Status

Current state"Draft"

Discussion thread: https://lists.apache.org/thread/3qppq1nks6vwf6m5lljbw9nmofb90lbp

JIRA

Released: <Solr Version>

Please keep the discussion on the mailing list rather than commenting on the wiki (wiki discussions get unwieldy fast). Confluence supports inline comments that can also be used.

Motivation

Solr's omnipresent NamedList (not SimpleOrderedMap) holds the project back from better JSON compatibility, and use of a standard Java Map abstraction.  Why is this?:

(A) In response structures, switching a NamedList (even SimpleOrderedMap) to a MapWriter or Map breaks compatibility with SolrJ "javabin" consumer, which makes a distinction between NamedList and Maps.  This compatibility break can be subtle, showing itself later.  The type change can also cause the JSON response to change.  While that may seem fair/expected, if we minimize NamedList usage then the problem is also minimized.

(B) A NamedList may hold repeated keys, and thus it can't be a Map (or be serialized to JSON) without a conversion that addresses key repetition somehow.  Minimizing NamedList minimizes a conversion (faster).  Solr serializes NamedLists to JSON using one of several strategies the user can choose with the "json.nl" parameter, which is awkward.  Ideally, with less NamedList usage, this would be a rare thing so that Solr has a more consistent JSON API that doesn't flip-flop.  There is also awkwardness in mapping JSON based configuration to a NamedList for the same underlying issues – repeated key handling.

Some important Solr APIs for both configuration and response processing specify a NamedList, and consequently the possibility of repeating keys.  But the actual need for this is rare.  For an API producing a response, it's too easy / obvious to supply a NamedList, even if keys don't repeat.  For configuration, ideally Solr would prevent a likely mistake of repeating a key, assuming Solr is modified to disallow such for root configuration elements.

While there is a special and important subclass of NamedList named "SimpleOrderedMap", all references to NamedList here mean the implementing type NamedList and not SimpleOrderedMap.

Public Interfaces

SolrResponse.getResponse, getResponseHeader, NamedListInitializedPlugin.init, ...more
  – but many more; really it's nearly all public methods with NamedList in the signature.

Several Options; pick one:

(A) keep NamedList but always use SimpleOrderedMap (no repeating keys).  Maybe enforce in some places.  This is the "don't rock the boat" approach".  Zero-ish upgrade pain.

(B) use SimpleOrderedMap.  A small upgrade hassle but otherwise it's business as usual.  No rush to update any code.  Lots of Solr code will experience a one-liner change.  Minor grumble:  IMO the name is clumsy and suddenly will be rather omnipresent.

(C) use Map, with SimpleOrderedMap being a typical but not mandatory implementation.  To help transition, a static utility method on SimpleOrderedMap can cast or convert a Map to a SimpleOrderedMap for code still using a NamedList.  Note that NamedList implements NavigableObject, which has useful utility methods to extract data from nested structures, especially in tests.  The switch to Map (isn't NavigableObject) may mean wanting to make SolrResponse implement NavigableObject or other tweaks to retain succinct calls.

Proposed Changes

(should be done in this order, mostly, and with one Jira per step unless specified otherwise)

SimpleOrderedMap (the subclass of NamedList) shall actually implement Map.  Backported to 9x.

"javabin" encoding shall decode all map data as a SimpleOrderedMap (vs. say a LinkedHashMap) thus allowing "javabin" consumers to compatibility read a SimpleOrderedMap or Map without differentiation to the calling code casting as one or the other.  Even SimpleOrderedMap is a NamedList, thus the calling code might cast to that type.  This should be configurable, particularly for Solr 9.9.  If "javabin" is retained even longer term (not yet replaced with CBOR or...), this decision might be reversed as we reduce casts to NamedList/SimpleOrderedMap, instead preferring Map.

NamedList.findRecursive: deprecate and stop using.  It assumes NamedList; doesn't know about Maps.  Obsoleted by NavigableObject.  One Jira & PR.  Could backport to 9x.

Minor: Use Java 17 "sealed classes" to insist the NamedList type hierarchy is exactly what we want; no surprises.

Change NamedList instantiations to SimpleOrderedMap as appropriate; probably many places.  This is Solr 10 only; it can affect backwards compatibility.  One Jira & PR(s); don't combine with anything else.

Disallow direct/obvious NamedList instantiation, requiring an alternative like a factory method or another subclass.  This forces the developer / call-site to determine if they truly need repeating keys.  One Jira/PR.  Useful to add to Solr 9, with deprecations of the current constructors.

Make the above "public interface" API change.

Solr shall read configuration at a plugin root level (e.g. for PluginInfo from ConfigNode from solrconfig.xml) as a SimpleOrderedMap (not NamedList), with new enforcement of unique keys here (not for SimpleOrderedMap generally).  Plugins that use repeated keys will need to change their configuration strategy, like to use an array.  This will not change interpreting "lst" in XML Solr configuration as anything different, as those elements are nested, not at root configurables (e.g. not for requestHandler or queryParser or...).

Compatibility, Deprecation, and Migration Plan

  • Primary changes are Solr 10.  Some aspects affect plugin compatibility.  Some aspects affect javabin and/or jSON compatibility.

Test Plan

Solr doesn't have backwards-compatibility tests; we should have some.

  • No labels