[
https://issues.apache.org/jira/browse/SOLR-18033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris M. Hostetter updated SOLR-18033:
--------------------------------------
Attachment: SOLR-18033.patch
Status: Open (was: Open)
Attaching the patch i've been working up.
90% of this patch is improvements to existing of {{BinaryField}} to also test
some new PoC test cases that use binary paylods internally, but modify them in
some way from the external representation...
* StrBinaryField - always uses a custom string representation externally which
it encodes/decodes for the internal binary docValues or stored field
* SwapBytesBinaryField - which uses a binary representation (when possible)
externally but internally "swaps" the order of the bytes
The problem initially identified in SOLR-17975 is "fixed" in this patch by
modifying {{SolrDocumentFetcher}} to delegate to the {{FieldType.toObject()}}
method before serializing raw binary docvalues (i punted on the general isue of
all docvalues, since focusing on {{BINARY}} docValues seemed less risky for
custom FieldTypes users might have.)
The new problem these tests identify come up when trying to use {{stored=true}}
instances of {{SwapBytesBinaryField}} with the javabin codec (see nocommit's in
TestBinaryField) – the values always come back as strings.
IIURC thisis because {{SwapBytesBinaryField}} isn't in the hardcoded list of
{{DocsStreamer.KNOWN_TYPES}}
> binary based fields can't control their external based representation
> ---------------------------------------------------------------------
>
> Key: SOLR-18033
> URL: https://issues.apache.org/jira/browse/SOLR-18033
> Project: Solr
> Issue Type: Bug
> Reporter: Chris M. Hostetter
> Priority: Major
> Attachments: SOLR-18033.patch
>
>
> This was discovered while working on SOLR-17975:
> {quote}... the main issue i *HAVE* found is that even though FieldType has
> method(s) for converting internal BytesRef's into something type specific
> that can be returned to the client, SolrDocumentFetcher isn't consistently
> using those methods in all useDocValuesAsStored situations ... it's got it's
> own special case statement with specialized conditionals that would need to
> be rethought/refactored to ensure the BINARY DocValues for this new field
> type (and any other similar field types) can be reconstituted back into
> _whatever_ external representation _[...the field type wants to use...]_
> {quote}
> While working on a patch with tests for this, I discovered that the same
> problem exists for stored fields due to hardcoded logic in DocsStreamer --
> even though {{FieldType}} has a distinct {{void
> write(TextResponseWriter,...}}} method (for when the response format requires
> String-ification) distinct from the {{Object toObject(IndexableField)}}
> (designed for use by the javabincodec) {{DocStreamer}} only uses {{toObject}}
> with a hardcoded list of special classes -- otherwise it uses the very old
> {{String toExternal(IndexableField)}} method
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]