[ 
https://issues.apache.org/jira/browse/SOLR-18033?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris M. Hostetter updated SOLR-18033:
--------------------------------------
    Attachment: SOLR-18033.patch
        Status: Open  (was: Open)

Attaching the patch i've been working up.

90% of this patch is improvements to existing of {{BinaryField}} to also test 
some new PoC test cases that use binary paylods internally, but modify them in 
some way from the external representation...
 * StrBinaryField - always uses a custom string representation externally which 
it encodes/decodes for the internal binary docValues or stored field
 * SwapBytesBinaryField - which uses a binary representation (when possible) 
externally but internally "swaps" the order of the bytes

The problem initially identified in SOLR-17975 is "fixed" in this patch by 
modifying {{SolrDocumentFetcher}} to delegate to the {{FieldType.toObject()}} 
method before serializing raw binary docvalues (i punted on the general isue of 
all docvalues, since focusing on {{BINARY}} docValues seemed less risky for 
custom FieldTypes users might have.)

The new problem these tests identify come up when trying to use {{stored=true}} 
instances of {{SwapBytesBinaryField}} with the javabin codec (see nocommit's in 
TestBinaryField) – the values always come back as strings.

IIURC thisis because {{SwapBytesBinaryField}} isn't in the hardcoded list of 
{{DocsStreamer.KNOWN_TYPES}}

> binary based fields can't control their external based representation
> ---------------------------------------------------------------------
>
>                 Key: SOLR-18033
>                 URL: https://issues.apache.org/jira/browse/SOLR-18033
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Chris M. Hostetter
>            Priority: Major
>         Attachments: SOLR-18033.patch
>
>
> This was discovered while working on SOLR-17975:
> {quote}... the main issue i *HAVE* found is that even though FieldType has 
> method(s) for converting internal BytesRef's into something type specific 
> that can be returned to the client, SolrDocumentFetcher isn't consistently 
> using those methods in all useDocValuesAsStored situations ... it's got it's 
> own special case statement with specialized conditionals that would need to 
> be rethought/refactored to ensure the BINARY DocValues for this new field 
> type (and any other similar field types) can be reconstituted back into 
> _whatever_ external representation _[...the field type wants to use...]_
> {quote}
> While working on a patch with tests for this, I discovered that the same 
> problem exists for stored fields due to hardcoded logic in DocsStreamer -- 
> even though {{FieldType}} has a distinct  {{void 
> write(TextResponseWriter,...}}} method (for when the response format requires 
> String-ification) distinct from the {{Object toObject(IndexableField)}} 
> (designed for use by the javabincodec) {{DocStreamer}} only uses {{toObject}} 
> with a hardcoded list of special classes -- otherwise it uses the very old 
> {{String toExternal(IndexableField)}} method



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to