[
https://issues.apache.org/jira/browse/SOLR-9166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15304236#comment-15304236
]
Erick Erickson commented on SOLR-9166:
--------------------------------------
Right, I'm looking at the code in SortingResponseWriter so I think that would
handle the issue of sorting etc., just not writing it to the return doc at the
very last second. I have no clue what the implications are for the code that
processes analytics on the worker nodes for instance, we'll just have to harden
that I suppose.
As far as a switch is concerned, I'm torn. I think current behavior is
surprising and would like to see people who need current behavior have to do
something special rather than someone expecting what I think is correct
behavior do something special. I guess we can argue "correct", but you get the
idea....
To be consistent it should work the same way for both strings and numerics.
So even though it would be a change in behavior for current users, I'd propose
two properties. The default behavior if neither is specified would be to not
return anything in the tuple for absent strings or absent numerics.
returnZeroForMissingNumerics=true would do what happens now, zero gets returned
for missing numeric fields Otherwise do not return the field in the tuple
(default false) .
returnEmptyForMissingStrings=true would return the field with "" rather than
not returning the field in the tuple (default false)
BTW, I assigned this to myself to not lose track of it, if anyone wants to jump
in feel free....
> Export handler returns zero for fields numeric fields that are not in the
> original doc
> --------------------------------------------------------------------------------------
>
> Key: SOLR-9166
> URL: https://issues.apache.org/jira/browse/SOLR-9166
> Project: Solr
> Issue Type: Bug
> Reporter: Erick Erickson
> Assignee: Erick Erickson
>
> From the dev list discussion:
> My original post.
> Zero is different from not
> existing. And let's claim that I want to process a stream and, say,
> facet on in integer field over the result set. There's no way on the
> client side to distinguish between a document that has a zero in the
> field and one that didn't have the field in the first place so I'll
> over-count the zero bucket.
> From Dennis Gove:
> Is this true for non-numeric fields as well? I agree that this seems like a
> very bad thing.
> I can't imagine that a fix would cause a problem with Streaming Expressions,
> ParallelSQL, or other given that the /select handler is not returning 0 for
> these missing fields (the /select handler is the default handler for the
> Streaming API so if nulls were a problem I imagine we'd have already seen
> it).
> That said, within Streaming Expressions there is a select(...) function which
> supports a replace(...) operation which allows you to replace one value (or
> null) with some other value. If a 0 were necessary one could use a
> select(...) to replace null with 0 using an expression like this
> select(<stream>, replace(fieldA, null, withValue=0)).
> The end result of that would be that the field fieldA would never have a null
> value and for all tuples where a null value existed it would be replaced with
> 0.
> Details on the select function can be found at
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61330338#StreamingExpressions-select.
> And to answer Denis' question, null gets returned for string DocValues fields.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]