[ 
https://issues.apache.org/jira/browse/LUCENE-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882321#action_12882321
 ] 

Uwe Schindler commented on LUCENE-2514:
---------------------------------------

bq. For example, currently the priority queue in TopTerms does BytesRef -> 
String conversion and creates a new Term for each add, but this might be 
entirely useless as it could fall off the pq, so i think its ScoreTerm or 
whatever should not hold term at all but just bytesref

Exactly! We removed support for TermEnum (without s), so field name is never 
null. You can always take the field from the MTQ when building TermQueries. And 
for that we create the Term using new Term(field, BytesRef) or with the 
non-interning placeholder (see also below). This makes MTQ much simplier, I 
started to do it...

By the way: we could remove all String interning for field names now? We don't 
compare fields anymore?

> Change Term to use bytes
> ------------------------
>
>                 Key: LUCENE-2514
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2514
>             Project: Lucene - Java
>          Issue Type: Task
>          Components: Search
>    Affects Versions: 4.0
>            Reporter: Robert Muir
>         Attachments: LUCENE-2514-surrogates-dance.patch, LUCENE-2514.patch, 
> LUCENE-2514.patch, LUCENE-2514.patch, LUCENE-2514.patch
>
>
> in LUCENE-2426, the sort order was changed to codepoint order.
> unfortunately, Term is still using string internally, and more importantly 
> its compareTo() uses the wrong order [utf-16].
> So MultiTermQuery, etc (especially its priority queues) are currently wrong.
> By changing Term to use bytes, we can also support terms encoded as bytes 
> such as numerics, instead of using
> strange string encodings.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to