[jira] [Commented] (LUCENE-7419) performance bug in tokenstream.end()

Adrien Grand (JIRA) Fri, 19 Aug 2016 01:11:49 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-7419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15427806#comment-15427806
 ]


Adrien Grand commented on LUCENE-7419:
--------------------------------------

So if I understand correctly, the slow down was caused by the fact that binary 
terms use a different impl for their TermToBytesRefAttribute 
(BytesTermAttributeImpl rather than PackedTokenAttributeImpl) and this confuses 
hotspot when getAttribute is called (regardless of which attribute is looked 
up)?

Should PackedTokenAttributeImpl only do {{positionIncrement = 0;}} after 
calling {{super.end();}}? The position increment seems to be the only attribute 
that deserves special handling? Or maybe you wanted to have explicit handling 
of all attributes that are wrapper in this über attribute impl?

> performance bug in tokenstream.end()
> ------------------------------------
>
>                 Key: LUCENE-7419
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7419
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Robert Muir
>            Priority: Blocker
>             Fix For: master (7.0), 6.2.0
>
>         Attachments: LUCENE-7419.patch
>
>
> TokenStream.end() calls getAttribute(), which is pretty costly to do 
> per-stream.
> It does its current hack, because in the ctor of TokenStream is "too early".
> Instead, we can just add a variant of clear(), called end() to AttributeImpl. 
> For most attributes it defers to clear, but for PosIncAtt it can handle the 
> special case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-7419) performance bug in tokenstream.end()

Reply via email to