[ 
https://issues.apache.org/jira/browse/LUCENE-6199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14292717#comment-14292717
 ] 

Robert Muir commented on LUCENE-6199:
-------------------------------------

All i want is a reasonable tradeoff. You know, if the code can be rearranged so 
this stuff is clear and isn't causing bugs for non-abuse cases, then I think 
its ok. But the current stuff seems to go far too far?

I don't fully get the blocktree changes without spending more time to 
understand if the right risk/reward tradeoffs (where reward is an abuse case) 
are being made, i mean maybe its fine, but could there be sneaky reuse bugs? 
Also why did we lose node/arc counts in stats? Was this on accident?

The FISReader attributes caching stuff seems extraordinarily risky. Are you 
sure it is really ok to suddenly become null where it was not before? Why not 
propose caching the attributes in SIReader too for abuse cases where people 
have too many segments?

i think adding Accountable to FieldInfos will ultimately be very invasive no 
matter how you do it? And in most cases, this is pretty ridiculous right? 
Shouldnt we instead think about non-abuse cases like adding LiveDocs to this 
ram computation before looking at making FieldInfos more complex?

Honestly, if the right tests are in place, i think I would get a lot less upset 
about it. But i don't like the idea of introducing bugs, that hurt real use 
cases, caused by complexity of optimizing abuse cases. Do you agree or disagree 
these changes are really scary?

> Reduce per-field heap usage for indexed fields
> ----------------------------------------------
>
>                 Key: LUCENE-6199
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6199
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Michael McCandless
>            Assignee: Michael McCandless
>             Fix For: Trunk, 5.1
>
>         Attachments: LUCENE-6199.patch
>
>
> Lucene uses a non-trivial baseline bytes of heap for each indexed
> field, and I know it's abusive for an app to create 100K indexed
> fields but I still think we can and should make some effort to reduce
> heap usage per unique field?
> E.g. in block tree we store 3 BytesRefs per field, when 3 byte[]s
> would do...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to