[ 
https://issues.apache.org/jira/browse/LUCENE-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12933911#action_12933911
 ] 

Uwe Schindler edited comment on LUCENE-2771 at 11/19/10 1:49 PM:
-----------------------------------------------------------------

Here the relevant comments:

{quote}
*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms.
we need to change SegmentMerger to not call norms on the top-level IR but 
populate its normBuffer from the subs. 
in my opinion it seems crazy we are currently creating these big arrays this 
way (yeah there is the hairy code for re-open that re-uses the big merged cache 
for the NRT case, but still).

Maybe i am missing something.

*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms. we need to change 
SegmentMerger to not call norms on the top-level IR but populate its normBuffer 
from the subs. in my opinion it seems crazy we are currently creating these big 
arrays this way (yeah there is the hairy code for re-open that re-uses the big 
merged cache for the NRT case, but still). Maybe i am missing something. 

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms 
and need to be fixed. 
maybe we could add an uncached "MultiNorms" or something at least in src/test 
for convenience,
just to fill the byte arrays so these tests can assertEquals

otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these 
tests.

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms 
and need to be fixed. maybe we could add an uncached "MultiNorms" or something 
at least in src/test for convenience, just to fill the byte arrays so these 
tests can assertEquals otherwise we are going to have to put a lot of 
SlowMultiReaderWrappers in these tests. 

*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. 
For ParallelReader i forced it to require non-composite readers only (e.g. 
SlowMultiReaderWrap them if thats not the case).

TODO: 
- ParallelReader shouldnt need multifields etc anymore 
- there are 5 @Ignore'd ParallelReader-related tests, because of things like 
reopen/isOptimized/isCurrent 
- merge in Uwe's improved SegmentsMerger 
- clean up code. 

*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. For 
ParallelReader i forced it to require non-composite readers only (e.g. 
SlowMultiReaderWrap them if thats not the case). TODO: 
ParallelReader shouldnt need multifields etc anymore 
there are 5 @Ignore'd ParallelReader-related tests, because of things like 
reopen/isOptimized/isCurrent 
merge in Uwe's improved SegmentsMerger 
clean up code. 
{quote}

      was (Author: thetaphi):
    Here the relevant comments:

{quote}
*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms.
we need to change SegmentMerger to not call norms on the top-level IR but 
populate its normBuffer from the subs. 
in my opinion it seems crazy we are currently creating these big arrays this 
way (yeah there is the hairy code for re-open that re-uses the big merged cache 
for the NRT case, but still).

Maybe i am missing something.

*Robert Muir added a comment - 19/Nov/10 04:28 AM*

here is a hack patch for Uwe's idea about the norms. we need to change 
SegmentMerger to not call norms on the top-level IR but populate its normBuffer 
from the subs. in my opinion it seems crazy we are currently creating these big 
arrays this way (yeah there is the hairy code for re-open that re-uses the big 
merged cache for the NRT case, but still). Maybe i am missing something. 

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms 
and need to be fixed. 
maybe we could add an uncached "MultiNorms" or something at least in src/test 
for convenience,
just to fill the byte arrays so these tests can assertEquals

otherwise we are going to have to put a lot of SlowMultiReaderWrappers in these 
tests.

*Robert Muir added a comment - 19/Nov/10 04:58 AM*

here's another hacky update: but still a few tests explicitly check these norms 
and need to be fixed. maybe we could add an uncached "MultiNorms" or something 
at least in src/test for convenience, just to fill the byte arrays so these 
tests can assertEquals otherwise we are going to have to put a lot of 
SlowMultiReaderWrappers in these tests. 

*Uwe Schindler added a comment - 19/Nov/10 07:17 AM*

Here a better patch for the segment merger. We should even apply this if we not 
remove top-level norms! 
It saves lots of memory during merging by using ReaderUtil to go down to 
segment level! Don't wonder about BytesRef, but we need a reference here 
because of the anonymous inner class.

*Uwe Schindler added a comment - 19/Nov/10 07:17 AM*

Here a better patch for the segment merger. We should even apply this if we not 
remove top-level norms! It saves lots of memory during merging by using 
ReaderUtil to go down to segment level! Don't wonder about BytesRef, but we 
need a reference here because of the anonymous inner class. 

*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. 
For ParallelReader i forced it to require non-composite readers only (e.g. 
SlowMultiReaderWrap them if thats not the case).

TODO: 
- ParallelReader shouldnt need multifields etc anymore 
- there are 5 @Ignore'd ParallelReader-related tests, because of things like 
reopen/isOptimized/isCurrent 
- merge in Uwe's improved SegmentsMerger 
- clean up code. 

*Robert Muir added a comment - 19/Nov/10 07:45 AM*

here is an updated patch, with core/contrib/solr tests passing. For 
ParallelReader i forced it to require non-composite readers only (e.g. 
SlowMultiReaderWrap them if thats not the case). TODO: 
ParallelReader shouldnt need multifields etc anymore 
there are 5 @Ignore'd ParallelReader-related tests, because of things like 
reopen/isOptimized/isCurrent 
merge in Uwe's improved SegmentsMerger 
clean up code. 
{quote}
  
> Remove norms() support from non-atomic IndexReaders
> ---------------------------------------------------
>
>                 Key: LUCENE-2771
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2771
>             Project: Lucene - Java
>          Issue Type: Improvement
>            Reporter: Uwe Schindler
>             Fix For: 4.0
>
>
> Spin-off from LUCENE-2769:
> Currently all IndexReaders support norms(), but the core of Lucene never uses 
> it and its even dangerous because of memory usage. We should do the same like 
> with MultiFields and factor it out and throw UOE on non-atomic readers.
> The SlowMultiReaderWrapper can then manage the norms. Also ParallelReader 
> needs to be fixed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to