[jira] Commented: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context

Michael McCandless (JIRA) Thu, 30 Dec 2010 03:10:15 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12976010#action_12976010
 ]


Michael McCandless commented on LUCENE-2831:
--------------------------------------------

bq. The same IR could be a sub-reader in a different context though.

Right, but that's perfectly fine w/ the proposed API.

Ie, any IR is able to be a top reader if you ask it to.  So you can
call IR.getTopReaderInfo(), and this will fill in the full tree as
seen by that top reader *only*.  Meaning, the resulting tree is
"private" to that reader.  If you go and ask another reader for its
tree, a new tree (private to that reader's "view" (parent/child
relationships)) is computed.

The API is fully general: it allows for readers that are shared by
more than one top reader.

Really (as Yonik said before) this is a question of caching.  It's
gonna be the same tree, whether it's stored on IR or IS.  Yet, the
tree in no way depends on IS -- it's entirely a function of the
relationships b/w IR and its subs.  An app should be able to store the
top IR, and pull the tree from it on demand.  And init'ing an IS from
an IR should continue to be cheap.

Also, looking towards the future... at some point we will split apart
"composite reader" and "atomic reader".  Ie, these really should be
separate classes, but today in Lucene they are one and the same and so
we have "dynamic typing" (you hit exceptions at runtime) instead of
static typing (compilation errors).  At some point we have to fix this
and make these two IRs separately classes.

But to say that this API should be moved to IS instead of IR because
of this dynamic/static typing problem is a step in the wrong direction
-- ie once we split out these two IR classes we'll want to move the
the getTopReaderInfo API back to the "composite" IR.

BTW we will have to somehow fix CachingWrapperFilter to behave
properly here.  Ie, it needs to know whether the cache key is purely
the sub-reader (as it is, always, today), or the combo of root reader
+ sub (as it is for eg DuplicateFilter).



> Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context
> ---------------------------------------------------------------------
>
>                 Key: LUCENE-2831
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2831
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>    Affects Versions: 4.0
>            Reporter: Simon Willnauer
>            Assignee: Simon Willnauer
>             Fix For: 4.0
>
>         Attachments: LUCENE-2831.patch
>
>
> Spinoff from LUCENE-2694 - instead of passing a reader into Weight#scorer(IR, 
> boolean, boolean) we should / could revise the API and pass in a struct that 
> has parent reader, sub reader, ord of that sub. The ord mapping plus the 
> context with its parent would make several issues way easier. See 
> LUCENE-2694, LUCENE-2348 and LUCENE-2829 to name some.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-2831) Revise Weight#scorer & Filter#getDocIdSet API to pass Readers context

Reply via email to