[
https://issues.apache.org/jira/browse/LUCENE-2575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12914802#action_12914802
]
Michael McCandless commented on LUCENE-2575:
--------------------------------------------
{quote}
bq. Can we just have IW allocate a new byte[][] after flush? So then any open
readers can keep using the one they have?
This means the prior byte[]s will still be recycled after all
active previous flush readers are closed?
{quote}
Probably we should stop reusing the byte[] with this change? So when all
readers using a given byte[] are finally GCd, is when that byte[] is reclaimed.
{quote}
bq. it's possible single level skipping, with a larger skip interval, is fine
for even large RAM buffers.
True, I'll implement a default of one level, and a default
large-ish skip interval.
{quote}
Well, I was thinking only implement the single-level skip case (since it ought
to be alot simpler than the MLSLW/R)....
{quote}
How many scorers, or how often is skipping used? It's mostly for
disjunction queries?
{quote}
Actually, conjunction (AND) queries, and also PhraseQuery (which is really an
AND query followed by positions checking). One thing to remember is that
skipping is *costly* (especially, the first time you use it) -- I think we
over-use it today, ie, in many cases we should do a spin loop (.next())
instead, if your target "is not that far away". PhraseQuery (the exact case)
has a heuristic to do this, but really this ought to be implemented in the
codec.
bq. get deletes working in the RT branch,
Do we have a design thought out for this? The challenge is because every doc
state now has its own private docID stream, we need a global sequence ID to
track "when" a deletion arrived, to know whether or not that deletion applies
to each docID, right? (And, each added doc must also record the sequenceID
when it was added).
> Concurrent byte and int block implementations
> ---------------------------------------------
>
> Key: LUCENE-2575
> URL: https://issues.apache.org/jira/browse/LUCENE-2575
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Affects Versions: Realtime Branch
> Reporter: Jason Rutherglen
> Fix For: Realtime Branch
>
> Attachments: LUCENE-2575.patch, LUCENE-2575.patch, LUCENE-2575.patch,
> LUCENE-2575.patch
>
>
> The current *BlockPool implementations aren't quite concurrent.
> We really need something that has a locking flush method, where
> flush is called at the end of adding a document. Once flushed,
> the newly written data would be available to all other reading
> threads (ie, postings etc). I'm not sure I understand the slices
> concept, it seems like it'd be easier to implement a seekable
> random access file like API. One'd seek to a given position,
> then read or write from there. The underlying management of byte
> arrays could then be hidden?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]