[
https://issues.apache.org/jira/browse/LUCENE-2312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922597#action_12922597
]
Jason Rutherglen commented on LUCENE-2312:
------------------------------------------
FreqProxTermsWriterPerField writes the prox posting data as terms are seen,
however for the freq data we wait until we're on to the next doc (to accurately
record the doc freq), and seeing a previously analyzed term before writing to
the freq stream. Because the last doc code array in the posting array should
not be copied per reader, when a document is finished, we need to flush the
freq info out per term seen for that doc. This way, on reader instigated
flush, the reader may always read all necessary posting data from the byte
slices, and not rely partially on the posting array. I don't think this will
affect indexing performance.
> Search on IndexWriter's RAM Buffer
> ----------------------------------
>
> Key: LUCENE-2312
> URL: https://issues.apache.org/jira/browse/LUCENE-2312
> Project: Lucene - Java
> Issue Type: New Feature
> Components: Search
> Affects Versions: Realtime Branch
> Reporter: Jason Rutherglen
> Assignee: Michael Busch
> Fix For: Realtime Branch
>
> Attachments: LUCENE-2312.patch
>
>
> In order to offer user's near realtime search, without incurring
> an indexing performance penalty, we can implement search on
> IndexWriter's RAM buffer. This is the buffer that is filled in
> RAM as documents are indexed. Currently the RAM buffer is
> flushed to the underlying directory (usually disk) before being
> made searchable.
> Todays Lucene based NRT systems must incur the cost of merging
> segments, which can slow indexing.
> Michael Busch has good suggestions regarding how to handle deletes using max
> doc ids.
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841923&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841923
> The area that isn't fully fleshed out is the terms dictionary,
> which needs to be sorted prior to queries executing. Currently
> IW implements a specialized hash table. Michael B has a
> suggestion here:
> https://issues.apache.org/jira/browse/LUCENE-2293?focusedCommentId=12841915&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12841915
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]