Re: Single filter instance with different searchers

2010-11-09 Thread Samarendra Pratap
Thanks Erick for you insight. I'd appreciate if someone could throw more light on it. Thanks On Tue, Nov 9, 2010 at 11:27 PM, Erick Erickson wrote: > I'm going to have to leave answering that to people with more > familiarity with the underlying code than I have... > > That said, I'd #guess# th

Implementing indexing of Versioned Document Collections

2010-11-09 Thread Alex vB
Hello everybody, I would like to implement the paper "Compact Full-Text Indexing of Versioned Document Collections" [1] from Torsten Suel for my diploma thesis in Lucene. The basic idea is to create a two-level index structure. On the first level a document is identified by document ID with a pos

Re: Single filter instance with different searchers

2010-11-09 Thread Erick Erickson
I'm going to have to leave answering that to people with more familiarity with the underlying code than I have... That said, I'd #guess# that you'll be OK because I'd #guess# that filters are maintained on a per-reader basis and the results are synthesized when combined in a MultiSearcher. But th

Re: Lucene index exchange format?

2010-11-09 Thread Grant Ingersoll
You can do this in trunk right now using the Codec capability. In fact, there is a text version already, but it is likely to be really slow on anything significant. You could likely produce something that is faster but still readable. On Nov 9, 2010, at 5:46 AM, Paul Libbrecht wrote: > hello

Re: high frequent terms in the search result set

2010-11-09 Thread starz10de
Thanks for the answer. My request is might easier. I will describe it in basic way: 1- I submit a query 2- I retrieve the matched documents 3- From this matched document I need to ´have a list of terms based on their high co-occurrence. Currently I could do this for the whole index but I still

Lucene index exchange format?

2010-11-09 Thread Paul Libbrecht
hello list, more and more I seem to encounter situations where the delivery of a prebuilt lucene index is desirable. The binary format probably works (experience hints would be welcome) but I fear it would be fragile with versioning (it certainly fails at version-downgrading). Did anyone work