RE: Best practices for searcher memory usage?

2010-07-15 Thread Christopher Condit
> [Toke: No frequent updates] > > So everything is rebuild from scratch each time? Or do you mean that you're > only adding new documents, not changing old ones? Everything is reindexed from scratch - indexing speed is not essential to us... > Either way, optimizing to a single 140GB segment is

Re: Index corruption using Lucene 2.4.1 - thread safety issue?

2010-07-15 Thread Frank Geary
Just to conclude this issue, it seems that my theory below was correct. I implemented code at the time of my previous posting to "delete the last IndexReader used whenever we re-create a new RAMDir IW" and we have not seen the ArrayOutOfBounds Exception since. Thus, the lesson here is never to r

Re: Continuously iterate over documents in index

2010-07-15 Thread Max Lynch
Erick, This is what I ended up doing. I initially avoided it because I was storing dates using Solr's date type which AFAIK aren't usable in Lucene, but I ended up using DateTools to store a lucene readable version that seems to work well. Thanks! On Wed, Jul 14, 2010 at 7:59 PM, Erick Erickson

XML results ranking

2010-07-15 Thread Maciej
Hello, I'm a newbie to Lucene and before starting playing with it I would like to know whether it fits to my application. I have a collection of XML documented demarcated with respect to a stable XML schema (WSDL definitions). I wonder whether Lucene: (1) provides full-text search over con

London open source search meet-up

2010-07-15 Thread Richard Marr
Hi all, Apologies for the cross-post. We are organising another open source search social evening in London on Wednesday the 28 July. As usual the plan is to get together and chat about search technology, from Lucene to Solr, Hadoop, Mahout, Xapian and the like - bringing together people from ac

Re: subset query :query filter or boolean query

2010-07-15 Thread Ian Lea
Loads of questions ... some answers below. > I have 4 query search fields. > > case 1 : if i use one search > field to make a query filter and then use the query filter to search on > other 3 fields so as to reduce the searching docs subset. > > case 2: i use > all query parameters using boolean q

RE: Best practices for searcher memory usage?

2010-07-15 Thread Toke Eskildsen
On Wed, 2010-07-14 at 20:28 +0200, Christopher Condit wrote: [Toke: No frequent updates] > Correct - in fact there are no updates and no deletions. We index > everything offline when necessary and just swap the new index in... So everything is rebuild from scratch each time? Or do you mean that