Lucene Merge Algorithm, max number of segments

2006-03-06 Thread Dalton, Jeffery
I am just going to wax philosophical for a minute. I am trying to understand lucene's merging algorithm in depth. Let's say I create an index of 25M web pages on a single machine. While creating this index I am doing both search and indexing / re-indexing at the same time, a bit like Technorat

Unable to optimize index: cannot delete deletable.new

2006-01-30 Thread Dalton, Jeffery
I have a periodic process that runs as a timer task that periodically optimizes my search index. However, I am having difficulties with this process failing: java.io.IOException: Cannot overwrite: C:\04950_04959\deleteable.new at org.apache.lucene.store.FSDirectory.createOutput(FSDirectory

RE: Lucene performance bottlenecks

2005-12-08 Thread Dalton, Jeffery
Andrzej, I think you did a great job elucidating my thoughts as well. I heartily concur with everything you said. Andrzej Bialecki Wrote: > Hmm... Please define what "adequate" means. :-) IMHO, > "adequate" is when for any query the response time is well > below 1 second. Otherwise the serv

RE: Can Lucene be Used To Substitute Real Database?

2005-10-25 Thread Dalton, Jeffery
It depends on the application. Depending on the access pattern of you system you might be able to use Lucene. It's been done ;-). If you have a very few tables with very simple relationships, it might be an answer -- perhaps not the best one though. If you want to use advanced RDBMS feature

RE: Displaying search context

2005-09-23 Thread Dalton, Jeffery
You mentioned that "it will scale well in the future". Does this imply that it doesn't scale well now? What are the current limitations of the Lucene Highlighter? Does does it perform under high query load? This is just a curiousity of mine, but nutch has a separate Summarizer: net.nutch.sear