Re: Directory flushing / commit / openIfChanged

2012-08-10 Thread Michael McCandless
This is a hard use case to do with pure Lucene ... NRTManager (plus NRTCachingDirectory) is the closest you can get, but given the Zipf distribution you'll be flushing/opening a new reader very frequently which leads to low perf. I think you have to have a cache above, which buffers up changes, an

Re: Directory flushing / commit / openIfChanged

2012-08-10 Thread Harald Kirsch
Maybe I did something wrong, maybe it does indeed not help, but pushing data into Lucene was not any faster than before. I would like remove my project specific baggage and try to rephrase my question by means of a simple example. Suppose a Lucene document is used to count events of certain t

Re: Directory flushing / commit / openIfChanged

2012-08-07 Thread Harald Kirsch
Hello Simon, ok, I'll try this out. Just to be sure. I was after a way to update documents before they are even written to disk, but this seems not to be the Lucene way. From what you propose I understand that this approach tries to keep documents from being written up to the time they need to

Re: Directory flushing / commit / openIfChanged

2012-08-06 Thread Simon Willnauer
hey harald, On Mon, Aug 6, 2012 at 1:22 PM, Harald Kirsch wrote: > Hi, > > in my application I have to write tons of small documents to the index, but > with a twist. Many of the documents are actually aggregations of pieces of > information that appear in a data stream, usually close together, b

Directory flushing / commit / openIfChanged

2012-08-06 Thread Harald Kirsch
Hi, in my application I have to write tons of small documents to the index, but with a twist. Many of the documents are actually aggregations of pieces of information that appear in a data stream, usually close together, but nevertheless merged with information for other documents. When info