RE: Flushing Thread

2012-07-19 Thread Simon McDuff
Thank you Simon Willnauer! With your explanation, we`ve decided to control the flushing by spawning another thread. So the thread is available to still ingest ! :-) (correct me if I'm wrong)We do so by checking the RAM size provided by Lucene! (Thank you!)By putting the automatic flushing at 1

Problem with TermVector offsets and positions not being preserved

2012-07-19 Thread Mike O'Leary
I created an index using Lucene 3.6.0 in which I specified that a certain text field in each document should be indexed, stored, analyzed with no norms, with term vectors, offsets and positions. Later I looked at that index in Luke, and it said that term vectors were created for this field, but

Re: Flushing Thread

2012-07-19 Thread Simon Willnauer
hey, On Thu, Jul 19, 2012 at 7:41 PM, Simon McDuff wrote: > > Thank you for your answer! > > I read all your blogs! It is always interesting! for details see: http://www.searchworkings.org/blog/-/blogs/gimme-all-resources-you-have-i-can-use-them!/ and http://www.searchworkings.org/blog/-/blog

RE: Flushing Thread

2012-07-19 Thread Simon McDuff
Thank you for your answer! I read all your blogs! It is always interesting! My understanding is probably incorrect ... I observed that if you have only one thread that addDocument, it will not spawn another thread for flushing, it uses the main thread. In this case, my main thread is locked. Co

Re: Flushing Thread

2012-07-19 Thread Michael McCandless
This has already been fixed on Lucene 4.0 (we now have fully concurrent flushing), eg see: http://blog.mikemccandless.com/2011/05/265-indexing-speedup-with-lucenes.html Mike McCandless http://blog.mikemccandless.com On Thu, Jul 19, 2012 at 12:54 PM, Simon McDuff wrote: > > I see some behavio

Flushing Thread

2012-07-19 Thread Simon McDuff
I see some behavior at the moment when I'm flushing and would like to know if I can change that. One main thread is inserting, when it flushes, it blocks. During that time my main thread is blocking. Instead of blocking, Could it spawn another thread to do that ? Basically, would like to h

RE: Lucene 4.0 .FDT

2012-07-19 Thread Simon McDuff
Thank you for your answer. In our case, in 983 seconds of processing, the size of these file are: - *.fdt : 366 Megs - *.fdx : 2898 Megs It is kind of useless to write more than 3 gigs for nothing... We already modified Lucene40StoredFieldsWriter to fix our problems. I hope we could do som

Re: Lucene 4.0 .FDT

2012-07-19 Thread Andrzej Bialecki
On 19/07/2012 14:26, Simon McDuff wrote: I'm using Lucene 4.0. I'm inserting around 300 000 documents / seconds. We do not have any store fields. But we noticed that .fdt get populated even so. .fdx contains useless informations. .fdt contains only zerouseless... Is there a way to minimi

Lucene 4.0 .FDT

2012-07-19 Thread Simon McDuff
I'm using Lucene 4.0. I'm inserting around 300 000 documents / seconds. We do not have any store fields. But we noticed that .fdt get populated even so. .fdx contains useless informations. .fdt contains only zerouseless... Is there a way to minimize the impact ? Thank you SImon

Re: RAM or SSD...

2012-07-19 Thread Dawid Weiss
Read this: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Dawid On Thu, Jul 19, 2012 at 1:32 PM, Dragon Fly wrote: > > The slowest part of my application is to read the search hits from disk. I > was hoping that using an SSD or RAMDirectory/MMapDirectory would speed th

RE: RAM or SSD...

2012-07-19 Thread Dragon Fly
The slowest part of my application is to read the search hits from disk. I was hoping that using an SSD or RAMDirectory/MMapDirectory would speed that up. I read the JavaDoc for MMapDirectory but didn't really understand how that differs from RAMDirectory. Could someone please explain? > Da

Re: change of API Javadoc interface funtionality in 4.0.x

2012-07-19 Thread Bernd Fehling
LUCENE-4237 - add ant task to generate optionally ALL javadocs https://issues.apache.org/jira/browse/LUCENE-4237 Am 19.07.2012 07:59, schrieb Robert Muir: > On Thu, Jul 19, 2012 at 1:53 AM, Bernd Fehling > wrote: >> ... >> Robert Muir added a comment - 12/Apr/12 16:24 >> >> We can save 10MB wit