Tried changing the merge policy but it had no effect on the test times. But I can rule out ReiserFS as the culprit now too, since I was able to run with indexes stored on an ext3 partition, and observed a similar slowdown.
So there's something else going on here with this particular test setup, but I can't really distill a simpler use case, so for now I'll leave it at that. Will post to this thread again if I find something promising... On Thu, Feb 9, 2012 at 4:13 AM, Simon Willnauer <simon.willna...@googlemail.com> wrote: > one major thing that changed from 3.0.3 to 3.5 is that we use > TieredMergePolicy by default. can you try to use the same merge policy > on both 3.0.3 and 3.5 and report back? ie LogByteSizeMergePolicy or > whatever you are using... > > simon > > On Thu, Feb 9, 2012 at 5:28 AM, Vitaly Funstein <vfunst...@gmail.com> wrote: >> Hello, >> >> I am currently evaluating Lucene 3.5.0 for upgrading from 3.0.3, and >> in the context of my usage, the most important parameter is index >> writing throughput. To that end, I have been running various tests, >> but seeing some contradictory results from different setups, which >> hopefully someone with a better knowledge of Lucene's internals could >> explain... >> >> First, let me describe my usage of Lucene, which is common across all >> of these cases. >> >> 1. Terms: non-analyzed strings or integral types, mostly. No free form >> text values on fields. >> 2. All indexed fields are stored. >> 3. Multiple threads per index writer, in the overall application >> currently capped at 4. >> 4. Document deletes are performed with each index update, using a >> simple string term to identify the document. >> 5. Default IndexWriter config settings are used, i.e. directory type, >> merge policy, RAM buffer size, etc. >> 6. Typical data size for an index is anywhere from a few hundred K >> docs up to a few hundred M. >> 7. Hardware config: >> - kernel 2.6.16-60 SMP (SuSE Enterprise Server 10) >> - 16x CPU >> - 16G RAM >> - ReiserFS partition for index data (more on this below) >> >> Here is where things diverge though. The first use case is a >> standalone performance test, which writes 1M documents containing 4 >> fields (2 string, 2 numeric) to a single index using 10 worker >> threads. In this case, I do not see any writing performance >> degradation when going from 3.0.3 to 3.5. >> >> The second setup is a distributed multi-threaded client server >> application, where Lucene is used on the server to implement the >> search functionality. Clients have the ability to submit searchable >> data for indexing, as well as to run queries against the data. I >> realize this is a very generic description, and if needed could >> provide more specifics later. For now, let's say the second test runs >> on one such client, and submits 3 million records for the server to >> process (and also index via Lucene). Total time taken is then >> reported. >> >> But when running the test above, I can definitely observe a consistent >> increase in test times when the only thing changing is Lucene going >> from 3.0.3 to 3.5.0, on the order of 15-35%. >> >> How could I reconcile this discrepancy? My theory at this point is >> that the combination of the kernel above and ReiserFS (default FS for >> the distro) somehow making index writing in 3.5.0 slower, possibly due >> to the BKL issue, but only when used in a heavily multi-threaded >> system. Unfortunately, I currently have no ext3 partitions, or ability >> to upgrade the kernel on the system to prove or disprove this. >> >> Has anyone experienced issues like this in a similar setup, or maybe >> benchmarked Lucene across different file system types and release >> versions? >> >> Thanks, >> -V >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org