Re: Lucene indexing throughput (and Mike's lucenebench charts)

2016-05-23 Thread Michael McCandless
I finally dug into this, and it turns out the nightly benchmark I run had bad bottlenecks such that it couldn't feed documents quickly enough to Lucene to take advantage of the concurrent hardware in beast2. I fixed that and just re-ran the nightly run and it shows good gains: https://plus.google.

Re: Lucene indexing throughput (and Mike's lucenebench charts)

2016-04-15 Thread Robert Muir
you won't see indexing improvements there because the dataset in question is wikipedia and mostly indexing full text. I think it may have one measly numeric field. On Thu, Apr 14, 2016 at 6:25 PM, Otis Gospodnetić wrote: > (replying to my original email because I didn't get people's replies, even

Re: Lucene indexing throughput (and Mike's lucenebench charts)

2016-04-14 Thread Otis Gospodnetić
(replying to my original email because I didn't get people's replies, even though I see in the archives people replied) Re BJ and beast2 upgrade. Yeah, I saw that, but * if there is no indexing throughput improvement after that, does that mean that those particular indexing tests happen to be

Re: Lucene indexing throughput (and Mike's lucenebench charts)

2016-04-14 Thread Stephen Green
As someone who runs Lucene on big hardware, I'd be very interested to see the tuning parameters when you do get a chance.. On Thu, Apr 14, 2016 at 3:41 PM, Michael McCandless < luc...@mikemccandless.com> wrote: > Yes, dual 2699 v3, with 256 GB of RAM, yet indexing throughput somehow > got slower

Re: Lucene indexing throughput (and Mike's lucenebench charts)

2016-04-14 Thread Michael McCandless
Yes, dual 2699 v3, with 256 GB of RAM, yet indexing throughput somehow got slower :) I haven't re-tuned indexing threads, IW buffer size yet for this new hardware ... Mike McCandless http://blog.mikemccandless.com On Thu, Apr 14, 2016 at 2:09 PM, Ishan Chattopadhyaya wrote: > Wow, 72 cores? T

Re: Lucene indexing throughput (and Mike's lucenebench charts)

2016-04-14 Thread Ishan Chattopadhyaya
Wow, 72 cores? That sounds astounding. Are they dual Xeon E5 2699 v3 CPUs with 18 cores each, with hyperthreading = 18*2*2=72 threads? On Thu, Apr 14, 2016 at 11:33 PM, Dawid Weiss wrote: > The GC change is after this: > > BJ (2015-12-02): Upgrade to beast2 (72 cores, 256 GB RAM) > > which leads

Re: Lucene indexing throughput (and Mike's lucenebench charts)

2016-04-14 Thread Dawid Weiss
The GC change is after this: BJ (2015-12-02): Upgrade to beast2 (72 cores, 256 GB RAM) which leads me to believe these results are not comparable (different machines, architectures, disks, CPUs perhaps?). Dawid On Thu, Apr 14, 2016 at 7:13 PM, Otis Gospodnetić wrote: > Hi, > > I was looking a

Lucene indexing throughput (and Mike's lucenebench charts)

2016-04-14 Thread Otis Gospodnetić
Hi, I was looking at Mike's http://home.apache.org/~mikemccand/lucenebench/indexing.html secretly hoping to spot some recent improvements in indexing throughput but instead it looks like: * indexing throughput hasn't really gone up in the last ~5 years * indexing was faster in 2014, but then