Anshum: Have you looked into the ConcurrentMultiSearcher? It would have you split your index into N sub-indices, and search each with a dedicated thread.
--Renaud -----Original Message----- From: Anshum [mailto:[EMAIL PROTECTED] Sent: Monday, April 21, 2008 9:10 PM To: java-user@lucene.apache.org Subject: Re: Binding lucene instance/threads to a particular processor(or core) The paper seems pretty good but I am still wondering if there was a way to achieve this through the command line parameters. I'm just trying this to optimize the code, if this works, would let all know else would keep everyone informed :) Any other suggestions for handling a concurrency of over 7 search requests per second for an index size of over 15Gigs containing over 13 million records? Also, could someone help me with obtaining a 'index size' - 'concurrency' - 'processor power' - 'memory' relationship formula (or something similar)? -- Anshum On Tue, Apr 22, 2008 at 3:55 AM, Antony Bowesman <[EMAIL PROTECTED]> wrote: > That paper from 1997 is pretty old, but mirrors our experiences in > those days. Then, we used Solaris processor sets to really improve > performance by binding one of our processes to a particular CPU while > leaving the other CPUs to manage the thread intensive work. > > You can bind processes/LWPs to a CPU on Solaris with psrset. > > The Solaris thread model in the late '90s was also a significant > factor in performance of multi-threaded programs. The default thread > library in Solaris 8 implemented a MxN unbound thread model > (threads/LWPS). In those days we found that it did not perform well, > so used the bound thread model (i.e. 1:1) where a Solaris thread was > bound permanently to an LWP. That improved performance a lot. In > Solaris 8, Sun had what they called the 'alternate' thread library > (T2) around 2000, which became the default library in Solaris 9, and > implemented a 1:1 model of Solaris threads to LWPs. That new library had dramatic performance improvements over the old. > > Some background info for Java and threading > > http://java.sun.com/j2se/1.5.0/docs/guide/vm/thread-priorities.html > > Antony > > > > Glen Newton wrote: > > > I realised that not everyone on this list might be able to access > > the IEEE paper I pointed-out, so I will include the abstract and > > some paragraphs from the paper which I have included below. > > > > Also of interest (and should be available to all): Fedorova et al. > > 2005. Performance of Multithreaded Chip Multiprocessors And > > Implications For Operating System Design. Usenix 2005. > > http://www.eecs.harvard.edu/margo/papers/usenix05/paper.pdf > > "Abstract: We investigated how operating system design should be > > adapted for multithreaded chip multiprocessors (CMT) - a new > > generation of processors that exploit thread-level parallelism to > > mask the memory latency in modern workloads. We determined that the > > L2 cache is a critical shared resource on CMT and that an > > insufficient amount of L2 cache can undermine the ability to hide > > memory latency on these processors. To use the L2 cache as > > efficiently as possible, we propose an L2-conscious scheduling > > algorithm and quantify its performance potential. Using this > > algorithm it is possible to reduce miss ratios in the L2 cache by > > 25-37% and improve processor throughput by 27-45%." > > > > > > From Lundberg, L. 1997: > > Abstract: "The default scheduling algorithm in Solaris and other > > operating systems may result in frequent relocation of threads at > > run-time. Excessive thread relocation cause poor memory performance. > > This can be avoided by binding threads to processors. However, > > binding threads to processors may result in an unbalanced load. By > > considering a previously obtained theoretical result and by > > evaluating a set of multithreaded Solaris programs using a > > multiprocessor with 8 processors, we are able to bound the maximum > > performance loss due to binding threads, The theoretical result is > > also recapitulated. By evaluating another set of multithreaded > > programs, we show that the gain of binding threads to processors may > > be substantial, particularly for programs with fine grained > > parallelism." > > > > First paragraph: "The thread concept in Solaris [3][5] and other > > operating systems makes it possible to write multithreaded programs, > > which can be executed in parallel on a multiprocessor. Previous > > experience from real world programs [4] show that, using the default > > scheduling algorithm in Solaris, threads are frequently relocated > > from one processor to another at run-time. After each such > > relocation, the code and data associated with the relocated thread > > is moved from the cache memory of the 0113 processor to the cache of > > the new processor. This reduces the performance. Run-time relocation > > of threads to processors can also result in unpredictable response > > times. This is a problem in systems which operate in a real-time > > environment. In order to avoid poor memory performance and > > unpredictable real-time behaviour due to frequent thread relocation, > > threads can be bound to processors using the processor-bind > > directive [3] [5]. The major problem with binding threads is that > > one can end up with an unbalanced load, i.e. some processors may be > > extremely busy during some time periods while other processors are > > idle." > > > > -Glen > > > > On 21/04/2008, Glen Newton <[EMAIL PROTECTED]> wrote: > > > > > And this discussion on bound threads may also shed light on things: > > > > > > http://coding.derkeiler.com/Archive/Java/comp.lang.java.programmer > > > /2007-11/msg02801.html > > > > > > > > > -Glen > > > > > > > > > On 21/04/2008, Glen Newton <[EMAIL PROTECTED]> wrote: > > > > BInding threads to processors - in many situations - improves > > > > throughput by reducing memory overhead. When a thread is > > > running on a > core, its state is local; if it is timeshared-out > > > and either 1) > swapped back in on the same core, it is likely > > > that there will be the > core's L1 cache; or 2) onto another > > > core, there will not be a cache > hit and the state will have to > > > be fetched from L2 or main memory, > incurring a performance > > > hit, esp in the latter. See Lundberg, L. > > > 1997. > > > > Evaluating the Performance Implications of Binding Threads to > > > > Processors. 393. > > > http://ieeexplore.ieee.org/iel3/5020/13768/00634520.pdf > > > > for more info. > > > > > > > > If you are using JVM on Solaris on SPARC, you should take a > > > look at > the following links for tuning (the Sun JVM on Solaris > > > SPARC has many > more performance tuning parameters available), > > > including > > > threading: > > > > - http://java.sun.com/docs/hotspot/threads/threads.html > > > > - > > > http://java.sun.com/j2se/1.5.0/docs/guide/vm/thread-priorities.htm > > > l > > > > - > > > http://www-1.ibm.com/support/docview.wss?rs=180&context=SSEQTP&uid > > > =swg21107291 > - > > > http://java.sun.com/javase/technologies/performance.jsp > > > > > > > > > > > > -Glen > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 21/04/2008, Ulf Dittmer <[EMAIL PROTECTED]> wrote: > > > > > This sounds odd. Why would restricting it to a single > > > > > core improve performance? The point of using multiple > > cores > > > (and multiple threads) is to improve performance > > isn't it? > > > I'd leave thread scheduling decisions to the > > JVM. Plus, I > > > don't think there is anything in Java to > > facilitate this > > > (short of using JNI). > > > > > > > > > > Are you talking about indexing or searching? You may > > > > > be able to use multiple parallel threads to improve > > > > > indexing performance. I don't think Lucene uses > > > > > multi-threading for searching; not unless you have > > multiple > > > indices, anyway. > > > > > > > > > > Ulf > > > > > > > > > > > > > > > > > > > > --- Anshum <[EMAIL PROTECTED]> wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > I have been trying to bind my lucene instance (JVM - > > > > > > Sun Hotspot*) to a > > > particular core so as to improve > > > the performance. Is > > > there a way to do so or > > > is > > > there support in lucene to explicitly control the > > > thread > > > - processor > > > linkup? > > > > > > > > > > > > -- > > > > > > -- > > > > > > The facts expressed here belong to everybody, the > > > > > > opinions to me. > > > > > > The distinction is yours to draw............ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > __________________________________________________________________ > > > __________________ > > Be a better friend, newshound, and > > > > > know-it-all with Yahoo! Mobile. Try it now. > > > http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ > > > > > > > > > > > > > > > > ------------------------------------------------------------------ > > > --- > > To unsubscribe, e-mail: > > > [EMAIL PROTECTED] > > > > > For additional commands, e-mail: > > > [EMAIL PROTECTED] > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > - > > > > > > > > > > > > > > > > -- > > > > > > - > > > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -- -- The facts expressed here belong to everybody, the opinions to me. The distinction is yours to draw............ --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]