On Jul 16, 2011, at 2:53, "h.g. g.h." <highigh...@gmail.com> wrote:
> Thanks a lot for your replies, Andi and Christian. This is exactly what I > saw... about 11 threads per python process that was making lucene queries... > > I have a follow-up question now (may be even naive): > Andi, about your suggestion of using multithreading and that PyLucene will > release GIL when it calls into JVM... So, when it does that, another python > thread will acquire GIL, and then call into JVM, and then another thread... > so on and so forth. Will this again not lead to too many Java threads > running in parallel?? Did I misunderstand what you were suggesting? It depends how long the queries are running for. It's true that only one Python thread can start queries at a time because it holds the GIL. If these queries are difficult enough and run for a long time, decent concurrency can still be achieved. > The staff here who run parallel/multithreaded java code use the cmd "java > -XX:ParallelGCThreads=16 JavaApp". I tried to pass the same argument as > "initVM(vmargs="-XX:ParallelGCThreads=4", but it doesn't obey. Am I missing > something here, or misusing it, may be?? What are your assumptions here ? Maybe Python isn't adapted to the constraints you must run in ? Why not use Java directly ? Andi.. > > Himanshu > > > On Fri, Jul 15, 2011 at 9:47 AM, Christian Heimes <li...@cheimes.de> wrote: > >> Am 15.07.2011 10:10, schrieb Andi Vajda: >>> PyLucene embeds a Java VM. Thus, with each subprocess, a new JVM is >> created with all its threads. This can get insane pretty quickly. >> >> The Java VM starts a lot of threads. On my Linux box eleven threads >> additional threads are running after initVM() has been called. >> >>>>> import lucene, os, psutil >>>>> psutil.Process(os.getpid()).get_num_threads() >> 1 >>>>> lucene.initVM() >> <jcc.JCCEnv object at 0x7f23a66f31e0> >>>>> psutil.Process(os.getpid()).get_num_threads() >> 12 >> >> Christian >>