What is the best way to call Lucene from Python?

2009-08-14 Thread Valery Khamenya
Hi what would be the best way to call Lucene from Python application? Is PyLucene really a good way for it? In particular: What about PyLucene's scalability? What about PyLucene vs Lucene performance? (this post is quite old: http://markmail.org/message/5pjbs7mdh4fpvsjb) best regards -- Valery

Re: What is the best way to call Lucene from Python?

2009-08-15 Thread Valery Khamenya
On Aug 14, 2009, at 12:33, Valery Khamenya wrote: > > Hi >> what would be the best way to call Lucene from Python application? >> >> Is PyLucene really a good way for it? >> >> In particular: >> >> What about PyLucene's scalability? >> >

Re: What is the best way to call Lucene from Python?

2009-08-15 Thread Valery Khamenya
Hi Andi, > If you have any questions, feel free to ask this list. and here we go! :) I've benchmarked the StandardAnalyzer against a 3Mb text file. This test (se below) was executed both in PyLucene and in *plain* Lucene, i.e. in Java. The execution time in Java was 1.7 sec, whereas the PyLucen

Re: What is the best way to call Lucene from Python?

2009-08-15 Thread Valery Khamenya
o, it's better to write that loop in > Java, generate Python wrapper access code to it with JCC and invoke it that > way. > > Andi.. > > > On Aug 15, 2009, at 15:52, Valery Khamenya wrote: > > Hi Andi, >> >> If you have any questions, feel free to ask th

Re: Request: Windows 64-bit build with Python 3.1

2009-08-17 Thread Valery Khamenya
Dennis, I can only say that PyLucene works fine on my 64bit Linux box. best regards -- Valery A.Khamenya On Mon, Aug 17, 2009 at 2:08 PM, Andi Vajda wrote: > > On Mon, 17 Aug 2009, Dennis Cooper wrote: > > Can I use PyLucene with Python 3.1? If not, are there plans to make >> PyLucene >> comp

Why 12 child threads in samples/ThreadIndexFiles.py ?

2010-03-25 Thread Valery Khamenya
Hi all, Q1: Why 12 threads in samples/ThreadIndexFiles.py ? I have no idea how this example http://svn.apache.org/repos/asf/lucene/pylucene/trunk/samples/ThreadIndexFiles.py starts more than 1 thread: └─python,4922 ThreadIndexFiles.py ./corp ├─{python},4923 ├─{python},4924 ├─{

Re: Why 12 child threads in samples/ThreadIndexFiles.py ?

2010-03-25 Thread Valery Khamenya
OK, initVM() immediately brings 11 child threads. So, we have 1 indexing thread. Now only Q3 is left: could someone please modify this example for indexing with more than 1 thread? best regards -- Valery A.Khamenya On Thu, Mar 25, 2010 at 11:19 AM, Valery Khamenya wrote: > Hi all, >

Re: Why 12 child threads in samples/ThreadIndexFiles.py ?

2010-03-26 Thread Valery Khamenya
Hi Andy On Fri, Mar 26, 2010 at 1:34 AM, Andi Vajda wrote: > Why 12 ? Only one is created explicitely. The others you found are probably > java's or lucene's. as I have written in my previous post, indeed initVM() creates 11 child-threads on my box. > Indeed, it starts only one. > The point of

Slowdown during the search for similar documents

2010-03-27 Thread Valery Khamenya
Hi, there is a strange slowdown during the search for similar documents. For some reason pylucene version is much slower than the pure Lucene one. The test document collection contains 200K docs. Here is the pylucene version: content = ref_doc.getField('content').stringValue() similarity_query =

Re: Slowdown during the search for similar documents

2010-03-28 Thread Valery Khamenya
yes, the same parameters. best regards -- Valery A.Khamenya On Sun, Mar 28, 2010 at 12:18 AM, Andi Vajda wrote: > > On Sat, 27 Mar 2010, Valery Khamenya wrote: > >> there is a strange slowdown during the search for similar documents. >> For some reason pylucene version is

Re: Slowdown during the search for similar documents

2010-03-28 Thread Valery Khamenya
hm, during new runs we see only 2x slow down. So, no "much slower", but just "2x slower" -- both searches compared are done in single thread. best regards -- Valery A.Khamenya On Sun, Mar 28, 2010 at 2:09 PM, Valery Khamenya wrote: > yes, the same parameters. > &g

a Debian package of PyLucene 3 for Ubuntu 10.10 "Maverick" (AMD64) ?

2011-02-17 Thread Valery Khamenya
Hi guys, I tried several times to install PyLucene 3 on Ubuntu 10.10 "Maverick" (AMD64). I have never succeeded with it (maybe 1 hour was never enough for it). Could someone, please, build a Debian package of PyLucene 3 for Debian 10.10? or alternatively write a clear guide on how to do it? If I

Re: [ANNOUNCE] Apache PyLucene 3.1.0

2011-04-14 Thread Valery Khamenya
Thanks! btw, originally I went this way http://lucene.apache.org/pylucene/jcc/documentation/install.html , but it is not up-to-date and trunk doesn't seem to be compilable (problems with ant xml files, plus no doc directory etc). Then I used the tar and it worked like charm. best regards -- Vale