Re: Problem getting tokens for document

2010-04-14 Thread Andi Vajda
On Apr 14, 2010, at 10:32, Aric Coady wrote: Hey, Herb. There is a memory leak in the string array in pylucene 2.4. In this case it would be the iteration of tfvP.getTerms(). The fix made it into 2.9, more history here: http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/20090

Re: Problem getting tokens for document

2010-04-14 Thread Andi Vajda
On Apr 14, 2010, at 10:21, "Herbert Roitblat" wrote: Hi, folks. I am using PyLucene and doing a lot of get tokens. lucene.py reports version 2.4.0. It is rpath linux with 8GB of memory. Python is 2.4. The system indexes 116,000 documents just fine. Maxheap is '2048m', 64 bit environment.

Re: Problem getting tokens for document

2010-04-14 Thread Aric Coady
Hey, Herb. There is a memory leak in the string array in pylucene 2.4. In this case it would be the iteration of tfvP.getTerms(). The fix made it into 2.9, more history here: http://mail-archives.apache.org/mod_mbox/lucene-pylucene-dev/200907.mbox/%3calpine.osx.2.01.0907301553230.5...@yuzu%3e

Problem getting tokens for document

2010-04-14 Thread Herbert Roitblat
Hi, folks. I am using PyLucene and doing a lot of get tokens. lucene.py reports version 2.4.0. It is rpath linux with 8GB of memory. Python is 2.4. The system indexes 116,000 documents just fine. Maxheap is '2048m', 64 bit environment. Then I need to get the tokens from these documents and