Re: Calculate Term Co-occurrence Matrix

2010-08-19 Thread Ivan Provalov
I used this before almost as is with couple of fixes: http://issues.apache.org/jira/browse/LUCENE-474 Thanks, IP --- On Thu, 8/19/10, ahmed algohary wrote: > From: ahmed algohary > Subject: Calculate Term Co-occurrence Matrix > To: java-user@lucene.apache.org > Date: Thursday, August 19, 20

Tokenization / Analyzer question

2010-08-19 Thread Beard, Brian
I'm using lucene 2.9.1. I'm indexing documents which correspond to an ID. Each field in the ID document is made up of data from all subId's. (It's a requirement that searches must work across all subId's within an ID). They will be indexed and stored in some format similar to: subId0Value0 subId0

Calculate Term Co-occurrence Matrix

2010-08-19 Thread ahmed algohary
Hi all, I need to know if there is a Lucene plug-in or a Lucene-based API for calculating the term co-occurrence matrix for a given text corpus. Thanks! -- Ahmed

Re: Sorting a Lucene index

2010-08-19 Thread Erick Erickson
You haven't yet told us how many documents you're talking about here, so it's hard to have a good idea of what solutions are. That said, I'd just try sorting first. The sorting cache size will be something like (sizeof(int or long)) * (number of documents). Measure (remember to measure the response

Re: asking about incremental update

2010-08-19 Thread Yakob
do you reckon I should use a timer or a thread instead to periodically update the index? On 8/19/10, findbestopensource wrote: > Hi jacobian, > > Lucene will not do incremental update by iteself. Lucene is just a > library. Your app should periodically add the content to the index and > once done

Re: Problems with Lucene 3.0.2 and Java 1.6.0_12

2010-08-19 Thread Michael McCandless
Phew... thanks for bringing closure! And, good sleuthing. So the takeaway is JRE 1.6.0_12 = BAD and JRE 1.6.0_21 = GOOD. Mike On Wed, Aug 18, 2010 at 10:48 PM, Nader, John P wrote: > > This is a follow up related to my original post Term browsing performance > problems with our upgrade to Luc

Re: asking about incremental update

2010-08-19 Thread findbestopensource
Hi jacobian, Lucene will not do incremental update by iteself. Lucene is just a library. Your app should periodically add the content to the index and once done, reopen the reader to get your changes reflected. Regards Aditya www.findbestopensource.com On Thu, Aug 19, 2010 at 12:13 PM, Yakob w

Re: Sorting a Lucene index

2010-08-19 Thread findbestopensource
Hi Shelly, Have you tried sorting in your queries. Is it creating in any issues? Once you open a reader and warm your search with sorting then fieldcache will be loaded for that field. You could see more usage of RAM. You could do as many queries with sorting till you reopen the reader. If you ad