I used this before almost as is with couple of fixes:
http://issues.apache.org/jira/browse/LUCENE-474
Thanks,
IP
--- On Thu, 8/19/10, ahmed algohary wrote:
> From: ahmed algohary
> Subject: Calculate Term Co-occurrence Matrix
> To: java-user@lucene.apache.org
> Date: Thursday, August 19, 20
I'm using lucene 2.9.1.
I'm indexing documents which correspond to an ID.
Each field in the ID document is made up of data from all subId's.
(It's a requirement that searches must work across all subId's within an
ID).
They will be indexed and stored in some format similar to:
subId0Value0 subId0
Hi all,
I need to know if there is a Lucene plug-in or a Lucene-based API for
calculating the term co-occurrence matrix for a given text corpus.
Thanks!
--
Ahmed
You haven't yet told us how many documents you're talking about here, so
it's
hard to have a good idea of what solutions are. That said, I'd just try
sorting first.
The sorting cache size will be something like (sizeof(int or long)) *
(number of documents).
Measure (remember to measure the response
do you reckon I should use a timer or a thread instead to periodically
update the index?
On 8/19/10, findbestopensource wrote:
> Hi jacobian,
>
> Lucene will not do incremental update by iteself. Lucene is just a
> library. Your app should periodically add the content to the index and
> once done
Phew... thanks for bringing closure! And, good sleuthing.
So the takeaway is JRE 1.6.0_12 = BAD and JRE 1.6.0_21 = GOOD.
Mike
On Wed, Aug 18, 2010 at 10:48 PM, Nader, John P wrote:
>
> This is a follow up related to my original post Term browsing performance
> problems with our upgrade to Luc
Hi jacobian,
Lucene will not do incremental update by iteself. Lucene is just a
library. Your app should periodically add the content to the index and
once done, reopen the reader to get your changes reflected.
Regards
Aditya
www.findbestopensource.com
On Thu, Aug 19, 2010 at 12:13 PM, Yakob w
Hi Shelly,
Have you tried sorting in your queries. Is it creating in any issues?
Once you open a reader and warm your search with sorting then
fieldcache will be loaded for that field. You could see more usage of
RAM. You could do as many queries with sorting till you reopen the
reader.
If you ad