Re: Using categories with Lucene

2010-08-11 Thread Luan Cestari
Julien, You're right. We discovered carrot by searching the mailing-list and thought it was mentioned in one of our conversations. We are sorry for our mistake. Best Regards, Daniel Gimenes Luan Cestari On Wed, Aug 11, 2010 at 2:16 PM, Julien Nioche < lists.digitalpeb...@gmail.com> wrote: > BTW

word frequency counting

2010-08-11 Thread Shuai Weng
Hey, I'm new to Lucene... I was wondering if we can use Lucene/Solr for word frequency counting (eg, in a subset of full text papers). Thanks for any info you may provide. Shuai On Aug 11, 2010, at 10:16 AM, Julien Nioche wrote: > BTW I don't remember anyone on the Nutch list suggesting you

Re: Using categories with Lucene

2010-08-11 Thread Julien Nioche
BTW I don't remember anyone on the Nutch list suggesting you to use Carrot for this (see : http://search-lucene.com/?q=luan+carrot) or classifying at querying time What I suggested in http://search-lucene.com/m/JWZTj1q4lB92 was about classifying during the parsing or indexing and generating a fiel

Re: Using categories with Lucene

2010-08-11 Thread Luan Cestari
Hi Glen, The URL to the Creative Commons package plugin ( http://netlikon.de/docs/javadoc-nutch/trunk/org/creativecommons/nutch/package-summary.html ). It is in the CCIndexingFilter class that add a field that in the CCQueryFilter class filter the result using that new field. Regards, Luan On T

[ANN] SearchBlox Version 6.0 incorporating Apache Lucene 3.0.2 released

2010-08-11 Thread Lucene
The SearchBlox Team is pleased to announce the availability of SearchBlox Version 6.0. The new version upgrades to Apache Lucene 3.0.2. SearchBlox is an integrated Enterprise Search Server incorporating Lucene and includes: - Web/RSS/FileSystem Crawlers - REST API - Web Admin Console for Server

Re: read past EOF

2010-08-11 Thread Michael McCandless
Hmm, this issue only leads to corruption is an exception is hit at the wrong time. Is it at all possible you missed an exception? Because that EOF on reading del docs is precisely the corruption seen if you hit this issue. The issue was backported to 2.9.4-dev, so you may want to check out 2.9 b

Re: read past EOF

2010-08-11 Thread Ganesh
I am using Lucene 2.9.1 and there was no exception in the past. Regards Ganesh - Original Message - From: "Michael McCandless" To: Sent: Wednesday, August 11, 2010 3:28 PM Subject: Re: read past EOF It looks like it may be this issue: https://issues.apache.org/jira/browse/LUCENE

Rank by Number of token

2010-08-11 Thread Philippe
Hi all, I want to rank my query by the number of tokens in a field. What would be the best way to implement such a ranking? Regards, Philippe - To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additiona

Re: read past EOF

2010-08-11 Thread Michael McCandless
It looks like it may be this issue: https://issues.apache.org/jira/browse/LUCENE-2593 But can you describe the circumstances leading up to this? EG were there any exceptions (eg disk full) before this one? Which version of Lucene? Mike On Wed, Aug 11, 2010 at 12:30 AM, Ganesh wrote: > He

Indexing and searching phrases

2010-08-11 Thread Christian S.
Hy All I'm using Lucene to extract keywords out of a text. The Lucene Index is built over a set of defined words (we call them keywords). Then a text is queried to search that index. The goal is to find out which keywords appear in the given text. This works fine as long as the defined keywords