Re: How to support stemming and case folding for english content mixed with non-english content?

2009-06-07 Thread KK
Hi Robert, The problem is that worddelimiterfilter is doing its job for english content but for non-english indian content which are unicoded it highlights the searched word but alongwith that it also highlights the characters of that word which was not hapenning without worddelimfilter, thats my c

Re: Retrieving the term vectors of a document in Nutch

2009-06-07 Thread House Less
In retrospect, pardon my stupidity: surely it cannot be right that the term frequency vector for a page is not present within Nutch, for it needs this to compute the score for a page given a query. I would appreciate it if you would tell me where I may find it given a document number. Thank you

Retrieving the term vectors of a document in Nutch

2009-06-07 Thread House Less
Hello everyone, I am quite new to development with Nutch, so you must forgive my question if it is amateurish. After some reading of Luke's source code, I found to my dismay that obtaining the TermFreqVector of a document via the IndexReader resulted in no vectors at all. A mailing list entry

Re: Search fails every time

2009-06-07 Thread Delip Rao
Thanks Simon & Grant. Yes, the indexWriter was close()'d before searching. As Grant pointed out, the issue really was with the Analyzer. Everything worked when I replaced the KeywordAnalyzer with a SimpleAnalyzer. Luke seems to be god-sent (no pun) -- With a KeywordAnalyzer, the entire "content"

Re: Search fails every time

2009-06-07 Thread Simon Willnauer
Another question, do you commit your indexwriter before you open your searcher. You could also check how many docs in the index using IndexReader#numDocs and pass the index reader to the indexsearchers constructor. Just a guess too... simon On Sun, Jun 7, 2009 at 6:40 PM, Grant Ingersoll wrote:

Re: Search fails every time

2009-06-07 Thread Grant Ingersoll
If I had to guess, I'd say you have some type of Analysis mismatch between what you are indexing and what you are searching. Do you really want to use the KeywordAnalyzer? You might use Luke (http://www.getopt.org/luke) to have a look at your index and see if that sheds some light. Also