Hi Robert,
The problem is that worddelimiterfilter is doing its job for english content
but for non-english indian content which are unicoded it highlights the
searched word but alongwith that it also highlights the characters of that
word which was not hapenning without worddelimfilter, thats my c
In retrospect, pardon my stupidity: surely it cannot be right that the term
frequency vector for a page is not present within Nutch, for it needs this to
compute the score for a page given a query. I would appreciate it if you would
tell me where I may find it given a document number. Thank you
Hello everyone,
I am quite new to development with Nutch, so you must forgive my question if it
is amateurish.
After some reading of Luke's source code, I found to my dismay that obtaining
the TermFreqVector of a document via the IndexReader resulted in no vectors at
all. A mailing list entry
Thanks Simon & Grant.
Yes, the indexWriter was close()'d before searching. As Grant pointed out,
the issue really was with the Analyzer. Everything worked when I replaced
the KeywordAnalyzer with a SimpleAnalyzer.
Luke seems to be god-sent (no pun) -- With a KeywordAnalyzer, the entire
"content"
Another question,
do you commit your indexwriter before you open your searcher. You
could also check how many docs in the index using IndexReader#numDocs
and pass the index reader to the indexsearchers constructor.
Just a guess too...
simon
On Sun, Jun 7, 2009 at 6:40 PM, Grant Ingersoll wrote:
If I had to guess, I'd say you have some type of Analysis mismatch
between what you are indexing and what you are searching. Do you
really want to use the KeywordAnalyzer?
You might use Luke (http://www.getopt.org/luke) to have a look at your
index and see if that sheds some light.
Also