Highlighter, Term Positions and Stopwords

2005-12-05 Thread Dan Climan
Do stopfilters create non-contiguous token positions? I was interested in experimenting with the highlighter and using the TokenSources.getTokenStream(TermPositionVector tpv, boolean tokenPositionsGuaranteedContiguous) method The javadocs for this method

RE: Why are tokens not being indexed?

2005-12-05 Thread Combs, Craig
I hate to admit this but I must. My error was caused by a simple offset on a counter. Luke was very helpful in helping me determine this error. Although the benefit is that I learned more about the internals of Lucene and have a much better understanding. I appreciate the responses and apologize

Re: Why are tokens not being indexed?

2005-12-05 Thread Erik Hatcher
Craig, Again, please try a TermQuery(new Term(fieldname, value)) for a known field/term combination that you are having issues with. MultiFieldQueryParser is adding complexity on top of complexity. Start simple, get a known term query to work, and move up from there. Erik On D

Re: Why are tokens not being indexed?

2005-12-05 Thread Andrzej Bialecki
Combs, Craig wrote: I'm able to see the documents that were indexed but not the tokens associated with the document in Luke. I'm using the multifield query parser and I did do the query.toString and the tokens returned by the query parser matched the tokens returned from the analyzer. Some how

RE: Why are tokens not being indexed?

2005-12-05 Thread Combs, Craig
I'm able to see the documents that were indexed but not the tokens associated with the document in Luke. I'm using the multifield query parser and I did do the query.toString and the tokens returned by the query parser matched the tokens returned from the analyzer. Some how I need to see which to

Re: Why are tokens not being indexed?

2005-12-05 Thread Erik Hatcher
On Dec 5, 2005, at 8:20 AM, Combs, Craig wrote: This is very mysterious I have check my parser and I'm returned body:. My analyzer during indexing returns in the token stream. But when I perform my search no results are found. Is there a way I can see what tokens are actually written b

RE: Why are tokens not being indexed?

2005-12-05 Thread Combs, Craig
This is very mysterious I have check my parser and I'm returned body:. My analyzer during indexing returns in the token stream. But when I perform my search no results are found. Is there a way I can see what tokens are actually written by the index writer of lucene? My analyzer returns the

Re: how to control terms to be highlighted?

2005-12-05 Thread mark harwood
Looks like you may need to do more intelligent parsing of the source document. In the specific example you gave the text comes from a drop-down combo in a form unrelated to the article text I imagine you are interested in. There is a lot of other "guff" around the edges to do with related news sto

Re: how to control terms to be highlighted?

2005-12-05 Thread Harini Raghavan
Hi, I was able to use the Highlighter API to extract the text where the keywords occur. However I am facing another related problem. My application downloads the news items to the local server. The indexer api parses these HTML files and extracts the content and stores it in the index. The p