hi all, I am new to Lucene. I am trying to use Lucene to generate data for a document classifier. I need to generate wordno, lineno, pageno for each term/phrase. I was able to use SpanQuery/SpanNearQuery to get the wordno (span.start()) for the term/phrase. To get pageno and lineno, a custom Analyzer needs to be written ? Can the Analyzer be made to recognize and newline and page feed characters and keep track of lineno and pageno for the tokens ?
Is it possible with existing Lucene Analyzer ? Thanks, Arun -- Where there is a will, there is a way ! --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org