RE: StandardTokenizer generation from JFlex grammar

2012-10-04 Thread vempap
Thanks Steve for the pointers. I'll look into it. -- View this message in context: http://lucene.472066.n3.nabble.com/StandardTokenizer-generation-from-JFlex-grammar-tp4011940p4011944.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. ---

RE: StandardTokenizer generation from JFlex grammar

2012-10-04 Thread Steven A Rowe
Hi Phani, Assuming you're using Lucene 3.6.X, see: and

StandardTokenizer generation from JFlex grammar

2012-10-04 Thread vempap
Hello, I'm trying to generate the standard tokenizer again using the jflex specification (StandardTokenizerImpl.jflex) but I'm not able to do so due to some errors (I would like to create my own jflex file using the standard tokenizer which is why I'm trying to first generate using that to get a

Re: Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter

2012-10-04 Thread Thomas Matthijs
And to include the code On Thu, Oct 4, 2012 at 3:52 PM, Markus Jelsma wrote: > I forgot to add that this is with today's build of trunk. > > -Original message- >> From:Markus Jelsma >> Sent: Thu 04-Oct-2012 15:42 >> To: java-user@lucene.apache.org >> Subject: Highlighter IOOBE with modif

RE: Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter

2012-10-04 Thread Markus Jelsma
I forgot to add that this is with today's build of trunk. -Original message- > From:Markus Jelsma > Sent: Thu 04-Oct-2012 15:42 > To: java-user@lucene.apache.org > Subject: Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter > > Hi, > > I've modified the HyphenationComp

Highlighter IOOBE with modified HyphenationCompoundWordTokenFilter

2012-10-04 Thread Markus Jelsma
Hi, I've modified the HyphenationCompoundWordTokenFilter to emit less subtokens because the original filter can emit all kinds of subtokens that have a very different meaning on their own. I've modified it so no overlapping subtokens are emitted and no subtokens are emitted that can be found wi