Hey,
Thanks for the tips. I was pointed towards the KeywordTokenizer by the
java people which returns the full content as one content (not a very
intuitive name in my opinion, but anyway). I might still need to extend
this to do some customizations, so I'll look into the PythonAnalyzer
sample
On Jul 17, 2010, at 22:30, Andi Vajda wrote:
On Jul 17, 2010, at 22:23, Martin wrote:
Hi there,
I'm trying to extend the PythonTokenizer class to build my own
custom tokenizer, but seem to get stuck pretty much soon after
that. I know that I'm supposed to extend the incrementToken()
On Jul 17, 2010, at 22:23, Martin wrote:
Hi there,
I'm trying to extend the PythonTokenizer class to build my own
custom tokenizer, but seem to get stuck pretty much soon after that.
I know that I'm supposed to extend the incrementToken() method, but
what exactly am I dealing with in th