Re: Looking for a way to customize how StandardAnalyzer handles punctuation

2008-12-11 Thread Greg Shackles
I still need all the normal benefits of the StandardAnalyzer as far as punctuation and everything else goes, with just this one special exception. Since I was on a limited schedule I ended up just doing the method where I escape these cases myself in a way that makes them get tokenized. Certainly

Re: Looking for a way to customize how StandardAnalyzer handles punctuation

2008-12-10 Thread Grant Ingersoll
Let's take a quick step back and see if it helps. Why do you feel you need the StandardAnalyzer to solve your problem? What else are you gaining from it? Would you be better served by a WhitespaceTokenizer? That being said, hacking up the grammar isn't as bad as you might think. There a

Looking for a way to customize how StandardAnalyzer handles punctuation

2008-12-09 Thread Greg Shackles
Hey everyone, I'm running into a problem where some punctuation that I would actually want to keep gets thrown out because they don't get tokenized. By far the most common case for this is ampersand, but it does happen with others as well. My concern isn't even so much in that I need to be able t