Re: Performing a like query

2006-10-10 Thread Rahil
Hi Erick I think you've made some really important observations. Steven has provided a good regular expression to help with word and non-words. For the moment I have reverted to my analyser and am going to try doing some clever pattern matching later. Also, Ill try using a different analyser

Re: Performing a like query

2006-10-09 Thread Steven Rowe
Hi Rahil, Rahil wrote: > I was just wondering whether there is a > difference between the regular expression you sent me i.e. > (i) \s*(?:\b|(?<=\S)(?=\s)|(?<=\s)(?=\S))\s* > >and > (ii) \\b > > as they lead to the same output. For example, the string search "testing > a-new string=3/4

Re: Performing a like query

2006-10-09 Thread Rahil
Hi Steve Thanks for your response. I was just wondering whether there is a difference between the regular expression you sent me i.e. (i) \s*(?:\b|(?<=\S)(?=\s)|(?<=\s)(?=\S))\s* and (ii) \\b as they lead to the same output. For example, the string search "testing a-new string=3/4

Re: Performing a like query

2006-10-06 Thread Steven Rowe
Steven Rowe wrote: >\s*(?:\b|(?<=\S)(?=\s)|(?<=\s)(?=\S))\s* Oops, here's an improved version to cover the beginning- and end-of-string non-alphanumeric cases (E.g. "=some text-"): \s*(?:\b|(?<=\S)(?=\s)|(?<=\s)(?=\S)|\A|\z)\s*

Re: Performing a like query

2006-10-06 Thread Steven Rowe
Hi Rahil, Rahil wrote: > I couldnt figure out a valid regular expression to write a valid > Pattern.compile(String regex) which can tokenise a string into "O/E - > visual acuity R-eye=6/24" into "O","/","E", "-", "visual", "acuity", > "R", "-", "eye", "=", "6", "/", "24". The following regular e

Re: Performing a like query

2006-10-06 Thread Erick Erickson
My intuition is that you'll have a real problem using regular expressions. It'll either be incredibly ugly (and unmaintainable) or just won't work since the regular expression tools tend to throw out the delimiters. I think you'll be much better off writing your own analyzer (see LIA, the synonym

Re: Performing a like query

2006-10-06 Thread Rahil
Hi Erick Im having trouble with writing a good regular expression for the PatternAnalyzer to deal with word and non-word characters.I couldnt figure out a valid regular expression to write a valid Pattern.compile(String regex) which can tokenise a string into "O/E - visual acuity R-eye=6/24"

Re: Performing a like query

2006-10-02 Thread Chris Hostetter
: I have a custom-built Analyzer where I tokenize all non-whitespace : characters as well available in the field "TERM" (which is the only : field being tokenised). : If I now query my index file for a term "6/12" for instance, I get back : only ONE result : instead of TWO. There is another token

Re: Performing a like query

2006-10-01 Thread Erick Erickson
Well, I'm not the greatest expert, but a quick look doesn't show me anything obvious. But I have to ask, wouldn't WhiteSpaceAnalyzer work for you? Although I don't remember whether WhiteSpaceAnalyzer lowercases or not. It sure looks like you're getting reasonable results given how you're tokenizi

Re: Performing a like query

2006-10-01 Thread Rahil
Hi Erick Thanks for your response. There's a lot to chew on in your reply and Im looking at the suggestions you've made. Yeah I have Luke installed and have queried my index but there isn't any great explanation Im getting out of it. A query for "6/12" is sent as "TERM:6/12" which is quite

Re: Performing a like query

2006-10-01 Thread Erick Erickson
Most often, from what I've seen on this e-mail list, unexpected results are because you're not indexing on the tokens you *think* you're indexing. Or not searching on them. By that I mean that the analyzers you're using are behaving in ways you don't expect. That said, I think you're getting exac

Performing a like query

2006-10-01 Thread Rahil
Hi I have a custom-built Analyzer where I tokenize all non-whitespace characters as well available in the field "TERM" (which is the only field being tokenised). If I now query my index file for a term "6/12" for instance, I get back only ONE result SCOREDESCRIPTIONSTATUSCONCEPTID