Re: Spaces in regular expressions

2016-02-13 Thread Jack Krupansky
Just to be clear, the whitespace tokenizer would treat "A=foo(){" as a single token. I presume you want "A" and "foo" to be separate terms. You still haven't indicated what regex you were considering. Try explaining your query in plain English. I mean, do you want to search for two keywords with a

Re: Spaces in regular expressions

2016-02-13 Thread Kudrettin Güleryüz
As mentioned, document is a source code. As you know all below statments are equal: A = foo() { A=foo(){ A= foo(){ ... With standard whitespace analyzer in action statements wanted to match can be on one to five terms in this case. If spacing is definite, I could go either a phrase search or rege

Re: Spaces in regular expressions

2016-02-13 Thread Jack Krupansky
Obviously you wouldn't need to do a regex for simply terms like foo and bar - just use simple terms and quoted phrase to match "foo bar". If you really do need to do complex pattern regexes and match across adjacent terms, your best bet is to keep a copy of the source text in a separate string (not

RE: Spaces in regular expressions

2016-02-13 Thread Uwe Schindler
Hi, That's very easy to explain: Regexp queries only work on terms, you already said it in your introduction. There is no phrase query in Lucene that accepts regular expressions. Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > ---

Spaces in regular expressions

2016-02-13 Thread Kudrettin Güleryüz
Hello, I am using standard whitespace analyzer to index a source code document using Lucene 5. I understand that a document with content foo bar would have only two terms: foo and bar. When I search for "foo bar" it normally matches the document. Similarly a regexp query /foo/ or /bar/ also matc