Just to be clear, the whitespace tokenizer would treat "A=foo(){" as a
single token. I presume you want "A" and "foo" to be separate terms.
You still haven't indicated what regex you were considering. Try explaining
your query in plain English. I mean, do you want to search for two keywords
with a
As mentioned, document is a source code. As you know all below statments
are equal:
A = foo() {
A=foo(){
A= foo(){
...
With standard whitespace analyzer in action statements wanted to match can
be on one to five terms in this case. If spacing is definite, I could go
either a phrase search or rege
Obviously you wouldn't need to do a regex for simply terms like foo and bar
- just use simple terms and quoted phrase to match "foo bar". If you really
do need to do complex pattern regexes and match across adjacent terms, your
best bet is to keep a copy of the source text in a separate string (not
Hi,
That's very easy to explain: Regexp queries only work on terms, you already
said it in your introduction. There is no phrase query in Lucene that accepts
regular expressions.
Uwe
-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de
> ---
Hello,
I am using standard whitespace analyzer to index a source code document
using Lucene 5.
I understand that a document with content foo bar would have only two
terms: foo and bar. When I search for "foo bar" it normally matches the
document. Similarly a regexp query /foo/ or /bar/ also matc