Re: How to use RegexTermEnum

2009-07-04 Thread Raf
Yes, I thought about this solution too, but the problem is that the "sid" part can be different in different domains. So, sometimes we have sid=..., other times we have s= and so on. If we decide to solve the problem by removing the sid from the url in the index, when we discover a new "patter

Re: How to use RegexTermEnum

2009-07-04 Thread Raf
It works, thanks. I thought I had to call next() to know IF there was a term, as you normally do with hasNext() - next() using iterators, but I was wrong. So, in order to know if there is a match, I have to check if rte.term() is null, correct? Than I can use next() to look for additional matches.

Re: How to use RegexTermEnum

2009-07-04 Thread Shayak Sen
I might be skirting the issue here, but wouldnt it be easier and faster if you remove the sid before you add it to the index? Cheers, Shayak On Sat, Jul 4, 2009 at 3:03 AM, Erick Erickson wrote: > WARNING: I haven't actually tried using RegexTermEnum in a > long time, but... > > I *think* that th

Re: How to use RegexTermEnum

2009-07-03 Thread Erick Erickson
WARNING: I haven't actually tried using RegexTermEnum in a long time, but... I *think* that the constructor positions you at the first term that matches, without calling next(). At least there's nothing I saw in the documentation that indicates you need to call next() before calling term(). Assum

How to use RegexTermEnum

2009-07-03 Thread Raf
Hi, I am trying to solve the following problem: In my index I have a "url" field added as Field.Store.YES, Field.Index.NOT_ANALYZED and I must use this field as a "key" to identify a document. The problem is that sometimes two urls can differ only because they contain a different session id: i.e.