Or take a look at search.regex.RegexQuery contrib module.  You won't
be able to use that via QueryParser either.

It might make more sense to do the sanitizing before indexing rather than after.


--
Ian.


On Fri, Mar 2, 2012 at 7:26 AM, Trejkaz <trej...@trypticon.org> wrote:
> On Fri, Mar 2, 2012 at 6:22 PM, su ha <s_han...@yahoo.com> wrote:
>> Hi,
>> I'm new to Lucene. I'm indexed some documents with Lucene and need to 
>> sanitize it to ensure
>> that they do not have any social security numbers (3-digits 2-digits 
>> 4-digits).
>>
>> (How) Can I write a query (with the QueryParser) that searches for this 
>> pattern?
>>
>> e.g. I can do [000 to 999] or [00 to 99] or [0000 to 9999], but this causes 
>> hits with any 2, 3 or 4 digit number.
>> Something like "[000 to 999] [00 TO 99] [0000 TO 9999]", I get no hits at 
>> all.
>>
>> Is this possible with the default QueryParser?
>> Or is there some other programmatic way to do it?
>
> The programmatic way is to use SpanMultiTermQueryWrapper around each
> RangeQuery and then SpanNearQuery around the lot.
>
> The default QueryParser probably can't do it. I believe someone was
> enhancing it for wildcards but I'm not sure if range queries were
> included in all that.
>
> TX
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to