Erick Erickson wrote:
OK, a not very helpful answer, but "of course they're slower, they do more work" (the span versions). But that's fairly useless, since the question is
really "is it enough slower in my situation that I need to find an
alternative?". And the only way I know of to answer that question is to make
some tests with the data representing my particular problem......

Sorry I can't be more help....
Erick

On 9/1/06, Mark Miller <[EMAIL PROTECTED]> wrote:

Erick Erickson wrote:
> Let me chime in here on a different note.... before you get happy with
> wildcard queries, take a look at the thread "I just don't get
> wildcards at
> all". There is lots of good info that Erik, Chris and Otis provided me.
>
> The danger with prefixquery and wildcard query is that they will throw
> TooManyClauses exceptions when you start matching a number of terms (the
> default is 1024, although you can make this much bigger if memory
> allows).
> If you're aware of this and it is and will be OK in your app, ignore
> this.
> But if your index is going to grow significantly, this is a real
> problem. I
> went with implementing filters with WildCardTermEnum (you could also use
> RegexTermEnum) for the wildcard portions of my query. Which has
> interesting
> implications for spans, we elected to say spans didn't work with
> wildcards.
>
> Anyway, as I said, if you're aware of the TooManyClauses issue and are
> sure
> it doesn't matter, ignore me. After all, everybody else does <G>.....
>
>
> Best
> Erick
>
>
>
> On 8/30/06, Mark Miller <[EMAIL PROTECTED]> wrote:
>>
>> Ignore that last question. I see that you said prefix wildcard query
and
>> not wildcard query. A quick look at the code seems to show it grabbing
a
>> prefix as well.
>>
>> Do you think one would be any faster than the other? Should I used
>> Wildcardqueries outside of spanqueries and the regexquery inside
>> spanqueries or use regex both places?
>>
>> - Mark
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
Thanks a lot for the info Eric. Good stuff to know for sure.
I guess the real question I have been trying to spit out is this:
Is a span version of any of these searches--fuzzy, wildcard,
etc--inherently slower than their non-span brothers. If they have the
same limitations and speeds then that is all I am looking for.

P.S.
I realize I have been screwing up the threading by replying when
starting a new topic. I have been alerted and will stop this pernicious
activity.

- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Thanks Eric. Your always more than helpful. The reason I only care that they are as good as they can be is that I am looking for a general solution and not one tailored to a particular dataset. This is for a general query parser. I want to be able to search for wildcard, fuzzies, etc in a proximity search. mark*off NEAR Bork?on. This may just be a slow query in general but other search engines appear to offer this, and they must face similar limitations. So if a fuzzy search is slow in a proximity search just because it is slow...I don't mind. If it is slow because lucene implements spans in a way that makes wildcard and fuzzies particularly slow in them...thats what I would like to know. And if that is the case...someone should make a fuzzy and wildcard that is fast in a span :)

- Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to