Hi, thanks for the tip !! Yes, basically, I would like to reduce the number of comparisons. Using this prefix length seems doable for my problem.. (even though I'm not 100% sure it is appropriate, this has to be investigated)
Is there a way to use this prefix length (or something similar) on some other property than the city ? In fact, I can also index the Country, so if this prefix length could be useable on the country, this could easily divide the search space by 400, which is way better than /26... Any idea ? Sami Dalouche Le mercredi 31 mai 2006 à 20:53 +0100, markharw00d a écrit : > >>I tried the cityName:city~0.8, and it is still not fast enough.. > >>something around 2 seconds... to return only 2 results... > > OK, so we trimmed down the search terms we actually used in the query but I > suspect what you are seeing is the effect of having to perform edit-distance > comparisons on ALL town names to get to this shortlist. If this is the case > then you'll probably be seeing a lot of CPU activity. One way of avoiding > this is to set the "prefix length" parameter on fuzzy queries to at least > one. This determines if you are comparing Rambouillet with ALL terms (as the > default zero prefix length setting does) or just those beginning with "R". > > Assuming an even spread of town names to letters that would cut the > computation down to 1/26th of the original cost. > > Cheers > Mark > > > > > Sami Dalouche wrote: > > >Hi, > > > >Compass offers me any kind of control Lucene does. it gives access to > >the low level Lucene API if you want too, so if you have a nice way of > >optimizing it, I can have Compass adapt to that. > > > > > >I tried the cityName:city~0.8, and it is still not fast enough.. > >something around 2 seconds... to return only 2 results... > >(city:Rambouillet~0.8) > > > >Sami Dalouche > > > >Le mercredi 31 mai 2006 à 09:28 +0100, mark harwood a écrit : > > > > > >>>>Actually, I am not using Lucene directly, but a > >>>> > >>>> > >>wrapper called compass > >> > >> > >>I don't know what controls it offers you then. > >>One option which could offer a speed up is to raise > >>the minimum quality match threshold above the default > >>of 0.5 and use a query string like this: > >> > >> cityName:London~0.8 > >> > >>This would reduce the number of alternative terms > >>considered and therefore the query time. > >> > >> > >>--- Sami Dalouche <[EMAIL PROTECTED]> wrote: > >> > >> > >> > >>>Hi, > >>> > >>>1) Actually, I am not using Lucene directly, but a > >>>wrapper called > >>>compass. I am using the find() method of the > >>>CompassSession, which code > >>>is : > >>>public CompassHits find(String query) throws > >>>CompassException { > >>> return > >>> > >>> > >>> > >>createQueryBuilder().queryString(query).toQuery().hits(); > >> > >> > >>> } > >>>And all of these objects are pure wrappers around > >>>lucene equivalents, > >>>nothing more. > >>> > >>> > >>>2) What I am timing is only the find call : > >>>-- start timer > >>>CompassHits hits = compassSession.find("cityName:"+ > >>>name+"~"); > >>>-- stop timer > >>> > >>>3) I am not sorting anything, but lucene is > >>>returning the hits by > >>>relevance. Does this count as sorting ? > >>> > >>>4) I tried to time the thing for ~10 queries, and > >>>the results are > >>>roughly the same. Can go down to 2 seconds, which is > >>>still way too > >>>much... > >>> > >>>Thanks for helping > >>>sami Dalouche > >>> > >>>On Tue, 2006-05-30 at 13:58 -0700, Chris Hostetter > >>>wrote: > >>> > >>> > >>>>: Fuzzy searching against this property takes > >>>> > >>>> > >>>around 3 seconds, which is > >>> > >>> > >>>>: way too much for what I plan to do, so I am > >>>> > >>>> > >>>considering the possible > >>> > >>> > >>>>whenever anyone has a question about how to speed > >>>> > >>>> > >>>up a search, and the > >>> > >>> > >>>>current amount of time the search takes is more > >>>> > >>>> > >>>then a second, there are a > >>> > >>> > >>>>few questions i allways want to ask: > >>>> > >>>> 1) what method exactly on the Searcher interface > >>>> > >>>> > >>>are you using the > >>> > >>> > >>>> execute the search? > >>>> 2) what exactly are you timing? (the time the > >>>> > >>>> > >>>search method call takes?, > >>> > >>> > >>>> the time it takes you to iterate over the > >>>> > >>>> > >>>results? etc...) > >>> > >>> > >>>> 3) are you sorting by any particular field? > >>>> 4) are you reusing the Searcher instance for more > >>>> > >>>> > >>>then one query? are > >>> > >>> > >>>> you timing more then one query and taking the > >>>> > >>>> > >>>average? > >>> > >>> > >>>>-Hoss > >>>> > >>>> > >>>> > >>>> > >>>> > >>--------------------------------------------------------------------- > >> > >> > >>>>To unsubscribe, e-mail: > >>>> > >>>> > >>>[EMAIL PROTECTED] > >>> > >>> > >>>>For additional commands, e-mail: > >>>> > >>>> > >>>[EMAIL PROTECTED] > >>> > >>> > >>> > >>> > >>> > >>--------------------------------------------------------------------- > >> > >> > >>>To unsubscribe, e-mail: > >>>[EMAIL PROTECTED] > >>>For additional commands, e-mail: > >>>[EMAIL PROTECTED] > >>> > >>> > >>> > >>> > >> > >> > >>___________________________________________________________ > >>The all-new Yahoo! Mail goes wherever you go - free your email address from > >>your Internet provider. http://uk.docs.yahoo.com/nowyoucan.html > >> > >>--------------------------------------------------------------------- > >>To unsubscribe, e-mail: [EMAIL PROTECTED] > >>For additional commands, e-mail: [EMAIL PROTECTED] > >> > >> > >> > > > > > >--------------------------------------------------------------------- > >To unsubscribe, e-mail: [EMAIL PROTECTED] > >For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > > > > > > > Send instant messages to your online friends http://uk.messenger.yahoo.com > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]