See below... On 6/8/07, Chris Lu <[EMAIL PROTECTED]> wrote:
Thanks to all who answered with their experience and insights! LUCENE-625 is very interesting, but not sure about the scalability. "Begin completion only with 3 letters or more" is reasonable for special cases, but not ideal. What I wanted to implement is a pretty general software. WildcardTermEnum seems closest to what I planned to search on existing Lucene index, possibly pretty large. I can use it to list, say top 10 matching terms, and I can use another search to find all matching docs. This is actually 2 searches.
Not really. WildcardTermEnum is actually MUCH less costly than a search. Imagine a list that contains every term in the index, already ordered lexically. A *trailing* wildcard character is really a prefix, right? So all that's really happening is something (and I've never looked at the code, so beware <G>) get the index of the first matching term. starting with that index, increment until the term you find doesn't have that prefix. That's it. Way, way, way different than queries which find *all* the terms that could possibly match your prefix. Find *all* the documents that have *any* of those terms. Calculate the relevancy scores. Optionally sort them. But try it. In fact, try enumerating all the terms for a given prefix and take some timings. I think you'll be impressed how fast it is and you can lay your fears to rest. Sounds pertty good?
-- Chris Lu ------------------------- Instant Scalable Full-Text Search On Any Database/Application site: http://www.dbsight.net demo: http://search.dbsight.com Lucene Database Search in 3 minutes: http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes On 6/8/07, Erick Erickson <[EMAIL PROTECTED]> wrote: > You can get the information pretty quickly by using a > WildcardTermEnum (NOT query). Especially if you > terminate after some number of characters.... > > Erick > > On 6/7/07, Chris Lu <[EMAIL PROTECTED]> wrote: > > > > Hi, > > > > I would like to implement an AJAX search. Basically when user types in > > several characters, I will try to search the Lucene index and found > > all possible matching items. > > > > Seems I need to use wildcard query like "test*" to matching anything. > > Is this the only way to do it? It doesn't seems quite efficient, > > especially when you just typed in the first character. > > > > I guess the "good" way is to go through the terms, and return as soon > > as, for example, 10 terms are found. > > > > I am wondering is there anything like this already built? > > > > -- > > Chris Lu > > ------------------------- > > Instant Scalable Full-Text Search On Any Database/Application > > site: http://www.dbsight.net > > demo: http://search.dbsight.com > > Lucene Database Search in 3 minutes: > > > > http://wiki.dbsight.com/index.php?title=Create_Lucene_Database_Search_in_3_minutes > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]