You can work like with lucene spelling.
A specific Index with word as Document, boost with something
proportionnal of number of occurences (with log and math magic)
The magical stuff is n Fields with starting ngram, not stored, no
tokenized.
For example, if you wont to index the word "carott", you will index
the fields carott, carot, caro, car, ca, c.
With this huge index, you can search quickly, ordered by what you wont.
Two improvement, for reducing index size, you can limit the number of
letter (min and max), and extract the right one in a little set,
after a request.
What about bad words?
This index could be extended with two words suggestion like Google
and co do.
M.
Le 9 juin 07 à 00:09, Chris Lu a écrit :
Thanks to all who answered with their experience and insights!
LUCENE-625 is very interesting, but not sure about the scalability.
"Begin completion only with 3 letters or more" is reasonable for
special cases, but not ideal. What I wanted to implement is a pretty
general software.
WildcardTermEnum seems closest to what I planned to search on existing
Lucene index, possibly pretty large. I can use it to list, say top 10
matching terms, and I can use another search to find all matching
docs. This is actually 2 searches.
Sounds pertty good?
--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php?
title=Create_Lucene_Database_Search_in_3_minutes
On 6/8/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
You can get the information pretty quickly by using a
WildcardTermEnum (NOT query). Especially if you
terminate after some number of characters....
Erick
On 6/7/07, Chris Lu <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I would like to implement an AJAX search. Basically when user
types in
> several characters, I will try to search the Lucene index and
found
> all possible matching items.
>
> Seems I need to use wildcard query like "test*" to matching
anything.
> Is this the only way to do it? It doesn't seems quite efficient,
> especially when you just typed in the first character.
>
> I guess the "good" way is to go through the terms, and return as
soon
> as, for example, 10 terms are found.
>
> I am wondering is there anything like this already built?
>
> --
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site: http://www.dbsight.net
> demo: http://search.dbsight.com
> Lucene Database Search in 3 minutes:
>
> http://wiki.dbsight.com/index.php?
title=Create_Lucene_Database_Search_in_3_minutes
>
>
---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]