You can work like with lucene spelling.
A specific Index with word as Document, boost with something proportionnal of number of occurences (with log and math magic) The magical stuff is n Fields with starting ngram, not stored, no tokenized. For example, if you wont to index the word "carott", you will index the fields carott, carot, caro, car, ca, c.
With this huge index, you can search quickly, ordered by what you wont.
Two improvement, for reducing index size, you can limit the number of letter (min and max), and extract the right one in a little set, after a request.
What about bad words?
This index could be extended with two words suggestion like Google and co do.

M.


Le 9 juin 07 à 00:09, Chris Lu a écrit :

Thanks to all who answered with their experience and insights!

LUCENE-625 is very interesting, but not sure about the scalability.
"Begin completion only with 3 letters or more" is reasonable for
special cases, but not ideal. What I wanted to implement is a pretty
general software.

WildcardTermEnum seems closest to what I planned to search on existing
Lucene index, possibly pretty large. I can use it to list, say top 10
matching terms, and I can use another search to find all matching
docs. This is actually 2 searches.

Sounds pertty good?

--
Chris Lu
-------------------------
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucene Database Search in 3 minutes:
http://wiki.dbsight.com/index.php? title=Create_Lucene_Database_Search_in_3_minutes


On 6/8/07, Erick Erickson <[EMAIL PROTECTED]> wrote:
You can get the information pretty quickly by using a
WildcardTermEnum (NOT query). Especially if you
terminate after some number of characters....

Erick

On 6/7/07, Chris Lu <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I would like to implement an AJAX search. Basically when user types in > several characters, I will try to search the Lucene index and found
> all possible matching items.
>
> Seems I need to use wildcard query like "test*" to matching anything.
> Is this the only way to do it? It doesn't seems quite efficient,
> especially when you just typed in the first character.
>
> I guess the "good" way is to go through the terms, and return as soon
> as, for example, 10 terms are found.
>
> I am wondering is there anything like this already built?
>
> --
> Chris Lu
> -------------------------
> Instant Scalable Full-Text Search On Any Database/Application
> site: http://www.dbsight.net
> demo: http://search.dbsight.com
> Lucene Database Search in 3 minutes:
>
> http://wiki.dbsight.com/index.php? title=Create_Lucene_Database_Search_in_3_minutes
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to