On Tue, Feb 28, 2012 at 8:42 AM, Alan Woodward <alan.woodw...@romseysoftware.co.uk> wrote: > > On 28 Feb 2012, at 13:31, Michael McCandless wrote: > >> Neat :) It's like a FuzzyQuery w/ a custom (binary?) cost matrix for >> the insert/delete/transposition changes... >> >> Is the number of edits smallish? Ie you're not concerned about >> combinatoric explosion of step 1? > > We're only allowing expansions within an edit distance of 1, which should > keep the numbers of terms down.
Ahh, ok. So even if the term has two occurrences of cl, only one of them is allowed to substitute d? >> For steps 2 and 3 you shouldn't use FST at all. Instead, for 2) use >> BasicAutomata.makeString(String) on each of your expanded terms, then >> BasicOperations.union on all of those automata to make a single >> automaton accepting all your expanded terms, then likely call >> .determinize() on the resulting automaton (maybe also .minimize() but >> I think that may not help). Then pass that automaton to AQ. > > Excellent, thanks for your help. I'll give that a go. You might also try the DaciukMihovAutomatonBuilder class (it's in lucene/test-framework): it builds a minimal deterministic automaton from sorted terms, very quickly... you'd just have to pre-sort your terms. Mike --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org