s,
> > Xmx3g OK)
> >
> > On Thu, May 26, 2016 at 12:13 PM, Michael McCandless <
> > luc...@mikemccandless.com> wrote:
> >
> >> But how many states does the not-yet-determinized union of 5000+
> >> Levenshtein automata contain?
> >>
any states does the not-yet-determinized union of 5000+
> Levenshtein automata contain?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Thu, May 26, 2016 at 12:08 PM, Luke Nezda wrote:
>
> > I should note, I know in I can
> > call Oper
I should note, I know in I can
call Operations.determinize(union, 10_000_000) but union of 5000+
Levenshtein automata seems to require too many states to be tractable, and
that's on the low end of what I'd like to work with.
On Thu, May 26, 2016 at 9:59 AM, Luke Nezda wrote:
> I
yeah, converting THAT to an FST is tricky...
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Wed, May 25, 2016 at 2:46 PM, Luke Nezda wrote:
>
> > Oof, sounds too tricky for me to justify pursuing right now. While
> > union'ing 10k Levenshtein
shtein
> automaton for that word, and recording the first arcs you hit that has one
> unique original word as its output, and placing outputs on those arcs, and
> then doing a "rote" conversion to the syn filter's FST format. This part
> sounds tricky :)
>
> Mi
the match character offsets
of each match in each document.
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
> On Mon, May 23, 2016 at 8:59 PM, Luke Nezda wrote:
>
> > Hello, all -
> >
> > I'd like to use Lucene's automaton/FST code to
Hello, all -
I'd like to use Lucene's automaton/FST code to achieve fast fuzzy (OSA edit
distance up to 2) search for many (10k+) strings (knowledge base: kb) in
many large strings (docs).
Approach I was thinking of: create Levenshtein FST with all paths
associated with unedited form for each kb
ge -
> From: "Luke Nezda" <[EMAIL PROTECTED]>
> To:
> Sent: Sunday, December 11, 2005 6:28 PM
> Subject: Re: ApacheCon next week
>
>
> Hello Grant-
> Could you post the material you present (eg slides, handouts, etc) for
> those
> of us who cann
Hello Grant-
Could you post the material you present (eg slides, handouts, etc) for those
of us who cannot attend?
Thanks in advance,
-Luke
On 12/9/05, Grant Ingersoll <[EMAIL PROTECTED]> wrote:
>
> Any one planning on going to ApacheCon next week? I will be giving a
> talk on Lucene on Monday af