Hi Rob,

First of all, kudos for the great work moving things from [lang] into [text].

I got a copy of the Lothaire book last weekend, but haven't had a chance to 
read it yet.

There was also some discussion around the name-parser, and since we couldn't 
reach a consensus,
I think we could either try to have another discussion thread, or stash it 
somewhere so that
it doesn't block a release.


I also would like to implement more edit distance and string similarities, as 
well as
look into the duration unit parser, probably adapting code from 
github.com/jchampemont/gunip


But I'd vote for (4). First moving the human name parser elsewhere, reviewing 
the edit distances,
and checking if there's anything else we could put into this initial release 
from [lang].

Once it has been released, we will be able to add things from Lothaire book,
more edit distances, maybe bring back the name parser, as well as any 
enhancement
bug fixing.

Bruno

>________________________________
> From: Rob Tompkins <chtom...@gmail.com>
>To: Commons Developers List <dev@commons.apache.org> 
>Sent: Tuesday, 29 November 2016 11:45 AM
>Subject: [text] Next steps.
> 
>
>Hello,
>
>I'm a tad curious what folks (along with Gary, Benedikt, and Bruno) think
>the next steps are for text in the hopeful thought that we are eventually
>heading towards a 1.0 release. Some thoughts that come to mind are:
>
>(1) Go over lang with fine tooth comb and see what we think should move,
>(2) Go through the Lothaire "Applied Combinatorics on Words" book (
>http://lipn.univ-paris13.fr/~duchamp/Books&more/Lothaire/(Encyclopedia_of_Mathematics_and_its_Applications_)M._Lothaire-Applied_Combinatorics_On_Words-Cambridge_University_Press(2005).pdf)
>and minimally implement some of the standard algorithms.
>(3) Implement, from the Lothaire book, some of the more complex stuff:
>heavier pattern matching, and/or natural language processing,
>and/or
>(4) Go straight for a release.
>
>I'm less for (4) because I think there's probably some smaller bits of code
>in lang that probably come over. I like the idea of (2) before heading out
>the door. Regarding (3), I would have to do considerable reading to make
>considerable headway here, which I'm not opposed to doing it would just
>merely prolong getting to a 1.0 release if we predicated the release upon
>my getting that done.
>
>So, what do you guys think?
>
>Cheers,
>-Rob
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to