Hi ebourg!

>I think a PROPOSAL.html file defining the scope and goals of thecomponent 
>would a good idea.
+1
> > - Inclusion of Talend code into [text] is not possible (the is code
> > licensed by www.talend.com)
>
> What is this code about?
Talend Open Studio [1]. More specifically, part of the solution that provides a 
mechanism for combining multiple algorithms and uses probability, weights and 
thresholds for comparing attributes. 

I started using Talend Open Studio, but have already switched to Duke [2], 
which is Apache License, has some algorithms already implemented [3] and 
includes the probabilistic methods as well.
Probably it's a better idea to use Duke's code, in case we decide to include 
the probabilistic method.

[1] https://www.talend.com/products/talend-open-studio
[2] https://github.com/larsga/Duke/
[3] 
https://github.com/larsga/Duke/tree/master/src/main/java/no/priv/garshol/duke/comparators


      From: Emmanuel Bourg <ebo...@apache.org>
 To: Commons Developers List <dev@commons.apache.org> 
 Sent: Wednesday, November 12, 2014 11:34 AM
 Subject: Re: [text] Incorporating Bruno Kinoshita's work
   
Le 12/11/2014 13:34, Benedikt Ritter a écrit :

> the git repo for [text] is ready and I've done the initial bootstraping
> already. I've also created a new component in the SANDBOX jira project. The
> first issue is to extract algorithms from [lang] [1]. I remember people
> saying, that theere is code in codec too. Please feel free to create
> tickets for this.

I think a PROPOSAL.html file defining the scope and goals of the
component would a good idea.

> - Inclusion of Talend code into [text] is not possible (the is code
> licensed by www.talend.com)

What is this code about?



> - spellchecker package: nice idea, which I haven't thought about before.
> Further more I could imagine a hyphenation package. Both should be locale
> dependend.

I may be wrong, but I'm under the impression a full spellchecking API is
probably too big for a small utility component.

Emmanuel Bourg


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



   

Reply via email to