Hi ebourg! >I think a PROPOSAL.html file defining the scope and goals of thecomponent >would a good idea. +1 > > - Inclusion of Talend code into [text] is not possible (the is code > > licensed by www.talend.com) > > What is this code about? Talend Open Studio [1]. More specifically, part of the solution that provides a mechanism for combining multiple algorithms and uses probability, weights and thresholds for comparing attributes.
I started using Talend Open Studio, but have already switched to Duke [2], which is Apache License, has some algorithms already implemented [3] and includes the probabilistic methods as well. Probably it's a better idea to use Duke's code, in case we decide to include the probabilistic method. [1] https://www.talend.com/products/talend-open-studio [2] https://github.com/larsga/Duke/ [3] https://github.com/larsga/Duke/tree/master/src/main/java/no/priv/garshol/duke/comparators From: Emmanuel Bourg <ebo...@apache.org> To: Commons Developers List <dev@commons.apache.org> Sent: Wednesday, November 12, 2014 11:34 AM Subject: Re: [text] Incorporating Bruno Kinoshita's work Le 12/11/2014 13:34, Benedikt Ritter a écrit : > the git repo for [text] is ready and I've done the initial bootstraping > already. I've also created a new component in the SANDBOX jira project. The > first issue is to extract algorithms from [lang] [1]. I remember people > saying, that theere is code in codec too. Please feel free to create > tickets for this. I think a PROPOSAL.html file defining the scope and goals of the component would a good idea. > - Inclusion of Talend code into [text] is not possible (the is code > licensed by www.talend.com) What is this code about? > - spellchecker package: nice idea, which I haven't thought about before. > Further more I could imagine a hyphenation package. Both should be locale > dependend. I may be wrong, but I'm under the impression a full spellchecking API is probably too big for a small utility component. Emmanuel Bourg --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org