Hi Daniela, 2014-02-24 14:30 GMT-03:00 Daniela Meneses <daniela11...@gmail.com>:
> Hi to all, > > As you may know I'm working on in some improvements for the String class. > Until now I implemented some missing tests. Right now I'm looking forward > to add new methods that could be useful based on Ruby API ( > http://www.ruby-doc.org/core-2.1.0/String.html). These are a few of the > methods that I'm planning to implement: > > > - chomp(separator=$/) -> new_str > - chop() -> new_str > - ljust(integer, padstr='') ->new_str > - next -> new_str > - partition(sep) -> [head, sep, tail] > > > Could you help to find out if these methods are already available for the > String class? > > If you have any idea of new methods for the string class, will be really > welcome. > > We can have an information retrieval API for aproximate string matching, i.e. Levenshtein distance (already implemented, various versions), Hamming distance, both are the most used and simplest edit distances. Then you have Longest common subsequence, Longest common substring (they are implemented in a package called "Fuzz", #longestCommonSubsequenceWith: ). Also there is the shift-or adapted for approximate matches (also implemented), fuzzy phrasing is another world also. Many applications use Damerau edit distance. Bioinformatics uses the Needleman-Wunsch and Smith-Waterman, but they call them "aligners" :) but you don't want to code the optimized version in Smalltalk, some say it could take years. All edit distances out there have specific requirements and no one is better than another for all cases. For example Jaro-Winkler is useful for one-word short strings. You have a lot of options for research. Smalltalkers here are very experienced and clever, always gives cool advices so don't be afraid to ask. Cheers, Hernán > -- > Cheers > , > Daniela Meneses >