On 4/26/15 3:14 AM, Benedikt Ritter wrote: > 2015-04-26 11:52 GMT+02:00 Benedikt Ritter <brit...@apache.org>: > >> >> 2015-04-19 17:29 GMT+02:00 Benedikt Ritter <brit...@apache.org>: >> >>> >>> 2015-04-19 17:24 GMT+02:00 Benedikt Ritter <brit...@apache.org>: >>> >>>> Hello Bruno, >>>> >>>> 2015-04-19 12:15 GMT+02:00 Bruno P. Kinoshita < >>>> brunodepau...@yahoo.com.br>: >>>> >>>>> Thanks for the thorough code review Benedikt! Comments inline. >>>>> >>>>>> 1. IP clearance! There are several comments talking about code >>>>> adapted from >>>>>> PHP libraries. This needs to be listed in NOTICE.txt. For an example >>>>> see >>>>>> the NOTICE.txt in [lang]. >>>>> Filed https://issues.apache.org/jira/browse/SANDBOX-497 >>>>> And first try to fix it: >>>>> >>>>> >>>>> https://git1-us-west.apache.org/repos/asf?p=commons-text.git;a=commitdiff;h=refs/heads/SANDBOX-497 >>>>> Does that look correct? I also checked with the author before porting >>>>> it to the initial Java version, and told him about the [text] component - >>>>> https://github.com/jasonpriem/HumanNameParser.php/issues/11 >>>> >>>> I've adjusted the changes a bit and merged it into master. >>>> >>>> >>>>>> 2. The names package needs work. >>>>> Completely agree. We can try to make it thread safe too. This will take >>>>> a little longer. >>>>> File new issue for tracking it >>>>> https://issues.apache.org/jira/browse/SANDBOX-498 >>>> >>>> I've pushed a new branch containing a proposal how to fix this. Please >>>> review. >>>> >>>> >>>>> >>>>>> 3. There are a some useless JavaDocs, which can be dropped. For >>>>> example: >>>>>> (...)> Better rename the variable to middleName and drop the comment. >>>>> What about renaming the variables but leaving the comments? >>>>> >>>> For I've corrected this for the names package in the SANDBOX-498 branch. >>>> >>>> >>>>>> 4. Several classes have no tests. The overall test coverage looks >>>>> good, so >>>>>> this may not be a problem. >>>>> Before the initial release I can start a development cycle just to add >>>>> new tests for parts that seem important - like code with somewhat high >>>>> [+5] >>>>> cyclomatic complexity and uncovered branches. Just to make sure that the >>>>> initial release will have an acceptable quality and less hidden bugs :-) >>>>> >>>> Nice! >>>> >>>> >>>>>> 5. I don't think the shade configuration is correct. To me it looks >>>>> like >>>>>> commons-text depends on [lang] and also shades some classes. Do we >>>>> really >>>>>> need the explicit dependency? Furthermore, why do we need to shade so >>>>> many >>>>>> classes? Why do we need anything beside StringUtils? >>>>> My mistake. I copied the [functor] configuration without much thinking. >>>>> IIRC, this dependency was added because of the names package. But it >>>>> shouldn't be too complicate to replace it. It is mainly used for checking >>>>> empty/blank values. >>>>> >>>>> So let's drop the [lang] dependency and use something like `$VAR != >>>>> null && ! "".equals($VAR)` instead? >>>>> >>>> I think we also need StringUtils in the similarity package. We should >>>> handle this after we have merged the changes made to the names package. >>>> >>>> >>>>> Thanks again for reviewing the code! >>>>> >>>> No problem, thanks for pushing this forward. >>>> >>>> Here are some more things we need to take care of before 1.0: >>>> - JIRA project >>>> >>> Requested a JIRA project: >>> https://issues.apache.org/jira/servicedesk/customer/portal/1/INFRA-9477 >>> >> Here is the JIRA project [1]. I'll find out, how we can move the Commons >> Text issues from the Sandbox project to the new project. >> > I've migrated all issues from the Sandbox project to the new Commons Text > project. I've also updated the project documentation.
Did I miss the promotion VOTE? Phil > > >> Benedikt >> >> [1] https://issues.apache.org/jira/browse/TEXT >> >> >>> >>>> - Rename similarity package to distance package? We have five Distance >>>> implementations but only one Similarity. Furthermore we have FuzzyScore. Is >>>> it a distance or a similarity? It should be renamed accordingly. >>>> - ConsineSimilariy uses Vectors modeled as Map<CharSequence, Integer>. >>>> Does it make sense to introduce a new class called Vector? >>>> >>>> Benedikt >>>> >>>> >>>>> All the best,Bruno >>>>> >>>>> >>>>> From: Benedikt Ritter <brit...@apache.org> >>>>> To: Commons Developers List <dev@commons.apache.org> >>>>> Sent: Sunday, April 19, 2015 9:01 PM >>>>> Subject: [TEXT] Review of the code base >>>>> >>>>> Hi all, >>>>> >>>>> I've looked through the code base of [text] and I already did some clean >>>>> up. However there are a few things, that need more work IMHO: >>>>> >>>>> 1. IP clearance! There are several comments talking about code adapted >>>>> from >>>>> PHP libraries. This needs to be listed in NOTICE.txt. For an example see >>>>> the NOTICE.txt in [lang]. >>>>> >>>>> 2. The names package needs work. Currently the HumanNameParser takes a >>>>> Name >>>>> object (which is just a wrapper around a String) as parameter. The parse >>>>> result is then saved in fields (e.g. firstName). This makes the parser >>>>> unusable after it has parsed a name. I think the API should be reworked >>>>> such as: >>>>> - The constructor of the parser takes configuration options which can be >>>>> reused for several names to parse >>>>> - the parse method takes a string as parameter, containing a name >>>>> - the parse method returns an immutable Name objects which has getters >>>>> for >>>>> firstName, lastName etc. >>>>> >>>>> 3. There are a some useless JavaDocs, which can be dropped. For example: >>>>> /** >>>>> * Middle name. >>>>> */ >>>>> private String middle; >>>>> >>>>> Better rename the variable to middleName and drop the comment. >>>>> >>>>> 4. Several classes have no tests. The overall test coverage looks good, >>>>> so >>>>> this may not be a problem. >>>>> >>>>> 5. I don't think the shade configuration is correct. To me it looks like >>>>> commons-text depends on [lang] and also shades some classes. Do we >>>>> really >>>>> need the explicit dependency? Furthermore, why do we need to shade so >>>>> many >>>>> classes? Why do we need anything beside StringUtils? >>>>> >>>>> Nevertheless, this library looks very good. Kudos to Bruno for pushing >>>>> this >>>>> forward! >>>>> >>>>> Regards, >>>>> Benedikt >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> http://people.apache.org/~britter/ >>>>> http://www.systemoutprintln.de/ >>>>> http://twitter.com/BenediktRitter >>>>> http://github.com/britter >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> http://people.apache.org/~britter/ >>>> http://www.systemoutprintln.de/ >>>> http://twitter.com/BenediktRitter >>>> http://github.com/britter >>>> >>> >>> >>> -- >>> http://people.apache.org/~britter/ >>> http://www.systemoutprintln.de/ >>> http://twitter.com/BenediktRitter >>> http://github.com/britter >>> >> >> >> -- >> http://people.apache.org/~britter/ >> http://www.systemoutprintln.de/ >> http://twitter.com/BenediktRitter >> http://github.com/britter >> > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org