On 4/26/15 3:14 AM, Benedikt Ritter wrote:
> 2015-04-26 11:52 GMT+02:00 Benedikt Ritter <brit...@apache.org>:
>
>>
>> 2015-04-19 17:29 GMT+02:00 Benedikt Ritter <brit...@apache.org>:
>>
>>>
>>> 2015-04-19 17:24 GMT+02:00 Benedikt Ritter <brit...@apache.org>:
>>>
>>>> Hello Bruno,
>>>>
>>>> 2015-04-19 12:15 GMT+02:00 Bruno P. Kinoshita <
>>>> brunodepau...@yahoo.com.br>:
>>>>
>>>>> Thanks for the thorough code review Benedikt! Comments inline.
>>>>>
>>>>>> 1. IP clearance! There are several comments talking about code
>>>>> adapted from
>>>>>> PHP libraries. This needs to be listed in NOTICE.txt. For an example
>>>>> see
>>>>>> the NOTICE.txt in [lang].
>>>>> Filed https://issues.apache.org/jira/browse/SANDBOX-497
>>>>> And first try to fix it:
>>>>>
>>>>>
>>>>> https://git1-us-west.apache.org/repos/asf?p=commons-text.git;a=commitdiff;h=refs/heads/SANDBOX-497
>>>>> Does that look correct? I also checked with the author before porting
>>>>> it to the initial Java version, and told him about the [text] component -
>>>>> https://github.com/jasonpriem/HumanNameParser.php/issues/11
>>>>
>>>> I've adjusted the changes a bit and merged it into master.
>>>>
>>>>
>>>>>> 2. The names package needs work.
>>>>> Completely agree. We can try to make it thread safe too. This will take
>>>>> a little longer.
>>>>> File new issue for tracking it
>>>>> https://issues.apache.org/jira/browse/SANDBOX-498
>>>>
>>>> I've pushed a new branch containing a proposal how to fix this. Please
>>>> review.
>>>>
>>>>
>>>>>
>>>>>> 3. There are a some useless JavaDocs, which can be dropped. For
>>>>> example:
>>>>>> (...)> Better rename the variable to middleName and drop the comment.
>>>>> What about renaming the variables but leaving the comments?
>>>>>
>>>> For I've corrected this for the names package in the SANDBOX-498 branch.
>>>>
>>>>
>>>>>> 4. Several classes have no tests. The overall test coverage looks
>>>>> good, so
>>>>>> this may not be a problem.
>>>>> Before the initial release I can start a development cycle just to add
>>>>> new tests for parts that seem important - like code with somewhat high 
>>>>> [+5]
>>>>> cyclomatic complexity and uncovered branches. Just to make sure that the
>>>>> initial release will have an acceptable quality and less hidden bugs :-)
>>>>>
>>>> Nice!
>>>>
>>>>
>>>>>> 5. I don't think the shade configuration is correct. To me it looks
>>>>> like
>>>>>> commons-text depends on [lang] and also shades some classes. Do we
>>>>> really
>>>>>> need the explicit dependency? Furthermore, why do we need to shade so
>>>>> many
>>>>>> classes? Why do we need anything beside StringUtils?
>>>>> My mistake. I copied the [functor] configuration without much thinking.
>>>>> IIRC, this dependency was added because of the names package. But it
>>>>> shouldn't be too complicate to replace it. It is mainly used for checking
>>>>> empty/blank values.
>>>>>
>>>>> So let's drop the [lang] dependency and use something like `$VAR !=
>>>>> null && ! "".equals($VAR)` instead?
>>>>>
>>>> I think we also need StringUtils in the similarity package. We should
>>>> handle this after we have merged the changes made to the names package.
>>>>
>>>>
>>>>> Thanks again for reviewing the code!
>>>>>
>>>> No problem, thanks for pushing this forward.
>>>>
>>>> Here are some more things we need to take care of before 1.0:
>>>> - JIRA project
>>>>
>>> Requested a JIRA project:
>>> https://issues.apache.org/jira/servicedesk/customer/portal/1/INFRA-9477
>>>
>> Here is the JIRA project [1]. I'll find out, how we can move the Commons
>> Text issues from the Sandbox project to the new project.
>>
> I've migrated all issues from the Sandbox project to the new Commons Text
> project. I've also updated the project documentation.

Did I miss the promotion VOTE?

Phil
>
>
>> Benedikt
>>
>> [1] https://issues.apache.org/jira/browse/TEXT
>>
>>
>>>
>>>> - Rename similarity package to distance package? We have five Distance
>>>> implementations but only one Similarity. Furthermore we have FuzzyScore. Is
>>>> it a distance or a similarity? It should be renamed accordingly.
>>>> - ConsineSimilariy uses Vectors modeled as Map<CharSequence, Integer>.
>>>> Does it make sense to introduce a new class called Vector?
>>>>
>>>> Benedikt
>>>>
>>>>
>>>>> All the best,Bruno
>>>>>
>>>>>
>>>>>       From: Benedikt Ritter <brit...@apache.org>
>>>>>  To: Commons Developers List <dev@commons.apache.org>
>>>>>  Sent: Sunday, April 19, 2015 9:01 PM
>>>>>  Subject: [TEXT] Review of the code base
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I've looked through the code base of [text] and I already did some clean
>>>>> up. However there are a few things, that need more work IMHO:
>>>>>
>>>>> 1. IP clearance! There are several comments talking about code adapted
>>>>> from
>>>>> PHP libraries. This needs to be listed in NOTICE.txt. For an example see
>>>>> the NOTICE.txt in [lang].
>>>>>
>>>>> 2. The names package needs work. Currently the HumanNameParser takes a
>>>>> Name
>>>>> object (which is just a wrapper around a String) as parameter. The parse
>>>>> result is then saved in fields (e.g. firstName). This makes the parser
>>>>> unusable after it has parsed a name. I think the API should be reworked
>>>>> such as:
>>>>> - The constructor of the parser takes configuration options which can be
>>>>> reused for several names to parse
>>>>> - the parse method takes a string as parameter, containing a name
>>>>> - the parse method returns an immutable Name objects which has getters
>>>>> for
>>>>> firstName, lastName etc.
>>>>>
>>>>> 3. There are a some useless JavaDocs, which can be dropped. For example:
>>>>> /**
>>>>>  * Middle name.
>>>>>  */
>>>>> private String middle;
>>>>>
>>>>> Better rename the variable to middleName and drop the comment.
>>>>>
>>>>> 4. Several classes have no tests. The overall test coverage looks good,
>>>>> so
>>>>> this may not be a problem.
>>>>>
>>>>> 5. I don't think the shade configuration is correct. To me it looks like
>>>>> commons-text depends on [lang] and also shades some classes. Do we
>>>>> really
>>>>> need the explicit dependency? Furthermore, why do we need to shade so
>>>>> many
>>>>> classes? Why do we need anything beside StringUtils?
>>>>>
>>>>> Nevertheless, this library looks very good. Kudos to Bruno for pushing
>>>>> this
>>>>> forward!
>>>>>
>>>>> Regards,
>>>>> Benedikt
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> http://people.apache.org/~britter/
>>>>> http://www.systemoutprintln.de/
>>>>> http://twitter.com/BenediktRitter
>>>>> http://github.com/britter
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> http://people.apache.org/~britter/
>>>> http://www.systemoutprintln.de/
>>>> http://twitter.com/BenediktRitter
>>>> http://github.com/britter
>>>>
>>>
>>>
>>> --
>>> http://people.apache.org/~britter/
>>> http://www.systemoutprintln.de/
>>> http://twitter.com/BenediktRitter
>>> http://github.com/britter
>>>
>>
>>
>> --
>> http://people.apache.org/~britter/
>> http://www.systemoutprintln.de/
>> http://twitter.com/BenediktRitter
>> http://github.com/britter
>>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to