2015-04-26 11:52 GMT+02:00 Benedikt Ritter <brit...@apache.org>:

>
>
> 2015-04-19 17:29 GMT+02:00 Benedikt Ritter <brit...@apache.org>:
>
>>
>>
>> 2015-04-19 17:24 GMT+02:00 Benedikt Ritter <brit...@apache.org>:
>>
>>> Hello Bruno,
>>>
>>> 2015-04-19 12:15 GMT+02:00 Bruno P. Kinoshita <
>>> brunodepau...@yahoo.com.br>:
>>>
>>>> Thanks for the thorough code review Benedikt! Comments inline.
>>>>
>>>> > 1. IP clearance! There are several comments talking about code
>>>> adapted from
>>>> >PHP libraries. This needs to be listed in NOTICE.txt. For an example
>>>> see
>>>> >the NOTICE.txt in [lang].
>>>> Filed https://issues.apache.org/jira/browse/SANDBOX-497
>>>> And first try to fix it:
>>>>
>>>>
>>>> https://git1-us-west.apache.org/repos/asf?p=commons-text.git;a=commitdiff;h=refs/heads/SANDBOX-497
>>>> Does that look correct? I also checked with the author before porting
>>>> it to the initial Java version, and told him about the [text] component -
>>>> https://github.com/jasonpriem/HumanNameParser.php/issues/11
>>>
>>>
>>> I've adjusted the changes a bit and merged it into master.
>>>
>>>
>>>>
>>>> > 2. The names package needs work.
>>>> Completely agree. We can try to make it thread safe too. This will take
>>>> a little longer.
>>>> File new issue for tracking it
>>>> https://issues.apache.org/jira/browse/SANDBOX-498
>>>
>>>
>>> I've pushed a new branch containing a proposal how to fix this. Please
>>> review.
>>>
>>>
>>>>
>>>>
>>>> > 3. There are a some useless JavaDocs, which can be dropped. For
>>>> example:
>>>> > (...)> Better rename the variable to middleName and drop the comment.
>>>> What about renaming the variables but leaving the comments?
>>>>
>>>
>>> For I've corrected this for the names package in the SANDBOX-498 branch.
>>>
>>>
>>>>
>>>> > 4. Several classes have no tests. The overall test coverage looks
>>>> good, so
>>>> >this may not be a problem.
>>>> Before the initial release I can start a development cycle just to add
>>>> new tests for parts that seem important - like code with somewhat high [+5]
>>>> cyclomatic complexity and uncovered branches. Just to make sure that the
>>>> initial release will have an acceptable quality and less hidden bugs :-)
>>>>
>>>
>>> Nice!
>>>
>>>
>>>> > 5. I don't think the shade configuration is correct. To me it looks
>>>> like
>>>> >commons-text depends on [lang] and also shades some classes. Do we
>>>> really
>>>> >need the explicit dependency? Furthermore, why do we need to shade so
>>>> many
>>>> >classes? Why do we need anything beside StringUtils?
>>>> My mistake. I copied the [functor] configuration without much thinking.
>>>> IIRC, this dependency was added because of the names package. But it
>>>> shouldn't be too complicate to replace it. It is mainly used for checking
>>>> empty/blank values.
>>>>
>>>> So let's drop the [lang] dependency and use something like `$VAR !=
>>>> null && ! "".equals($VAR)` instead?
>>>>
>>>
>>> I think we also need StringUtils in the similarity package. We should
>>> handle this after we have merged the changes made to the names package.
>>>
>>>
>>>> Thanks again for reviewing the code!
>>>>
>>>
>>> No problem, thanks for pushing this forward.
>>>
>>> Here are some more things we need to take care of before 1.0:
>>> - JIRA project
>>>
>>
>> Requested a JIRA project:
>> https://issues.apache.org/jira/servicedesk/customer/portal/1/INFRA-9477
>>
>
> Here is the JIRA project [1]. I'll find out, how we can move the Commons
> Text issues from the Sandbox project to the new project.
>

I've migrated all issues from the Sandbox project to the new Commons Text
project. I've also updated the project documentation.


>
> Benedikt
>
> [1] https://issues.apache.org/jira/browse/TEXT
>
>
>>
>>
>>> - Rename similarity package to distance package? We have five Distance
>>> implementations but only one Similarity. Furthermore we have FuzzyScore. Is
>>> it a distance or a similarity? It should be renamed accordingly.
>>> - ConsineSimilariy uses Vectors modeled as Map<CharSequence, Integer>.
>>> Does it make sense to introduce a new class called Vector?
>>>
>>> Benedikt
>>>
>>>
>>>>
>>>> All the best,Bruno
>>>>
>>>>
>>>>       From: Benedikt Ritter <brit...@apache.org>
>>>>  To: Commons Developers List <dev@commons.apache.org>
>>>>  Sent: Sunday, April 19, 2015 9:01 PM
>>>>  Subject: [TEXT] Review of the code base
>>>>
>>>> Hi all,
>>>>
>>>> I've looked through the code base of [text] and I already did some clean
>>>> up. However there are a few things, that need more work IMHO:
>>>>
>>>> 1. IP clearance! There are several comments talking about code adapted
>>>> from
>>>> PHP libraries. This needs to be listed in NOTICE.txt. For an example see
>>>> the NOTICE.txt in [lang].
>>>>
>>>> 2. The names package needs work. Currently the HumanNameParser takes a
>>>> Name
>>>> object (which is just a wrapper around a String) as parameter. The parse
>>>> result is then saved in fields (e.g. firstName). This makes the parser
>>>> unusable after it has parsed a name. I think the API should be reworked
>>>> such as:
>>>> - The constructor of the parser takes configuration options which can be
>>>> reused for several names to parse
>>>> - the parse method takes a string as parameter, containing a name
>>>> - the parse method returns an immutable Name objects which has getters
>>>> for
>>>> firstName, lastName etc.
>>>>
>>>> 3. There are a some useless JavaDocs, which can be dropped. For example:
>>>> /**
>>>>  * Middle name.
>>>>  */
>>>> private String middle;
>>>>
>>>> Better rename the variable to middleName and drop the comment.
>>>>
>>>> 4. Several classes have no tests. The overall test coverage looks good,
>>>> so
>>>> this may not be a problem.
>>>>
>>>> 5. I don't think the shade configuration is correct. To me it looks like
>>>> commons-text depends on [lang] and also shades some classes. Do we
>>>> really
>>>> need the explicit dependency? Furthermore, why do we need to shade so
>>>> many
>>>> classes? Why do we need anything beside StringUtils?
>>>>
>>>> Nevertheless, this library looks very good. Kudos to Bruno for pushing
>>>> this
>>>> forward!
>>>>
>>>> Regards,
>>>> Benedikt
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> http://people.apache.org/~britter/
>>>> http://www.systemoutprintln.de/
>>>> http://twitter.com/BenediktRitter
>>>> http://github.com/britter
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> http://people.apache.org/~britter/
>>> http://www.systemoutprintln.de/
>>> http://twitter.com/BenediktRitter
>>> http://github.com/britter
>>>
>>
>>
>>
>> --
>> http://people.apache.org/~britter/
>> http://www.systemoutprintln.de/
>> http://twitter.com/BenediktRitter
>> http://github.com/britter
>>
>
>
>
> --
> http://people.apache.org/~britter/
> http://www.systemoutprintln.de/
> http://twitter.com/BenediktRitter
> http://github.com/britter
>



-- 
http://people.apache.org/~britter/
http://www.systemoutprintln.de/
http://twitter.com/BenediktRitter
http://github.com/britter

Reply via email to