Hello,I am planning to work on this ticket TEXT-2. I need your guidance on naming/placing the class file for implementing this. The ask in the ticket is to get Jaccard Index [measures similarity] and Jaccard Distance [measures dissimilarity]. Below is what I am planning to do. Add a new class JaccardBase under package org.apache.commons.text, this will have logic to calculate both the index and distance. As you know Jaccard distance is 1- jaccard index, so there is no separate logic for each of it (index and distance), so planning to keep the calculation logic in a common place. Add a new class JaccardIndex under package org.apache.commons.text.similarity, this class will be derived from JaccardBase and the class JaccardIndex will expose public function to get the jaccard index. Similar to the above a new class JaccardDistance under package org.apache.commons.text.diff, this class will be derived from JaccardBase and the class JaccardDistance will expose public function to get the jaccard distance. The advantage is there is no code duplication.The disadvantage is, the caller wants both the index and distance then, he/she needs to call 2 separate functions (one from JaccardIndex class and one from JaccardDistance class) and we need to do the calculation twice for the same set of input.
Another option is, have a single class which will return both the index and distance.With this option, I have 2 questions1 where to keep the new class (under which package)2 what should be the name the new class.The disadvantage is option 1 is fixed here. I personally prefer option 1 as it looks more clean considering the way the classes are arranged in the package. Can you kindly review and comment on your thought. Do let me know if I am not clear. Thank you, Regards,Don Jeba.