> On 2 Mar 2019, at 16:59, Mark Dacek <m...@syberion.com> wrote:
> 
> Is your proposed method a stepwise charAt comparison across both, assuming
> non-null and equal length?

Yes. Although the StringUtils.equals(CharSequence, CharSequence) from [lang] 
will do the job correctly (thanks Gary). It currently does all the edge case 
checks then calls a region matching method using the entire length but the 
effect is the same as:

for (int i = 0; i < cs1.length(); i++) {
    if (cs1.charAt(i) != cs2.charAt(i)) {
        return false;
    }
}
return true;

Switching in the above code instead of the call to regionMatches(…) at the end 
of StringUtils.equals(CharSequence, CharSequence) would avoid repeating all the 
edge case checks of length at the start of that method and the case 
insensitivity functionality. 

The StringUtils.equals method already detects if String is input as both 
arguments and defaults to that if possible. So this is basically for any other 
combination of CharSequence types where a simple stepwise charAt comparison is 
wanted.


> Doesn't seem like a bad idea, though I'm curious whether there's a use-case
> where toString() on both and comparing isn't more expedient.

Just the memory overhead of duplicating to create a String. If a match is 
unlikely, especially near the start, then this is a cost to consider for longer 
strings.

I was just after something to put in place of the incorrect usage of:

CharSequence cs1, cs2;
cs1.equals(cs2);

Which is not part of the CharSequence interface and works only if inputting 2 
objects that support equals correctly, like String or StringBuilder.

I’ve just has a look for .equals() in all of [text] and this is actually a bug 
that is in the newly submitted JaroWinklerSimilarity too. 

I’ll do a PR to fix that one.

> 
> On Sat, Mar 2, 2019 at 11:53 AM Alex Herbert <alex.d.herb...@gmail.com>
> wrote:
> 
>> I am helping with the PR for TEXT-126 to add to the similarity package.
>> 
>> Part of the new algorithm requires identifying if two CharSequences are
>> identical. Is there a utility in Text to do something like this:
>> 
>> public static boolean CharSequenceUtils.equals(CharSequence, CharSequence);
>> 
>> I cannot find one with a quick regex search of the library. I am not
>> familiar with Lang either but this is a dependency so a method from there
>> could be used.
>> 
>> The current PR is using left.equals(right) on the input CharSequence to
>> compare to one to another which is wrong if the two input CharSequences do
>> not support matching, e.g. if the input was a String and StringBuilder then
>> String.equals(StringBuilder) would not match, even if the characters were
>> the same.
>> 
>> Regards,
>> 
>> Alex
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
>> For additional commands, e-mail: dev-h...@commons.apache.org
>> 
>> 

Reply via email to