On Tue, Mar 17, 2020 at 4:38 PM Alex Herbert <alex.d.herb...@gmail.com>
wrote:

>
>
> > On 17 Mar 2020, at 15:41, Claude Warren <cla...@xenei.com> wrote:
> >
> > I agree with the HashFunction changes.
>
> OK, but which ones?
>

DOH! this one...

>
> Changing HashFunction to have two methods:
>
> long hash(byte[])
> long increment(int seed)
>
> > I think Builder should have
> > with(byte[])
> > with(byte[], int offset, int len )
>
> Not convinced here. The HashFunction requires a byte[] and cannot operate
> on a range. This change should be made in conjunction with a similar change
> to HashFunction. So should we update HashFunction to:
>
>
Given the depth of the change let's just leave the with( byte[] )


> > with(String)
> >
> > I find that I use with(String) more than any other with() method.
>
> That may be so but String.getBytes(Charset) is trivial to call for the
> user. Then they get to decide on the encoding and not leave it to the
> Hasher. I would use UTF-16 because it would be fast. But UTF-8 is nice as a
> cross-language standard. Leave it out of the API for now, or add both:
>
> Builder with(CharSequence, Charset);
> Builder withUnencoded(CharSequence);
>

CharSequence has no easy method to convert to a byte[]. While it could be
done, it looks to be more of a streaming interface.  Let's leave that out.


> I would argue that you may use BloomFilters for Strings but if we see a
> BloomFilter as a collection then we should really support all Objects (with
> a decorator or by typing the Builder) or not support Objects. Currently we
> are not supporting any Object so for now would drop this and the
> Hasher.Builder then becomes a very simple API that specifies that you put
> in items represented as a byte[] and call build to create a Hasher
> containing those items and reset for further use.
>

I have code example in several places where I hash GeoCode entities.  Since
they are comprised of strings, for the most part, building a hasher for
them simply requires hashing the Strings.  Many web services use JSON and
most JSON is string based.  I disagree with removing with(String) because
it is so convenient in so many cases.  It also makes the code
cleaner/easier to read.  But if you feel strongly about removing it then OK.

The only other thing that is really bothersome is the lack of
Shape.getNumberOfBytes().  Yes it is easy to call Math.ceil(
Shape.getNumberOfBits / 8.0 ).  But getNumberOfBytes() is much more
readable in the code.

Claude

Reply via email to