Re: [Python-ideas] incremental hashing in hash

Matt Gilson Thu, 05 Jan 2017 09:32:49 -0800

I agree with Paul -- I'm not convinced that this is common enough or that
the benefits are big enough to warrant something builtin.  However, I did
decide to dust off some of my old skills and I threw together a simple gist
to see how hard it would be to create something using Cython based on the
CPython tuple hash algorithm.  I don't know how well it works for arbitrary
iterables without a `__length_hint__`, but seems to work as intended for
iterables that have the length hint.


<goog_827102756>
https://gist.github.com/mgilson/129859a79487a483163980db25b709bf

If you're interested, or want to pick this up and actually do something
with it, feel free... Also, I haven't written anything using Cython for
ages, so if this could be further optimized, feel free to let me know.

On Thu, Jan 5, 2017 at 7:58 AM, Paul Moore <[email protected]> wrote:

> On 5 January 2017 at 13:28, Neil Girdhar <[email protected]> wrote:
> > The point is that the OP doesn't want to write his own hash function, but
> > wants Python to provide a standard way of hashing an iterable.  Today,
> the
> > standard way is to convert to tuple and call hash on that.  That may not
> be
> > efficient. FWIW from a style perspective, I agree with OP.
>
> The debate here regarding tuple/frozenset indicates that there may not
> be a "standard way" of hashing an iterable (should order matter?).
> Although I agree that assuming order matters is a reasonable
> assumption to make in the absence of any better information.
>
> Hashing is low enough level that providing helpers in the stdlib is
> not unreasonable. It's not obvious (to me, at least) that it's a
> common enough need to warrant it, though. Do we have any information
> on how often people implement their own __hash__, or how often
> hash(tuple(my_iterable)) would be an acceptable hash, except for the
> cost of creating the tuple? The OP's request is the only time this has
> come up as a requirement, to my knowledge. Hence my suggestion to copy
> the tuple implementation, modify it to work with general iterables,
> and publish it as a 3rd party module - its usage might give us an idea
> of how often this need arises. (The other option would be for someone
> to do some analysis of published code).
>
> Assuming it is a sufficiently useful primitive to add, then we can
> debate naming. But I'd prefer it to be named in such a way that it
> makes it clear that it's a low-level helper for people writing their
> own __hash__ function, and not some sort of variant of hashing (which
> hash.from_iterable implies to me).
>
> Paul
> _______________________________________________
> Python-ideas mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>



-- 

[image: pattern-sig.png]

Matt Gilson // SOFTWARE ENGINEER

E: [email protected] // P: 603.892.7736

We’re looking for beta testers.  Go here
<https://www.getpattern.com/meetpattern> to sign up!

_______________________________________________
Python-ideas mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-ideas] incremental hashing in __hash__

Reply via email to

Re: [Python-ideas] incremental hashing in hash