Re: finding items that occur more than once in a list

2008-03-22 Thread Bryan Olson
Arnaud Delobelle wrote: > Bryan Olson wrote: >> We mean that the party supplying the keys deliberately chose >> them to make the hash table inefficient. In this thread the goal >> is efficiency; a party working toward an opposing goal is an >> adversary. > > There are situations where this can hap

Re: finding items that occur more than once in a list

2008-03-22 Thread castironpi
On Mar 22, 3:31 am, Bryan Olson <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > John Machin wrote: > >> On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: > >>> A collision sequence is not so rare. > >> [ hash( 2**i ) for i in range( 0, 256, 32 ) ] > >>> [1, 1, 1, 1, 1, 1, 1, 1] > >> Bryan

Re: finding items that occur more than once in a list

2008-03-22 Thread sturlamolden
On 22 Mar, 09:31, Bryan Olson <[EMAIL PROTECTED]> wrote: > Even a hash function that behaves as a random oracle has > worst-case quadratic-time in the algorithm here In which case inserts are not amortized to O(1). -- http://mail.python.org/mailman/listinfo/python-list

Re: finding items that occur more than once in a list

2008-03-22 Thread John Machin
On Mar 22, 9:58 pm, Arnaud Delobelle <[EMAIL PROTECTED]> wrote: > > Lastly, if one deals with a totally ordered set of object but they are > not hashable (there isn't a good hash function), then Ninereeds' idea > of sorting first is still useful. ... and if not totally ordered, then ... I'll just

Re: finding items that occur more than once in a list

2008-03-22 Thread Arnaud Delobelle
On Mar 22, 8:31 am, Bryan Olson <[EMAIL PROTECTED]> wrote: > [EMAIL PROTECTED] wrote: > > John Machin wrote: > >> On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: > >>> A collision sequence is not so rare. > >> [ hash( 2**i ) for i in range( 0, 256, 32 ) ] > >>> [1, 1, 1, 1, 1, 1, 1, 1] > >> Bryan

Re: finding items that occur more than once in a list

2008-03-22 Thread Bryan Olson
[EMAIL PROTECTED] wrote: > John Machin wrote: >> On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: >>> A collision sequence is not so rare. >> [ hash( 2**i ) for i in range( 0, 256, 32 ) ] >>> [1, 1, 1, 1, 1, 1, 1, 1] >> Bryan did qualify his remarks: "If we exclude the case where an >> adversary i

Re: finding items that occur more than once in a list

2008-03-22 Thread castironpi
On Mar 21, 3:39 pm, John Machin <[EMAIL PROTECTED]> wrote: > On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: > > A collision sequence is not so rare. > > > >>> [ hash( 2**i ) for i in range( 0, 256, 32 ) ] > > > [1, 1, 1, 1, 1, 1, 1, 1] > > Bryan did qualify his remarks: "If we exclude the case where

Re: finding items that occur more than once in a list

2008-03-21 Thread John Machin
On Mar 22, 1:11 am, [EMAIL PROTECTED] wrote: > On Mar 21, 4:17 am, Bryan Olson <[EMAIL PROTECTED]> wrote: > > > > > Arnaud Delobelle wrote: > > > Bryan Olson wrote: > > [...] > > >> Arnaud Delobelle offered a good Wikipedia link, and for more background > > >> look up "amortized analysis. > > > > H

Re: finding items that occur more than once in a list

2008-03-21 Thread castironpi
On Mar 21, 4:17 am, Bryan Olson <[EMAIL PROTECTED]> wrote: > Arnaud Delobelle wrote: > > Bryan Olson wrote: > [...] > >> Arnaud Delobelle offered a good Wikipedia link, and for more background > >> look up "amortized analysis. > > > Hrvoje Niksic provided the link :). > > Oops, careless thread-foll

Re: finding items that occur more than once in a list

2008-03-21 Thread Bryan Olson
Arnaud Delobelle wrote: > Bryan Olson wrote: [...] >> Arnaud Delobelle offered a good Wikipedia link, and for more background >> look up "amortized analysis. > > Hrvoje Niksic provided the link :). Oops, careless thread-following. Hrvoje Niksic it was. > I still think two unrelated > things are

Re: finding items that occur more than once in a list

2008-03-20 Thread castironpi
On Mar 20, 2:07 am, John Machin <[EMAIL PROTECTED]> wrote: > On Mar 20, 12:50 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> > wrote:> En Wed, 19 Mar 2008 20:16:36 -0300, John Machin <[EMAIL PROTECTED]>   > > escribió: > > > > On Mar 20, 9:14 am, sturlamolden <[EMAIL PROTECTED]> wrote: > > >> Is a Pyt

Re: finding items that occur more than once in a list

2008-03-20 Thread Arnaud Delobelle
On Mar 19, 3:08 am, Bryan Olson <[EMAIL PROTECTED]> wrote: > Arnaud Delobelle wrote: > > Ninereeds wrote: > >> Hrvoje Niksic wrote: > >>> This doesn't apply to Python, which implements dict storage as an > >>> open-addressed table and automatically (and exponentially) grows the > >>> table when the

Re: finding items that occur more than once in a list

2008-03-20 Thread John Machin
On Mar 20, 12:50 pm, "Gabriel Genellina" <[EMAIL PROTECTED]> wrote: > En Wed, 19 Mar 2008 20:16:36 -0300, John Machin <[EMAIL PROTECTED]>   > escribió: > > > On Mar 20, 9:14 am, sturlamolden <[EMAIL PROTECTED]> wrote: > >> Is a Python set implemented using a hash table? > > > What don't you underst

Re: finding items that occur more than once in a list

2008-03-19 Thread Gabriel Genellina
En Wed, 19 Mar 2008 20:16:36 -0300, John Machin <[EMAIL PROTECTED]> escribió: > On Mar 20, 9:14 am, sturlamolden <[EMAIL PROTECTED]> wrote: >> Is a Python set implemented using a hash table? > > What don't you understand about the comments in the first two > screenfuls of Objects/setobject.c? T

Re: finding items that occur more than once in a list

2008-03-19 Thread sturlamolden
On 20 Mar, 00:16, John Machin <[EMAIL PROTECTED]> wrote: > What don't you understand about the comments in the first two > screenfuls of Objects/setobject.c? I had not looked at it, but now I have. Is seems Hettinger is the author :) Ok, so sets are implemented as hash tables. Then I agree, use R

Re: finding items that occur more than once in a list

2008-03-19 Thread John Machin
On Mar 20, 9:14 am, sturlamolden <[EMAIL PROTECTED]> wrote: > On 19 Mar, 22:48, John Machin <[EMAIL PROTECTED]> wrote: > > > I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's, > > and is IMHO more readable than Paul's. > > Is a Python set implemented using a hash table? What don't

Re: finding items that occur more than once in a list

2008-03-19 Thread John Machin
On Mar 20, 9:57 am, Justin Bozonier <[EMAIL PROTECTED]> wrote: > On Mar 19, 2:48 pm, John Machin <[EMAIL PROTECTED]> wrote: > > > > > On Mar 19, 10:08 am, sturlamolden <[EMAIL PROTECTED]> wrote: > > > > On 18 Mar, 23:45, Arnaud Delobelle <[EMAIL PROTECTED]> wrote: > > > > > > def nonunique(lst): >

Re: finding items that occur more than once in a list

2008-03-19 Thread Justin Bozonier
On Mar 19, 2:48 pm, John Machin <[EMAIL PROTECTED]> wrote: > On Mar 19, 10:08 am, sturlamolden <[EMAIL PROTECTED]> wrote: > > > > > On 18 Mar, 23:45, Arnaud Delobelle <[EMAIL PROTECTED]> wrote: > > > > > def nonunique(lst): > > > >slst = sorted(lst) > > > >dups = [s[0] for s in > > > >

Re: finding items that occur more than once in a list

2008-03-19 Thread sturlamolden
On 19 Mar, 22:48, John Machin <[EMAIL PROTECTED]> wrote: > I'd use Raymond Hettinger's solution. It is as much O(N) as Paul's, > and is IMHO more readable than Paul's. Is a Python set implemented using a hash table? -- http://mail.python.org/mailman/listinfo/python-list

Re: finding items that occur more than once in a list

2008-03-19 Thread John Machin
On Mar 19, 10:08 am, sturlamolden <[EMAIL PROTECTED]> wrote: > On 18 Mar, 23:45, Arnaud Delobelle <[EMAIL PROTECTED]> wrote: > > > > def nonunique(lst): > > >slst = sorted(lst) > > >dups = [s[0] for s in > > > filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))] > > >return

Re: finding items that occur more than once in a list

2008-03-18 Thread castironpi
On Mar 18, 10:08 pm, Bryan Olson <[EMAIL PROTECTED]> wrote: > Arnaud Delobelle wrote: > > Ninereeds wrote: > >> Hrvoje Niksic wrote: > >>> This doesn't apply to Python, which implements dict storage as an > >>> open-addressed table and automatically (and exponentially) grows the > >>> table when th

Re: finding items that occur more than once in a list

2008-03-18 Thread Bryan Olson
Arnaud Delobelle wrote: > Ninereeds wrote: >> Hrvoje Niksic wrote: >>> This doesn't apply to Python, which implements dict storage as an >>> open-addressed table and automatically (and exponentially) grows the >>> table when the number of entries approaches 2/3 of the table size. >>> Assuming a goo

Re: finding items that occur more than once in a list

2008-03-18 Thread castironpi
On Mar 18, 6:38 pm, Arnaud Delobelle <[EMAIL PROTECTED]> wrote: > On Mar 18, 3:59 pm, Ninereeds <[EMAIL PROTECTED]> wrote: > > > Hrvoje Niksic wrote: > > > This doesn't apply to Python, which implements dict storage as an > > > open-addressed table and automatically (and exponentially) grows the >

Re: finding items that occur more than once in a list

2008-03-18 Thread Arnaud Delobelle
On Mar 18, 3:59 pm, Ninereeds <[EMAIL PROTECTED]> wrote: > Hrvoje Niksic wrote: > > This doesn't apply to Python, which implements dict storage as an > > open-addressed table and automatically (and exponentially) grows the > > table when the number of entries approaches 2/3 of the table size. > > A

Re: finding items that occur more than once in a list

2008-03-18 Thread sturlamolden
On 18 Mar, 23:45, Arnaud Delobelle <[EMAIL PROTECTED]> wrote: > > def nonunique(lst): > >slst = sorted(lst) > >dups = [s[0] for s in > > filter(lambda t : t[0] == t[1], zip(slst[:-1],slst[1:]))] > >return [dups[0]] + [s[1] for s in > > filter(lambda t : t[0] != t[1], zi

Re: finding items that occur more than once in a list

2008-03-18 Thread Arnaud Delobelle
On Mar 18, 9:56 pm, sturlamolden <[EMAIL PROTECTED]> wrote: > On 18 Mar, 22:25, sturlamolden <[EMAIL PROTECTED]> wrote: > > > def nonunique(lst): > >    slst = sorted(lst) > >    return list(set([s[0] for s in > >        filter(lambda t : t[0] != t[1], zip(slst[:-1],slst[1:]))])) > > Obviously that

Re: finding items that occur more than once in a list

2008-03-18 Thread Simon Forman
I love you guys, thanks! ;-) ~Simon -- http://mail.python.org/mailman/listinfo/python-list

Re: finding items that occur more than once in a list

2008-03-18 Thread Michael Mabin
How about using list comprehension? l1 = ["apples","apples","bananas","oranges","oranges","peaches"] s1 = set([x for x in l1 if l1.count(x) > 1]) On Tue, Mar 18, 2008 at 4:56 PM, sturlamolden <[EMAIL PROTECTED]> wrote: > On 18 Mar, 22:25, sturlamolden <[EMAIL PROTECTED]> wrote: > > > def nonun

Re: finding items that occur more than once in a list

2008-03-18 Thread sturlamolden
On 18 Mar, 22:25, sturlamolden <[EMAIL PROTECTED]> wrote: > def nonunique(lst): >slst = sorted(lst) >return list(set([s[0] for s in >filter(lambda t : t[0] != t[1], zip(slst[:-1],slst[1:]))])) Obviously that should be 'lambda t : t[0] == t[1]'. Instead of using the set function, t

Re: finding items that occur more than once in a list

2008-03-18 Thread Paul Rubin
sturlamolden <[EMAIL PROTECTED]> writes: > def nonunique(lst): >slst = sorted(lst) >return list(set([s[0] for s in >filter(lambda t : t[0] != t[1], zip(slst[:-1],slst[1:]))])) The items are all comparable and you're willing to take them out of order? from collections import defaul

Re: finding items that occur more than once in a list

2008-03-18 Thread sturlamolden
On 18 Mar, 22:22, sturlamolden <[EMAIL PROTECTED]> wrote: > def nonunique(lst): >slst = sorted(lst) >return list(set([s[0] for s in > filter(lambda t : not(t[0]-t[1]), zip(slst[:-1],slst[1:]))])) Or perhaps better: def nonunique(lst): slst = sorted(lst) return list(set([s[0]

Re: finding items that occur more than once in a list

2008-03-18 Thread sturlamolden
On 18 Mar, 10:57, Simon Forman <[EMAIL PROTECTED]> wrote: > def f(L): > '''Return a set of the items that occur more than once in L.''' > L = list(L) > for item in set(L): > L.remove(item) > return set(L) def nonunique(lst): slst = sorted(lst) return list(set([s[0]

Re: finding items that occur more than once in a list

2008-03-18 Thread Hrvoje Niksic
Ninereeds <[EMAIL PROTECTED]> writes: > As for the growth pattern, each time you grow the table you have to > redistribute all the items previously inserted to new locations. > Resizes would get rarer as more items are added due to the > exponential growth, but every table resize would take longer

Re: finding items that occur more than once in a list

2008-03-18 Thread Raymond Hettinger
On Mar 18, 2:57 am, Simon Forman <[EMAIL PROTECTED]> wrote: > Is there a more efficient way to do this? > > def f(L): > '''Return a set of the items that occur more than once in L.''' > L = list(L) > for item in set(L): > L.remove(item) > return set(L) > > |>> f([0, 0, 1, 1,

Re: finding items that occur more than once in a list

2008-03-18 Thread Ninereeds
Hrvoje Niksic wrote: > This doesn't apply to Python, which implements dict storage as an > open-addressed table and automatically (and exponentially) grows the > table when the number of entries approaches 2/3 of the table size. > Assuming a good hash function, filling the dict should yield amort

Re: finding items that occur more than once in a list

2008-03-18 Thread Hrvoje Niksic
Ninereeds <[EMAIL PROTECTED]> writes: > The dictionary version Chris suggests (and the essentially > equivalent set-based approach) is doing essentially the same thing > in a way, but using hashing rather than ordering to organise the > list and spot duplicates. This is *not* O(n) due to the rate

Re: finding items that occur more than once in a list

2008-03-18 Thread Ninereeds
Just to throw in one more alternative, if you sort your list, you only need to test adjacent items for equality rather than needing a search for each unique item found. You should get O(n log n) rather than O(n^2), since the performance bottleneck is now the sorting rather than the searching for d

Re: finding items that occur more than once in a list

2008-03-18 Thread Bryan Olson
Simon Forman wrote: > Is there a more efficient way to do this? > > def f(L): > '''Return a set of the items that occur more than once in L.''' > L = list(L) > for item in set(L): > L.remove(item) > return set(L) That's neat, but quadratic time because list.remove() requir

Re: finding items that occur more than once in a list

2008-03-18 Thread Paul Rubin
Simon Forman <[EMAIL PROTECTED]> writes: > Is there a more efficient way to do this? http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/502263 -- http://mail.python.org/mailman/listinfo/python-list

Re: finding items that occur more than once in a list

2008-03-18 Thread Chris
On Mar 18, 11:57 am, Simon Forman <[EMAIL PROTECTED]> wrote: > Is there a more efficient way to do this? > > def f(L): >     '''Return a set of the items that occur more than once in L.''' >     L = list(L) >     for item in set(L): >         L.remove(item) >     return set(L) > > |>> f([0, 0, 1, 1