David Eppstein wrote: > In article <[EMAIL PROTECTED]>, > Patrick Useldinger <[EMAIL PROTECTED]> wrote: > > > > Well, but the spec didn't say efficiency was the primary criterion, it > > > said minimizing the number of comparisons was. > > > > That's exactly what my program does. > > If you're doing any comparisons at all, you're not minimizing the number > of comparisons.
Gah. You two should be lawyers. I don't know why you're arguing about whatever you are arguing about. Just look at the efficiency of processing N files of the same size S, where they differ after d bytes: [If they don't differ, d = S] PU: O(Nd) reading time, O(Nd) data comparison time [Actually (N-1)d which is important for small N and large d]. Hashing method: O(NS) reading time, O(NS) hash calc time 1. d <= S 2. Anyone have a usable hash function that's faster than string comparison? 3. Anyone have an answer to PU's request for advice where his algorithm could be improved by using a hash method? Temporary injunction in favour of PU. -- http://mail.python.org/mailman/listinfo/python-list