On 20/12/2019 18:59, Peter Otten wrote:
Chris Angelico wrote:
On Sat, Dec 21, 2019 at 5:03 AM Peter Otten <__pete...@web.de> wrote:
PS: If you are sorting files by size and checksum as part of a
deduplication effort consider using dict-s instead:
Yeah, I'd agree if that's the purpose. But let's say the point is to
have a guaranteed-stable ordering of files that are primarily to be
sorted by file size - in order to ensure that two files are in the
same order every time you refresh the view, they get sorted by their
checksums.
One thing that struck me about Eli's example is that it features two key
functions rather than a complex comparison.
If sort() would accept a sequence of key functions each function could be
used to sort slices that compare equal when using the previous key.
You don't need a sequence of key functions : the sort algorithm used in
Python (tim-sort) is stable - which means if two items (A &B) are in a
given order in the sequence before the sort starts, and A & B compare
equal during the sort, then after the sort A & B retain their ordering.
So if you want to sort by file size as the primary and then by checksum
if file sizes are equal - you sort by checksum first, and then by file
size: this guarantees that the items will always be in file size order -
and if file sizes are equal then they will be ordered by checksum.
The rule to remember - is sort in the reverse order of criteria.
There ARE good reasons to do weird things with sorting, and a custom
key object (either with cmp_to_key or directly implemented) can do
that.
Indeed.
--
https://mail.python.org/mailman/listinfo/python-list