On 24 December 2012 16:18, Roy Smith <r...@panix.com> wrote: > In article <40d108ec-b019-4829-a969-c8ef51386...@googlegroups.com>, > Pander Musubi <pander.mus...@gmail.com> wrote: > > > Hi all, > > > > I would like to sort according to this order: > > > > (' ', '.', '\'', '-', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', > 'a', > > 'A', '?', '?', '?', '?', '?', '?', '?', '?', '?', '?', 'b', 'B', 'c', > 'C', > > '?', '?', 'd', 'D', 'e', 'E', '?', '?', '?', '?', '?', '?', '?', '?', > 'f', > > 'F', 'g', 'G', 'h', 'H', 'i', 'I', '?', '?', '?', '?', '?', '?', '?', > '?', > > 'j', 'J', 'k', 'K', 'l', 'L', 'm', 'M', 'n', '?', 'N', '?', 'o', 'O', > '?', > > '?', '?', '?', '?', '?', '?', '?', '?', '?', 'p', 'P', 'q', 'Q', 'r', > 'R', > > 's', 'S', 't', 'T', 'u', 'U', '?', '?', '?', '?', '?', '?', '?', '?', > 'v', > > 'V', 'w', 'W', 'x', 'X', 'y', 'Y', 'z', 'Z') > > > > How can I do this? The default sorted() does not give the desired result. >
<snip> Given all that, I would start by writing some code which turned your > alphabet into a pair of dicts. One maps from the code point to a > collating sequence number (i.e. ordinals), the other maps back. > Something like (for python 2.7): > > alphabet = (' ', '.', '\'', '-', '0', '1', '2', '3', '4', '5', > '6', '7', '8', '9', 'a', 'A', '?', '?', '?', '?', > [...] > 'v', 'V', 'w', 'W', 'x', 'X', 'y', 'Y', 'z', 'Z') > > map1 = {c: n for n, c in enumerate(alphabet)} > map2 = {n: c for n, c in enumerate(alphabet)} > > Next, I would write some functions which encode your strings as lists of > ordinals (and back again) > > def encode(s): > "encode('foo') ==> [34, 19, 19]" # made-up ordinals > return [map1[c] for c in s] > > def decode(l): > "decode([34, 19, 19]) ==> 'foo'" > return ''.join(map2[i] for i in l) > > Use these to convert your strings to lists of ints which will sort as > per your specified collating order, and then back again: > > encoded_strings = [encode(s) for s in original_list] > encoded_strings.sort() > sorted_strings = [decode(l) for l in encoded_strings] > This isn't needed and the not-so-new way to do this is through .sort's key attribute. encoded_strings = [encode(s) for s in original_list] encoded_strings.sort() sorted_strings = [decode(l) for l in encoded_strings] changes to encoded_strings.sort(key=encode) [Which happens to be faster </reasonable_guess>] Hence you neither need map2 or decode: ## CODE ## alphabet = ( ' ', '.', '\'', '-', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'A', 'ä', 'Ä', 'á', 'Á', 'â', 'Â', 'à', 'À', 'å', 'Å', 'b', 'B', 'c', 'C', 'ç', 'Ç', 'd', 'D', 'e', 'E', 'ë', 'Ë', 'é', 'É', 'ê', 'Ê', 'è', 'È', 'f', 'F', 'g', 'G', 'h', 'H', 'i', 'I', 'ï', 'Ï', 'í', 'Í', 'î', 'Î', 'ì', 'Ì', 'j', 'J', 'k', 'K', 'l', 'L', 'm', 'M', 'n', 'ñ', 'N', 'Ñ', 'o', 'O', 'ö', 'Ö', 'ó', 'Ó', 'ô', 'Ô', 'ò', 'Ò', 'ø', 'Ø', 'p', 'P', 'q', 'Q', 'r', 'R', 's', 'S', 't', 'T', 'u', 'U', 'ü', 'Ü', 'ú', 'Ú', 'û', 'Û', 'ù', 'Ù', 'v', 'V', 'w', 'W', 'x', 'X', 'y', 'Y', 'z', 'Z' ) hashindex = {character:index for index, character in enumerate(alphabet)} def string2sortlist(string): return [hashindex[s] for s in string] # Quickly make some stuff to sort. Let's try 200k, as that's what's suggested. import random things_to_sort = ["".join(random.sample(alphabet, random.randint(4, 6))) for _ in range(200000)] print(things_to_sort[:15]) things_to_sort.sort(key=string2sortlist) print(things_to_sort[:15]) ## END CODE ## Not-so-coincidentally, this is exactly the same as Ian Kelly's extension to Tomas Bach's method.
-- http://mail.python.org/mailman/listinfo/python-list