Sorting a multidimensional array by multiple keys
Hello everyone, can I sort a multidimensional array in Python by multiple sort keys? A litte code sample would be nice! Thx, Rehceb -- http://mail.python.org/mailman/listinfo/python-list
Re: Sorting a multidimensional array by multiple keys
> If you want a good answer you have to give me/us more details, and an > example too. OK, here is some example data: reaction is BUT by the sodium , BUT it is sea , BUT it is this manner BUT the dissolved pattern , BUT it is rapid , BUT it is As each line consists of 5 words, I would break up the data into an array of five-field-arrays (Would you use lists or tuples or a combination in Python?). The word "BUT" would be in the middle, with two fields/words left and two fields/words right of it. I then want to sort this list by - field 3 - field 4 - field 1 - field 0 in this hierarchy. This is the desired result: pattern , BUT it is rapid , BUT it is sea , BUT it is sodium , BUT it is reaction is BUT by the this manner BUT the dissolved The first 4 lines all could not be sorted by fields 3 & 4, as they are identical ("it", "is"), so they have been sorted first by field 1 (which is also identical: ",") and then by field 0: pattern rapid sea sodium I hope I have explained this in an understandable way. It would be cool if you could show me how this can be done in Python! Regards, Rehceb -- http://mail.python.org/mailman/listinfo/python-list
Re: Sorting a multidimensional array by multiple keys
Wait, I made a mistake. The correct result would be reaction is BUT by the pattern , BUT it is rapid , BUT it is sea , BUT it is sodium , BUT it is this manner BUT the dissolved because "by the" comes before and "the dissolved" after "it is". Sorry for the confusion. -- http://mail.python.org/mailman/listinfo/python-list
Re: Sorting a multidimensional array by multiple keys
Thank you all for your helpful solutions! Regards, Rehceb -- http://mail.python.org/mailman/listinfo/python-list
Unicode list
Hello, I have this little grep-like program: ++snip++ #!/usr/bin/python import sys import re pattern = sys.argv[1] inputfile = file(sys.argv[2], 'r') for line in inputfile: matches = re.findall(pattern, line) if matches: print matches ++snip++ Like this, the program prints some characters as strange escape sequences, which is due to the input file being encoded in utf-8: When I convert "re.findall..." to a string and wrap an "unicode()" around it, the matches get printed correctly. Is it possible to make "matches" unicode without saving it as a single string first? The function "unicode ()" seems only to work for strings. Or is there a general way of telling Python to abandon the ancient and evil land of iso-8859 for good and use utf-8 only? Regards, Rehceb -- http://mail.python.org/mailman/listinfo/python-list
Re: Unicode list
> When printing a list, the individual elements are converted with repr(), > not with str(). For a string object, repr() adds escape codes for all > bytes that are not printable ASCII characters. Thanks Martin, you're right, it were the repr() calls that messed up the output. Iterating the array like you proposed is even 1/100s faster ;) Regards, Rehceb -- http://mail.python.org/mailman/listinfo/python-list
Overlapping matches
In the re documentation, it says that the matching functions return "non- overlapping" matches only, but I also need overlapping ones. Does anyone know how this can be done? Regards, Rehceb Rotkiv -- http://mail.python.org/mailman/listinfo/python-list
Re: Overlapping matches
Both methods work well, thank you! -- http://mail.python.org/mailman/listinfo/python-list
Checking whether list element exists
I want to check whether, for example, the element myList[-3] exists. So far I did it like this: index = -3 if len(myList) >= abs(index): print myList[index] Another idea I had was to (ab-?)use the try...except structure: index = -3 try: print myList[index] except: print "Does not exist!" Is it ok to use try...except for the test or is it bad coding style? Or is there another, more elegant method than these two? Regards, Rehceb -- http://mail.python.org/mailman/listinfo/python-list
Re: Checking whether list element exists
> In general case it won't work, because lists accept negative indexes: > http://docs.python.org/lib/typesseq.html, 3rd note. Yes, I know! I _want_ the "3rd last list element", i.e. list[-3]. But it may be that the list does not have 3 elements. In this case, list[-3] will throw an error, cf.: >>> arr = ['a','b','c'] >>> print arr[-3] a >>> print arr[-4] Traceback (most recent call last): File "", line 1, in ? IndexError: list index out of range >>> I thought maybe I could catch the error with try...except so that I do not need the if-test, but I don't know whether this is proper usage of the try...except structure. -- http://mail.python.org/mailman/listinfo/python-list
Unicode problem
Please have a look at this little script: #!/usr/bin/python import sys import codecs fileHandle = codecs.open(sys.argv[1], 'r', 'utf-8') fileString = fileHandle.read() print fileString if I call it from a Bash shell like this $ ./test.py testfile.utf8.txt it works just fine, but when I try to pipe the output to another process ("|") or into a file (">"), e.g. like this $ ./test.py testfile.utf8.txt | cat I get an error: Traceback (most recent call last): File "./test.py", line 6, in ? print fileString UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 538: ordinal not in range(128) I absolutely don't know what's the problem here, can you help? Thanks, Rehceb -- http://mail.python.org/mailman/listinfo/python-list
Re: Checking whether list element exists
Thanks for your many helpful tips! Rehceb Rotkiv -- http://mail.python.org/mailman/listinfo/python-list
Re: Unicode problem
On Sat, 07 Apr 2007 12:46:49 -0700, Gabriel Genellina wrote: > You have to encode the Unicode object explicitely: print > fileString.encode("utf-8") > (or any other suitable one; I said utf-8 just because you read the input > file using that) Thanks! That's a nice little stumbling block for a newbie like me ;) Is there a way to make utf-8 the default encoding for every string, so that I do not have to encode each string explicitly? -- http://mail.python.org/mailman/listinfo/python-list