On 19 Okt, 21:07, George Trojan <george.tro...@noaa.gov> wrote: > A trivial one, this is the first time I have to deal with Unicode. I am > trying to parse a string s='''48° 13' 16.80" N'''. I know the charset is > "iso-8859-1". To get the degrees I did > >>> encoding='iso-8859-1' > >>> q=s.decode(encoding) > >>> q.split() > [u'48\xc2\xb0', u"13'", u'16.80"', u'N'] > >>> r=q.split()[0] > >>> int(r[:r.find(unichr(ord('\xc2')))]) > 48 > > Is there a better way of getting the degrees? > > George
When parsing strings, use Regular Expressions. If you don't know how to, spend some time teaching yourself how to - well spent time! A great tool for playing around with REs is KODOS. For the problem at hand you can e.g.: import re degrees = int(re.findall('\d+', s)[0]) that in essence will group together all groups of consecutive digits, return the first group and int() it. No need to care/know about the fact that the string is Unicode and the underlying coding of the charset. -- http://mail.python.org/mailman/listinfo/python-list