2012/9/13 Tim Chase <python.l...@tim.thechases.com>: > I've got a bunch of text in Portuguese and to transmit them, need to > have them in us-ascii (7-bit). I'd like to keep as much information > as possible, just stripping accents, cedillas, tildes, etc. So > "serviço móvil" becomes "servico movil". Is there anything stock > that I've missed? I can do mystring.encode('us-ascii', 'replace') > but that doesn't keep as much information as I'd hope. > > -tkc >
Hi, would something like the following be enough for your needs? Unfortunately, I can't check it reliably with regard to Portuguese. >>> import unicodedata >>> unicodedata.normalize("NFD", u"serviço móvil").encode("ascii", >>> "ignore").decode("ascii") u'servico movil' >>> There is also "Unidecode", but I haven't used it myself sofar... http://pypi.python.org/pypi/Unidecode/ hth, vbr -- http://mail.python.org/mailman/listinfo/python-list