On 26 Apr 2005 19:16:25 -0700, [EMAIL PROTECTED] wrote: > >John Machin wrote: >> On 26 Apr 2005 13:39:26 -0700, [EMAIL PROTECTED] (dumbkiwi) wrote: >> >> >Peter Otten <[EMAIL PROTECTED]> wrote in message >news:<[EMAIL PROTECTED]>... >> >> Dumbkiwi wrote: >> >> >> >> >> Just encode the data in the target encoding before passing it >to >> >> >> os.popen(): >> >> > >> >Anyway, from your post, I've done some more digging, and found the >> >command: >> > >> >sys.setappdefaultencoding() >> > >> >which I've used, and it's fixed the problem (I think). >> > >> >> Dumb Kiwi, eh? Maybe not so dumb -- where'd you find >> sys.setappdefaultencoding()? I'm just a dumb Aussie [1]; I looked in >> the 2.4.1 docs and also did import sys; dir(sys) and I can't spot it. > >Hmmm. See post above, seems to be something generated by eric3. So >this may not be the fix I'm looking for. > >> >> In any case, how could the magical sys.setappdefaultencoding() fix >> your problem? From your description, your problem appeared to be that >> you didn't know what encoding to use. > >I knew what encoding to use,
Would you mind telling us (a) what that encoding is (b) how you came to that knowledge (c) why you just didn't do test = os.popen('kdialog --inputbox %s' %(data.encode('that_encoding'))) instead of test = os.popen('kdialog --inputbox %s' %(data.encode('utf-8'))) > the problem was that the text was being >passed to kdialog as ascii. It wasn't being passed to kdialog; there was an attempt which failed. > The .encode('utf-8') at least allows >kdialog to run, but the text still looks like crap. Using >sys.setappdefaultencoding() seemed to help. The text looked a bit >better - although not entirely perfect - but I think that's because the >font I was using didn't have the correct characters (they came up as >square boxes). And the font you *were* using is what? And the font you are now using is what? What facilities do you have to use different fonts? >> >> What is the essential difference between >> >> send(u_data.encode('polish')) >> >> and >> >> sys.setappdefaultencoding('polish') >> ... >> send(u_data) > >Not sure - I'm new to character encoding, and most of this seems like >black magic to me. The essential difference is that setting a default encoding is a daft idea. > >> >> [1]: Now that's *TWO* contenders for TautologyOTW :-) >> Before I retract that back to one contender, I'll give it one more shot: 1. Your data: you say it is Polish text, and is utf-8. This implies that it is in Unicode, encoded as utf-8. What evidence do you have? Have you been able to display it anywhere so that it "looks good"? If it's not confidential, can you show us a dump of the first say 100 bytes of text, in an unambiguous form, like this: print repr(open('polish.text', 'rb').read(100)) 2. Your script: You say "I then manipulate the data to break it down into text snippets" - uh-huh ... *what* manipulations? Care to tell us? Care to show us the code? 3. kdialog: I know nothing of KDE and its toolkit. I would expect either (a) it should take utf-8 and be able to display *any* of the first 64K (nominal) Unicode characters, given a Unicode font or (b) you can encode your data in a legacy charset, *AND* tell it what that charset is, and have a corresponding font or (c) you have both options. Which is correct, and what are the details of how you can tell kdialog what to do -- configuration? command-line arguments? HTHYTHYS, John -- http://mail.python.org/mailman/listinfo/python-list