Re: choosing a default text-encoding in Python programs (was: To unicode or not to unicode)

2009-02-22 Thread John Machin
On Feb 23, 11:46 am, Joshua Judson Rosen wrote: > Denis Kasak writes: > > > > > Python "assumes" ASCII and if the decodes/encoded text doesn't > > > > fit that encoding it refuses to guess. > > > > Which is reasonable given that Python is programming language where it's > > > better to have more

choosing a default text-encoding in Python programs (was: To unicode or not to unicode)

2009-02-22 Thread Joshua Judson Rosen
Denis Kasak writes: > > > > Python "assumes" ASCII and if the decodes/encoded text doesn't > > > fit that encoding it refuses to guess. > > > > Which is reasonable given that Python is programming language where it's > > better to have more conservative assumption about encodings so errors > > can

Re: To unicode or not to unicode

2009-02-22 Thread Denis Kasak
On Sun, Feb 22, 2009 at 1:39 AM, Ross Ridge wrote: > Ross Ridge (Sat, 21 Feb 2009 18:06:35 -0500) >> I understand what Unicode and MIME are for and why they exist. Neither >> their merits nor your insults change the fact that the only current >> standard governing the content of Usenet posts doesn

Re: To unicode or not to unicode

2009-02-22 Thread dineshv
re: "You should never have to rely on the default encoding. You should explicitly decode and encode data." What is the best practice for 1) doing this in Python and 2) for unicode support ? I want to standardize on unicode and want to put into place best Python practice so that we don't have to w

Re: To unicode or not to unicode

2009-02-21 Thread Joshua Judson Rosen
Ross Ridge writes: > > > It's all about declaring your charset. In Python as well as in your > > newsreader. If you don't declare your charset it's ASCII for you - in > > Python as well as in your newsreader. > > Except in practice unlike Python, many newsreaders don't assume ASCII. > The origi

Re: To unicode or not to unicode

2009-02-21 Thread Martin v. Löwis
> Since when is "Google Groups" a newsreader? So far as I know, all > the display/formatting is handled by my web browser and GG merely stuffs > messages into an HTML wrapper... It also transmits this HTML wrapper via HTTP, where it claims that the charset of the HTML is UTF-8. To do that, i

Re: To unicode or not to unicode

2009-02-21 Thread Steve Holden
Thorsten Kampe wrote: > * Ross Ridge (Sat, 21 Feb 2009 14:52:09 -0500) >> Thorsten Kampe wrote: >>> It's all about declaring your charset. In Python as well as in your >>> newsreader. If you don't declare your charset it's ASCII for you - in >>> Python as well as in your newsreader. >> Except in

Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 19:39:42 -0500) > Thorsten Kampe wrote: > >That's right. As long as you use pure ASCII you can skip this nasty step > >of informing other people which charset you are using. If you do use non > >ASCII then you have to do that. That's the way virtually all newsread

Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
Ross Ridge (Sat, 21 Feb 2009 18:06:35 -0500) > I understand what Unicode and MIME are for and why they exist. Neither > their merits nor your insults change the fact that the only current > standard governing the content of Usenet posts doesn't require their > use. Thorsten Kampe wrote: >That's

Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 18:06:35 -0500) > > The link demonstrates that Google Groups doesn't assume ASCII like > > Python does. Since popular newsreaders like Google Groups and Outlook > > Express can display the message correctly without the MIME headers, > > but your obscure one can't, th

Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
Ross Ridge (Sat, 21 Feb 2009 17:07:35 -0500) > The link demonstrates that Google Groups doesn't assume ASCII like > Python does. Since popular newsreaders like Google Groups and Outlook > Express can display the message correctly without the MIME headers, > but your obscure one can't, there's a m

Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 17:07:35 -0500) > The link demonstrates that Google Groups doesn't assume ASCII like > Python does. Since popular newsreaders like Google Groups and Outlook > Express can display the message correctly without the MIME headers, > but your obscure one can't, there's a

Re: To unicode or not to unicode

2009-02-21 Thread Carl Banks
On Feb 19, 6:57 pm, Ron Garret wrote: > I'm writing a little wiki that I call µWiki.  That's a lowercase Greek > mu at the beginning (it's pronounced micro-wiki).  It's working, except > that I can't actually enter the name of the wiki into the wiki itself > because the default unicode encoding on

Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
Ross Ridge (Sat, 21 Feb 2009 14:52:09 -0500) > Except in practice unlike Python, many newsreaders don't assume ASCII. Thorsten Kampe wrote: >They assume ASCII - unless you declare your charset (the exception being >Outlook Express and a few Windows newsreaders). Everything else is >"guessing".

Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 14:52:09 -0500) > Thorsten Kampe wrote: >> It's all about declaring your charset. In Python as well as in your >> newsreader. If you don't declare your charset it's ASCII for you - in >> Python as well as in your newsreader. > > Except in practice unlike Python, ma

Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
Thorsten Kampe wrote: >> RFC 1036 doesn't require nor give a meaning to a Content-Type header >> in a Usenet message > >Well, /maybe/ the reason for that is that RFC 1036 was written in 1987 >and the first MIME RFC in 1992...? Obviously. >"Son of RFC 1036" mentions MIME more often than you can

Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 12:22:36 -0500) > =?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?= wrote: > >I don't think that was the complaint. Instead, the complaint was > >that the OP's original message did not have a Content-type header, > >and that it was thus impossible to tell what the byte in front

Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?= wrote: >I don't think that was the complaint. Instead, the complaint was >that the OP's original message did not have a Content-type header, >and that it was thus impossible to tell what the byte in front of >"Wiki" meant. To properly post either MICRO SIGN or

Re: To unicode or not to unicode

2009-02-20 Thread Martin v. Löwis
Ron Garret wrote: > In article <499f0cf0.8070...@v.loewis.de>, > "Martin v. Löwis" wrote: > > > I'm the OP. I'm using MT-Newswatcher 3.5.1. I thought I had it > configured properly, but I guess I didn't. Probably you did. However, it then means that the newsreader is crap. > Under > Pref

Re: To unicode or not to unicode

2009-02-20 Thread Ron Garret
In article <499f0cf0.8070...@v.loewis.de>, "Martin v. Löwis" wrote: > MRAB wrote: > > Thorsten Kampe wrote: > >> * Ron Garret (Thu, 19 Feb 2009 18:57:13 -0800) > >>> I'm writing a little wiki that I call µWiki. That's a lowercase > >>> Greek mu at the beginning (it's pronounced micro-wiki).

Re: To unicode or not to unicode

2009-02-20 Thread Martin v. Löwis
MRAB wrote: > Thorsten Kampe wrote: >> * Ron Garret (Thu, 19 Feb 2009 18:57:13 -0800) >>> I'm writing a little wiki that I call µWiki. That's a lowercase >>> Greek mu at the beginning (it's pronounced micro-wiki). >> >> No, it's not. I suggest you start your Unicode adventure by >> configuring yo

Re: To unicode or not to unicode

2009-02-20 Thread Ron Garret
In article , MRAB wrote: > Thorsten Kampe wrote: > > * Ron Garret (Thu, 19 Feb 2009 18:57:13 -0800) > >> I'm writing a little wiki that I call µWiki. That's a lowercase Greek > >> mu at the beginning (it's pronounced micro-wiki). > > > > No, it's not. I suggest you start your Unicode advent

Re: To unicode or not to unicode

2009-02-20 Thread MRAB
Thorsten Kampe wrote: * Ron Garret (Thu, 19 Feb 2009 18:57:13 -0800) I'm writing a little wiki that I call µWiki. That's a lowercase Greek mu at the beginning (it's pronounced micro-wiki). No, it's not. I suggest you start your Unicode adventure by configuring your newsreader. It looked

Re: To unicode or not to unicode

2009-02-20 Thread Thorsten Kampe
* Ron Garret (Thu, 19 Feb 2009 18:57:13 -0800) > I'm writing a little wiki that I call µWiki. That's a lowercase Greek > mu at the beginning (it's pronounced micro-wiki). No, it's not. I suggest you start your Unicode adventure by configuring your newsreader. Thorsten -- http://mail.python.or

Re: To unicode or not to unicode

2009-02-19 Thread Benjamin Peterson
Ron Garret flownet.com> writes: > > I'm writing a little wiki that I call µWiki. That's a lowercase Greek > mu at the beginning (it's pronounced micro-wiki). It's working, except > that I can't actually enter the name of the wiki into the wiki itself > because the default unicode encoding o

To unicode or not to unicode

2009-02-19 Thread Ron Garret
I'm writing a little wiki that I call µWiki. That's a lowercase Greek mu at the beginning (it's pronounced micro-wiki). It's working, except that I can't actually enter the name of the wiki into the wiki itself because the default unicode encoding on my Python installation is "ascii". So I'm