Re: [Python-Dev] File encodings

2004-11-30 Thread "Martin v. Löwis"
Gustavo Niemeyer wrote:
Given the fact that files have an 'encoding' parameter, and that
any unicode strings with characters not in the 0-127 range will
raise an exception if being written to files, isn't it reasonable
to respect the 'encoding' attribute whenever writing data to a
file?
In general, files don't have an encoding parameter - sys.stdout
is an exception.
The reason why this works for print and not for write is that
I considered "print unicodeobject" important, and wanted to
implement that. file.write is an entirely different code path,
so it doesn't currently consider Unicode objects; instead, it
only supports strings (or, more generally, buffers).
> This difference may become a really annoying problem when trying to
> internationalize programs, since it's usual to see third-party code
> dealing with sys.stdout, instead of using 'print'.
Apparently, it isn't important enough that somebody had analysed this,
and offered a patch. In any case, it would be quite unreliable to
pass unicode strings to .write even *if* .write supported .encoding,
since most files don't have .encoding. Even sys.stdout does not
always have .encoding - only when it is a terminal, and only if we
managed to find out what the encoding of the terminal is.
Regards,
Martin
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] File encodings

2004-11-30 Thread M.-A. Lemburg
Gustavo Niemeyer wrote:
Greetings,
Today, while trying to internationalize a program I'm working on,
I found an interesting side-effect of how we're dealing with
encoding of unicode strings while being written to files.
Suppose the following example:
  # -*- encoding: iso-8859-1 -*-
  print u"á"
This will correctly print the string 'á', as expected. Now, what
surprises me, is that the following code won't work in an equivalent
way (unless using sys.setdefaultencoding()):
  # -*- encoding: iso-8859-1 -*-
  import sys
  sys.stdout.write(u"á\n")
This will raise the following error:
  Traceback (most recent call last):
File "asd.py", line 3, in ?
  sys.stdout.write(u"á")
  UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1'
  in position 0:ordinal not in range(128)
This difference may become a really annoying problem when trying to
internationalize programs, since it's usual to see third-party code
dealing with sys.stdout, instead of using 'print'. The standard
optparse module, for instance, has a reference to sys.stdout which
is used in the default --help handling mechanism.
You are mixing things here:
The source encoding is meant for the
parser and defines the way Unicode literals are converted
into Unicode objects.
The encoding used on the stdout stream doesn't have anything
to do with the source code encoding and has to be handled
differently.
The idiom presented by Bob is the right way to go: wrap
sys.stdout with a StreamEncoder.
Using sys.setdefaultencoding() is *not* the right solution
to the problem.
In general when writing programs that are targetted for
i18n, you should use Unicode for all text data and
convert from Unicode to 8-bit only at the IO/UI layer.
The various wrappers in the codecs module make this
rather easy.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source  (#1, Nov 30 2004)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] RELEASED Python 2.4 (final)

2004-11-30 Thread Anthony Baxter
On behalf of the Python development team and the Python community, I'm
happy to announce the release of Python 2.4.

Python 2.4 is a final, stable release, and we can recommend that Python
users upgrade to this version.

Python 2.4 is the result of almost 18 month's worth of work on top 
of Python 2.3 and represents another stage in the careful evolution 
of Python. New language features have been kept to a minimum, many 
bugs have been fixed and a wide variety of improvements have been made.

Notable changes in Python 2.4 include improvements to the importing of
modules, generator expressions, function decorators, a number of new 
modules (including subprocess, decimal and cookielib) and countless 
numbers of fixed bugs and smaller enhancements. For more, see the 
(subjective) highlights, the release notes, or Andrew Kuchling's What's 
New In Python, all available from the 2.4 web page.

http://www.python.org/2.4/

Please log any problems you have with this release in the SourceForge
bug tracker (noting that you're using Python 2.4):

http://sourceforge.net/bugs/?group_id=5470

Enjoy the new (stable!) release,
Anthony

Anthony Baxter
[EMAIL PROTECTED]
Python Release Manager
(on behalf of the entire python-dev team)


pgpgvRk3XuWeH.pgp
Description: PGP signature
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] File encodings

2004-11-30 Thread Gustavo Niemeyer
Hello Bob,

[...]
> >Given the fact that files have an 'encoding' parameter, and that
> >any unicode strings with characters not in the 0-127 range will
> >raise an exception if being written to files, isn't it reasonable
> >to respect the 'encoding' attribute whenever writing data to a
> >file?
> 
> No, because you don't know it's a file.  You're calling a function with 
> a unicode object.  The function doesn't know that the object was some 
> unicode object that came from a source file of some particular 
> encoding.

I don't understand what you're saying here. The file knows itself
is a file. The write function knows the parameter is unicode.

> >The workaround for that problem is to either use the evil-considered
> >sys.setdefaultencoding(), or to wrap sys.stdout. IMO, both options
> >seem unreasonable for such a common idiom.
> 
> There's no guaranteed correlation whatsoever between the claimed 
> encoding of your source document and the encoding of the user's 
> terminal, why do you want there to be?  What if you have some source 

I don't. I want the write() function of file objects to respect
the encoding attribute of these objects. This is already being
done when print is used. I'm proposing to extend that behavior to
the write function. That's all.

> files with 'foo' encoding and others with 'bar' encoding?  What about 
> ascii encoded source documents that use escape sequences to represent 
> non-ascii characters?  What you want doesn't make any sense so long as 
> python strings and file objects deal in bytes not characters :)

Please, take a long breath, and read my message again. :-)

> Wrapping sys.stdout is the ONLY reasonable solution.
[...]

No, it's not. But I'm glad to know other people is also doing
workarounds for that problem.

-- 
Gustavo Niemeyer
http://niemeyer.net
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] TRUNK UNFROZEN; release24-maint branch has been cut

2004-11-30 Thread Anthony Baxter
I've cut the release24-maint branch, and updated the Include/patchlevel.h
on trunk and branch (trunk is now 2.5a0, branch is 2.4+)

The trunk and the branch are now both unfrozen and suitable for checkins.
The feature freeze on the trunk is lifted. Remember - if you're checking 
bugfixes into the trunk, either backport them to the branch, or else mark 
the commit message with 'bugfix candidate' or 'backport candidate' or the
like.

Next up will be a 2.3.5 release. I'm going to be travelling for a large chunk
of December (at very short notice) so it's likely that this will happen at the
start of January. If someone else wants to cut a 2.3.5 sooner than that,
please feel free to volunteer! 2.3.5 will be the last 2.3.x release, barring
some almighty cockup - the next scheduled release will be 2.4.1, which will 
probably happen around May 2005.

Anthony
and yes, I'm drinking.
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] File encodings

2004-11-30 Thread Gustavo Niemeyer
> Gustavo Niemeyer wrote:
> >Given the fact that files have an 'encoding' parameter, and that
> >any unicode strings with characters not in the 0-127 range will
> >raise an exception if being written to files, isn't it reasonable
> >to respect the 'encoding' attribute whenever writing data to a
> >file?
> 
> In general, files don't have an encoding parameter - sys.stdout
> is an exception.

That's the only case I'd like to solve.

If there are platforms that don't know how to set it, we could make
the encoding attribute writable, and that would allow people to
easily set it to the encoding which is deemed correct in their
systems.

> The reason why this works for print and not for write is that
> I considered "print unicodeobject" important, and wanted to
> implement that. file.write is an entirely different code path,
> so it doesn't currently consider Unicode objects; instead, it
> only supports strings (or, more generally, buffers).

I understand your reasoning behind it, and would like to extend
your idea to the write function, allowing anyone to use the common
sys.stdout idiom to implement print-like functionality (like optparse
and many others). For normal files, the absence of the encoding
parameter would ensure the current behavior.

> > This difference may become a really annoying problem when trying to
> > internationalize programs, since it's usual to see third-party code
> > dealing with sys.stdout, instead of using 'print'.
> 
> Apparently, it isn't important enough that somebody had analysed this,
> and offered a patch. In any case, it would be quite unreliable to

That's what I'm doing here! :-)

> pass unicode strings to .write even *if* .write supported .encoding,
> since most files don't have .encoding. Even sys.stdout does not always
> have .encoding - only when it is a terminal, and only if we managed to
> find out what the encoding of the terminal is.

I think that's acceptable. The encoding parameter is meant for output
streams, and Python does its best to try to find a reasonable value
for showing output strings.

Thanks for your answer and clarifications,

-- 
Gustavo Niemeyer
http://niemeyer.net
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] File encodings

2004-11-30 Thread Gustavo Niemeyer
[...]
> You are mixing things here:
> 
> The source encoding is meant for the
> parser and defines the way Unicode literals are converted
> into Unicode objects.
> 
> The encoding used on the stdout stream doesn't have anything
> to do with the source code encoding and has to be handled
> differently.

Sorry. I probably wasn't clear enough in my message. I understand
the issue, and I'm not discussing source encoding at all. The
only problem I'd like to solve is that of output streams not
being able to have unicode strings written.

> The idiom presented by Bob is the right way to go: wrap
> sys.stdout with a StreamEncoder.

I don't see that as a good solution, since every Python software
that is internationalizaed will have do figure out this wrapping,
introducing extra overhead unnecessarily.

> Using sys.setdefaultencoding() is *not* the right solution
> to the problem.

I understand.

> In general when writing programs that are targetted for
> i18n, you should use Unicode for all text data and
> convert from Unicode to 8-bit only at the IO/UI layer.

That's what I think as well. I just would expect that Python was
kind enough to allow me to tell which output encoding I want,
instead of wrapping the sys.stdout object with a non-native-file.

IOW, being widely necessary, handling internationalization without
wrapping sys.stdout everytime seems like a good step for a language
like Python.

> The various wrappers in the codecs module make this
> rather easy.

Thanks for the suggestion!

-- 
Gustavo Niemeyer
http://niemeyer.net
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] File encodings

2004-11-30 Thread Walter Dörwald
Gustavo Niemeyer wrote:
[...]
You are mixing things here:
The source encoding is meant for the
parser and defines the way Unicode literals are converted
into Unicode objects.
The encoding used on the stdout stream doesn't have anything
to do with the source code encoding and has to be handled
differently.
Sorry. I probably wasn't clear enough in my message. I understand
the issue, and I'm not discussing source encoding at all. The
only problem I'd like to solve is that of output streams not
being able to have unicode strings written.
The idiom presented by Bob is the right way to go: wrap
sys.stdout with a StreamEncoder.
I don't see that as a good solution, since every Python software
that is internationalizaed will have do figure out this wrapping,
introducing extra overhead unnecessarily.
This wrapping is probably necessary for stateful encodings. If you
had a sys.stdout.encoding=="utf-16", print would probably add the
BOM every time a unicode object is printed. This doesn't happen if
you wrap sys.stdout in a StreamWriter.
[...]
That's what I think as well. I just would expect that Python was
kind enough to allow me to tell which output encoding I want,
instead of wrapping the sys.stdout object with a non-native-file.
IOW, being widely necessary, handling internationalization without
wrapping sys.stdout everytime seems like a good step for a language
like Python.
You can't have stateful encodings without something that keeps
state. The only thing that does keep state in Python is a
StreamReader/StreamWriter.
Bye,
   Walter Dörwald
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Trouble installing 2.4

2004-11-30 Thread Andrew Koenig
I'm using Windows XP SP2.

Uninstalled 2.3, installed 2.4 (running as me, not as administrator).
No problems so far.

Tried installing pywin32-203.win32-py2.4.exe

When I try to install it as me, it gets as far as "ready to install."  When
I click Next, it says

Can't load Python for pre-install script

and quits, even though earlier it said it had found Python 2.4 in the
registry.

When I try to install it as Administrator, it quits immediately, saying that
it couldn't locate a Python 2.4 installation.

My hypothesis: When I install 2.4 as me, it puts it in my user registry, not
the system-wide registry, and then pywin32 can't find it.

I'm going to unstall and try again as Administrator.


___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] File encodings

2004-11-30 Thread Gustavo Niemeyer
Hello Walter,

> >I don't see that as a good solution, since every Python software
> >that is internationalizaed will have do figure out this wrapping,
> >introducing extra overhead unnecessarily.
> 
> This wrapping is probably necessary for stateful encodings. If you
> had a sys.stdout.encoding=="utf-16", print would probably add the
> BOM every time a unicode object is printed. This doesn't happen if
> you wrap sys.stdout in a StreamWriter.

I'm not sure this is an issue for a terminal output stream, which
is the case I'm trying to find a solution for. Otherwise, Python
would already be in trouble for using this scheme in the print
statement. Can you show an example of the print statement not
working?

-- 
Gustavo Niemeyer
http://niemeyer.net
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


RE: [Python-Dev] Trouble installing 2.4

2004-11-30 Thread Andrew Koenig
Follow-up:  When I install Python as Administrator, all is well.  In that
case (but not when installing it as me), it asks whether I want to install
it for all users or for myself only.  I then install pywin32 and it works.

So it may be that a caveat is in order to people who do not install 2.4 as
Administrator.


___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] File encodings

2004-11-30 Thread Walter Dörwald
Gustavo Niemeyer wrote:
Hello Walter,
I don't see that as a good solution, since every Python software
that is internationalizaed will have do figure out this wrapping,
introducing extra overhead unnecessarily.
This wrapping is probably necessary for stateful encodings. If you
had a sys.stdout.encoding=="utf-16", print would probably add the
BOM every time a unicode object is printed. This doesn't happen if
you wrap sys.stdout in a StreamWriter.
I'm not sure this is an issue for a terminal output stream, which
is the case I'm trying to find a solution for. Otherwise, Python
would already be in trouble for using this scheme in the print
statement. Can you show an example of the print statement not
working?
No, I can't. Python doesn't accept UTF-16 as encoding.
This works:
> LANG=de_DE.UTF-8 python2.4
Python 2.4 (#1, Nov 30 2004, 14:16:24)
[GCC 2.96 2731 (Red Hat Linux 7.3 2.96-113)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding
'UTF-8'
This doesn't:
> LANG=de_DE.UTF-16 python2.4
Python 2.4 (#1, Nov 30 2004, 14:16:24)
[GCC 2.96 2731 (Red Hat Linux 7.3 2.96-113)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.stdout.encoding
'ANSI_X3.4-1968'
Bye,
   Walter Dörwald
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python.org current docs

2004-11-30 Thread Thomas Heller
http://www.python.org/doc/current/
and
http://docs.python.org/

still point to 2.3.4 docs.

Thomas

___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Small subprocess patch

2004-11-30 Thread Peter Åstrand

I'm planning to change the signature for subprocess.call slightly:

-def call(*args, **kwargs):
+def call(*popenargs, **kwargs):

The purpose is to make it clearer that "args" in this context is not the
same as the "args" argument to the Popen constructor. Two questions:

1) Is it OK to commit changes like this on the 2.4 branch, in addition to
trunk?

2) Anyone that thinks that "kwargs" should be changed into "popenkwargs"?


/Peter Åstrand <[EMAIL PROTECTED]>

___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Trouble installing 2.4

2004-11-30 Thread "Martin v. Löwis"
Andrew Koenig wrote:
So it may be that a caveat is in order to people who do not install 2.4 as
Administrator.
I think the trouble is not with 2.4, here - the trouble is with
installing pywin32. As you said, the installation of Python itself
went fine.
> My hypothesis: When I install 2.4 as me, it puts it in my user
> registry, not the system-wide registry,
I can confirm this hypothesis. In a per-user installation, the registry
settings are deliberately change for the user, not for the entire
system. Otherwise, it wouldn't be per-user. Also, the user might not
be able to write to the machine registry (unless he is a member of
the Power Users group).
> and then pywin32 can't find it.
That sounds likely, but I cannot confirm it. If it is, it is a bug
in pywin32 (and, in turn, possibly in distutils).
Regards,
Martin
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python.org current docs

2004-11-30 Thread Fred L. Drake, Jr.
On Tuesday 30 November 2004 02:46 pm, Thomas Heller wrote:
 > http://www.python.org/doc/current/
 > and
 > http://docs.python.org/
 >
 > still point to 2.3.4 docs.


I'll be fixing that up tonight.


  -Fred

-- 
Fred L. Drake, Jr.  

___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Re: Small subprocess patch

2004-11-30 Thread Peter Astrand
On Tue, 30 Nov 2004, Peter Åstrand wrote:

> 1) Is it OK to commit changes like this on the 2.4 branch, in addition to
> trunk?

I'm also wondering if patch 1071755 and 1071764 should go into
release24-maint:

* 1071755 makes subprocess raise TypeError if Popen is called with a
bufsize that is not an integer.

* 1071764 adds a new, small utility function.


/Peter Åstrand <[EMAIL PROTECTED]>

___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Roster Deadline

2004-11-30 Thread Tim Hochberg
Hi Larry,
FYI: I asked EB about the roster deadline and she says that she doesn't 
know when it is either. Checking on the Lei Out web page didn't help 
much either.

So, you are no wiser now than at the start of this message.
-tim
___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python.org current docs

2004-11-30 Thread Fred L. Drake, Jr.
On Tuesday 30 November 2004 02:46 pm, Thomas Heller wrote:
  > http://www.python.org/doc/current/
  > and
  > http://docs.python.org/
  >
 >  > still point to 2.3.4 docs.

I think everything is properly updated now.  Please let me know if I've missed 
anything.


  -Fred

-- 
Fred L. Drake, Jr.  

___
Python-Dev mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com