Re: [Python-Dev] New lines, carriage returns, and Windows

2007-09-29 Thread Nick Maclaren
"Paul Moore" <[EMAIL PROTECTED]> wrote:
> 
> OK, so far so good - although I'm not *quite* sure there's a
> self-consistent definition of "code that only uses \n". I'll assume
> you mean code that has a concept of lines, that lines never contain
> anything other than text (specifically, neither \r or \n can appear in
> a line, I'll punt on whether other weird stuff like form feed are
> legal), and that whenever your code needs to write data to a file, it
> writes lines with \n alone between them.

I won't.  There are a few of us still left who know how this started,
and here is a simplified description.

Unix was a computer scientist's workbench, and made no attempt to be
general.  In particular, its text datastream model was appropriate
for the imnportant devices of the day - teletypes and similar.  So
far, so good.  But what was forgotten later is that the model does
NOT extend to other systems and, in particular, made no sense on the
record-oriented models generally used by mainframes (see Fortran for
an example).

When C was standardised, this was fudged.  I tried to get it improved,
but it is one of the many things I failed to do.  The handling of
ALL of the control characters in text I/O is non-portable (even \t,
despite what the satndard says), and you have to follow the system's
constraints if things are to work.  Unfortunately, the kludging that
the compiler does to map C to the operating system confuses things
still further - though it is essential.

Now, BCPL was an ancestor of C, but always was a more portable
language (i.e. it didn't start with a specific operating system in
mind), and used/uses a rather better model.  In this, line separators
are atomic - e.g. '\f' is newline-with-form-feed and '\r' is
"newline-with-overprinting".  Now, THAT model is more generic.
Not fully generic, of course, but it would cater for all of Unix,
CPM and its derivatives (yes, Microsoft), MacOS and most mainframes
(with some reservations).

So, until and unless Python chooses to define its own I/O model,
these problems will continue to arise.  Whether this one is a simple
bug or an avoidable feature, I can't say without looking harder,
but bugs are often caused by attempting to implement impossible
or confusing specifications.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  [EMAIL PROTECTED]
Tel.:  +44 1223 334761Fax:  +44 1223 334679
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New lines, carriage returns, and Windows

2007-09-29 Thread Guido van Rossum
On 9/29/07, Nick Maclaren <[EMAIL PROTECTED]> wrote:
> "Paul Moore" <[EMAIL PROTECTED]> wrote:
> >
> > OK, so far so good - although I'm not *quite* sure there's a
> > self-consistent definition of "code that only uses \n". I'll assume
> > you mean code that has a concept of lines, that lines never contain
> > anything other than text (specifically, neither \r or \n can appear in
> > a line, I'll punt on whether other weird stuff like form feed are
> > legal), and that whenever your code needs to write data to a file, it
> > writes lines with \n alone between them.
>
> I won't.  There are a few of us still left who know how this started,
> and here is a simplified description.
>
> Unix was a computer scientist's workbench, and made no attempt to be
> general.  In particular, its text datastream model was appropriate
> for the imnportant devices of the day - teletypes and similar.  So
> far, so good.  But what was forgotten later is that the model does
> NOT extend to other systems and, in particular, made no sense on the
> record-oriented models generally used by mainframes (see Fortran for
> an example).
>
> When C was standardised, this was fudged.  I tried to get it improved,
> but it is one of the many things I failed to do.  The handling of
> ALL of the control characters in text I/O is non-portable (even \t,
> despite what the satndard says), and you have to follow the system's
> constraints if things are to work.  Unfortunately, the kludging that
> the compiler does to map C to the operating system confuses things
> still further - though it is essential.
>
> Now, BCPL was an ancestor of C, but always was a more portable
> language (i.e. it didn't start with a specific operating system in
> mind), and used/uses a rather better model.  In this, line separators
> are atomic - e.g. '\f' is newline-with-form-feed and '\r' is
> "newline-with-overprinting".  Now, THAT model is more generic.
> Not fully generic, of course, but it would cater for all of Unix,
> CPM and its derivatives (yes, Microsoft), MacOS and most mainframes
> (with some reservations).
>
> So, until and unless Python chooses to define its own I/O model,
> these problems will continue to arise.  Whether this one is a simple
> bug or an avoidable feature, I can't say without looking harder,
> but bugs are often caused by attempting to implement impossible
> or confusing specifications.

Have you looked at Py3k at all, especially PEP 3116 (new I/O)?

Python *does* have its own I/O model. There are binary files and text
files. For binary files, you write bytes and the semantic model is
that of an array of bytes; byte indices are seek positions.

For text files, the contents is considered to be Unicode, encoded as
bytes in a binary file. So text file always has an underlying binary
file. Two translations take place, both of which have defaults varying
by platform. One translation is encoding Unicode text into bytes upon
output, and decoding bytes to Unicode text upon input. This can use
any encoding supported by the encodings package.

The other translation deals with line endings. Upon input, any of
\r\n, \r, or \n is translated to a single \n by default (this is nhe
"universal newlines" algorithm from Python 2.x). This can be tweaked
or disabled. Upon output, \n is translated into a platform specific
string chosen from \r\n, \r, or \n. This can also be disabled or
overridden. Note that \r, when written, is never treated specially; if
you want special processing for \r on output, you can write your own
translation layer.

That's all. There is nothing unimplementable or confusing in these
specifications.

Python doesn't care about record I/O on legacy OSes; it does care
about variability found in practice between popular OSes.

Note that \r, \n and friends in Python 3000 are either ASCII (in bytes
literals) or Unicode (in text literals). Again, no support for legacy
systems that don't use ASCII or a superset.

Legacy OSes are called that for a reason.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal news

2007-09-29 Thread Thomas Wouters
On 9/28/07, Thomas Heller <[EMAIL PROTECTED]> wrote:
>
> Thomas Wouters schrieb:
> >> > If you re-eally need to check something into the trunk that re-eally
> >> > must not be merged into py3k, but you're afraid it's not going to be
> >> > obvious to the merger, please record the change as 'merged' using
> >> > "svnmerge merge -M -r". Please take care when picking the
> >> > revision ;) You can also just email me or someone else you see doing
> >> > merges, as I doubt this will be a common occurance.
>
> I think that the 'svnmerge block -r' command should be used.  Or
> not?


If you're comfortable with using svnmerge yourself, sure. If you're worried
that you might mess up the state of the branch, you can leave it up to us
(me.)


-- 
Thomas Wouters <[EMAIL PROTECTED]>

Hi! I'm a .signature virus! copy me into your .signature file to help me
spread!
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Michael Foord
Guido van Rossum wrote:
> [snip..]
> Python *does* have its own I/O model. There are binary files and text
> files. For binary files, you write bytes and the semantic model is
> that of an array of bytes; byte indices are seek positions.
>
> For text files, the contents is considered to be Unicode, encoded as
> bytes in a binary file. So text file always has an underlying binary
> file. Two translations take place, both of which have defaults varying
> by platform. One translation is encoding Unicode text into bytes upon
> output, and decoding bytes to Unicode text upon input. This can use
> any encoding supported by the encodings package.
>
> The other translation deals with line endings. Upon input, any of
> \r\n, \r, or \n is translated to a single \n by default (this is nhe
> "universal newlines" algorithm from Python 2.x). This can be tweaked
> or disabled. Upon output, \n is translated into a platform specific
> string chosen from \r\n, \r, or \n. This can also be disabled or
> overridden. Note that \r, when written, is never treated specially; if
> you want special processing for \r on output, you can write your own
> translation layer.
>   
So the question is, that when a string containing '\r\n' is written to a 
file in text mode on a Windows platform, should it be written with the 
encoded representation of '\r\n' or '\r\r\n'?

Purity would dictate the latter and practicality the former (IMO)...

However, that would mean that round tripping a string would change it 
('\r\n' would be written as '\r\n' and then read as '\n') - on the other 
hand (particularly given that we are treating the data as text and not a 
binary blob) I don't see how writing '\r\r\n' would ever actually be 
useful in text.

+1 on just writing '\r\n' from me.

Michael Foord
http://www.manning.com/foord


> That's all. There is nothing unimplementable or confusing in these
> specifications.
>
> Python doesn't care about record I/O on legacy OSes; it does care
> about variability found in practice between popular OSes.
>
> Note that \r, \n and friends in Python 3000 are either ASCII (in bytes
> literals) or Unicode (in text literals). Again, no support for legacy
> systems that don't use ASCII or a superset.
>
> Legacy OSes are called that for a reason.
>
>   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Decimal news

2007-09-29 Thread Raymond Hettinger
If the differences are few, I prefer that you insert some conditionals  
that attach different functions based on the version number. That way  
we can keep a single version of the source that works on all of the  
pythons.


Raymond

On Sep 29, 2007, at 8:26 AM, "Thomas Wouters" <[EMAIL PROTECTED]> wrote:




On 9/28/07, Thomas Heller <[EMAIL PROTECTED]> wrote:
Thomas Wouters schrieb:
>> > If you re-eally need to check something into the trunk that re- 
eally
>> > must not be merged into py3k, but you're afraid it's not going  
to be

>> > obvious to the merger, please record the change as 'merged' using
>> > "svnmerge merge -M -r". Please take care when picking  
the
>> > revision ;) You can also just email me or someone else you see  
doing

>> > merges, as I doubt this will be a common occurance.

I think that the 'svnmerge block -r' command should be  
used.  Or not?


If you're comfortable with using svnmerge yourself, sure. If you're  
worried that you might mess up the state of the branch, you can  
leave it up to us (me.)



--
Thomas Wouters <[EMAIL PROTECTED]>

Hi! I'm a .signature virus! copy me into your .signature file to  
help me spread!

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: http://mail.python.org/mailman/options/python-dev/python%40rcn.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Terry Reedy

"Michael Foord" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
| Guido van Rossum wrote:

[snip first part of nice summary of Python i/o model]

| > The other translation deals with line endings. Upon input, any of
| > \r\n, \r, or \n is translated to a single \n by default (this is nhe 
[sic]
| > "universal newlines" algorithm from Python 2.x). This can be tweaked
| > or disabled. Upon output, \n is translated into a platform specific
| > string chosen from \r\n, \r, or \n. This can also be disabled or
| > overridden. Note that \r, when written, is never treated specially; if
| > you want special processing for \r on output, you can write your own
| > translation layer.

| So the question is, that when a string containing '\r\n' is written to a
| file in text mode on a Windows platform, should it be written with the
| encoded representation of '\r\n' or '\r\r\n'?

I think Guido pretty clearly said that on output, the default behavior is 
that \r is nothing special.  If you want a special case exception, write a 
special case translator. +1 from me.

To propose otherwise is to propose that the default semantic meaning of 
Python text objects depend on the platform that it might be 
output-translated for.  I believe the point of universal newline support 
was to get away from this.

| Purity would dictate the latter and practicality the former (IMO)...

I disagree.  Special case exceptions complicate both learnability and code 
readability and maintainability.  Simplicity is practicality.  The symmetry 
of 'platform-line-endings =input> \n =output> plaform-line-endings' is both 
pure and practical.

| However, that would mean that round tripping a string would change it
| ('\r\n' would be written as '\r\n' and then read as '\n')

Whereas \r\r\n would be read back as \r\n, which is what should happen. 
Round-trip-ability is practical to me.

| - on the other
| hand (particularly given that we are treating the data as text and not a
| binary blob) I don't see how writing '\r\r\n' would ever actually be
| useful in text.

There are two normal ways for internal Python text to have \r\n:
1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the 
same platform).
2. Intentially put there by a programmer.  If s/he also chooses default \n 
translation on output, \r is correct.

The leaves
1. Bugs due to ignorance or accident.  These should be repaired.
2. Other special situations, which can be handled by disabling, overriding, 
and layering the defaults.  This seems enough flexibility to me.

Terry Jan Reedy




___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Michael Foord
Terry Reedy wrote:
> "Michael Foord" <[EMAIL PROTECTED]> wrote in message 
> news:[EMAIL PROTECTED]
> | Guido van Rossum wrote:
>
> [snip first part of nice summary of Python i/o model]
>
> | > The other translation deals with line endings. Upon input, any of
> | > \r\n, \r, or \n is translated to a single \n by default (this is nhe 
> [sic]
> | > "universal newlines" algorithm from Python 2.x). This can be tweaked
> | > or disabled. Upon output, \n is translated into a platform specific
> | > string chosen from \r\n, \r, or \n. This can also be disabled or
> | > overridden. Note that \r, when written, is never treated specially; if
> | > you want special processing for \r on output, you can write your own
> | > translation layer.
>
> | So the question is, that when a string containing '\r\n' is written to a
> | file in text mode on a Windows platform, should it be written with the
> | encoded representation of '\r\n' or '\r\r\n'?
>
> I think Guido pretty clearly said that on output, the default behavior is 
> that \r is nothing special.  If you want a special case exception, write a 
> special case translator. +1 from me.
>
> To propose otherwise is to propose that the default semantic meaning of 
> Python text objects depend on the platform that it might be 
> output-translated for.  I believe the point of universal newline support 
> was to get away from this.
>
> | Purity would dictate the latter and practicality the former (IMO)...
>
> I disagree.  Special case exceptions complicate both learnability and code 
> readability and maintainability.  Simplicity is practicality.  The symmetry 
> of 'platform-line-endings =input> \n =output> plaform-line-endings' is both 
> pure and practical.
>
> | However, that would mean that round tripping a string would change it
> | ('\r\n' would be written as '\r\n' and then read as '\n')
>
> Whereas \r\r\n would be read back as \r\n, which is what should happen. 
> Round-trip-ability is practical to me.
>
> | - on the other
> | hand (particularly given that we are treating the data as text and not a
> | binary blob) I don't see how writing '\r\r\n' would ever actually be
> | useful in text.
>
> There are two normal ways for internal Python text to have \r\n:
> 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the 
> same platform).
> 2. Intentially put there by a programmer.  If s/he also chooses default \n 
> translation on output, \r is correct.
>   
Actually, I usually get these strings from Windows UI components. A file 
containing '\r\n' is read in with '\r\n' being translated to '\n'. New 
user input is added containing '\r\n' line endings. The file is written 
out and now contains a mix of '\r\n' and '\r\r\n'.

Michael


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Steven Bethard
On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote:
> Terry Reedy wrote:
> > There are two normal ways for internal Python text to have \r\n:
> > 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
> > same platform).
> > 2. Intentially put there by a programmer.  If s/he also chooses default \n
> > translation on output, \r is correct.
> >
> Actually, I usually get these strings from Windows UI components. A file
> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
> user input is added containing '\r\n' line endings. The file is written
> out and now contains a mix of '\r\n' and '\r\r\n'.

Out of curiosity, why don't the Python wrappers for your Windows UI
components do the appropriate '\r\n' -> '\n' conversions?

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Michael Foord
Steven Bethard wrote:
> On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote:
>   
>> Terry Reedy wrote:
>> 
>>> There are two normal ways for internal Python text to have \r\n:
>>> 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
>>> same platform).
>>> 2. Intentially put there by a programmer.  If s/he also chooses default \n
>>> translation on output, \r is correct.
>>>
>>>   
>> Actually, I usually get these strings from Windows UI components. A file
>> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
>> user input is added containing '\r\n' line endings. The file is written
>> out and now contains a mix of '\r\n' and '\r\r\n'.
>> 
>
> Out of curiosity, why don't the Python wrappers for your Windows UI
> components do the appropriate '\r\n' -> '\n' conversions?
>   

One of the great things about IronPython is that you don't *need* any 
wrappers - you access .NET objects natively (which in fact wrap the 
lower level win32 API) - and the .NET APIs are usually not as bad as you 
probably assume. ;-)

You just have to be aware that line endings are '\r\n'. I'm not sure how 
or if pywin32 handles this.

Michael

> STeVe
>   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New lines, carriage returns, and Windows

2007-09-29 Thread Nick Maclaren
"Guido van Rossum" <[EMAIL PROTECTED]> wrote:
> 
> Have you looked at Py3k at all, especially PEP 3116 (new I/O)?

No.

> Python *does* have its own I/O model. There are binary files and text
> files. For binary files, you write bytes and the semantic model is
> that of an array of bytes; byte indices are seek positions.

That is the same model as C and Unix.  It is text files that we are
discussing.

> For text files, the contents is considered to be Unicode, encoded as
> bytes in a binary file. So text file always has an underlying binary
> file. Two translations take place, both of which have defaults varying
> by platform. One translation is encoding Unicode text into bytes upon
> output, and decoding bytes to Unicode text upon input. This can use
> any encoding supported by the encodings package.

The character code isn't the issue here, and is almost completely
irrelevant.

> The other translation deals with line endings. Upon input, any of
> \r\n, \r, or \n is translated to a single \n by default (this is nhe
> "universal newlines" algorithm from Python 2.x). This can be tweaked
> or disabled. Upon output, \n is translated into a platform specific
> string chosen from \r\n, \r, or \n. This can also be disabled or
> overridden. Note that \r, when written, is never treated specially; if
> you want special processing for \r on output, you can write your own
> translation layer.

Grrk.  That's the problem.  You don't get back what you have written,
for a start, which isn't nice.  There are other issues, too.

> That's all. There is nothing unimplementable or confusing in these
> specifications.

Nothing unimplementable, I agree.  Nothing confusing?  Not in the
experience of the users I have dealt with.

> Python doesn't care about record I/O on legacy OSes; it does care
> about variability found in practice between popular OSes.

As a short-term solution, that is fine.  But I have seen the wheel
turn a couple of times in 40 years, and expect it to continue after
I am safely 6' under 

> Note that \r, \n and friends in Python 3000 are either ASCII (in bytes
> literals) or Unicode (in text literals). Again, no support for legacy
> systems that don't use ASCII or a superset.

That's not a problem.  I don't see that changing in the forseeable
future.

> Legacy OSes are called that for a reason.

Well, I remember when the text I/O model that C, Unix and Python
use WAS a feature of legacy OSs :-)

Seriously.


Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email:  [EMAIL PROTECTED]
Tel.:  +44 1223 334761Fax:  +44 1223 334679
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Steven Bethard
On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote:
> Steven Bethard wrote:
> > On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote:
> >
> >> Terry Reedy wrote:
> >>
> >>> There are two normal ways for internal Python text to have \r\n:
> >>> 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
> >>> same platform).
> >>> 2. Intentially put there by a programmer.  If s/he also chooses default \n
> >>> translation on output, \r is correct.
> >>>
> >>>
> >> Actually, I usually get these strings from Windows UI components. A file
> >> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
> >> user input is added containing '\r\n' line endings. The file is written
> >> out and now contains a mix of '\r\n' and '\r\r\n'.
> >
> > Out of curiosity, why don't the Python wrappers for your Windows UI
> > components do the appropriate '\r\n' -> '\n' conversions?
>
> One of the great things about IronPython is that you don't *need* any
> wrappers - you access .NET objects natively (which in fact wrap the
> lower level win32 API) - and the .NET APIs are usually not as bad as you
> probably assume. ;-)
>
> You just have to be aware that line endings are '\r\n'.

Ahh, I see.  So all the .NET components function like Python 3.0's
io.open(..., newline='\n'), where no translation of \n (to or from
\r\n) is performed.

STeVe
-- 
I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a
tiny blip on the distant coast of sanity.
--- Bucky Katt, Get Fuzzy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Michael Foord
Steven Bethard wrote:
> On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote:
>   
>> Steven Bethard wrote:
>> 
>>> On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote:
>>>
>>>   
 Terry Reedy wrote:

 
> There are two normal ways for internal Python text to have \r\n:
> 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
> same platform).
> 2. Intentially put there by a programmer.  If s/he also chooses default \n
> translation on output, \r is correct.
>
>
>   
 Actually, I usually get these strings from Windows UI components. A file
 containing '\r\n' is read in with '\r\n' being translated to '\n'. New
 user input is added containing '\r\n' line endings. The file is written
 out and now contains a mix of '\r\n' and '\r\r\n'.
 
>>> Out of curiosity, why don't the Python wrappers for your Windows UI
>>> components do the appropriate '\r\n' -> '\n' conversions?
>>>   
>> One of the great things about IronPython is that you don't *need* any
>> wrappers - you access .NET objects natively (which in fact wrap the
>> lower level win32 API) - and the .NET APIs are usually not as bad as you
>> probably assume. ;-)
>>
>> You just have to be aware that line endings are '\r\n'.
>> 
>
> Ahh, I see.  So all the .NET components function like Python 3.0's
> io.open(..., newline='\n'), where no translation of \n (to or from
> \r\n) is performed.
>   

Effectively yes. Although for Python compatibility, opening a file in 
text mode using the python 'open' or 'file' will behave in the usual way.

Michael

> STeVe
>   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Paul Moore
>>> Actually, I usually get these strings from Windows UI components. A file
>>> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
>>> user input is added containing '\r\n' line endings. The file is written
>>> out and now contains a mix of '\r\n' and '\r\r\n'.
>>>
>> Out of curiosity, why don't the Python wrappers for your Windows UI
>> components do the appropriate '\r\n' -> '\n' conversions?
>>
> One of the great things about IronPython is that you don't *need* any
> wrappers - you access .NET objects natively (which in fact wrap the
> lower level win32 API) - and the .NET APIs are usually not as bad as you
> probably assume. ;-)

Given the current lengthy discussion about newline translation, maybe
it isn't such a great thing :-)

Seriously, you do need a wrapper in this particular case - to convert
the .NET line ending convention to Python's. The issue here is that
such a wrapper is so trivial, that it's usually easier to simply do
the translation with adhoc .replace('\r\n', '\n') calls. The problem
comes when you accidentally forget a translation - then you get the
clash between the .NET (\r\\n) and Python (\n) models. But of course,
the solution in that case is to simply add the omitted translation,
not to change Python's IO model.

Of course, all this grand theory is just that - theory. In my case, it
helped me understand what's going on, but that's all. For real life
code, you just add the appropriate replace() calls. Whether theory
helps you keep track of where replace() is needed, or whether you just
know, doesn't really matter much.

But regardless - the Python IO model doesn't need changing. (Not even
2.x, and the py3k model is even better in this regard).

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Terry Reedy

"Michael Foord" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
| Terry Reedy wrote:
| > There are two normal ways for internal Python text to have \r\n:
| > 1. Read from a file with \r\r\n.  Then \r\r\n is correct output (on the
| > same platform).
| > 2. Intentially put there by a programmer.  If s/he also chooses default 
\n
| > translation on output, \r is correct.
| >
| Actually, I usually get these strings from Windows UI components. A file
| containing '\r\n' is read in with '\r\n' being translated to '\n'. New
| user input is added containing '\r\n' line endings. The file is written
| out and now contains a mix of '\r\n' and '\r\r\n'.

I covered this in the part you snipped:

"2. Other special situations, which can be handled by disabling, 
overriding,
and layering the defaults.  This seems enough flexibility to me."

While mixing input like this may seem 'normal' to you, I believe it is 
'special'
considering the total Python community.  I can think of at least 4 decent 
solutions, depending on the details of the input and what you do with it.

tjr



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New lines, carriage returns, and Windows

2007-09-29 Thread Greg Ewing
Nick Maclaren wrote:
> Grrk.  That's the problem.  You don't get back what you have written

You do as long as you *don't* use universal newlines mode
for reading. This is the best that can be done, because
universal newlines are inherently ambiguous.

If you want universal newlines, you just have to accept
that you can't also have \r characters meaning something
other than newlines in your files. This is true regardless
of what programming language or I/O model is being used.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New lines, carriage returns, and Windows

2007-09-29 Thread Greg Ewing
On 9/29/07, Nick Maclaren <[EMAIL PROTECTED]> wrote:

> Now, BCPL was an ancestor of C, but always was a more portable
> language (i.e. it didn't start with a specific operating system in
> mind), and used/uses a rather better model.  In this, line separators
> are atomic - e.g. '\f' is newline-with-form-feed and '\r' is
> "newline-with-overprinting".

I don't see how this is different from Unix/C "\n" being
an atomic newline character.

If you're saying that BCPL is better because it defines
standard semantics for more control characters than just
"\n", that may be true, but C is doing about the best it
can with "\n" as far as I can see, given all the crazy
things that different OSes want to do with line endings.

In any case, the problem which started all this isn't
really an I/O problem at all, it's a mismatch between
the world of Python strings which use "\n" and .NET
library code expecting strings which use "\r\n".

The correct thing to do with that is to translate whenever
a string crosses a boundary between Python code and
.NET code. This is something that ought to be done
automatically by the Python/.NET interfacing machinery,
maybe by having a different type for .NET strings.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows

2007-09-29 Thread Greg Ewing
Michael Foord wrote:
> One of the great things about IronPython is that you don't *need* any 
> wrappers - you access .NET objects natively

But it seems that you really *do* need wrappers to
deal with the line endings problem, whether they're
provided automatically or you it yourself manually.

This is reminiscent of the C-string vs. Pascal-string
fiasco when Apple switched from Pascal to C as their
main application programming language. Some development
environments provided glue code that did the translation
automatically; others required you to do it yourself,
which was a huge nuisance.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com