Re: [Python-Dev] New lines, carriage returns, and Windows
"Paul Moore" <[EMAIL PROTECTED]> wrote: > > OK, so far so good - although I'm not *quite* sure there's a > self-consistent definition of "code that only uses \n". I'll assume > you mean code that has a concept of lines, that lines never contain > anything other than text (specifically, neither \r or \n can appear in > a line, I'll punt on whether other weird stuff like form feed are > legal), and that whenever your code needs to write data to a file, it > writes lines with \n alone between them. I won't. There are a few of us still left who know how this started, and here is a simplified description. Unix was a computer scientist's workbench, and made no attempt to be general. In particular, its text datastream model was appropriate for the imnportant devices of the day - teletypes and similar. So far, so good. But what was forgotten later is that the model does NOT extend to other systems and, in particular, made no sense on the record-oriented models generally used by mainframes (see Fortran for an example). When C was standardised, this was fudged. I tried to get it improved, but it is one of the many things I failed to do. The handling of ALL of the control characters in text I/O is non-portable (even \t, despite what the satndard says), and you have to follow the system's constraints if things are to work. Unfortunately, the kludging that the compiler does to map C to the operating system confuses things still further - though it is essential. Now, BCPL was an ancestor of C, but always was a more portable language (i.e. it didn't start with a specific operating system in mind), and used/uses a rather better model. In this, line separators are atomic - e.g. '\f' is newline-with-form-feed and '\r' is "newline-with-overprinting". Now, THAT model is more generic. Not fully generic, of course, but it would cater for all of Unix, CPM and its derivatives (yes, Microsoft), MacOS and most mainframes (with some reservations). So, until and unless Python chooses to define its own I/O model, these problems will continue to arise. Whether this one is a simple bug or an avoidable feature, I can't say without looking harder, but bugs are often caused by attempting to implement impossible or confusing specifications. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New lines, carriage returns, and Windows
On 9/29/07, Nick Maclaren <[EMAIL PROTECTED]> wrote: > "Paul Moore" <[EMAIL PROTECTED]> wrote: > > > > OK, so far so good - although I'm not *quite* sure there's a > > self-consistent definition of "code that only uses \n". I'll assume > > you mean code that has a concept of lines, that lines never contain > > anything other than text (specifically, neither \r or \n can appear in > > a line, I'll punt on whether other weird stuff like form feed are > > legal), and that whenever your code needs to write data to a file, it > > writes lines with \n alone between them. > > I won't. There are a few of us still left who know how this started, > and here is a simplified description. > > Unix was a computer scientist's workbench, and made no attempt to be > general. In particular, its text datastream model was appropriate > for the imnportant devices of the day - teletypes and similar. So > far, so good. But what was forgotten later is that the model does > NOT extend to other systems and, in particular, made no sense on the > record-oriented models generally used by mainframes (see Fortran for > an example). > > When C was standardised, this was fudged. I tried to get it improved, > but it is one of the many things I failed to do. The handling of > ALL of the control characters in text I/O is non-portable (even \t, > despite what the satndard says), and you have to follow the system's > constraints if things are to work. Unfortunately, the kludging that > the compiler does to map C to the operating system confuses things > still further - though it is essential. > > Now, BCPL was an ancestor of C, but always was a more portable > language (i.e. it didn't start with a specific operating system in > mind), and used/uses a rather better model. In this, line separators > are atomic - e.g. '\f' is newline-with-form-feed and '\r' is > "newline-with-overprinting". Now, THAT model is more generic. > Not fully generic, of course, but it would cater for all of Unix, > CPM and its derivatives (yes, Microsoft), MacOS and most mainframes > (with some reservations). > > So, until and unless Python chooses to define its own I/O model, > these problems will continue to arise. Whether this one is a simple > bug or an avoidable feature, I can't say without looking harder, > but bugs are often caused by attempting to implement impossible > or confusing specifications. Have you looked at Py3k at all, especially PEP 3116 (new I/O)? Python *does* have its own I/O model. There are binary files and text files. For binary files, you write bytes and the semantic model is that of an array of bytes; byte indices are seek positions. For text files, the contents is considered to be Unicode, encoded as bytes in a binary file. So text file always has an underlying binary file. Two translations take place, both of which have defaults varying by platform. One translation is encoding Unicode text into bytes upon output, and decoding bytes to Unicode text upon input. This can use any encoding supported by the encodings package. The other translation deals with line endings. Upon input, any of \r\n, \r, or \n is translated to a single \n by default (this is nhe "universal newlines" algorithm from Python 2.x). This can be tweaked or disabled. Upon output, \n is translated into a platform specific string chosen from \r\n, \r, or \n. This can also be disabled or overridden. Note that \r, when written, is never treated specially; if you want special processing for \r on output, you can write your own translation layer. That's all. There is nothing unimplementable or confusing in these specifications. Python doesn't care about record I/O on legacy OSes; it does care about variability found in practice between popular OSes. Note that \r, \n and friends in Python 3000 are either ASCII (in bytes literals) or Unicode (in text literals). Again, no support for legacy systems that don't use ASCII or a superset. Legacy OSes are called that for a reason. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Decimal news
On 9/28/07, Thomas Heller <[EMAIL PROTECTED]> wrote: > > Thomas Wouters schrieb: > >> > If you re-eally need to check something into the trunk that re-eally > >> > must not be merged into py3k, but you're afraid it's not going to be > >> > obvious to the merger, please record the change as 'merged' using > >> > "svnmerge merge -M -r". Please take care when picking the > >> > revision ;) You can also just email me or someone else you see doing > >> > merges, as I doubt this will be a common occurance. > > I think that the 'svnmerge block -r' command should be used. Or > not? If you're comfortable with using svnmerge yourself, sure. If you're worried that you might mess up the state of the branch, you can leave it up to us (me.) -- Thomas Wouters <[EMAIL PROTECTED]> Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Guido van Rossum wrote:
> [snip..]
> Python *does* have its own I/O model. There are binary files and text
> files. For binary files, you write bytes and the semantic model is
> that of an array of bytes; byte indices are seek positions.
>
> For text files, the contents is considered to be Unicode, encoded as
> bytes in a binary file. So text file always has an underlying binary
> file. Two translations take place, both of which have defaults varying
> by platform. One translation is encoding Unicode text into bytes upon
> output, and decoding bytes to Unicode text upon input. This can use
> any encoding supported by the encodings package.
>
> The other translation deals with line endings. Upon input, any of
> \r\n, \r, or \n is translated to a single \n by default (this is nhe
> "universal newlines" algorithm from Python 2.x). This can be tweaked
> or disabled. Upon output, \n is translated into a platform specific
> string chosen from \r\n, \r, or \n. This can also be disabled or
> overridden. Note that \r, when written, is never treated specially; if
> you want special processing for \r on output, you can write your own
> translation layer.
>
So the question is, that when a string containing '\r\n' is written to a
file in text mode on a Windows platform, should it be written with the
encoded representation of '\r\n' or '\r\r\n'?
Purity would dictate the latter and practicality the former (IMO)...
However, that would mean that round tripping a string would change it
('\r\n' would be written as '\r\n' and then read as '\n') - on the other
hand (particularly given that we are treating the data as text and not a
binary blob) I don't see how writing '\r\r\n' would ever actually be
useful in text.
+1 on just writing '\r\n' from me.
Michael Foord
http://www.manning.com/foord
> That's all. There is nothing unimplementable or confusing in these
> specifications.
>
> Python doesn't care about record I/O on legacy OSes; it does care
> about variability found in practice between popular OSes.
>
> Note that \r, \n and friends in Python 3000 are either ASCII (in bytes
> literals) or Unicode (in text literals). Again, no support for legacy
> systems that don't use ASCII or a superset.
>
> Legacy OSes are called that for a reason.
>
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Decimal news
If the differences are few, I prefer that you insert some conditionals that attach different functions based on the version number. That way we can keep a single version of the source that works on all of the pythons. Raymond On Sep 29, 2007, at 8:26 AM, "Thomas Wouters" <[EMAIL PROTECTED]> wrote: On 9/28/07, Thomas Heller <[EMAIL PROTECTED]> wrote: Thomas Wouters schrieb: >> > If you re-eally need to check something into the trunk that re- eally >> > must not be merged into py3k, but you're afraid it's not going to be >> > obvious to the merger, please record the change as 'merged' using >> > "svnmerge merge -M -r". Please take care when picking the >> > revision ;) You can also just email me or someone else you see doing >> > merges, as I doubt this will be a common occurance. I think that the 'svnmerge block -r' command should be used. Or not? If you're comfortable with using svnmerge yourself, sure. If you're worried that you might mess up the state of the branch, you can leave it up to us (me.) -- Thomas Wouters <[EMAIL PROTECTED]> Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/python%40rcn.com ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
"Michael Foord" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
| Guido van Rossum wrote:
[snip first part of nice summary of Python i/o model]
| > The other translation deals with line endings. Upon input, any of
| > \r\n, \r, or \n is translated to a single \n by default (this is nhe
[sic]
| > "universal newlines" algorithm from Python 2.x). This can be tweaked
| > or disabled. Upon output, \n is translated into a platform specific
| > string chosen from \r\n, \r, or \n. This can also be disabled or
| > overridden. Note that \r, when written, is never treated specially; if
| > you want special processing for \r on output, you can write your own
| > translation layer.
| So the question is, that when a string containing '\r\n' is written to a
| file in text mode on a Windows platform, should it be written with the
| encoded representation of '\r\n' or '\r\r\n'?
I think Guido pretty clearly said that on output, the default behavior is
that \r is nothing special. If you want a special case exception, write a
special case translator. +1 from me.
To propose otherwise is to propose that the default semantic meaning of
Python text objects depend on the platform that it might be
output-translated for. I believe the point of universal newline support
was to get away from this.
| Purity would dictate the latter and practicality the former (IMO)...
I disagree. Special case exceptions complicate both learnability and code
readability and maintainability. Simplicity is practicality. The symmetry
of 'platform-line-endings =input> \n =output> plaform-line-endings' is both
pure and practical.
| However, that would mean that round tripping a string would change it
| ('\r\n' would be written as '\r\n' and then read as '\n')
Whereas \r\r\n would be read back as \r\n, which is what should happen.
Round-trip-ability is practical to me.
| - on the other
| hand (particularly given that we are treating the data as text and not a
| binary blob) I don't see how writing '\r\r\n' would ever actually be
| useful in text.
There are two normal ways for internal Python text to have \r\n:
1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the
same platform).
2. Intentially put there by a programmer. If s/he also chooses default \n
translation on output, \r is correct.
The leaves
1. Bugs due to ignorance or accident. These should be repaired.
2. Other special situations, which can be handled by disabling, overriding,
and layering the defaults. This seems enough flexibility to me.
Terry Jan Reedy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Terry Reedy wrote:
> "Michael Foord" <[EMAIL PROTECTED]> wrote in message
> news:[EMAIL PROTECTED]
> | Guido van Rossum wrote:
>
> [snip first part of nice summary of Python i/o model]
>
> | > The other translation deals with line endings. Upon input, any of
> | > \r\n, \r, or \n is translated to a single \n by default (this is nhe
> [sic]
> | > "universal newlines" algorithm from Python 2.x). This can be tweaked
> | > or disabled. Upon output, \n is translated into a platform specific
> | > string chosen from \r\n, \r, or \n. This can also be disabled or
> | > overridden. Note that \r, when written, is never treated specially; if
> | > you want special processing for \r on output, you can write your own
> | > translation layer.
>
> | So the question is, that when a string containing '\r\n' is written to a
> | file in text mode on a Windows platform, should it be written with the
> | encoded representation of '\r\n' or '\r\r\n'?
>
> I think Guido pretty clearly said that on output, the default behavior is
> that \r is nothing special. If you want a special case exception, write a
> special case translator. +1 from me.
>
> To propose otherwise is to propose that the default semantic meaning of
> Python text objects depend on the platform that it might be
> output-translated for. I believe the point of universal newline support
> was to get away from this.
>
> | Purity would dictate the latter and practicality the former (IMO)...
>
> I disagree. Special case exceptions complicate both learnability and code
> readability and maintainability. Simplicity is practicality. The symmetry
> of 'platform-line-endings =input> \n =output> plaform-line-endings' is both
> pure and practical.
>
> | However, that would mean that round tripping a string would change it
> | ('\r\n' would be written as '\r\n' and then read as '\n')
>
> Whereas \r\r\n would be read back as \r\n, which is what should happen.
> Round-trip-ability is practical to me.
>
> | - on the other
> | hand (particularly given that we are treating the data as text and not a
> | binary blob) I don't see how writing '\r\r\n' would ever actually be
> | useful in text.
>
> There are two normal ways for internal Python text to have \r\n:
> 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the
> same platform).
> 2. Intentially put there by a programmer. If s/he also chooses default \n
> translation on output, \r is correct.
>
Actually, I usually get these strings from Windows UI components. A file
containing '\r\n' is read in with '\r\n' being translated to '\n'. New
user input is added containing '\r\n' line endings. The file is written
out and now contains a mix of '\r\n' and '\r\r\n'.
Michael
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote: > Terry Reedy wrote: > > There are two normal ways for internal Python text to have \r\n: > > 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the > > same platform). > > 2. Intentially put there by a programmer. If s/he also chooses default \n > > translation on output, \r is correct. > > > Actually, I usually get these strings from Windows UI components. A file > containing '\r\n' is read in with '\r\n' being translated to '\n'. New > user input is added containing '\r\n' line endings. The file is written > out and now contains a mix of '\r\n' and '\r\r\n'. Out of curiosity, why don't the Python wrappers for your Windows UI components do the appropriate '\r\n' -> '\n' conversions? STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Steven Bethard wrote: > On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote: > >> Terry Reedy wrote: >> >>> There are two normal ways for internal Python text to have \r\n: >>> 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the >>> same platform). >>> 2. Intentially put there by a programmer. If s/he also chooses default \n >>> translation on output, \r is correct. >>> >>> >> Actually, I usually get these strings from Windows UI components. A file >> containing '\r\n' is read in with '\r\n' being translated to '\n'. New >> user input is added containing '\r\n' line endings. The file is written >> out and now contains a mix of '\r\n' and '\r\r\n'. >> > > Out of curiosity, why don't the Python wrappers for your Windows UI > components do the appropriate '\r\n' -> '\n' conversions? > One of the great things about IronPython is that you don't *need* any wrappers - you access .NET objects natively (which in fact wrap the lower level win32 API) - and the .NET APIs are usually not as bad as you probably assume. ;-) You just have to be aware that line endings are '\r\n'. I'm not sure how or if pywin32 handles this. Michael > STeVe > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New lines, carriage returns, and Windows
"Guido van Rossum" <[EMAIL PROTECTED]> wrote: > > Have you looked at Py3k at all, especially PEP 3116 (new I/O)? No. > Python *does* have its own I/O model. There are binary files and text > files. For binary files, you write bytes and the semantic model is > that of an array of bytes; byte indices are seek positions. That is the same model as C and Unix. It is text files that we are discussing. > For text files, the contents is considered to be Unicode, encoded as > bytes in a binary file. So text file always has an underlying binary > file. Two translations take place, both of which have defaults varying > by platform. One translation is encoding Unicode text into bytes upon > output, and decoding bytes to Unicode text upon input. This can use > any encoding supported by the encodings package. The character code isn't the issue here, and is almost completely irrelevant. > The other translation deals with line endings. Upon input, any of > \r\n, \r, or \n is translated to a single \n by default (this is nhe > "universal newlines" algorithm from Python 2.x). This can be tweaked > or disabled. Upon output, \n is translated into a platform specific > string chosen from \r\n, \r, or \n. This can also be disabled or > overridden. Note that \r, when written, is never treated specially; if > you want special processing for \r on output, you can write your own > translation layer. Grrk. That's the problem. You don't get back what you have written, for a start, which isn't nice. There are other issues, too. > That's all. There is nothing unimplementable or confusing in these > specifications. Nothing unimplementable, I agree. Nothing confusing? Not in the experience of the users I have dealt with. > Python doesn't care about record I/O on legacy OSes; it does care > about variability found in practice between popular OSes. As a short-term solution, that is fine. But I have seen the wheel turn a couple of times in 40 years, and expect it to continue after I am safely 6' under > Note that \r, \n and friends in Python 3000 are either ASCII (in bytes > literals) or Unicode (in text literals). Again, no support for legacy > systems that don't use ASCII or a superset. That's not a problem. I don't see that changing in the forseeable future. > Legacy OSes are called that for a reason. Well, I remember when the text I/O model that C, Unix and Python use WAS a feature of legacy OSs :-) Seriously. Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: [EMAIL PROTECTED] Tel.: +44 1223 334761Fax: +44 1223 334679 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote: > Steven Bethard wrote: > > On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote: > > > >> Terry Reedy wrote: > >> > >>> There are two normal ways for internal Python text to have \r\n: > >>> 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the > >>> same platform). > >>> 2. Intentially put there by a programmer. If s/he also chooses default \n > >>> translation on output, \r is correct. > >>> > >>> > >> Actually, I usually get these strings from Windows UI components. A file > >> containing '\r\n' is read in with '\r\n' being translated to '\n'. New > >> user input is added containing '\r\n' line endings. The file is written > >> out and now contains a mix of '\r\n' and '\r\r\n'. > > > > Out of curiosity, why don't the Python wrappers for your Windows UI > > components do the appropriate '\r\n' -> '\n' conversions? > > One of the great things about IronPython is that you don't *need* any > wrappers - you access .NET objects natively (which in fact wrap the > lower level win32 API) - and the .NET APIs are usually not as bad as you > probably assume. ;-) > > You just have to be aware that line endings are '\r\n'. Ahh, I see. So all the .NET components function like Python 3.0's io.open(..., newline='\n'), where no translation of \n (to or from \r\n) is performed. STeVe -- I'm not *in*-sane. Indeed, I am so far *out* of sane that you appear a tiny blip on the distant coast of sanity. --- Bucky Katt, Get Fuzzy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Steven Bethard wrote: > On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote: > >> Steven Bethard wrote: >> >>> On 9/29/07, Michael Foord <[EMAIL PROTECTED]> wrote: >>> >>> Terry Reedy wrote: > There are two normal ways for internal Python text to have \r\n: > 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the > same platform). > 2. Intentially put there by a programmer. If s/he also chooses default \n > translation on output, \r is correct. > > > Actually, I usually get these strings from Windows UI components. A file containing '\r\n' is read in with '\r\n' being translated to '\n'. New user input is added containing '\r\n' line endings. The file is written out and now contains a mix of '\r\n' and '\r\r\n'. >>> Out of curiosity, why don't the Python wrappers for your Windows UI >>> components do the appropriate '\r\n' -> '\n' conversions? >>> >> One of the great things about IronPython is that you don't *need* any >> wrappers - you access .NET objects natively (which in fact wrap the >> lower level win32 API) - and the .NET APIs are usually not as bad as you >> probably assume. ;-) >> >> You just have to be aware that line endings are '\r\n'. >> > > Ahh, I see. So all the .NET components function like Python 3.0's > io.open(..., newline='\n'), where no translation of \n (to or from > \r\n) is performed. > Effectively yes. Although for Python compatibility, opening a file in text mode using the python 'open' or 'file' will behave in the usual way. Michael > STeVe > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
>>> Actually, I usually get these strings from Windows UI components. A file
>>> containing '\r\n' is read in with '\r\n' being translated to '\n'. New
>>> user input is added containing '\r\n' line endings. The file is written
>>> out and now contains a mix of '\r\n' and '\r\r\n'.
>>>
>> Out of curiosity, why don't the Python wrappers for your Windows UI
>> components do the appropriate '\r\n' -> '\n' conversions?
>>
> One of the great things about IronPython is that you don't *need* any
> wrappers - you access .NET objects natively (which in fact wrap the
> lower level win32 API) - and the .NET APIs are usually not as bad as you
> probably assume. ;-)
Given the current lengthy discussion about newline translation, maybe
it isn't such a great thing :-)
Seriously, you do need a wrapper in this particular case - to convert
the .NET line ending convention to Python's. The issue here is that
such a wrapper is so trivial, that it's usually easier to simply do
the translation with adhoc .replace('\r\n', '\n') calls. The problem
comes when you accidentally forget a translation - then you get the
clash between the .NET (\r\\n) and Python (\n) models. But of course,
the solution in that case is to simply add the omitted translation,
not to change Python's IO model.
Of course, all this grand theory is just that - theory. In my case, it
helped me understand what's going on, but that's all. For real life
code, you just add the appropriate replace() calls. Whether theory
helps you keep track of where replace() is needed, or whether you just
know, doesn't really matter much.
But regardless - the Python IO model doesn't need changing. (Not even
2.x, and the py3k model is even better in this regard).
Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
"Michael Foord" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] | Terry Reedy wrote: | > There are two normal ways for internal Python text to have \r\n: | > 1. Read from a file with \r\r\n. Then \r\r\n is correct output (on the | > same platform). | > 2. Intentially put there by a programmer. If s/he also chooses default \n | > translation on output, \r is correct. | > | Actually, I usually get these strings from Windows UI components. A file | containing '\r\n' is read in with '\r\n' being translated to '\n'. New | user input is added containing '\r\n' line endings. The file is written | out and now contains a mix of '\r\n' and '\r\r\n'. I covered this in the part you snipped: "2. Other special situations, which can be handled by disabling, overriding, and layering the defaults. This seems enough flexibility to me." While mixing input like this may seem 'normal' to you, I believe it is 'special' considering the total Python community. I can think of at least 4 decent solutions, depending on the details of the input and what you do with it. tjr ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New lines, carriage returns, and Windows
Nick Maclaren wrote: > Grrk. That's the problem. You don't get back what you have written You do as long as you *don't* use universal newlines mode for reading. This is the best that can be done, because universal newlines are inherently ambiguous. If you want universal newlines, you just have to accept that you can't also have \r characters meaning something other than newlines in your files. This is true regardless of what programming language or I/O model is being used. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New lines, carriage returns, and Windows
On 9/29/07, Nick Maclaren <[EMAIL PROTECTED]> wrote: > Now, BCPL was an ancestor of C, but always was a more portable > language (i.e. it didn't start with a specific operating system in > mind), and used/uses a rather better model. In this, line separators > are atomic - e.g. '\f' is newline-with-form-feed and '\r' is > "newline-with-overprinting". I don't see how this is different from Unix/C "\n" being an atomic newline character. If you're saying that BCPL is better because it defines standard semantics for more control characters than just "\n", that may be true, but C is doing about the best it can with "\n" as far as I can see, given all the crazy things that different OSes want to do with line endings. In any case, the problem which started all this isn't really an I/O problem at all, it's a mismatch between the world of Python strings which use "\n" and .NET library code expecting strings which use "\r\n". The correct thing to do with that is to translate whenever a string crosses a boundary between Python code and .NET code. This is something that ought to be done automatically by the Python/.NET interfacing machinery, maybe by having a different type for .NET strings. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [python] Re: New lines, carriage returns, and Windows
Michael Foord wrote: > One of the great things about IronPython is that you don't *need* any > wrappers - you access .NET objects natively But it seems that you really *do* need wrappers to deal with the line endings problem, whether they're provided automatically or you it yourself manually. This is reminiscent of the C-string vs. Pascal-string fiasco when Apple switched from Pascal to C as their main application programming language. Some development environments provided glue code that did the translation automatically; others required you to do it yourself, which was a huge nuisance. -- Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
