[issue1395] py3k: duplicated line endings when using read(1)

2011-11-23 Thread Sébastien Sablé
Sébastien Sablé added the comment: OK, sorry. Done in issue 13461. -- ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubsc

[issue1395] py3k: duplicated line endings when using read(1)

2011-11-23 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: You should open a new issue for this new problem. -- ___ Python tracker ___ ___ Python-bugs-li

[issue1395] py3k: duplicated line endings when using read(1)

2011-11-23 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.o

[issue1395] py3k: duplicated line endings when using read(1)

2011-11-23 Thread Sébastien Sablé
Sébastien Sablé added the comment: I am trying to get Python working when compiled with Visual Studio 2010 (cf issue 13210). When running the tests with the python 2.7 branch compiled with VS2010, the "test_issue_1395_5" in test_io.py will cause Python to eat the whole memory within a few se

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-19 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Committed the patch in r59060. -- resolution: -> fixed status: open -> closed __ Tracker <[EMAIL PROTECTED]> __ ___

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-11 Thread Christian Heimes
Christian Heimes added the comment: By the way I've found the daily builds you were asking for, Raghuram. http://www.python.org/dev/daily-msi/ :) __ Tracker <[EMAIL PROTECTED]> __ _

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-09 Thread Raghuram Devarakonda
Raghuram Devarakonda added the comment: On 11/8/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > I do think there is something iffy here -- the 2.x version of this > test opens the files in binary mode. I wonder what end users are > supposed to do. I think that requirement (need to open in bina

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Guido van Rossum
Guido van Rossum added the comment: > Considering that test_csv is failing on windows even without any changes > related to this issue, I looked at it and came up with this patch: > > - > Index: Lib/test/test_csv.py > ===

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Updated patch (io6.diff): - simplifications in readline - seennl is now completely handled by the NewlineDecoder Added file: http://bugs.python.org/file8719/io6.diff __ Tracker <[EMAIL PROTECTED]>

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Raghuram Devarakonda
Raghuram Devarakonda added the comment: On 11/8/07, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote: > OK, I have taken another approach which seems to work (see io4.diff): > It uses an IncrementalNewlineDecoder, which wraps the initial (e.g. > utf-8) decoder. I like this approach even though I h

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Raghuram Devarakonda
Raghuram Devarakonda added the comment: Considering that test_csv is failing on windows even without any changes related to this issue, I looked at it and came up with this patch: - Index: Lib/test/test_csv.py === --

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: > The new io4.diff breaks test_io and test_univnewlines on Linux oops, I missed this one. Here is a new version: io5.diff, which should handle the "seen newlines" better. Two more bug fixes found by test_univnewlines: - write() should return the number o

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Christian Heimes
Christian Heimes added the comment: The new io4.diff breaks test_io and test_univnewlines on Linux Added file: http://bugs.python.org/file8713/linux_test.log.gz __ Tracker <[EMAIL PROTECTED]> __

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: > About mailbox.py: it seems that the code cannot work Of course: the file mode was recently changed from rb+ to r+ (revision 57809). This means that every occurrence of os.linesep has to disappear. Oh my. __ Tracker <[EMAI

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Sorry, I think I corrupted the file by hand. Here is another version Added file: http://bugs.python.org/file8712/io4.diff __ Tracker <[EMAIL PROTECTED]> __

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Christian Heimes
Christian Heimes added the comment: The patch doesn't apply $ patch -p0 < io4.diff (Stripping trailing CRs from patch.) patching file Lib/io.py patch: malformed patch at line 41: @@ -1133,7 +1160,10 @@ __ Tracker <[EMAIL PROTECTED]>

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-08 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: OK, I have taken another approach which seems to work (see io4.diff): It uses an IncrementalNewlineDecoder, which wraps the initial (e.g. utf-8) decoder. All the tests in test_io pass on Windows, including those added by io.diff and io2.diff. This was not t

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: On 11/7/07, Guido van Rossum wrote: > Cool. How hard do you think it would be to extend your work on > StringIO into a translation of TextIOWrapper into C? Well, StringIO and TextIOWrapper are quite different. The only part that I could reuse, from StringI

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: > io3.diff does replacenl() in adjust_chunk() (trying Amaury's > suggestion). Can you see if it fixes test_mailbox failures? Unfortunately, it does not. And some tests now fail in test_io (correcting testTelling seems a good start point, since we just chan

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Guido van Rossum
Guido van Rossum added the comment: Cool. How hard do you think it would be to extend your work on StringIO into a translation of TextIOWrapper into C? On Nov 7, 2007 3:55 PM, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote: > On 11/7/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > > On 11/7/07

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: On 11/7/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On 11/7/07, Christian Heimes <[EMAIL PROTECTED]> wrote: > > > > Christian Heimes added the comment: > > > > By the way what happened to the SoC project related to Python's new IO > > subsystem? IIRC

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Christian Heimes
Christian Heimes added the comment: I take it back. I accidentally run the unit tests on the trunk instead of the py3k branch. mailbox and csv are still breaking with your test, netrc is doing fine. Added file: http://bugs.python.org/file8709/py3k_windows.log.gz

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Christian Heimes
Christian Heimes added the comment: Good work! The tests for mailbox, netrc and csv are passing with your test. I'm going to run the entire suite now. __ Tracker <[EMAIL PROTECTED]> __ ___

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Christian Heimes
Changes by Christian Heimes: -- nosy: +alexandre.vassalotti __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list Unsubscribe: h

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Christian Heimes
Changes by Christian Heimes: __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/py

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Raghuram Devarakonda
Raghuram Devarakonda added the comment: Hi Amaury and Christian, io3.diff does replacenl() in adjust_chunk() (trying Amaury's suggestion). Can you see if it fixes test_mailbox failures? Added file: http://bugs.python.org/file8708/io3.diff __ Tracker <[EMAIL PROT

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Guido van Rossum
Guido van Rossum added the comment: On 11/7/07, Christian Heimes <[EMAIL PROTECTED]> wrote: > > Christian Heimes added the comment: > > By the way what happened to the SoC project related to Python's new IO > subsystem? IIRC somebody was working on a C optimization of the io lib. > >

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Christian Heimes
Christian Heimes added the comment: By the way what happened to the SoC project related to Python's new IO subsystem? IIRC somebody was working on a C optimization of the io lib. __ Tracker <[EMAIL PROTECTED]> ___

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Guido van Rossum
Guido van Rossum added the comment: Somebody needs to reverse-engineer the invariants applying to the various instance variables and add comments explaining them, and showing how they are maintained by each call. __ Tracker <[EMAIL PROTECTED]>

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: > The new patch fixes test_netrc for me but test_csv and test_mailbox are > still broken. For test_mailbox at least, I think I have a clue: the _pending member now contains translated newlines. But in tell(), we use its length and compare it with the lengt

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Guido van Rossum
Guido van Rossum added the comment: This looks promising, but my head hurts when I try to understand the code that's already there and think about whether your patch will always do the right thing... I'll look more later. Regarding "universal newlines without translation:" that means that \r\n

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Raghuram Devarakonda
Raghuram Devarakonda added the comment: > The new patch fixes test_netrc for me but test_csv and test_mailbox are > still broken. Unfortunately, I am not able to build python on windows so I can not test there. Can you post the exact errors? You can send me private email as well if you think the

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Christian Heimes
Christian Heimes added the comment: The new patch fixes test_netrc for me but test_csv and test_mailbox are still broken. -- components: +Library (Lib), Windows keywords: +py3k __ Tracker <[EMAIL PROTECTED]>

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-07 Thread Raghuram Devarakonda
Raghuram Devarakonda added the comment: I am attaching another patch (io2.diff). Please review. I am not sure whether _adjust_chunk() should also adjust "readahead". BTW, PEP 3116 says: "If universal newlines without translation are requested on input (i.e. newline=''), if a system read operat

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-06 Thread Guido van Rossum
Guido van Rossum added the comment: > In my opinion the check for \r should only happen when os.linesep or > newline starts with \r. On Unix with standard newline the \r should be > treated as every other char. No: it is not dependent on os.linesep but on the newline parameter passed to open().

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-06 Thread Christian Heimes
Christian Heimes added the comment: Guido van Rossum wrote: > IMO you shouldn't read another chunk when the last character you've seen > is \r; instead, you should set a flag so that on the next read, you'll > ignore an initial \n. The flag should be made of the tell/seek state as > well. In my

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-06 Thread Raghuram Devarakonda
Raghuram Devarakonda added the comment: On 11/6/07, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote: > - it reads a complete chunk for just one more byte > - the re-read should be disabled when lineends are not translated > these two are minor annoyance and can be easily corrected, but: > > - ther

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-06 Thread Guido van Rossum
Guido van Rossum added the comment: IMO you shouldn't read another chunk when the last character you've seen is \r; instead, you should set a flag so that on the next read, you'll ignore an initial \n. The flag should be made of the tell/seek state as well. (The problem with reading another char

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-06 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: This patch goes in the right direction, but has several problems IMO: - it reads a complete chunk for just one more byte - the re-read should be disabled when lineends are not translated these two are minor annoyance and can be easily corrected, but: - th

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-06 Thread Raghuram Devarakonda
Raghuram Devarakonda added the comment: I am attaching the patch io.diff that does the following: - If a chunk ends in "\r", read the next chunk to check if it starts with "\n". This is obviously a very simplified solution that can be optimized. - invoke _replacenl on the complete string read i

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-06 Thread Amaury Forgeot d'Arc
Amaury Forgeot d'Arc added the comment: Some thoughts: - is it ok to call replacenl multiple times on the same string? \r\r\n and other combinations like this. - I wonder if it is possible to correctly handle \r\n at the CHUNK_SIZE limit without an incremental decoder. it seems that we need at le

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-05 Thread Raghuram Devarakonda
Raghuram Devarakonda added the comment: I think the solution is to do the translation on a bigger chunk than on the string being returned in each read call. For example, the following patch correctly returns "a" and "\n" (instead of "a" and two "\n"s). Index: Lib/io.py =

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-05 Thread Guido van Rossum
Guido van Rossum added the comment: Wow, thanks! This is not just a bug on Windows -- it is a bug in the TextIOWrapper code that is just more likely on Windows. It is easily reproduced on Linux too: >>> f = open("@", "wb")>>> f.write(b"a\r\n") 6 >>> f.close() >>> f = open("@", "r") >>> f.read(1)

[issue1395] py3k: duplicated line endings when using read(1)

2007-11-05 Thread Amaury Forgeot d'Arc
New submission from Amaury Forgeot d'Arc: When reading a Windows text file one byte at a time, \r\n get split into two function calls, and the user receives two \n. The following test fails (put it somewhere in test_io.py, inside TextIOWrapperTest for example) def testReadOneByOne(self):