[ python-Bugs-1117302 ] sgmllib.SGMLParser
Bugs item #1117302, was opened at 2005-02-06 15:04 Message generated for change (Comment added) made by effbot You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1117302&group_id=5470 Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Paul Birnie (pbirnie) Assigned to: Nobody/Anonymous (nobody) Summary: sgmllib.SGMLParser Initial Comment: sgmllib.SGMLParser calls start tag and end_methods correctly until it encounters One Two Three the seems to cause its parsing to become confused and I conly get call backs for tag a twice (link 1 and 3) -- >Comment By: Fredrik Lundh (effbot) Date: 2005-02-08 09:01 Message: Logged In: YES user_id=38376 footnote: is an XML construct, and is not valid HTML. In HTML, "blah", so the BR section is parsed as START br DATA >Two< END br DATA a> which is 100% correct. For more on this topic, see: http://www.cs.tut.fi/~jkorpela/html/empty.html -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1117302&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1117302 ] sgmllib.SGMLParser
Bugs item #1117302, was opened at 2005-02-06 15:04 Message generated for change (Comment added) made by effbot You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1117302&group_id=5470 Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Paul Birnie (pbirnie) Assigned to: Nobody/Anonymous (nobody) Summary: sgmllib.SGMLParser Initial Comment: sgmllib.SGMLParser calls start tag and end_methods correctly until it encounters One Two Three the seems to cause its parsing to become confused and I conly get call backs for tag a twice (link 1 and 3) -- >Comment By: Fredrik Lundh (effbot) Date: 2005-02-08 09:03 Message: Logged In: YES user_id=38376 footnote 2: if you need to deal with broken HTML, use TidyLib: http://utidylib.berlios.de/ http://effbot.org/zone/element-tidylib.htm -- Comment By: Fredrik Lundh (effbot) Date: 2005-02-08 09:01 Message: Logged In: YES user_id=38376 footnote: is an XML construct, and is not valid HTML. In HTML, "blah", so the BR section is parsed as START br DATA >Two< END br DATA a> which is 100% correct. For more on this topic, see: http://www.cs.tut.fi/~jkorpela/html/empty.html -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1117302&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1117302 ] sgmllib.SGMLParser
Bugs item #1117302, was opened at 2005-02-06 15:04 Message generated for change (Comment added) made by effbot You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1117302&group_id=5470 Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Paul Birnie (pbirnie) Assigned to: Nobody/Anonymous (nobody) Summary: sgmllib.SGMLParser Initial Comment: sgmllib.SGMLParser calls start tag and end_methods correctly until it encounters One Two Three the seems to cause its parsing to become confused and I conly get call backs for tag a twice (link 1 and 3) -- >Comment By: Fredrik Lundh (effbot) Date: 2005-02-08 09:14 Message: Logged In: YES user_id=38376 footnote 3: for the link case, also note that the HTMLParser module handles this in a more practical way (that is, it limits itself to SGML features that's actually used on the web). -- Comment By: Fredrik Lundh (effbot) Date: 2005-02-08 09:03 Message: Logged In: YES user_id=38376 footnote 2: if you need to deal with broken HTML, use TidyLib: http://utidylib.berlios.de/ http://effbot.org/zone/element-tidylib.htm -- Comment By: Fredrik Lundh (effbot) Date: 2005-02-08 09:01 Message: Logged In: YES user_id=38376 footnote: is an XML construct, and is not valid HTML. In HTML, "blah", so the BR section is parsed as START br DATA >Two< END br DATA a> which is 100% correct. For more on this topic, see: http://www.cs.tut.fi/~jkorpela/html/empty.html -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1117302&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1116571 ] Wrong match with regex, non-greedy problem
Bugs item #1116571, was opened at 2005-02-05 01:12 Message generated for change (Comment added) made by effbot You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1116571&group_id=5470 Category: Regular Expressions Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: rengel (engel_re) Assigned to: Gustavo Niemeyer (niemeyer) Summary: Wrong match with regex, non-greedy problem Initial Comment: # This is executable. # My test string ist rather long: tst = "In this Buch, used to designate Dinge der Wirklichkeit rather than SW Ent." # I want to match the last part of the string: # SW Ent # So I define the following pattern an compile it: pat = r"(.*?) (.*?)" rex = re.compile(pat) # Then I search the string to get a match group : mat = rex.search(tst) # If found, print the group if mat: print mat.group() # Instead of # SW Ent # I get the whole string starting with # Buch... # up to the very last # Apparently the non-greedy operator doesn't work correctly. # What's wrong? -- >Comment By: Fredrik Lundh (effbot) Date: 2005-02-08 09:27 Message: Logged In: YES user_id=38376 Search returns the first (left-most) location where the pattern matches, if any. The non-greedy operator only guarantees that you get the shortest possible match at that location. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1116571&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1118729 ] Error in representation of complex numbers(again)
Bugs item #1118729, was opened at 2005-02-09 01:26 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1118729&group_id=5470 Category: Python Interpreter Core Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: George Yoshida (quiver) Assigned to: Nobody/Anonymous (nobody) Summary: Error in representation of complex numbers(again) Initial Comment: >>> -(1+0j) (-1+-0j) I encountered this while I was calculating conjugate of complex numbers(e.g. z.conjugate()). Related bug * http://www.python.org/sf/1013908 One thing to note is that -(0j) can return 0j or -0j dependeing on OSes. Confirmed on SuSE 9.1 & cygwin. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1118729&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1112856 ] patch 1079734 broke cgi.FieldStorage w/ multipart post req.
Bugs item #1112856, was opened at 2005-01-31 01:58 Message generated for change (Comment added) made by irmen You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1112856&group_id=5470 Category: Python Library Group: Python 2.5 Status: Open Resolution: None Priority: 5 Submitted By: Irmen de Jong (irmen) Assigned to: Nobody/Anonymous (nobody) Summary: patch 1079734 broke cgi.FieldStorage w/ multipart post req. Initial Comment: Patch 1079734 "Make cgi.py use email instead of rfc822 or mimetools" seems to have broken the cgi.FieldStorage in cases where the request is a multipart post (for instance, when a file upload form field is used). See the attached test program. With cgi.py revision <1.83 (python 2.4 for instance) I get the expected results; 374 FieldStorage(None, None, [FieldStorage('param1', None, 'Value of param1'), FieldStorage('param2', None, 'Value of param2'), FieldStorage('file', '', ''), FieldStorage(None, None, '')]) but with cgi.py rev 1.83 (current) I get this: 374 FieldStorage(None, None, [FieldStorage('param1', None, '')]) Another thing that I observed (which isn't reproduced by this test program) is that cgi.FieldStorage.__init__ never completes when the fp is a socket-file (and the request is again a multipart post). It worked fine with the old cgi.py. -- >Comment By: Irmen de Jong (irmen) Date: 2005-02-08 22:40 Message: Logged In: YES user_id=129426 I've added a test that shows the 'freezing' problem I talked about. Start the server.py, it will listen on port 9000. Open the post.html in your web browser, enter some form data, and submit the form. It will POST to the server.py and if you started that one with cvs-python (2.5a0) it will freeze on the marked line. If you start server.py with Python 2.4, it will work fine. -- Comment By: Irmen de Jong (irmen) Date: 2005-02-06 23:58 Message: Logged In: YES user_id=129426 Yes, I'll try to make a test case for that within the next few days. -- Comment By: Josh Hoyt (joshhoyt) Date: 2005-02-05 00:02 Message: Logged In: YES user_id=693077 Irmen, can you try to create a test case where the cgi.FieldStorage never completes, so I can make sure that any fix I come up with resolves it? I will try to put together an implementation where the email parser parses the whole multipart message. -- Comment By: Irmen de Jong (irmen) Date: 2005-02-04 23:06 Message: Logged In: YES user_id=129426 Johannes: while your patch makes my cgibug.py test case run fine, it has 2 problems: 1- it runs much slower than the python2.4 code (probably because of the reading back thing Josh is talking about); 2- it still doesn't fix the second problem that I observed: cgi.FieldStorage never completes when fp is a socket. I don't have a separate test case for this yet, sorry. So Josh: perhaps your idea doesn't have these 2 problems? -- Comment By: Josh Hoyt (joshhoyt) Date: 2005-02-04 16:45 Message: Logged In: YES user_id=693077 Johannes, your patch looks fine to me. It would be nice if we didn't have to keep reading back each part from the parsed message, though. I had an idea for another approach. Use email to parse the MIME message fully, then convert it to FieldStorage fields. Parsing could go something like: == CODE == from email.FeedParser import FeedParser parser = FeedParser() # Create bogus content-type header... parser.feed('Content-type: %s ; boundary=%s \r\n\r\n' % (self.type, self.innerboundary)) parser.feed(self.fp.read()) message = parser.close() # Then take parsed message and convert to FieldStorage fields == END CODE == This lets the email parser handle all of the complexities of MIME, but it does mean that we have to accurately re-create all of the necessary headers. I can cook up a full patch if anyone thinks this would fly. -- Comment By: Johannes Gijsbers (jlgijsbers) Date: 2005-02-04 11:15 Message: Logged In: YES user_id=469548 Here's a patch. We're interested in two things in the patched loop: * the rest of the multipart/form-data, including the headers for the current part, for consumption by HeaderParser (this is the tail variable) * the rest of the multipart/form-data without the headers for the current part, for consumption by FieldStorage (this is the message.get_payload() call) Josh, Irmen, do you see any problems with this patch? (BTW, this fix should be ported to the parse_multipart function as well, when I check it in (and I should make cgibug.py into a test)) -- Comment By: Josh Hoyt (josh
[ python-Bugs-1118977 ] builtin file() vanishes
Bugs item #1118977, was opened at 2005-02-08 23:42 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1118977&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Barry Alan Scott (barry-scott) Assigned to: Nobody/Anonymous (nobody) Summary: builtin file() vanishes Initial Comment: The attached files reproduce a wierd problem where by the builtin file() function completely vanishes from python. Notice that __builtin__ changes type from module to dict. In the attached tar files find: manufacture- main program bob.py- module a.a - file to open Untar and run: python manufacture Notice that file()is no ware to be found in side of bob.py This runs the same on all 2.3 and 2.4 on Windows, Linux and Mac OS X. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1118977&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1118977 ] builtin file() vanishes
Bugs item #1118977, was opened at 2005-02-08 18:42 Message generated for change (Comment added) made by tim_one You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1118977&group_id=5470 Category: None Group: None Status: Open Resolution: None Priority: 5 Submitted By: Barry Alan Scott (barry-scott) Assigned to: Nobody/Anonymous (nobody) Summary: builtin file() vanishes Initial Comment: The attached files reproduce a wierd problem where by the builtin file() function completely vanishes from python. Notice that __builtin__ changes type from module to dict. In the attached tar files find: manufacture- main program bob.py- module a.a - file to open Untar and run: python manufacture Notice that file()is no ware to be found in side of bob.py This runs the same on all 2.3 and 2.4 on Windows, Linux and Mac OS X. -- >Comment By: Tim Peters (tim_one) Date: 2005-02-08 22:51 Message: Logged In: YES user_id=31435 No, __builtins__ (note the trailing 's') changes type from module to module, not __builtin__. __builtins__ is an implementation detail, and you shouldn't use it at all. __builtin__ (no trailing 's') is a built-in module, and you're free to use that, but then you have to import it explicitly: import __builtin__ __builtin__.file There's more in the Language (not Library) reference manual, in the section "Naming and Binding". -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1118977&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com