Changes by Glenn Linderman :
--
nosy: +v+python
___
Python tracker
<http://bugs.python.org/issue13703>
___
___
Python-bugs-list mailing list
Unsubscribe:
Glenn Linderman added the comment:
Given Martin's comment (msg150832) I guess I should add my suggestion to this
issue, at least for the record.
Rather than change hash functions, randomization could be added to those dicts
that are subject to attack by wanting to store user-supplie
Glenn Linderman added the comment:
Alex, I agree the issue has to do with the origin of the data, but the modules
listed are the ones that deal with the data supplied by this particular attack.
Note that changing the hash algorithm for a persistent process, even though
each process may have
Glenn Linderman added the comment:
[offlist]
Paul, thanks for the enumeration and response. Some folks have more
experience, but the rest of us need to learn. Having the proposal in
the ticket, with an explanation of its deficiencies is not all bad,
however, others can learn, perhaps. On
Glenn Linderman added the comment:
I don't find a way to delete my prior comment, so I'll add one more
(only). The prior comment was intended to go to one person, but I didn't
notice the From, having one person's name, actually went back to the
ticket (the email addres
Glenn Linderman added the comment:
In msg142098 Ezio said:
> Keep in mind that we should be able to access and use lone surrogates too,
> therefore:
> s = '\ud800' # should be valid
> len(s) # should this raise an error? (or return 0.5 ;)?
I say:
For streams and da
Glenn Linderman added the comment:
Just some comments for the historical record:
During the discussion of issue 4953 research and testing revealed that browsers
send back their cgi data using the same charset as the page that they are
responding to. So the only way that quoting would be
Glenn Linderman added the comment:
Sergey says:
I wanted to add that the fact that browsers encode the field names in the page
encoding does not change that they should escape the header according to RFC
2047.
I respond:
True, but RFC 2047 is _so_ weird, that it seems that browsers have a
Glenn Linderman added the comment:
I can certainly agree with the opinion that raw strings are working as
documented, but I can also agree with the opinion that they contain traps for
the unwary, and after getting trapped several times, I have chosen to put up
with the double-backslash
Glenn Linderman added the comment:
@Graham: seems like the two primary gotchas are trailing \ and \" \' not
removing the \. The one that tripped me up the most was the trailing \, but I
did get hit with \" once. Probably if Python had been my first programming
language that
Glenn Linderman added the comment:
On 3/12/2011 7:11 PM, R. David Murray wrote:
> R. David Murray added the comment:
>
> I've opened issue 11479 with a proposed patch to the tutorial along the lines
> suggested by Graham.
Which is good, for people that use the tutorial. I
Glenn Linderman added the comment:
Presently, a correct application only needs to flush between a sequence of
writes and a sequence of buffer.writes.
Don't assume the flush happens after every write, for a correct application.
--
___
P
Glenn Linderman added the comment:
Would it suffice if the new scheme internally flushed after every buffer.write?
It wouldn't be needed after write, because the correct application would
already do one there?
Am I off-base in supposing that the performance of buffer.write is expect
Glenn Linderman added the comment:
David-Sarah said:
In any case, given that the buffer of the initial std{out,err} will always be a
BufferedWriter object (since .buffer is readonly), it would be possible for the
TextIOWriter to test a dirty flag in the BufferedWriter, in order to check
Glenn Linderman added the comment:
David-Sarah wrote:
Windows is very slow at scrolling a console, which might make the cost of
flushing insignificant in comparison.)
Just for the record, I noticed a huge speedup in Windows console scrolling when
I switched from WinXP to Win7 on a faster
Glenn Linderman added the comment:
Bertrand Meyer's exposition is flowery, and he is a learned man, but the basic
argument he makes is:
Reflexivity of equality is something that we expect for any data type, and it
seems hard to justify that a value is not equal to itself. As to assig
Glenn Linderman added the comment:
Nick says (and later explains better what he meant):
The status quo works. Proposals to change it on theoretical grounds have a
significantly higher bar to meet than proposals to simply document it clearly.
I say:
What the status quo doesn't provi
Glenn Linderman added the comment:
Regarding http://bugs.python.org/issue4953#msg91444 POST with
multipart/form-data encoding can use UTF-8, other stuff is restricted to ASCII!
>From http://www.w3.org/TR/html401/interact/forms.html:
Note. The "get" method restricts form data
New submission from Glenn Linderman :
The CGI interface is a binary stream, because it is pumped directly to/from the
HTTP protocol, which is a binary stream.
Hence, cgitb.py should produce binary output. Presently, it produces text
output.
When one sets stdout to a binary stream, and then
New submission from Glenn Linderman :
CGI is a bytestream protocol. Python assumes a text mode encoding for stdin
and stdout, this is inappropriate for the CGI interface.
CGI should provide an API to "do the right thing" to make stdin and stout
binary mode interfaces (includ
New submission from Glenn Linderman :
While http://bugs.python.org/issue2683 did clarify the fact that the
.communicate API takes a byte stream as input, it is easy to miss the
implication. Because Python programs start up with stdin as a text stream, it
might be good to point out that some
Glenn Linderman added the comment:
Maybe it should also be mentioned that p.stdout and p.stderr and p.stdin, when
set to be PIPEs, are also byte streams. Of course that is the reason that
communicate accepts and produces byte streams
New submission from Glenn Linderman :
.communicate is a nice API for programs that produce little output, and can be
buffered. While that may cover a wide range of uses, it doesn't cover
launching CGI programs, such as is done in http.server. Now there are nice
warnings about that iss
New submission from Glenn Linderman :
The def executable for CGIHTTPRequestHandler is simply wrong on Windows. The
Unix executable bits do not apply.
Yet it is not clear what to use instead. One could check the extension against
PATHEXT, perhaps, but Windows doesn't limit itself to
New submission from Glenn Linderman :
is_cgi doesn't properly handle PATH_INFO parts of the path. The Python2.x
CGIHTTPServer.py had this right, but the introduction and use of
_url_collapse_path_split broke it.
_url_collapse_path_split splits the URL into a two parts, the second pa
New submission from Glenn Linderman :
http.server on Python 3 and CGIHTTPServer on Python 2 both contain the same
code with the same bug. In run_cgi, rest.rfind('?') is used to separate the
path from the query string. However, it should be rest.find('?') as the query
Changes by Glenn Linderman :
--
type: -> behavior
___
Python tracker
<http://bugs.python.org/issue10485>
___
___
Python-bugs-list mailing list
Unsubscri
New submission from Glenn Linderman :
HTTP_HOST HTTP_PORT REQUEST_URI are variables that my CGI scripts use, but
which are not available from http.server or CGIHTTPServer (until I added them).
There may be more standard variables that are not set, I didn't attempt to
enumerate the whole
New submission from Glenn Linderman :
While it is documented that http.server (and Python 2's CGIHTTPServer) do not
process the status header, and limit the usefulness of CGI scripts as a result,
that doesn't make it less of a bug, just a documented bug. But I guess that it
might
Changes by Glenn Linderman :
--
type: -> behavior
___
Python tracker
<http://bugs.python.org/issue10479>
___
___
Python-bugs-list mailing list
Unsubscri
Changes by Glenn Linderman :
--
assignee: -> d...@python
components: +Documentation
nosy: +d...@python
type: -> behavior
___
Python tracker
<http://bugs.python.org/i
Glenn Linderman added the comment:
Martin, that is an interesting viewpoint, and one I considered, but didn't
state, because it seems much too restrictive. Most CGI programs are written in
scripting languages, not compiled to .exe. So it seems the solution should
allow for launchi
Glenn Linderman added the comment:
The rest of the code has clearly never had its deficiencies exposed on Windows,
simply because executable() has prevented that. So what the rest of the code
"already supports" is basically nothing. Reasonable Windows support is
appropriate to im
Glenn Linderman added the comment:
Martin, you are splitting hairs about the "reported problem". The original
message does have a paragraph about the executable bits being wrong. But the
bulk of the message is commenting about the difficulty of figuring out what to
replace it wi
Glenn Linderman added the comment:
Took a little more time to do a little more analysis on this one. Compared a
sample query via Apache on Linux vs http.server, then looked up the CGI RFC for
more info:
DOCUMENT_ROOT: ...
GATEWAY_INTERFACE: CGI/1.1
HTTP_ACCEPT: text/html,application/xhtml
Glenn Linderman added the comment:
Here is a replacement for the body of is_cgi that will work with the current
_url_collapse_path_split function, but it seems to me that it is ineffecient to
do multiple splits and joins of the path between the two functions.
splitpath = server
Glenn Linderman added the comment:
So I've experimented a bit, and it looks like simply exposing ._readerthread as
an external API would handle the buffered case for stdout or stderr. For
http.server CGI scripts, I think it is fine to buffer stderr, as it should not
be a high-volume ch
Glenn Linderman added the comment:
Here's an updated _writerthread idea that handles more cases:
def _writerthread(self, fhr, fhw, length=None):
if length is None:
flag = True
while flag:
buf = fhr.read( 512 )
fhw.write
Glenn Linderman added the comment:
Sorry, left some extraneous code in the last message, here is the right code:
def _writerthread(self, fhr, fhw, length=None):
if length is None:
flag = True
while flag:
buf = fhr.read( 512
Glenn Linderman added the comment:
Just to mention, with the added code from issue 10482, I was able to get a
3-stream functionality working great in http.server and also backported it to
2.6 CGIHTTPServer... and to properly process the Status: header on stdout.
Works very well in 2.6; Issue
Glenn Linderman added the comment:
Looking at the code the way I've used it in my modified server.py:
stderr = []
stderr_thread = threading.Thread(target=self._readerthread,
args=(p.stderr, s
Glenn Linderman added the comment:
Pierre, thanks for your work on this. I hope a fix can make it in to 3.2.
However, while starting Python with -u can help a but, that should not, in my
opinion, be requirement to use CGI. Rather, the stdin should be set into
binary mode by the CGI
Glenn Linderman added the comment:
Regarding the use of detach(), I don't know if it works. Maybe it would. I
know my code works, because I have it working. But if there are simpler
solutions that are shown to work, that would be
Glenn Linderman added the comment:
Peter, it seems that detach is relatively new (3.1) likely the code samples and
suggestions that I had found to cure the problem predate that. While I haven't
yet tried detach, your code doesn't seem to modify stdin, so are you
suggesti
Glenn Linderman added the comment:
Rereading the doc link I pointed at, I guess detach() is part of the new API
since 3.1, so doesn't need to be checked for in 3.1+ code... but instead, may
need to be coded as:
try:
sys.stdin = sys.stdin.detach()
except UnsupportedOper
Glenn Linderman added the comment:
So then David, is your suggestion to use
sys.stdin = sys.stdin.detach()
and you claim that the Windows-specific hacks are not needed in 3.x land? The
are, in 2.x land, I have proven empirically, but haven't been able to test CGI
forms very well i
Glenn Linderman added the comment:
David, Starting from a working (but hacked to work) version of http.server and
using 3.2a1 (I should upgrade to the Beta, but I doubt it makes a difference at
the moment), I modified
# if hasattr( sys.stdin, 'buffer'):
#
Glenn Linderman added the comment:
(and I should mention that all the "hacked to work" issues in my copy of
http.server have been reported as bugs, on 2010-11-21. The ones of most
interest related to this binary bytestream stuff are issue 10479 and is
Glenn Linderman added the comment:
R. David said:
>From looking over the cgi code it is not clear to me whether Pierre's approach
>is simpler or more complex than the alternative approach of starting with
>binary input and decoding as appropriate. From a consistency perspec
Glenn Linderman added the comment:
R. David said:
(I believe http uses latin-1 when no charset is specified, but I need to double
check that)
See http://bugs.python.org/issue4953#msg121864 ASCII and UTF-8 are what HTTP
defines. Some implementations may, in fact, use latin-1 instead of ASCII
New submission from Glenn Linderman :
Per Antoine's request, I wrote this test code, it isn't elegant, I whipped it
together quickly; but it shows the issue. The issue may be one of my
ignorance, but it does show the behavior I described in issue 4953. Here's the
output from t
Glenn Linderman added the comment:
Pierre said:
In all cases the interpreter must be launched with the -u option. As stated in
the documentation, the effect of this option is to "force the binary layer of
the stdin, stdout and stderr streams (which is available as their buffer
attribut
Glenn Linderman added the comment:
tested on Windows, for those that aren't following issue 4953
--
components: +IO
type: -> behavior
___
Python tracker
<http://bugs.python.org
Glenn Linderman added the comment:
The same. This can be tested with the same test program,
c:\python32\python.exe test.py 1 > test1.txt
similar for 2, 3, 4. Then add -u and repeat. All 8 cases produce the same
results, either via a pipe, or with a redirected std
Glenn Linderman added the comment:
Actually, it seems like this "-u" behaviour, should simply be the default for
Python 3.x on Windows. The new IO subsystem seems to be able to add \r when
desired anyway. And except for Notepad, most programs on Windows can deal with
\r\n or solo
Glenn Linderman added the comment:
Is there an easy way for me to find the code for -u? I haven't learned my way
around the Python sources much, just peeked in modules that I've needed to fix
or learn something from a little. I'm just surprised you think it is
orthogonal, b
Glenn Linderman added the comment:
I can read and understand C well enough, having coded in it for about 40 years
now... but I left C for Perl and Perl for Python, I try not to code in C when I
don't have to, these days, as the P languages are more productive, overall.
But there has
Glenn Linderman added the comment:
I suppose the FileIO in _io is next to look at, wherever it can be found.
--
___
Python tracker
<http://bugs.python.org/issue10
Glenn Linderman added the comment:
Found it.
The file browser doesn't tell what line number it is, but in _io/Fileio.c
function fileio_init, there is code like
#ifdef O_BINARY
flags |= O_BINARY;
#endif
#ifdef O_APPEND
if (append)
flags |= O_APPEND;
#endif
if (fd
Glenn Linderman added the comment:
Etienne, I'm not sure what you are _really_ referring to by
HTTP_TRANSFER_ENCODING. There is a TRANSFER_ENCODING defined by HTTP but it is
completely orthogonal to character encoding issues. There is a
CONTENT_ENCODING defined which is a char
Glenn Linderman added the comment:
Don't find "initstdio" "stdio" in pythonrun.c. Has it moved? There are
precious few references to stdin, stdout, stderr in that module, mostly for
attaching the default encoding.
--
Glenn Linderman added the comment:
stderr is notable by its absence in the list of O_BINARY adjustments.
So -u does do 2/3 of what my windows_binary() does :) Should I switch my test
case to use stderr to demonstrate that it doesn't help with that? I vaguely
remember that early versio
Glenn Linderman added the comment:
Etienne said:
yes, lets not complexify anymore please...
Albert Einstein said:
Things should be as simple as possible, but no simpler.
I say:
My "learning" of HTTP predates "chunked". I've mostly heard of it being used
in downloads
Glenn Linderman added the comment:
Makes sense to me. Still should document the open file parameter when passed
an fd, and either tell the user that it should be O_BINARY, or that it will be
O_BINARYd for them, whichever technique is chosen. But having two newline
techniques is bad, and if
Glenn Linderman added the comment:
Victor, Thanks for your interest and patches.
msg125530 points out the location of the code where _all_ fds could be
O_BINARYed, when passed in to open. I think this would make all fds open in
binary mode, per Guido's comment... he made exactl
Glenn Linderman added the comment:
We have several, myself included, that can't use CGI under 3.x because it
doesn't take a binary stream.
I believe there are several alternatives:
1) Document that CGI needs a binary stream, and expect the user to provide it,
either an explicit han
Glenn Linderman added the comment:
Pierre said:
Option 1 is impossible, because the CGI script sometimes has no control on the
stream : for instance on a shared web host, it will receive sys.stdin as a text
stream
I say:
It is the user code of the CGI script that calls CGI.FieldStorage. So
Glenn Linderman added the comment:
Thanks for your work on this Victor, and other commenters also.
--
___
Python tracker
<http://bugs.python.org/issue10
Glenn Linderman added the comment:
Interesting!
I was able to tweak David-Sarah's code to work with Python 3.x, mostly doing
things that 2to3 would probably do: changing unicode() to str(), dropping u
from u'...', etc.
I skipped the unmangling of command-line arguments, beca
Glenn Linderman added the comment:
I would certainly be delighted if someone would reopen this issue, and figure
out how to translate unicode2.py to Python internals so that Python's console
I/O on Windows would support Unicode "out of the box".
Otherwise, I'll have to in
New submission from Glenn Linderman :
In attempting to review issue 4953, I discovered a conundrum in handling of
multipart/formdata.
cgi.py has claimed for some time (at least since 2.4) that it "handles" file
storage for uploading large files. I looked at the code in 2.6 that han
Glenn Linderman added the comment:
This looks much simpler than the previous patch. However, I think it can be
further simplified. This is my first reading of this code, however, so I might
be totally missing something(s).
Pierre said:
Besides FieldStorage, I modified the parse() function
Glenn Linderman added the comment:
Also, the required behavior of make_file changes, to need the right encoding,
or binary, so that needs to be documented as a change for people porting from
2.x. It would be possible, even for files, which will be uploaded as binary,
for a user to know the
Glenn Linderman added the comment:
Trying to code some of this, it would be handy if BytesFeedParser.feed would
return a status, indicating if it has seen the end of the headers yet. But that
would only work if it is parsing as it goes, rather than just buffering, with
all the real parsing
Glenn Linderman added the comment:
I wrote:
Additionally, if there is a CONTENT-LENGTH specified for non-binary data, the
read_binary method should be used for it also, because it is much more
efficient than readlines... less scanning of the data, and fewer outer
iterations. This goes well
Glenn Linderman added the comment:
It seems the choice of whether to make_file or StringIO is based on the
existence of self.length... per my previous comment, content-length doesn't
seem to appear in any of the multipart/ item headers, so it is unlikely that
real files will be creat
Glenn Linderman added the comment:
Victor said:
Don't you think that a warning would be appropriate if sys.stdin is passed
here?
---
# self.fp.read() must return bytes
if isinstance(fp,TextIOBase):
self.fp = fp.buffer
else:
self.fp = fp
---
Glenn Linderman added the comment:
R. David said:
However, I'm not clear on how that helps. Doesn't FieldStorage also load
everything into memory?
I say:
FieldStorage in 2.x (for x <= 6, at least) copies incoming file data to a file,
using limited size read/write operations.
Glenn Linderman added the comment:
Victor said:
"Set sys.stdin.buffer.encoding attribute is not a good idea. Why do you modify
fp, instead of using a separated attribute on FieldStorage (eg.
self.fp_encoding)?"
Pierre said:
I set an attribute encoding to self.fp because, for each
Glenn Linderman added the comment:
Victor said:
I mean: you should pass sys.stdin.buffer instead of sys.stdin.
I say:
That would be possible, but it is hard to leave it at default, in that case,
because sys.stdin will, by default, not be a binary stream. It is a
convenience for FieldStorage
Glenn Linderman added the comment:
I said:
I wonder what result you get with the same browser, at the web page
http://rishida.net/tools/conversion/ by entering the euro symbol into the
Characters entry field, and choosing convert.
But I couldn't wait, so I ran a test with € in one
Glenn Linderman added the comment:
R. David:
Pierre said:
BytesFeedParser only uses the ascii codec ; if the header has non ASCII
characters (filename in a multipart/form-data), they are replaced by ? : the
original file name is lost. So for the moment I leave the text version of
FeedParser
Glenn Linderman added the comment:
In my previous message I quoted Pierre rightly cautioning about headers
containing non-ASCII... and that BytesFeedParser doesn't, so using it to parse
headers may be questionable.
So I decided to try one... I show the Live HTTP headers below, from a s
Glenn Linderman added the comment:
Pierre said:
Since it works the same with 2 browsers and 2 web servers, I'm almost sure it's
not dependant on the configuration - but if others can tests on different
configurations I'd like to know the result
So I showed in my just previous
Glenn Linderman added the comment:
Aha!
Found a page <http://htmlpurifier.org/docs/enduser-utf8.html#whyutf8-support>
which links to another page
<http://web.archive.org/web/20060427015200/ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html>
that explains the behavior.
The syno
Glenn Linderman added the comment:
I notice the version on this issue is Python 3.3, but it affects 3.2 and 3.1 as
well. While I would like to see it fixed for 3.2, perhaps it is too late for
that, with rc1 coming up this weekend?
Could at least the non-deprecated parse functions be
Glenn Linderman added the comment:
Pierre,
I applied your patch to my local copy of cgi.py for my installation of 3.2, and
have been testing. Lots of things work great!
My earlier comment regarding make_file seems to be relevant. Files that are
not binary should have an encoding. Likely
Glenn Linderman added the comment:
I'd be willing to propose such a patch and tests, but I haven't a clue how,
other than starting by reading the contributor document... I was putting off
learning the process until hg conversion, not wanting to learn an old process
for a few mont
Glenn Linderman added the comment:
Pierre said:
The encoding used by the browser is defined in the Content-Type meta tag, or
the content-type header ; if not, the default seems to vary for different
browsers. So it's definitely better to define it
The argument stream_encoding us
Glenn Linderman added the comment:
Pierre,
Looking better.
I see you've retained the charset parameter, but do not pass it through to
nested calls of FieldStorage. This is good, because it wouldn't work if you
did. However, purists might still complain that FieldStorage should
Glenn Linderman added the comment:
The O_BINARY stuff was probably necessary because issue 10841 is not yet in the
build Pierre was using? I agree it in not necessary with the fix for that
issue, but neither does it hurt.
It could be stripped out, if you think that is best, Antoine.
But
Glenn Linderman added the comment:
Victor, thanks for your comments, and interest in this bug. Other than the
existence of the charset parameter, and whether or not to include IOMix, I
think all of the others could be fixed later, and do not hurt at present. So I
will just comment on those
Glenn Linderman added the comment:
Graham, Thanks for your comments. Fortunately, if the new charset parameter is
not supplied, no mucking with stdout or stderr is done, which is the only
reason I cannot argue strongly against the feature, which I would have
implemented as a separate API
Glenn Linderman added the comment:
Pierre, Thank you for the new patch, with the philosophy of "it's broke, so
let's produce something the committers like to get it fixed".
I see you overlooked removing the second use of O_BINARY. Locally, I removed
that also, and test
Changes by Glenn Linderman :
--
versions: +Python 3.2 -Python 3.3
___
Python tracker
<http://bugs.python.org/issue4953>
___
___
Python-bugs-list mailing list
Unsub
Glenn Linderman added the comment:
Thanks to Pierre for producing patch after patch and testing testing testing,
and to Victor for committing it, as well as others that contributed in smaller
ways, as I tried to. I look forward to 3.2 rc1 so I can discard all my
temporary patched copies of
Glenn Linderman added the comment:
Victor said:
Why do you set the code page to 65001? In all my tests (on Windows XP), it
always break the standard input.
My response:
Because when I searched Windows for Unicode and/or UTF-8 stuff, I found 65001,
and it seems like it might help, and it does
Glenn Linderman added the comment:
Issue 4953 has somewhat resolved this issue by using email only for parsing
headers (more like 2.x did). So this issue could be closed, or could be left
open to point out the required additional features needed from email before
cgi.py can use it for
Glenn Linderman added the comment:
So since cgi.py was fixed to use the .buffer attribute of sys.stdout, that
leaves sys.stdout itself as a character stream, and cgitb.py can successfully
write to that.
If cgitb.py never writes anything but ASCII, then maybe that should be
documented, and
Glenn Linderman added the comment:
Fixed by issue 10841 and issue 4953.
--
status: open -> closed
___
Python tracker
<http://bugs.python.org/issue10480>
___
_
1 - 100 of 248 matches
Mail list logo