Changes by David Watson :
nosy: +baikie
Python tracker
Python-bugs-list mailing list
David Watson added the comment:
I had a look at this patch, and the FD passing looked OK, except
that calculating the buffer size with CMSG_SPACE() may allow more
than one file descriptor to be received, with the extra one going
unnoticed - it should use CMSG_LEN() instead (the existing C
David Watson added the comment:
For reference, the warnings are partially explained here:
I get
Changes by David Watson :
Added file:
Python tracker
Changes by David Watson :
Added file:
Python tracker
Changes by David Watson :
nosy: +baikie
Python tracker
Python-bugs-list mailing list
David Watson added the comment:
On Sun 18 Sep 2011, Charles-François Natali wrote:
> > I had a look at this patch, and the FD passing looked OK, except
> > that calculating the buffer size with CMSG_SPACE() may allow more
> > than one file descriptor to be received, with t
New submission from David Watson :
The function _multiprocessing.recvfd() calls recvmsg() and
expects to receive a file descriptor in an SCM_RIGHTS control
message, but doesn't check that such a control message is
actually present. So if the sender sends data without an
Changes by David Watson :
Added file:
Python tracker
David Watson added the comment:
On Tue 20 Sep 2011, Charles-François Natali wrote:
> I committed the patch to catch the ImportError in test_multiprocessing.
This should go in all branches, I think - see issue #13022.
Python tracker
David Watson added the comment:
On Mon 23 May 2011, Gergely Kálmán wrote:
> It's been a while I had a look at that code. As far as I remember though
> the code is fairly decent not
> taking the missing unit tests into account. There are a few todos, and
> also a pretty bad bu
David Watson added the comment:
On Sun 12 Jun 2011, Charles-François Natali wrote:
> The patches look good to me, except that instead of passing
> (addrlen > buflen) ? buflen : addrlen
> as addrlen argument every time makesockaddr is called, I'd
> prefer if this
New submission from David Watson :
Changeset fd10d042b41d removed the wrappers on ssl.SSLSocket for
the new socket.send/recvmsg() methods (since I forgot to check
for the existence of the underlying methods - see issue #6560),
but this leaves SSLSocket with send/recvmsg() methods inherited
David Watson added the comment:
On Tue 23 Aug 2011, Nick Coghlan wrote:
> As you can see, I just pushed a change that removed the new
> methods from SSLSocket objects. If anyone wants to step up with
> a valid use case (not already covered by wrap_socket),
> preferably with a patch
New submission from David Watson :
Changeset 4736e172fa61 for issue #12810 removed the test
"msg->msg_controllen < 0" from socketmodule.c, where
msg_controllen happened to be unsigned on the reporter's system.
I included this test deliberately, because msg_controllen m
David Watson added the comment:
On Wed 24 Aug 2011, Charles-François Natali wrote:
> > I included this test deliberately, because msg_controllen may be
> > of signed type [...] POSIX allows socklen_t to be signed
David Watson added the comment:
On Thu 25 Aug 2011, Antoine Pitrou wrote:
> Adding an explanation message to the NotImplementedError would be more
> helpful. Otherwise, good catch.
OK, I've copied the messages from the ValueErrors the other
methods raise.
New submission from David Watson :
Attaching simple tests for these functions, which aren't currently tested.
components: Extension Modules
files: test-mknod-mkfifo-3.x.diff
keywords: patch
messages: 113609
nosy: baikie
priority: normal
severity: normal
status: open
title: Add
Changes by David Watson :
Added file:
Python tracker
New submission from David Watson :
These functions still use the "s" format for their arguments; the attached
patch fixes them to use PyUnicode_FSConverter() in 3.2. Some simple tests for
these functions (not for PEP 383 behaviour) are at issue #9569.
components: Extensi
New submission from David Watson :
It may be hard to find a configuration string this long, but you
can see the problem if you apply the attached
confstr-reduce-bufsize.diff to reduce the size of the local array
buffer that posix_confstr() uses. With it applied:
>>> import os
Changes by David Watson :
Added file:
Python tracker
New submission from David Watson :
The attached patch applies on top of the patch from issue #9579 to
make it use PyUnicode_DecodeFSDefaultAndSize(). (You could use
it in the existing code, but until that issue is fixed, there is
sometimes nothing to decode!)
components: Extension
David Watson added the comment:
The returned string should also be decoded with the file system
encoding and surrogateescape error handler, as per PEP 383 -
there's a patch at issue #9580 to do this.
Python tracker
David Watson added the comment:
I'm not quite sure what you mean, but the man page for FreeBSD 5.3 specifies
EPERM for an unprivileged user and EINVAL for an attempt to create something
other than a device node. POSIX requires creating a FIFO to work for any user,
and just says that E
David Watson added the comment:
OK, these patches work on FreeBSD 5.3 (root and non-root) if you want to check
the errno. I don't know what other systems might return though. I did also
find that the 2.x tests were failing on cleanup because the test class used
os.unlink rather
Changes by David Watson :
Added file:
Python tracker
Changes by David Watson :
Added file:
Python tracker
David Watson added the comment:
The CS_PATH variable is a colon-separated list of directories ("the value for
the PATH environment variable that finds all standard utilities"), so the file
system encoding is certainly correct there.
I don't see any reference to an encoding in
David Watson added the comment:
I don't see why confstr() values shouldn't change. sysconf() values can change
between calls, IIRC. Implementations can also define their own confstr
variables - they don't have to stick to the POSIX stuff.
And using a loop means the conf
New submission from David Watson :
These functions each return the path to a terminal, so they should use
PyUnicode_DecodeFSDefault(). Patch attached.
components: Extension Modules
files: ttyname-ctermid-pep383.diff
keywords: patch
messages: 113920
nosy: baikie
priority: normal
New submission from David Watson :
The pwd module decodes usernames using PyUnicode_DecodeFSDefault(), so
initgroups() should use PyUnicode_FSConverter() for the username. Patch
components: Extension Modules
files: initgroups-pep383.diff
keywords: patch
messages: 113921
New submission from David Watson :
The pwd module decodes usernames with PyUnicode_DecodeFSDefault(), and the
LOGNAME environment variable (suggested as an alternative to getlogin()) is
decoded the same way. Attaching a patch to use PyUnicode_DecodeFSDefault() in
David Watson added the comment:
> CS_PATH is hardcoded to "/bin:/usr/bin" in the GNU libc for UNIX. Do you know
> another key for which the value can be controled by the user (or the system
> administrator)?
No, not a specific example, but CS_PATH could conceivably
David Watson added the comment:
> I just fear that the loop is "endless". Imagine the worst case: confstr()
> returns a counter (n, n+1, n+2, ...). In 64 bits, it can be long.
The returned length is supposed to be determined by the length of
the variable, not the length of t
New submission from David Watson :
The statvfs() function still converts its argument with the "s"
format; the attached patch (for 3.2) fixes it to use
components: Extension Modules
files: statvfs-pep383-3.2.diff
keywords: patch
messages: 114392
New submission from David Watson :
The pathconf() function still converts its argument with the "s"
format; the attached pathconf-pep383-3.2.diff fixes it to use
PyUnicode_FSConverter() (in 3.2). Also attaching
pathconf-cleanup.diff to clean up the indentation, which
otherwise make
Changes by David Watson :
Added file:
Python tracker
New submission from David Watson :
This came up in relation to issue #9579; there is some discussion
of it there. Basically, if os.confstr() has to call confstr()
twice because the buffer wasn't big enough the first time, the
existing code assumes the string is the same length that t
David Watson added the comment:
I've opened a separate issue for the changing-length problem
(issue #9647; it affects 2.x as well). Here is a patch that
fixes the 255-byte issue only, and has similar results to the 2.x
code if the value changes length between calls (except that it
David Watson added the comment:
I wrote this patch to make confstr() return bytes (with code
similar to 2.x), and document the change in "Porting to Python
3.2" and elsewhere, but it then occurred to me that you might
have been talking about making a separate bytes API like
New submission from David Watson :
The protocol and service/port number databases are typically
implemented as text files on Unix and can contain non-ASCII names
in any encoding (presumably for local services), but the socket
module tries to decode them as strict UTF-8. In particular
David Watson added the comment:
I noticed that try-surrogateescape-first.diff missed out one of
the string references that needed to be changed to point to the
bytes object, and also used PyBytes_AS_STRING() in an unlocked
section. This version fixes these things by taking the generally
David Watson added the comment:
Updated the socket module patch to include gethostbyaddr() - it
happens to accept hostnames and is used this way in the standard
Added file:
David Watson added the comment:
Come to think of it, I'm not sure if the patch is correct for
Windows, as PyUnicode_DecodeFSDefault() appears to do strict MBCS
decoding by default (similarly with PyUnicode_FSConverter() for
encoding). Can Windows return service names that won't d
David Watson added the comment:
> Thanks for the patch. Committed as r84261.
> I'm not sure what the point is of supporting IDNA in getnameinfo, so I have
> removed that from the patch. If you think it's needed, please elaborate.
I don't see the point of it eithe
David Watson added the comment:
> Is this patch in response to an actual problem, or a theoretical problem?
> If "actual problem": what was the specific application, and what was the
> specific host name?
It's about environments, not applications - the local network ma
David Watson added the comment:
> > It's about environments, not applications
> Still, my question remains. Is it a theoretical problem (i.e. one
> of your imagination), or a real one (i.e. one you observed in real
> life, without explicitly triggering it)? If real:
David Watson added the comment:
> Is this still an issue on later versions of Python and/or FreeBSD?
Yes, there is still an issue. There is no longer a deadlock on
FreeBSD because the module been changed to use only lockf() and
dot-locking (on all platforms), but the issue is now about
David Watson added the comment:
> The surrogateescape mechanism is a very hackish approach, and
> violates the principle that errors should never pass silently.
I don't see how a name resolution API returning non-ASCII bytes
would indicate an error. If the host table contains a non
David Watson added the comment:
> > I don't see how a name resolution API returning non-ASCII bytes
> > would indicate an error.
> It's in violation of RFC 952 (slightly relaxed by RFC 1123).
That's bad if it's on the public Internet, but it's not
David Watson added the comment:
OK, I still think this issue should be addressed, but here is a patch for the
part we agree on: that decoding should not return any Unicode characters except
Added file:
David Watson added the comment:
The rest of the issue could also be straightforwardly addressed by adding bytes
versions of the name lookup APIs. Attaching a patch which does that (applies
on top of decode-strict-ascii.diff).
Added file:
Changes by David Watson :
Removed file:
Python tracker
David Watson added the comment:
Oops, forgot to refresh the last change into that patch. This should fix it.
Added file:
Python tracker
New submission from David Watson :
This test requires network access as it tries to resolve a domain name at Patch attached.
components: Tests
files: idna-test-resource.diff
keywords: patch
messages: 115593
nosy: baikie
priority: normal
severity: normal
status: open
David Watson added the comment:
> > baikie: why did the test pass for you?
> The test passes (I assume) if linux-pass-unterminated.diff is applied. The
> latter patch is only meant to exhibit the issue, though, not to be checked in.
No, I meant for linux-pass-untermin
David Watson added the comment:
> baikie, coming back to your original message: what precisely makes you
> believe that sun_path does not need to be null-terminated on Linux?
That's the way I demonstrated the bug - the only way to bind to a
108-byte path is to pass it without null
David Watson added the comment:
Updated the patches for Python 3.2 - these are now simpler as
they do not support bytearray arguments, as these are no longer
used for filenames (the existing code does not support bytearrays
I've put the docs and tests in one patch, and made sep
Changes by David Watson :
Added file:
Python tracker
Changes by David Watson :
Added file:
Python tracker
David Watson added the comment:
I've updated the PEP 383 patches at issue #8373 with separate
versions for if the linux-pass-unterminated patch is applied or
If it's not essential to have unit tests for the overrun issue,
I'd suggest applying just the return-untermina
David Watson added the comment:
One of the tests got broken by the removal of sys.setfilesystemencoding().
Replaced it.
Added file:
Python tracker
David Watson added the comment:
> With all the effort that went into the patch, I recommend to get it right: if
> there is space for the \0, include it. If the string size is exactly 108, and
> it's linux, write it unterminated. Else fail.
> As for testing: we should th
David Watson added the comment:
I meant to say that FreeBSD provides the SUN_LEN macro, but it
turns out that Linux does as well, and its version behaves the
same as FreeBSD's. The FreeBSD man pages state that the
terminating null is not part of the address:
David Watson added the comment:
> If I understood correctly, you don't want the value to be truncated if the
> variable grows between the two calls to confstr(). Which behaviour would you
> expect? A Python exception?
A return size larger than the buffer is *supposed* to ind
David Watson added the comment:
> platform.system() fails with UnicodeEncodeError on systems that have their
> computer name set to a name containing non-ascii characters. The
> implementation of platform.system() uses at some point socket.gethostname() (
> see http://www.pastea
David Watson added the comment:
> As a further note: I think socket.gethostname() is a special case, since this
> is just about a local setting (i.e. not related to DNS).
But the hostname *is* commonly intended to be looked up in the
DNS or whatever name resolution mechanisms are used l
David Watson added the comment:
> The result from gethostname likely comes out of machine-local
> configuration. It may have non-ASCII in it, which is then likely
> encoded in the local encoding. When looking it up in DNS, IDNA
> should be applied.
I would have thought that
David Watson added the comment:
> > In fact, I would think that non-ASCII bytes in a hostname most
> > probably indicated that a name resolution mechanism other than
> > the DNS was in use, and that the byte string should be passed
> > unaltered just as a typical C pro
David Watson added the comment:
I was looking at the MSDN pages linked to above, and these two
pages seemed to suggest that Unicode characters appearing in DNS
names represented UTF-8 sequences, and that Windows allowed such
non-ASCII byte sequences in the DNS by default:
David Watson added the comment:
> > Also, if GetComputerNameEx() only offers a choice of DNS names or
> > NetBIOS names, and both are byte-oriented underneath (that was my
> > reading of the "Computer Names" page), then presumably there
> > shouldn't be
David Watson added the comment:
> On other platforms, I guess we'll just have to do some trial
> and error to see what works and what not. E.g. on Linux it is
> possible to set the hostname to a non-ASCII value, but then
> the resolver returns an error, so it
David Watson added the comment:
> FWIW, you can do the same on a Linux box, i.e. setup the host name
> and domain to some completely bogus values. And as David pointed out,
> without also updating the /etc/hosts on the Linux, you always get the
> resolver error with hostname -f
New submission from David Watson <[EMAIL PROTECTED]>:
The error message has no newline at the end:
$ LANG=en_GB.UTF-8 python3.0 $'\xff'
Could not convert argument 2 to string$
Seriously, though: is this the intended behaviour? If the
interpreter just dies when it gets
David Watson <[EMAIL PROTECTED]> added the comment:
Hmm, yes, I see that the open() builtin doesn't accept bytes
filenames, though still does. When I saw that you
could pass bytes filenames transparently from os.listdir() to, I assumed that this was intentional!
David Watson added the comment:
Thanks for your interest! I'm actually still working on the
patch I posted, docs and a test suite, and I'll post something
Yes, you could just use b"".join() with sendmsg() (and get
slightly annoyed because it doesn't accept buff
David Watson added the comment:
OK, here's a new version as a work in progress. A lot of the new
stuff is uncommented (particularly the support code for the
tests), but there are proper docs this time and a fairly complete
test suite (but see below).
There are a couple of changes t
David Watson added the comment:
I just found that the IPv6 tests don't get skipped when IPv6 is
available but disabled in the build - you can create IPv6
sockets, but not use them :/ This version fixes the problem.
Added file:
David Watson added the comment:
I was about to report this for the socket module - the gethostbyname(),
gethostbyname_ex() and getnameinfo() functions are the only things currently
affected in that module as far as I can see. 3.x is affected too - the
functions will pass non-ASCII Unicode
New submission from David Watson :
The makesockaddr() function in the socket module assumes that
AF_UNIX addresses have a null-terminated sun_path, but Linux
actually allows unterminated addresses using all 108 bytes of
sun_path (for normal filesystem sockets, that is, not just
Changes by David Watson :
Added file:
Python tracker
Changes by David Watson :
Added file:
Python tracker
Changes by David Watson :
Added file:
Python tracker
Python-bugs-list m
Changes by David Watson :
Added file:
Python tracker
Python-bugs-list m
Changes by David Watson :
Added file:
Python tracker
Python-bugs-list mailin
Changes by David Watson :
Added file:
Python tracker
Python-bugs-list mailin
New submission from David Watson :
In 3.x, the socket module assumes that AF_UNIX addresses use
UTF-8 encoding - this means, for example, that accept() will
raise UnicodeDecodeError if the peer socket path is not valid
UTF-8, which could crash an unwary server.
Python 3.1.2 (r312:79147, Mar 23
David Watson added the comment:
This patch does the same thing without fixing issue #8372 (not
that I'd recommend that, but it may be easier to review).
Added file:
Changes by David Watson :
Added file:
Python tracker
Python-bugs-list m
Changes by David Watson :
Added file:
Python tracker
Python-bugs-list mailin
Changes by David Watson :
Added file:
Python tracker
David Watson added the comment:
Attaching the C test programs I forgot to attach yesterday -
sorry about that. I've also tried these programs, and the
patches, on FreeBSD 5.3 (an old version from late 2004). I found
that it accepted unterminated addresses as well, and unlike Linux
it di
Changes by David Watson :
Added file:
Python tracker
Python-bugs-list mailin
David Watson added the comment:
@ Victor Stinner: Yes, the behaviour of those functions is as you
describe - it's been changed since I filed this issue. I do
consider it an improvement.
By the password database, I mean /etc/passwd or replacements that
are accessible via getpwnam() and fr
New submission from David Watson :
The pwd (and spwd and grp) modules deal with data from
/etc/passwd (and/or other sources) that can be supplied by users
on the system. Specifically, users can often change the data in
their GECOS fields without the OS requiring that it conform to a
Changes by David Watson :
Added file:
Python tracker
Python-bugs-list mailin
Changes by David Watson :
Added file:
Python tracker
Python-bugs-list mailin
David Watson added the comment:
> baikie: Open a separated issue for the refcount error and fd leak.
OK. It does affect 2.x as well, come to think of it.
> On Ubuntu, it's not possible to create an user with a non-ASCII
> name:
> $ sudo adduser é --no-create-home
New submission from David Watson :
When investigating issue #4859 I found that when pwd.getpwall()
and grp.getgrall() fail due to decoding errors, they leave open
file descriptors referring to the passwd and group files, since
they don't call the end*ent() functions in this case. Also, th
1 - 100 of 141 matches
Mail list logo