Re: [Python-Dev] Sharing docstrings between the Python and C implementations of a module

2013-04-16 Thread Stephen J. Turnbull
Skip Montanaro writes:

 > > Would it make sense to think about adding this in the scope of
 > > the argument clinic work, or is it too unrelated? This seems like
 > > a commonly needed thing for large parts of the stdlib (where the
 > > C accelerator overrides Python code).
 > 
 > Or maybe separate doc strings from both code bases altogether and
 > insert them where appropriate as part of the build process?

Experience with gettext vs other kinds of message catalogs for
localization suggests that's a really painful approach to take.

It's not entirely clear to me that this whole effort isn't a premature
optimization.  Eventually (next 5 to 15 years? long run, anyway) we'll
probably localize, and *most* messages will be shared (from gettext
.mo files) anyway.  (Yes, I recognize that space is not the most
important aspect of sharing docstrings, and that it's likely shared
docstrings can be automatically shared by gettext.  We should take
care that that's so.)

The other thing that occurs to me is that maybe something like gettext
may be the way to deal with these issues anyway.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mimetypes broken on Windows

2013-04-16 Thread Terry Jan Reedy

On 4/15/2013 10:04 PM, Ben Hoyt wrote:

Hi folks,

The built-in mimetypes module is broken on Windows, and it has been
since Python 2.7 alpha 1. On all Windows systems I've tried,
guess_type() returns the wrong mime type for common types like .png and
.jpg. For example (on Python 2.7.4 and 3.3.1):

 >>> import mimetypes
 >>> mimetypes.guess_type('f.png')
('image/x-png', None)
 >>> mimetypes.guess_type('f.jpg')
('image/pjpeg', None)

These should be 'image/png' and 'image/jpeg', respectively.

There's an open issue for this: http://bugs.python.org/issue15207.
However, it hasn't gotten any love in the last few months, so per
r.david.murray's comment, I'm posting it here.

Dave Chambers, who opened the bug, has proposed a fix, which is
significantly better (i.e., not totally broken for common types).
However, as I mentioned in http://bugs.python.org/issue15207#msg177030,
using the Windows registry for this at all is basically a bad idea, because:


The actual mapping is fixed and more or less system independent while 
the windows registry is for volatile system and user dependent mappings.



1) Important keys like .jpg and .png aren't in the registry anyway.
2) Some that do exist are wrong in the Windows registry. This includes
.zip, which is "application/x-zip-compressed" (at least in my registry)
but should be "application/zip".
3) It makes the first call to guess_type() slow (~100ms), which isn't
terrible, but with the above concerns, not worth it.
4) Perhaps most importantly: the keys in the Windows registry depend on
what programs you have installed. And the users and programs can change
registry keys at will.


And change what a given key is mapped to.


Obviously one can work around this bug, either by calling
mimetypes.init(files=[]) before any calls to guess_type, or calling
init() with your own mime types file. However, "broken out of the box"
is going to cause a lot of people headaches. :-)

So my proposal is simply to get rid of read_windows_registry()
altogether, and fall back to the default type mapping in mimetypes.py on
Windows systems. This is correct and fast, even if not complete. As


I basicallly agree, but am not sure what to do about back-compatibility 
considerations. But we do not have to reproduce buggy behavior.



always, folks can always use their own mimetypes file if they want.

In summary: the current behaviour is buggy and broken, the behaviour
proposed in Issue 15207 is problematic, getting this from the Windows
registry is bad idea, and we should revert the whole registry thing. :-)

If folks agree with my reasoning above, I can provide a patch to fix
this, along with a patch to the Windows unit tests.

-Ben

P.S. Kind of proving my point about the fragility of using the registry,
the Python 2.7.4 unit test test_registry_parsing in test_mimetypes.py
fail on my machine. It's because I've installed some SQL server, and
text/plain is my registry is mapped from .sql (instead of .txt), causing
this:

Traceback (most recent call last):
   File "C:\python27\lib\test\test_mimetypes.py", line 85, in
test_registry_parsing
 eq(self.db.guess_type("foo.txt"), ("text/plain", None))
AssertionError: Tuples differ: (None, None) != ('text/plain', None)



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mimetypes broken on Windows

2013-04-16 Thread R. David Murray
On Tue, 16 Apr 2013 14:00:53 -0400, Terry Jan Reedy  wrote:
> On 4/15/2013 10:04 PM, Ben Hoyt wrote:
> > So my proposal is simply to get rid of read_windows_registry()
> > altogether, and fall back to the default type mapping in mimetypes.py on
> > Windows systems. This is correct and fast, even if not complete. As
> 
> I basicallly agree, but am not sure what to do about back-compatibility 
> considerations. But we do not have to reproduce buggy behavior.

I basically agree as well, but as a non-windows user I'm not willing
to commit any change without approval from a committer who actually
understands what's going on.

My understanding is that referencing the windows registry is a relatively
new feature (I'm not sure exactly how new), and that it is itself causing
more backward compatibility problems than would likely be caused by
removing it.  But as I said, I'm not enough of a Windows expert to be
comfortable making that decision.

I'm glad this was brought up on python-dev; it's been nagging at me
that this issue hasn't been getting resolved.

--David
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] mimetypes broken on Windows

2013-04-16 Thread Ben Hoyt
(Sorry if this reply doesn't thread as I intend -- I wasn't configured
to get python-dev emails, so I'm replying to my original with
copy-n-paste.)

On Tue, 16 Apr 2013 14:00:53 -0400, Terry Jan Reedy  wrote:
> On 4/15/2013 10:04 PM, Ben Hoyt wrote:
> > So my proposal is simply to get rid of read_windows_registry()
> > altogether, and fall back to the default type mapping in mimetypes.py on
> > Windows systems. This is correct and fast, even if not complete. As
>
> I basicallly agree, but am not sure what to do about back-compatibility
> considerations. But we do not have to reproduce buggy behavior.

Agreed. What we have is just plain wrong. Dave Chambers' fix is
better, but still problematic.

What we *could* do is implement Dave Chambers' fix in
read_windows_registry(), but not call this by default. So a user would
have to explicitly call it if they really want Windows registry.

But I actually don't think even that's necessary. I honestly can't see
how anyone will be "depending" on the current behaviour, as it's just
plain buggy (.png and .jpg give the wrong mime type). So I don't think
backwards-compatibility is an issue here.

As R. David Murray mentioned, reading the registry is quite new
(Python 2.7 alpha 1, I believe), and has caused several problems
already. There's been encoding issues, and there's even a duplicate of
issue 15207, "part 3" of http://bugs.python.org/issue10551

But yes, I would love to see a Windows Python committer chip in, even
if it's just with "agreed, please provide a patch".

-Ben
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com