Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-05 Thread Nick Coghlan
Brett Cannon wrote:
> If we add a new method like get_filenames(), I would suggest going
> with Antoine's suggestion of a tuple for __compiled__ (allowing
> loaders to indicate that they actually constructed the runtime
> bytecode from multiple cached files on-disk).
> 
> 
> Does code exist out there where people are constructing bytecode from
>  multiple files for a single module?

I'm quite prepared to call YAGNI on that idea and just return a 2-tuple
of source filename and compiled filename.

The theoretical use case was for a module that was partially compiled to
native code in advance, so it's "compiled" version was a combination of
a shared library and a bytecode file. It isn't really all that
compelling an idea - it would be easy enough for a loader to pick one or
the other and stick that in __compiled__.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] IO module improvements

2010-02-05 Thread Pascal Chambon

Hello

The new modular io system of python is awesome, but I'm running into 
some of its limits currently, while replacing the raw FileIO with a more 
advanced stream.
So here are a few ideas and questions regarding the mechanisms of this 
IO system. Note that I'm speaking in python terms, but these ideas 
should also apply to the C implementation (with more programming hassle 
of course).


- some streams have specific attributes (i.e mode, name...), but since 
they'll often been wrapped inside buffering or encoding streams, these 
attributes will not be available to the end user.


So wouldn't it be great to implement some "transversal inheritance", 
simply by delegating to the underlying buffer/raw-stream, attribute 
retrievals which fail on the current stream ? A little __getattr__ 
should do it fine, shouldn't it ?
By the way, I'm having trouble with the "name" attribute of raw files, 
which can be string or integer (confusing), ambiguous if containing a 
relative path, and which isn't able to handle the new case of my 
library, i.e opening a file from an existing file handle (which is ALSO 
an integer, like C file descriptors...) ; I propose we deprecate it for 
the benefit or more precise attributes, like "path" (absolute path) and 
"origin" (which can be "path", "fileno", "handle" and can be extended...).


Methods too would deserve some auto-forwarding. If you want to bufferize 
a raw stream which also offers size(), times(), lock_file() and other 
methods, how can these be accessed from a top-level buffering/text 
stream ? So it would be interesting to have a system through which a 
stream can expose its additional features to top level streams, and at 
the same time tell these if they must flush() or not before calling 
these new methods (eg. asking the inode number of a file doesn't require 
flushing, but knowing its real size DOES require it.).


- I feel thread-safety locking and stream stream status checking are 
currently overly complicated. All methods are filled with locking calls 
and CheckClosed() calls, which is both a performance loss (most io 
streams will have 3 such levels of locking, when 1 would suffice) and 
error-prone (some times ago I've seen in sources several functions in 
which checks and locks seemed lacking).
Since we're anyway in a mood of imbricating streams, why not simply 
adding a "safety stream" on top of each stream chain returned by open() 
? That layer could gracefully handle mutex locking, CheckClosed() calls, 
and even, maybe, the attribute/method forwarding I evocated above. I 
know a pure metaprogramming solution would maybe not suffice for 
performance-seekers, but static implementations should be doable as well.


- some semantic decisions of the current system are somehow dangerous. 
For example, flushing errors occuring on close are swallowed. It seems 
to me that it's of the utmost importance that the user be warned if the 
bytes he wrote disappeared before reaching the kernel ; shouldn't we 
decidedly enforce a "don't hide errors" everywhere in the io module ?.


Regards,
Pascal



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO module improvements

2010-02-05 Thread Antoine Pitrou
Pascal Chambon  gmail.com> writes:
> 
> By the way, I'm having trouble with the "name" attribute of raw files, 
> which can be string or integer (confusing), ambiguous if containing a 
> relative path, and which isn't able to handle the new case of my 
> library, i.e opening a file from an existing file handle (which is ALSO 
> an integer, like C file descriptors...)

What is the difference between "file handle" and a regular C file descriptor?
Is it some Windows-specific thing?
If so, then perhaps it deserves some Windows-specific attribute ("handle"?).

> Methods too would deserve some auto-forwarding. If you want to bufferize 
> a raw stream which also offers size(), times(), lock_file() and other 
> methods, how can these be accessed from a top-level buffering/text 
> stream ?

I think it's a bad idea. If you forget to implement one of the standard IO
methods (e.g. seek()), it will get forwarded to the raw stream, but with the
wrong semantics (because it won't take buffering into account).

It's better to require the implementor to do the forwarding explicitly if
desired, IMO.

> - I feel thread-safety locking and stream stream status checking are 
> currently overly complicated. All methods are filled with locking calls 
> and CheckClosed() calls, which is both a performance loss (most io 
> streams will have 3 such levels of locking, when 1 would suffice)

FileIO objects don't have a lock, so there are 2 levels of locking at worse, not
3 (and, actually, TextIOWrapper doesn't have a lock either, although perhaps it
should).
As for the checkClosed() calls, they are probably cheap, especially if they
bypass regular attribute lookup.

> Since we're anyway in a mood of imbricating streams, why not simply 
> adding a "safety stream" on top of each stream chain returned by open() 
> ? That layer could gracefully handle mutex locking, CheckClosed() calls, 
> and even, maybe, the attribute/method forwarding I evocated above.

It's an interesting idea, but it could also end up slower than the current
situation.
First because you are adding a level of indirection (i.e. additional method
lookups and method calls).
Second because currently the locks aren't always taken. For example, in
BufferedIOReader, we needn't take the lock when the requested data is available
in our buffer (the GIL already protects us). Having a separate "synchronizing"
wrapper would forbid such micro-optimizations.

If you want to experiment with this, you can use iobench (in the Tools
directory) to measure file IO performance.

> - some semantic decisions of the current system are somehow dangerous. 
> For example, flushing errors occuring on close are swallowed. It seems 
> to me that it's of the utmost importance that the user be warned if the 
> bytes he wrote disappeared before reaching the kernel ; shouldn't we 
> decidedly enforce a "don't hide errors" everywhere in the io module ?

It may be a bug. Can you report it, along with a script or test showcasing it?

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO module improvements

2010-02-05 Thread Guido van Rossum
On Fri, Feb 5, 2010 at 5:28 AM, Antoine Pitrou  wrote:
> Pascal Chambon  gmail.com> writes:
>>
>> By the way, I'm having trouble with the "name" attribute of raw files,
>> which can be string or integer (confusing), ambiguous if containing a
>> relative path,

Why is it ambiguous? It sounds like you're using str() of the name and
then can't tell whether the file is named e.g. '1' or whether it
refers to file descriptor 1 (i.e. sys.stdout).

>> and which isn't able to handle the new case of my
>> library, i.e opening a file from an existing file handle (which is ALSO
>> an integer, like C file descriptors...)
>
> What is the difference between "file handle" and a regular C file descriptor?
> Is it some Windows-specific thing?
> If so, then perhaps it deserves some Windows-specific attribute ("handle"?).

Make it mirror the fileno() attribute.

>> Methods too would deserve some auto-forwarding. If you want to bufferize
>> a raw stream which also offers size(), times(), lock_file() and other
>> methods, how can these be accessed from a top-level buffering/text
>> stream ?
>
> I think it's a bad idea. If you forget to implement one of the standard IO
> methods (e.g. seek()), it will get forwarded to the raw stream, but with the
> wrong semantics (because it won't take buffering into account).
>
> It's better to require the implementor to do the forwarding explicitly if
> desired, IMO.

Agreed. If an underlying stream has a certain property that doesn't
mean the above stream has the same property. Calling methods on the
underlying stream that move the file position may wreak havoc on the
buffer consistency of the above stream. Etc., etc. Please don't do
this. Antoine has the right idea.

>> - I feel thread-safety locking and stream stream status checking are
>> currently overly complicated. All methods are filled with locking calls
>> and CheckClosed() calls, which is both a performance loss (most io
>> streams will have 3 such levels of locking, when 1 would suffice)
>
> FileIO objects don't have a lock, so there are 2 levels of locking at worse, 
> not
> 3 (and, actually, TextIOWrapper doesn't have a lock either, although perhaps 
> it
> should).
> As for the checkClosed() calls, they are probably cheap, especially if they
> bypass regular attribute lookup.
>
>> Since we're anyway in a mood of imbricating streams, why not simply
>> adding a "safety stream" on top of each stream chain returned by open()
>> ? That layer could gracefully handle mutex locking, CheckClosed() calls,
>> and even, maybe, the attribute/method forwarding I evocated above.
>
> It's an interesting idea, but it could also end up slower than the current
> situation.
> First because you are adding a level of indirection (i.e. additional method
> lookups and method calls).
> Second because currently the locks aren't always taken. For example, in
> BufferedIOReader, we needn't take the lock when the requested data is 
> available
> in our buffer (the GIL already protects us). Having a separate "synchronizing"
> wrapper would forbid such micro-optimizations.
>
> If you want to experiment with this, you can use iobench (in the Tools
> directory) to measure file IO performance.
>
>> - some semantic decisions of the current system are somehow dangerous.
>> For example, flushing errors occuring on close are swallowed. It seems
>> to me that it's of the utmost importance that the user be warned if the
>> bytes he wrote disappeared before reaching the kernel ; shouldn't we
>> decidedly enforce a "don't hide errors" everywhere in the io module ?
>
> It may be a bug. Can you report it, along with a script or test showcasing it?
>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO module improvements

2010-02-05 Thread exarkun

On 03:57 pm, [email protected] wrote:
On Fri, Feb 5, 2010 at 5:28 AM, Antoine Pitrou  
wrote:

Pascal Chambon  gmail.com> writes:


By the way, I'm having trouble with the "name" attribute of raw 
files,

which can be string or integer (confusing), ambiguous if containing a
relative path,


Why is it ambiguous? It sounds like you're using str() of the name and
then can't tell whether the file is named e.g. '1' or whether it
refers to file descriptor 1 (i.e. sys.stdout).


I think string/integer and ambiguity were different points.  Here's the 
ambiguity:


   exar...@boson:~$ python
   Python 2.6.4 (r264:75706, Dec  7 2009, 18:45:15)[GCC 4.4.1] on 
linux2
   Type "help", "copyright", "credits" or "license" for more 
information.

   >>> import os, io
   >>> f = io.open('.bashrc')
   >>> os.chdir('/')
   >>> f.name
   '.bashrc'
   >>> os.path.abspath(f.name)
   '/.bashrc'
   >>>
Jean-Paul
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2010-02-05 Thread Python tracker

ACTIVITY SUMMARY (01/29/10 - 02/05/10)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 2602 open (+38) / 17079 closed (+15) / 19681 total (+53)

Open issues with patches:  1069

Average duration of open issues: 707 days.
Median duration of open issues: 461 days.

Open Issues Breakdown
   open  2568 (+38)
pending33 ( +0)

Issues Created Or Reopened (57)
___

allow unicode keyword args   02/04/10
   http://bugs.python.org/issue4978reopened barry   
  
   patch, needs review 

A selection of spelling errors and typos throughout source   02/04/10
   http://bugs.python.org/issue5341reopened ezio.melotti
  
   patch   

enable compilation of readline module on Mac OS X 10.5 and 10.6  02/04/10
   http://bugs.python.org/issue6877reopened barry   
  
   patch, 26backport, needs review 

segfault when deleting from a list using slice with very big `st 02/03/10
   http://bugs.python.org/issue7788reopened mark.dickinson  
  
   patch   

test_macostools fails on OS X 10.6: no attribute 'FSSpec'01/29/10
   http://bugs.python.org/issue7807created  mark.dickinson  
  
   

test_bsddb3 leaks references 01/29/10
   http://bugs.python.org/issue7808created  flox
  
   patch   

Documentation for random module should indicate that a call to s 01/30/10
CLOSED http://bugs.python.org/issue7809created  Justin.Lebar
  
   

fix_callable breakage01/30/10
CLOSED http://bugs.python.org/issue7810created  loewis  
  
   

[decimal] ValueError -> TypeError in from_tuple  01/30/10
   http://bugs.python.org/issue7811created  skrah   
  
   

Call to gestalt('sysu') on OSX can lead to freeze in wxPython ap 01/30/10
   http://bugs.python.org/issue7812created  phansen 
  
   

Bug in command-line module launcher  01/30/10
   http://bugs.python.org/issue7813created  pakal   
  
   patch   

SimpleXMLRPCServer Example uses "mul" instead of "div" in client 01/30/10
CLOSED http://bugs.python.org/issue7814created  mnewman 
  
   

Regression in unittest traceback formating extensibility 01/30/10
   http://bugs.python.org/issue7815created  gz  
  
   

test_capi crashes when run with "-R" 01/30/10
CLOSED http://bugs.python.org/issue7816created  flox
  
   patch   

Pythonw.exe fails to start   01/31/10
CLOSED http://bugs.python.org/issue7817created  ZDan
  
   

Improve set().test_c_api(): don't expect a set("abc"), modify th 01/31/10
   http://bugs.python.org/issue7818created  haypo   
  
   patch   

sys.call_tracing(): check arguments type 01/31/10
CLOSED http://bugs.python.org/issue7819created  haypo   
  
   patch   

parser: restores all bytes in the right order if check_bom() fai 01/31/10
   http://bugs.python.org/issue7820created  haypo   
  
   patch   

Command line option -U not documented01/31/10
CLOSED http://bugs.python.org/issue7821created  stevenjd
  
 

Re: [Python-Dev] Rational for PEP 3147 (PYC Respository Directories)

2010-02-05 Thread Jan Matějek
Dne 3.2.2010 18:39, Antoine Pitrou napsal(a):
> Neil Schemenauer  arctrix.com> writes:
>>
>> Thanks for doing the work of writing a PEP.  The rational section
>> could use some strengthing, I think.  Who is benefiting from this
>> feature?  Is it the distribution package maintainers?  Maybe people
>> who use a distribution packaged Python and install packages from
>> PyPI.  It's not clear to me, anyhow.
> 
> It would also be nice to have other packagers' take on this (Redhat, Mandriva,
> etc.). But of course you aren't responsible if they don't show up.

As the SUSE guy, i don't care either way. This has simply no benefits or
drawbacks for us.

This solution can only be beneficial in systems like Debian's
python-support, where you byte-compile when installing. We byte-compile
at build time, so if we wanted to support more than one python within
one package, we would need to distribute a rpm full of different .pycs
for all supported python versions. Yes, that was not possible before and
it is possible with this PEP, but there is no sense in doing it ;)

That said, i don't particularly care whether the installed pycs are in a
separate __pycache__ directory or next to their sources.
(there were very good arguments in the other thread against subdir
clutter, one more from me: each subdirectory has a separate entry in rpm
database, so by creating subdir clutter you're also cluttering our
packaging system)

+0 from me

regards
m.

> 
> cheers
> 
> Antoine.
> 
> 
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/jmatejek%40suse.cz
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO module improvements

2010-02-05 Thread Guido van Rossum
On Fri, Feb 5, 2010 at 8:46 AM,   wrote:
> On 03:57 pm, [email protected] wrote:
>>
>> On Fri, Feb 5, 2010 at 5:28 AM, Antoine Pitrou 
>> wrote:
>>>
>>> Pascal Chambon  gmail.com> writes:

 By the way, I'm having trouble with the "name" attribute of raw files,
 which can be string or integer (confusing), ambiguous if containing a
 relative path,
>>
>> Why is it ambiguous? It sounds like you're using str() of the name and
>> then can't tell whether the file is named e.g. '1' or whether it
>> refers to file descriptor 1 (i.e. sys.stdout).
>
> I think string/integer and ambiguity were different points.  Here's the
> ambiguity:
>
>   exar...@boson:~$ python
>   Python 2.6.4 (r264:75706, Dec  7 2009, 18:45:15)    [GCC 4.4.1] on linux2
>   Type "help", "copyright", "credits" or "license" for more information.
>   >>> import os, io
>   >>> f = io.open('.bashrc')
>   >>> os.chdir('/')
>   >>> f.name
>   '.bashrc'
>   >>> os.path.abspath(f.name)
>   '/.bashrc'
>   >>>
> Jean-Paul

You're right, I didn't see the OP's comma. :-)

I don't think this can be helped though -- I really don't want open()
to be slowed down or complicated by an attempt to do path
manipulation. If this matters to the app author they should use
os.path.abspath() or os.path.realpath() or whatever before calling
open().

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO module improvements

2010-02-05 Thread Christian Heimes
Guido van Rossum wrote:
> You're right, I didn't see the OP's comma. :-)
> 
> I don't think this can be helped though -- I really don't want open()
> to be slowed down or complicated by an attempt to do path
> manipulation. If this matters to the app author they should use
> os.path.abspath() or os.path.realpath() or whatever before calling
> open().

I had the idea to add a property that returns the file name based on the
file descriptor. However there isn't a plain way to lookup the file
based on the fd on POSIX OSes. fstat() returns only the inode and
device. The combination of inode + device references 0 to n files due to
anonymous files and hard links. On POSIX OSes with a /proc file systems
it's possible to do a reverse lookup by (ab)using /proc/self/fd/, but
that's a hack.

>>> import os
>>> f = open("/etc/passwd")
>>> fd = f.fileno()
>>> os.readlink("/proc/self/fds/%i" % fd)
'/etc/passwd'

On Windows it's possible to get the file name from the handle with
GetFileInformationByHandleEx().

This doesn't strike me as a feasible options ...

Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO module improvements

2010-02-05 Thread Guido van Rossum
On Fri, Feb 5, 2010 at 3:47 PM, Christian Heimes  wrote:
> I had the idea to add a property that returns the file name based on the
> file descriptor. However there isn't a plain way to lookup the file
> based on the fd on POSIX OSes. fstat() returns only the inode and
> device. The combination of inode + device references 0 to n files due to
> anonymous files and hard links. On POSIX OSes with a /proc file systems
> it's possible to do a reverse lookup by (ab)using /proc/self/fd/, but
> that's a hack.
>
 import os
 f = open("/etc/passwd")
 fd = f.fileno()
 os.readlink("/proc/self/fds/%i" % fd)
> '/etc/passwd'
>
> On Windows it's possible to get the file name from the handle with
> GetFileInformationByHandleEx().
>
> This doesn't strike me as a feasible options ...

It's good to know about such options, but I really don't like to add
such brittle APIs to the standard I/O objects. So, agreed, this is not
feasible.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO module improvements

2010-02-05 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Antoine Pitrou wrote:
> Pascal Chambon  gmail.com> writes:
>> By the way, I'm having trouble with the "name" attribute of raw files, 
>> which can be string or integer (confusing), ambiguous if containing a 
>> relative path, and which isn't able to handle the new case of my 
>> library, i.e opening a file from an existing file handle (which is ALSO 
>> an integer, like C file descriptors...)
> 
> What is the difference between "file handle" and a regular C file descriptor?
> Is it some Windows-specific thing?
> If so, then perhaps it deserves some Windows-specific attribute ("handle"?).

File descriptors are integer indexes into a process-specific table.
File handles are pointers to opaque structs which contain other
information the kernel knows about the file.  MS Windows muddies the
distinction, using "file handle" to refer to the integer index.


[1] http://en.wikipedia.org/wiki/File_handle


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkttGsUACgkQ+gerLs4ltQ733gCgqrkKNryUrWvLLEjoOWL7z5IY
PnkAnREQKkY3CbPikOdEq4sYQcUylKxw
=Sr71
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IO module improvements

2010-02-05 Thread David Cournapeau
On Sat, Feb 6, 2010 at 4:31 PM, Tres Seaver  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> Antoine Pitrou wrote:
>> Pascal Chambon  gmail.com> writes:
>>> By the way, I'm having trouble with the "name" attribute of raw files,
>>> which can be string or integer (confusing), ambiguous if containing a
>>> relative path, and which isn't able to handle the new case of my
>>> library, i.e opening a file from an existing file handle (which is ALSO
>>> an integer, like C file descriptors...)
>>
>> What is the difference between "file handle" and a regular C file descriptor?
>> Is it some Windows-specific thing?
>> If so, then perhaps it deserves some Windows-specific attribute ("handle"?).
>
> File descriptors are integer indexes into a process-specific table.

AFAIK, they aren't simple indexes in windows, and that's  partly why
even file descriptors cannot be safely passed between C runtimes on
windows (whereas they can in most unices).

David
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com