Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-25 Thread Charles-François Natali
2013/11/25 Greg Ewing :
> Ben Hoyt wrote:
>>
>> However, it seems there was no further discussion about why not
>> "extension" and "extensions"? I have never heard a filename extension
>> being called a "suffix".
>
>
> You can't have read many unix man pages, then! I just
> searched for "suffix" in the gcc man page, and found
> this:
>
> For any given input file, the file name suffix determines what kind of
> compilation is done:
>
>
>> I know it is a suffix in the sense of the
>> English word, but I've never heard it called that in this context, and
>> I think context is important.
>
>
> This probably depends on your background. In my experience,
> the term "extension" arose in OSes where it was a formal
> part of the filename syntax, often highly constrained.
> E.g. RT11, CP/M, early MS-DOS.
>
> Unix has never had a formal notion of extensions like that,
> only informal conventions, and has called them suffixes at
> least some of the time for as long as I can remember.

Indeed.
Just for reference, here's an extract of POSIX basename(1) man page [1]:
"""
SYNOPSIS

basename string [suffix]

DESCRIPTION

The string operand shall be treated as a pathname, as defined in XBD
Pathname. The string string shall be converted to the filename
corresponding to the last pathname component in string and then the
suffix string suffix, if present, shall be removed.
"""

[1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/basename.html


cf
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-25 Thread Nick Coghlan
On 25 Nov 2013 13:18, "Ben Hoyt"  wrote:
>
> > The idea of the rich stat result object is that has all that info
> > prepopulated, based on an initial stat call. "Caching" it amounts to
"keep a
> > reference to it".
> >
> > It is suggested that it would be a subset of the pathlib.Path API:
> > http://bugs.python.org/issue19725
> >
> > If it's also a superset of the existing stat object API, then at least
> > Path.stat and Path.lstat (and perhaps the lower level APIs) can be
updated
> > to return it in 3.5.
>
> Got it.
>
> >> "Worst case", we can add os.scandir() separately, which return
> >> DirEntry, "path-like" objects.
> >
> > Indeed, we may still want such an object API, since dirent doesn't
provide
> > full stat info.
>
> I'm not quite sure what you're suggesting here.
>
> In any case, I'm going to modify my scandir() so its DirEntry objects
> are closer to pathlib.Path, particularly:
>
> * isdir() -> is_dir()
> * isfile() -> is_file()
> * islink() -> is_symlink()
> * add is_socket(), is_fifo(), is_block_device(), and is_char_device()
>
> I'm considering removing DirEntry's .dirent attribute entirely. The
> above is_* functions cover everything in .dirent.d_type in a much more
> Pythonic and cross-platform way, and the only other info in .dirent is
> d_ino -- can a non-Windows dev tell me how or when d_ino would be
> useful? If it's useful, is it useful in a higher-level, cross-platform
> API such as scandir()?
>
> Hmmm, I wonder about this "rich stat object" idea in light of the
> above. Do the methods on pathlib.Path basically supercede the need for
> this? Because otherwise folks will always be wondering whether to say
> "path.is_dir()" or "path.stat().is_dir" ... two ways to do it, right
> next to each other. So I'd prefer to add the "rich" stuff on the
> higher-level Path instead of the lower-level stat.

The rich stat API proposal exists precisely to provide a clean way to do
stat result caching - path objects always give immediate data, stat objects
give cached answers.

The direct APIs on Path would just become a trivial shortcut once a rich
stat APIs existed - you could use the long form if you wanted to, but it
would be pointless to do so.

Cheers,
Nick.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (2.7): Fix test_fcntl to run properly on systems that do not support the flags

2013-11-25 Thread Nick Coghlan
On 25 Nov 2013 14:46, "gregory.p.smith"  wrote:
>
> http://hg.python.org/cpython/rev/cac7319c5972
> changeset:   87537:cac7319c5972
> branch:  2.7
> parent:  87534:3981e57a7bdc
> user:Gregory P. Smith 
> date:Mon Nov 25 04:45:27 2013 +
> summary:
>   Fix test_fcntl to run properly on systems that do not support the flags
> used in the "does the value get passed in properly" test.
>
> files:
>   Lib/test/test_fcntl.py |  3 +++
>   1 files changed, 3 insertions(+), 0 deletions(-)
>
>
> diff --git a/Lib/test/test_fcntl.py b/Lib/test/test_fcntl.py
> --- a/Lib/test/test_fcntl.py
> +++ b/Lib/test/test_fcntl.py
> @@ -113,7 +113,10 @@
>  self.skipTest("F_NOTIFY or DN_MULTISHOT unavailable")
>  fd = os.open(os.path.dirname(os.path.abspath(TESTFN)),
os.O_RDONLY)
>  try:
> +# This will raise OverflowError if issue1309352 is present.
>  fcntl.fcntl(fd, cmd, flags)
> +except IOError:
> +pass  # Running on a system that doesn't support these flags.
>  finally:
>  os.close(fd)

Raising a skip is generally preferred to marking a test that can't be
executed as passing.

Cheers,
Nick.

>
>
> --
> Repository URL: http://hg.python.org/cpython
>
> ___
> Python-checkins mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-checkins
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Close #19762: Fix name of _get_traces() and _get_object_traceback() function

2013-11-25 Thread Jim Jewett
Why are these functions (get_traces and get_object_traceback) private?

(1)  Is the whole module provisional?  At one point, I had thought so, but
I don't see that in the PEP or implementation.  (I'm not sure that it
should be provisional, but I want to be sure that the decision is
intentional.)

(2)  This implementation does lock in certain choices about the nature of
traces.  (What data to include for analysis vs excluding to save memory;
which events are tracked separately and which combined into a single total;
organizing the data that is saved in a hash by certain keys; etc)

While I would prefer more flexibility, the existing code provides a
reasonable default, and I can't forsee changing traces so much that these
functions *can't* be reasonably supported unless the rest of the module API
changes too.

(3)  get_object_traceback is the killer app that justifies the specific
data-collection choices Victor made; if it isn't public, the implementation
starts to look overbuilt.

(4) get_traces is about the only way to get at even the all the data that
*is* stored, prior to additional summarization.  If it isn't public, those
default summarization options become even more locked in..

-jJ

On Mon, Nov 25, 2013 at 3:34 AM, victor.stinner
wrote:

> http://hg.python.org/cpython/rev/2e2ec595dc58
> changeset:   87551:2e2ec595dc58
> user:Victor Stinner 
> date:Mon Nov 25 09:33:18 2013 +0100
> summary:
>   Close #19762: Fix name of _get_traces() and _get_object_traceback()
> function
> name in their docstring. Patch written by Vajrasky Kok.
>
> files:
>   Modules/_tracemalloc.c |  4 ++--
>   1 files changed, 2 insertions(+), 2 deletions(-)
>
>
> diff --git a/Modules/_tracemalloc.c b/Modules/_tracemalloc.c
> --- a/Modules/_tracemalloc.c
> +++ b/Modules/_tracemalloc.c
> @@ -1018,7 +1018,7 @@
>  }
>
>  PyDoc_STRVAR(tracemalloc_get_traces_doc,
> -"get_traces() -> list\n"
> +"_get_traces() -> list\n"
>  "\n"
>  "Get traces of all memory blocks allocated by Python.\n"
>  "Return a list of (size: int, traceback: tuple) tuples.\n"
> @@ -1083,7 +1083,7 @@
>  }
>
>  PyDoc_STRVAR(tracemalloc_get_object_traceback_doc,
> -"get_object_traceback(obj)\n"
> +"_get_object_traceback(obj)\n"
>  "\n"
>  "Get the traceback where the Python object obj was allocated.\n"
>  "Return a tuple of (filename: str, lineno: int) tuples.\n"
>
> --
> Repository URL: http://hg.python.org/cpython
>
> ___
> Python-checkins mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-checkins
>
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython: Close #19762: Fix name of _get_traces() and _get_object_traceback() function

2013-11-25 Thread Victor Stinner
2013/11/25 Jim Jewett :
> Why are these functions (get_traces and get_object_traceback) private?

_get_object_traceback() is wrapped to get a nice Python object:
http://hg.python.org/cpython/file/6ec6facb69ca/Lib/tracemalloc.py#l208

_get_traces() is private, it is used internally by take_snapshot():
http://hg.python.org/cpython/file/6ec6facb69ca/Lib/tracemalloc.py#l455

So it's possible to modify the low-level (private) structure used in
the C module without touching the high-level (public) Python API.

> (1)  Is the whole module provisional?  At one point, I had thought so, but I
> don't see that in the PEP or implementation.  (I'm not sure that it should
> be provisional, but I want to be sure that the decision is intentional.)

I don't know.

> (2)  This implementation does lock in certain choices about the nature of
> traces.  (What data to include for analysis vs excluding to save memory;
> which events are tracked separately and which combined into a single total;
> organizing the data that is saved in a hash by certain keys; etc)
>
> While I would prefer more flexibility, the existing code provides a
> reasonable default, and I can't forsee changing traces so much that these
> functions *can't* be reasonably supported unless the rest of the module API
> changes too.

Sorry, I don't see which kind of information is "excluded" to save memory.

Maybe you read an old version of the PEP?

About "events": tracemalloc doesn't store functions calls as event,
there is no timestamp. I didn't try to implement that, and I doesn't
want to. If you develop it (maybe on top of tracemalloc, I mean by
modify _tracemalloc.c), I would be interested to see your code and
test it :-)

> (3)  get_object_traceback is the killer app that justifies the specific
> data-collection choices Victor made; if it isn't public, the implementation
> starts to look overbuilt.

It is public, see the doc and the doc:
http://docs.python.org/dev/library/tracemalloc.html#tracemalloc.get_object_traceback
http://www.python.org/dev/peps/pep-0454/

> (4) get_traces is about the only way to get at even the all the data that
> *is* stored, prior to additional summarization.  If it isn't public, those
> default summarization options become even more locked in..

Snapshot.traces contains exactly the same data than _get_traces().

In a previous version of the PEP/code, statistics were computed while
traces were collected. It's not more the case. Now the API only
collect raw data, and then you have call to Snapshot.statistics() or
Snapshot.compare_to().

Victor
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-25 Thread Ben Hoyt
> OK, so I'm a Windows dev, but my understanding is that d_ino is useful
> to tell if two files are identical - hard links to the same physical
> file have the same d_ino value. I don't believe it's possible to do
> this on Windows at all.
>
> I've seen it used in tools like diff, to short-circuit doing the
> actual diff if you know from a stat that the 2 files are the same.

Okay, that helps -- thanks.

So the inode number is probably not all that useful in this context at
all. Because it doesn't come with the device, you don't know whether
it's unique (from the posixpath.samestat source, it looks like a
file's only unique if the inode and device numbers are equal).

So I think I'm going to drop .dirent entirely, and just expose the
d_type information via the is_* functions.

I'm not sure about is_socket(), is_fifo(), is_block_device(),
is_char_device(). I'm tempted to just leave them off, as I think
they'll basically never be used ... their stat counterparts are
exceedingly rare in the stdlib, so if you really want that, just use
.lstat().

-Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 0404 and VS 2010

2013-11-25 Thread Steve Dower
Steve Dower wrote:
> The advice I've been given on FILE* is that there's probably no way to make it
> work correctly due to its internal buffering. Unfortunately, there are more
> places where this leaks through than just the APIs using them - extensions 
> that
> call os.dup(fd), PyNumber_AsSsize_t() and pass the result to _fdopen() (for
> example, numpy) are simply going to break with mismatched fd's and there's no
> way to detect it at compile time. It's hard to tell how wide-ranging this sort
> of issue is going to be, but it certainly has the potential to break badly...

After thinking about this and looking into it, I think the breakage caused by 
this sort of code is so bad that we should be discouraging it. The internal 
buffering, especially on stdin/stdout/stderr, will wreak havoc on any 
extensions that use them, and code that casts fds to ints within Python will 
simply crash. The loss of confidence here may be irrecoverable - I don't think 
we should be making it easy for people to get into this situation.

We could make it opt-in for extension modules, but I think that situation is 
worse than the current one. The best solution is always going to be for users 
to install VS 2008, or at least VC9 (I'm working on making this easier, but it 
requires getting the attention of a lot of busy people...).

Any thoughts?

Cheers,
Steve 
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (2.7): Fix test_fcntl to run properly on systems that do not support the flags

2013-11-25 Thread Gregory P. Smith
On Mon, Nov 25, 2013 at 12:46 AM, Nick Coghlan  wrote:

>
> On 25 Nov 2013 14:46, "gregory.p.smith" 
> wrote:
> >
> > http://hg.python.org/cpython/rev/cac7319c5972
> > changeset:   87537:cac7319c5972
> > branch:  2.7
> > parent:  87534:3981e57a7bdc
> > user:Gregory P. Smith 
> > date:Mon Nov 25 04:45:27 2013 +
> > summary:
> >   Fix test_fcntl to run properly on systems that do not support the flags
> > used in the "does the value get passed in properly" test.
> >
> > files:
> >   Lib/test/test_fcntl.py |  3 +++
> >   1 files changed, 3 insertions(+), 0 deletions(-)
> >
> >
> > diff --git a/Lib/test/test_fcntl.py b/Lib/test/test_fcntl.py
> > --- a/Lib/test/test_fcntl.py
> > +++ b/Lib/test/test_fcntl.py
> > @@ -113,7 +113,10 @@
> >  self.skipTest("F_NOTIFY or DN_MULTISHOT unavailable")
> >  fd = os.open(os.path.dirname(os.path.abspath(TESTFN)),
> os.O_RDONLY)
> >  try:
> > +# This will raise OverflowError if issue1309352 is present.
> >  fcntl.fcntl(fd, cmd, flags)
> > +except IOError:
> > +pass  # Running on a system that doesn't support these
> flags.
> >  finally:
> >  os.close(fd)
>
> Raising a skip is generally preferred to marking a test that can't be
> executed as passing.
>
For this test this is still a pass because no OverflowError was raised.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com