Re: [Python-Dev] PEP 428 (pathlib) now committed

2013-11-24 Thread Nick Coghlan
On 24 Nov 2013 01:21, "Antoine Pitrou"  wrote:
>
> On Sat, 23 Nov 2013 15:32:58 +0200
> Serhiy Storchaka  wrote:
> > 22.11.13 18:44, Antoine Pitrou написав(ла):
> > > I've pushed pathlib to the repository. I'm hopeful there won't be
> > > new buildbot failures because of it, but still, there may be some
> > > platform-specific issues I'm unaware of.
> >
> > Congratuate Antoine!
> >
> > Does it means that issues #11344 (Add os.path.splitpath(path) function)
> > [1] and #13968 (Support recursive globs) [2] have no chance? Both are
> > ready for commit and waits for reviews almost a year. Are the os.path
> > and glob modules deprecated now?
>
> They are not deprecated, no. I am not terribly interested in reviewing
> those patches, personally, but other people may be :-)

Right, pathlib is an abstraction layer on top of the lower level
implementation APIs, rather than a replacement for them.

Cheers,
Nick.

>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot's are needlessly compiling -O0

2013-11-24 Thread Nick Coghlan
On 24 Nov 2013 17:15, "Gregory P. Smith"  wrote:
>
> our buildbots are setup to configure --with-pydebug which also
unfortunately causes them to compile with -O0... this results in a python
binary that is excruciatingly slow and makes the entire test suite run take
a long time.
>
> given that nobody is ever going to run a gdb or another debugger on the
buildbot generated transient binaries themselves how about we speed all of
the buildbot's up by adding CFLAGS=-O2 to the configure command line?

The main problem is that doing so would disable test_gdb. Humans don't run
gdb on those binaries, but the test suite does.

I agree it would be nice to figure out a way to run most of the tests on an
optimised build, though.

Cheers,
Nick.

>
> Sure, the compile step will take a bit longer but that is dwarfed by the
test time as it is:
>
>
http://buildbot.python.org/all/builders/AMD64%20Ubuntu%20LTS%203.x/builds/3224
> http://buildbot.python.org/all/builders/ARMv7%203.x/builds/7
>
http://buildbot.python.org/all/builders/AMD64%20Snow%20Leop%203.x/builds/639
>
> It should dramatically decrease the turnaround latency for buildbot
results.
>
> -gps
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Disable annoying tests which doesn't work optimized pickles.

2013-11-24 Thread Antoine Pitrou
On Sun, 24 Nov 2013 05:58:24 +0100 (CET)
alexandre.vassalotti  wrote:
> http://hg.python.org/cpython/rev/a68c303eb8dc
> changeset:   87486:a68c303eb8dc
> user:Alexandre Vassalotti 
> date:Sat Nov 23 20:58:24 2013 -0800
> summary:
>   Disable annoying tests which doesn't work optimized pickles.

We should probably disable them only on optimized pickles, then :-)

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot's are needlessly compiling -O0

2013-11-24 Thread anatoly techtonik
On Sun, Nov 24, 2013 at 12:43 PM, Nick Coghlan  wrote:
>
> On 24 Nov 2013 17:15, "Gregory P. Smith"  wrote:
>>
>> our buildbots are setup to configure --with-pydebug which also
>> unfortunately causes them to compile with -O0... this results in a python
>> binary that is excruciatingly slow and makes the entire test suite run take
>> a long time.
>>
>> given that nobody is ever going to run a gdb or another debugger on the
>> buildbot generated transient binaries themselves how about we speed all of
>> the buildbot's up by adding CFLAGS=-O2 to the configure command line?
>
> The main problem is that doing so would disable test_gdb. Humans don't run
> gdb on those binaries, but the test suite does.

Is there a danger that the code tested under GDB is not tested in
"natural environment" for pythons?
--
anatoly t.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot's are needlessly compiling -O0

2013-11-24 Thread Eli Bendersky
On Sun, Nov 24, 2013 at 6:12 AM, anatoly techtonik wrote:

> On Sun, Nov 24, 2013 at 12:43 PM, Nick Coghlan  wrote:
> >
> > On 24 Nov 2013 17:15, "Gregory P. Smith"  wrote:
> >>
> >> our buildbots are setup to configure --with-pydebug which also
> >> unfortunately causes them to compile with -O0... this results in a
> python
> >> binary that is excruciatingly slow and makes the entire test suite run
> take
> >> a long time.
> >>
> >> given that nobody is ever going to run a gdb or another debugger on the
> >> buildbot generated transient binaries themselves how about we speed all
> of
> >> the buildbot's up by adding CFLAGS=-O2 to the configure command line?
> >
> > The main problem is that doing so would disable test_gdb. Humans don't
> run
> > gdb on those binaries, but the test suite does.
>
> Is there a danger that the code tested under GDB is not tested in
> "natural environment" for pythons?
> --
>

What are you talking about? Have you actually looked at test_gdb before
writing this email?

Eli
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Python 3.4.0b1 is now tagged, feature-freeze is now in effect

2013-11-24 Thread Larry Hastings



Please refrain from checking in any new features to Python trunk until 
after the 3.4 release branch is created (which will be a few months).  
Instead, let's concentrate our efforts on polishing Python 3.4 until 
it's the best and most-defect-free release yet!


Thanks,


//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] buildbot's are needlessly compiling -O0

2013-11-24 Thread Barry Warsaw
On Nov 23, 2013, at 11:13 PM, Gregory P. Smith wrote:

>our buildbots are setup to configure --with-pydebug which also
>unfortunately causes them to compile with -O0... this results in a python
>binary that is excruciatingly slow and makes the entire test suite run take
>a long time.

It would be fine(-ish) to add this for improved buildbot performance, but
please do not change this for default --with-pydebug builds.  When you're
debugging Python, -O0 just makes so much more sense.

-Barry
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] can someone create a buildbot slave name & password for me?

2013-11-24 Thread Barry Warsaw
On Nov 23, 2013, at 11:01 PM, Gregory P. Smith wrote:

>http://buildbot.python.org/all/buildslaves/gps-ubuntu-exynos5-armv7l

Cool thanks.  Antoine, do you still want or need my buildbot, or can I take it
off-line?  (FWIW, because the hardware is no longer supported, it's pretty
much stuck at Ubuntu 12.10.)

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] can someone create a buildbot slave name & password for me?

2013-11-24 Thread Antoine Pitrou
On Sun, 24 Nov 2013 11:47:42 -0500
Barry Warsaw  wrote:
> On Nov 23, 2013, at 11:01 PM, Gregory P. Smith wrote:
> 
> >http://buildbot.python.org/all/buildslaves/gps-ubuntu-exynos5-armv7l
> 
> Cool thanks.  Antoine, do you still want or need my buildbot, or can I take it
> off-line?  (FWIW, because the hardware is no longer supported, it's pretty
> much stuck at Ubuntu 12.10.)

Well, your buildbot has already been off-line for something like a
month :-)
http://buildbot.python.org/all/buildslaves/warsaw-ubuntu-arm

If the hardware is not supported anymore, and since the machine was
rather slow, I agree it's ok to let it go.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] can someone create a buildbot slave name & password for me?

2013-11-24 Thread Barry Warsaw
On Nov 24, 2013, at 06:02 PM, Antoine Pitrou wrote:

>If the hardware is not supported anymore, and since the machine was
>rather slow, I agree it's ok to let it go.

Done!

(The machine's name was 'hope', so now we're hope-less :).

-Barry
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 428 - pathlib API questions

2013-11-24 Thread Ben Hoyt
PEP 428 looks nice. Thanks, Antoine!

I have a couple of questions about the module name and API. I think
I've read through most of the previous discussion, but may have missed
some, so please point me to the right place if there have already been
discussions about these things.

1) Someone on reddit.com/r/Python asked "Is the import going to be
'pathlib'? I thought the renaming going on of std lib things with the
transition to Python 3 sought to remove the spurious usage of
appending 'lib' to libs?" I wondered about this too. Has this been
discussed/answered?

2) I think the operation of "suffix" and "suffixes" is good, but not
so much the name. I saw Ben Finney's original suggestion about
multiple extensions etc
(https://mail.python.org/pipermail/python-ideas/2012-October/016437.html).

However, it seems there was no further discussion about why not
"extension" and "extensions"? I have never heard a filename extension
being called a "suffix". I know it is a suffix in the sense of the
English word, but I've never heard it called that in this context, and
I think context is important. Put another way, "extension" is obvious
and guessable, "suffix" isn't.

3) Obviously pathlib isn't going in the stdlib in Python 2.x, but I'm
wondering about writing portable code when you want the string version
of the path. In Python 3.x you'll call str(path_obj), but in Python
2.x that will fail if the path has unicode chars in it, and you'll
need to use unicode(path_obj), which of course doesn't work 3.x. Is
this just a fact of life, or would .str() or .as_string() help for
2.x/3.x portability?

4) Is path_obj.glob() recursive? In the PEP it looks like it is if the
pattern starts with '**', but in the pep428 branch of the code there
are both glob() and rglob() functions. I've never seen the ** syntax
before (though admittedly I'm a Windows dev), and much prefer the
explicitness of having two functions, or maybe even better,
path_obj.glob('*.py', recursive=True).

Seems much more Pythonic to provide an actual argument (or different
function) for this change in behaviour, rather than stuffing the
"recursive flag" inside the pattern string.

Has this ship already sailed with http://bugs.python.org/issue13968?
Which I also think should also be rglob(pattern) or glob(pattern,
recursive=True). Of course, if this ship has already sailed, it's
definitely better for pathlib's glob to match glob.glob.

Thanks,
Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] [RELEASED] Python 3.4.0b1

2013-11-24 Thread Larry Hastings


On behalf of the Python development team, it's my privilege to announce
the first beta release of Python 3.4.

This is a preview release, and its use is not recommended for
production settings.

Python 3.4 includes a range of improvements of the 3.x series, including
hundreds of small improvements and bug fixes.  Major new features and
changes in the 3.4 release series include:

* PEP 435, a standardized "enum" module
* PEP 436, a build enhancement that will help generate introspection
   information for builtins
* PEP 442, improved semantics for object finalization
* PEP 443, adding single-dispatch generic functions to the standard library
* PEP 445, a new C API for implementing custom memory allocators
* PEP 446, changing file descriptors to not be inherited by default
   in subprocesses
* PEP 450, a new "statistics" module
* PEP 453, a bundled installer for the *pip* package manager
* PEP 456, a new hash algorithm for Python strings and binary data
* PEP 3154, a new and improved protocol for pickled objects
* PEP 3156, a new "asyncio" module, a new framework for asynchronous I/O

Python 3.4 is now in "feature freeze", meaning that no new features will be
added.  The final release is projected for late February 2014.


To download Python 3.4.0b1 visit:

http://www.python.org/download/releases/3.4.0/


Please consider trying Python 3.4.0b1 with your code and reporting any
new issues you notice to:

 http://bugs.python.org/


Enjoy!

--
Larry Hastings, Release Manager
larry at hastings.org
(on behalf of the entire python-dev team and 3.4's contributors)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Ben Hoyt
Hi folks,

I decided to start another thread for my thoughts on the interaction
between pathlib (Antoine's new PEP 428), issue 11406 (proposal for a
directory iterator returning stat-like info), and my own scandir
library, which implements something along the lines of issue 11406.

My scandir library (https://github.com/benhoyt/scandir) is something
I've been working on for a while -- it provides a scandir() function
which uses the OS's directory iterator functions to expose as much
stat-like information as possible (readdir and FindFirstFile etc).
This way functions like os.walk() can use the info (particularly
"is_dir()") and not require tons of extra calls to os.stat().

This provides a huge speed boost for os.walk() in many cases: I've
seen 3-4x on Linux, and up to 20x on Windows. (It depends on various
things, not least of which is Windows' weird stat caching -- if I run
my scandir benchmark "fresh", I get os.walk() running 8-9 times as
fast as the built-in one. But if I run it after an un-hibernate,
suddenly it runs 18-20 times as fast as the built-in one. Either way,
huge gains, especially on Windows.)

scandir.scandir() returns a DirEntry object, which has .isdir(),
.isfile(), .islink(), and .lstat() attributes. Look familiar? When I
was reading PEP 428 and saw .is_file(), .is_dir(), and .stat(), I
thought -- surely I can merge this with pathlib and Path objects.

The first thing I can do to scandir is rename my isdir() type
attributes to match PEP 428's, so that DirEntry quacks like a Path
object where it can.

However, I'm wondering if I can change scandir to return actual Path
objects. Or better, because Path already helpfully provides iterdir()
which yields Path objects, and Path objects have .is_dir() etc, can
scandir()-like behaviour simply work out-of-the-box?

This mainly depends on how Path is going to cache stat information. If
it caches it, then this will just work. Sounds like Guido's opinion
was that both cached and uncached use cases are important, but that it
should be very clear which one you're getting. I personally like the
.stat() and .restat() idea.

The other related thing is that DirEntry only provides .lstat(),
because it's providing stat-like info without following links.

Note in this context that it's not just "network filesystems" on which
stat() is slow 
(https://mail.python.org/pipermail/python-dev/2013-May/125805.html).
It's quite slow in Windows under various conditions too.

See also Nick Coghlan's post about a DirEntry-style object on the
issue 11406 thread:
https://mail.python.org/pipermail/python-dev/2013-May/126148.html

Thoughts and suggestions for how to merge scandir with pathlib's
approach? It's important to me that pathlib's API doesn't cut itself
off from a more efficient implement of the ideas from issue 11406 and
scandir...

Thanks,
Ben.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-24 Thread Antoine Pitrou

Hello,

On Mon, 25 Nov 2013 11:00:09 +1300
Ben Hoyt  wrote:
> 
> 1) Someone on reddit.com/r/Python asked "Is the import going to be
> 'pathlib'? I thought the renaming going on of std lib things with the
> transition to Python 3 sought to remove the spurious usage of
> appending 'lib' to libs?" I wondered about this too. Has this been
> discussed/answered?

Well, "path" is much too common already, and it's an obvious variable
name for a filesystem path, so "pathlib" is better to avoid name
clashes.

> 2) I think the operation of "suffix" and "suffixes" is good, but not
> so much the name. I saw Ben Finney's original suggestion about
> multiple extensions etc
> (https://mail.python.org/pipermail/python-ideas/2012-October/016437.html).
> 
> However, it seems there was no further discussion about why not
> "extension" and "extensions"? I have never heard a filename extension
> being called a "suffix". I know it is a suffix in the sense of the
> English word, but I've never heard it called that in this context, and
> I think context is important. Put another way, "extension" is obvious
> and guessable, "suffix" isn't.

Well, perhaps :-), but nobody opposed suffix and suffixes at the time.
Note the API is provisional, so we can still make it change, but
obviously the barrier for changes is higher now that the PEP is
accepted and the beta has been cut.

> 3) Obviously pathlib isn't going in the stdlib in Python 2.x, but I'm
> wondering about writing portable code when you want the string version
> of the path. In Python 3.x you'll call str(path_obj), but in Python
> 2.x that will fail if the path has unicode chars in it, and you'll
> need to use unicode(path_obj), which of course doesn't work 3.x.

The behaviour of unicode paths in Python 2 is erratic
(system-dependent).  pathlib can't really fix it: Python 2 doesn't know
about a well-defined filesystem encoding.

> 4) Is path_obj.glob() recursive?

This is documented:
http://docs.python.org/dev/library/pathlib.html#pathlib.Path.glob
http://docs.python.org/dev/library/pathlib.html#pathlib.Path.rglob

> Seems much more Pythonic to provide an actual argument (or different
> function) for this change in behaviour, rather than stuffing the
> "recursive flag" inside the pattern string.

It's not a flag, it's a different wildcard. This allows e.g. a library
function to call glob() and users to pass a recursive or non-recursive
pattern as they wish.

> Has this ship already sailed with http://bugs.python.org/issue13968?

This issue is still open, so no :-)

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] Python 3.4.0b1

2013-11-24 Thread Larry Hastings

On 11/24/2013 02:00 PM, Larry Hastings wrote:

Python 3.4 includes a range of improvements of the 3.x series, including
hundreds of small improvements and bug fixes.  Major new features and
changes in the 3.4 release series include:


Whoops, sorry, I missed a couple of PEPs there:

* PEP 428, a "pathlib" module providing object-oriented filesystem paths
* PEP 451, standardizing module metadata for Python's module import system
* PEP 454, a new "tracemalloc" module for tracing Python memory allocations

They're on the web site already, and they'll be in the next announcement.

Sorry for the oversight!


//arry/
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Antoine Pitrou
On Mon, 25 Nov 2013 11:20:08 +1300
Ben Hoyt  wrote:
> 
> This mainly depends on how Path is going to cache stat information. If
> it caches it, then this will just work. Sounds like Guido's opinion
> was that both cached and uncached use cases are important, but that it
> should be very clear which one you're getting. I personally like the
> .stat() and .restat() idea.

Right now, pathlib doesn't cache. Guido decided it was safer to start
off like that, and perhaps later we can add some optional caching.

One reason caching didn't go in is that it's not clear which API is
best. Working on pluggin scandir() into pathlib would actually help
choosing a stat-caching API.

(or, rather, lstat-caching...)

> The other related thing is that DirEntry only provides .lstat(),
> because it's providing stat-like info without following links.

Path.is_dir() and friends use stat(), i.e. they inform you about
whether a symlink's target is a directory (not the symlink itself).  Of
course, if the DirEntry says the path is a symlink, Path.is_dir() could
then run stat() to find out about the target.

Do you plan to propose scandir() for inclusion in the stdlib?

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-24 Thread Ben Hoyt
> Well, "path" is much too common already, and it's an obvious variable
> name for a filesystem path, so "pathlib" is better to avoid name
> clashes.

Yep, that makes total sense, thanks.

>> However, it seems there was no further discussion about why not
>> "extension" and "extensions"? I have never heard a filename extension
>> being called a "suffix". I know it is a suffix in the sense of the
>> English word, but I've never heard it called that in this context, and
>> I think context is important. Put another way, "extension" is obvious
>> and guessable, "suffix" isn't.
>
> Well, perhaps :-), but nobody opposed suffix and suffixes at the time.
> Note the API is provisional, so we can still make it change, but
> obviously the barrier for changes is higher now that the PEP is
> accepted and the beta has been cut.

Okay. I won't push hard :-) as "suffix" isn't terrible, but has anyone
else never (or rarely) heard the term "suffix" applied to filename
extensions?

>> 3) Obviously pathlib isn't going in the stdlib in Python 2.x, but I'm
>> wondering about writing portable code when you want the string version
>> of the path. In Python 3.x you'll call str(path_obj), but in Python
>> 2.x that will fail if the path has unicode chars in it, and you'll
>> need to use unicode(path_obj), which of course doesn't work 3.x.
>
> The behaviour of unicode paths in Python 2 is erratic
> (system-dependent).  pathlib can't really fix it: Python 2 doesn't know
> about a well-defined filesystem encoding.

Fair enough.

>> 4) Is path_obj.glob() recursive?
>
> This is documented:
> http://docs.python.org/dev/library/pathlib.html#pathlib.Path.glob
> http://docs.python.org/dev/library/pathlib.html#pathlib.Path.rglob
>
>> Seems much more Pythonic to provide an actual argument (or different
>> function) for this change in behaviour, rather than stuffing the
>> "recursive flag" inside the pattern string.
>
> It's not a flag, it's a different wildcard. This allows e.g. a library
> function to call glob() and users to pass a recursive or non-recursive
> pattern as they wish.

Okay, just saw those docs now -- thanks. Fair enough re "it's a
different wildcard". At the least I don't think there should be two
ways to do it -- in other words, either rglob() or glob('**'), both
seems very un-PEP 20 to me. My preference is rglob(), but
glob(recursive=True) would be fine too.

>> Has this ship already sailed with http://bugs.python.org/issue13968?
>
> This issue is still open, so no :-)

Same goes for this issue -- there should be OOWTDI, and my preference
is rglob() or glob(recursive=True). But maybe issue 13968's behaviour
can be determined by pathlib's now that pathlib is the one getting
done first.

Thanks,
Ben.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Ben Hoyt
> Right now, pathlib doesn't cache. Guido decided it was safer to start
> off like that, and perhaps later we can add some optional caching.
>
> One reason caching didn't go in is that it's not clear which API is
> best. Working on pluggin scandir() into pathlib would actually help
> choosing a stat-caching API.
>
> (or, rather, lstat-caching...)
>
>> The other related thing is that DirEntry only provides .lstat(),
>> because it's providing stat-like info without following links.
>
> Path.is_dir() and friends use stat(), i.e. they inform you about
> whether a symlink's target is a directory (not the symlink itself).  Of
> course, if the DirEntry says the path is a symlink, Path.is_dir() could
> then run stat() to find out about the target.
>
> Do you plan to propose scandir() for inclusion in the stdlib?

Yes, I was hoping to propose adding "os.scandir() -> yields DirEntry
objects" for inclusion into the stdlib, and also speed up os.walk() as
a result.

However, pathlib's API with .is_dir() and .lstat() etc are so close to
DirEntry, I'd be much keener to roll up the scandir functionality into
pathlib's iterdir(), as that's already going in the standard library,
and iterdir() already returns Path objects.

I'm just not sure it's possible or useful without stat caching.

We could do Path.lstat(cached=True), but we'd also really want
is_dir(cached=True), so that API kinda sucks. Alternatively you could
have iterdir(cached=True) return PathWithCachedStat style objects --
probably better, but kinda messy.

For these reasons, I would much prefer stat caching on by default in
Path -- in my experience, the cached behaviour is desired much much
more often than the non-cached. I've written directory walkers more
often than I can count, whereas I've maybe only once written a
long-running process that needs to re-stat, and if it's clearly
documented as cached, then it's super easy to call restat(), or create
a new Path instance to get new stat info.

This would allow iterdir() to take advantage of the huge performance
improvements you can get when walking directories.

Guido, are you at all open to reconsidering the uncached-by-default in
light of this?

-Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-24 Thread Greg Ewing

Ben Hoyt wrote:

However, it seems there was no further discussion about why not
"extension" and "extensions"? I have never heard a filename extension
being called a "suffix".


You can't have read many unix man pages, then! I just
searched for "suffix" in the gcc man page, and found
this:

For any given input file, the file name suffix determines what kind of
compilation is done:


I know it is a suffix in the sense of the
English word, but I've never heard it called that in this context, and
I think context is important.


This probably depends on your background. In my experience,
the term "extension" arose in OSes where it was a formal
part of the filename syntax, often highly constrained.
E.g. RT11, CP/M, early MS-DOS.

Unix has never had a formal notion of extensions like that,
only informal conventions, and has called them suffixes at
least some of the time for as long as I can remember.


4) Is path_obj.glob() recursive? In the PEP it looks like it is if the
pattern starts with '**',


I don't think it has to *start* with **. Rather, the ** is
a pattern that can span directory separators. It's not a
flag that applies to the whole thing -- a pattern could have
a * in one place and a ** in another.

--
Greg
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-24 Thread Ben Hoyt
>> However, it seems there was no further discussion about why not
>> "extension" and "extensions"? I have never heard a filename extension
>> being called a "suffix".
>
>
> You can't have read many unix man pages, then!

Huh, no I haven't! Certainly not regularly, as I'm almost exclusively
a Windows user. :-)

> This probably depends on your background. In my experience,
> the term "extension" arose in OSes where it was a formal
> part of the filename syntax, often highly constrained.
> E.g. RT11, CP/M, early MS-DOS.
>
> Unix has never had a formal notion of extensions like that,
> only informal conventions, and has called them suffixes at
> least some of the time for as long as I can remember.

Yes, seems like it definitely is background-dependent. I'm
Windows-centric. I stand corrected, and recant my position on
"suffix". :-)

>> 4) Is path_obj.glob() recursive? In the PEP it looks like it is if the
>> pattern starts with '**',
>
>
> I don't think it has to *start* with **. Rather, the ** is
> a pattern that can span directory separators. It's not a
> flag that applies to the whole thing -- a pattern could have
> a * in one place and a ** in another.

Oh okay, that makes more sense. It definitely needs more thorough
documentation in that case. I would still prefer the simpler and more
explicit rglob() / recursive=True rather than pattern new syntax, but
I don't feel as strongly anymore.

-Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Antoine Pitrou
On Mon, 25 Nov 2013 12:04:28 +1300
Ben Hoyt  wrote:
> > Right now, pathlib doesn't cache. Guido decided it was safer to start
> > off like that, and perhaps later we can add some optional caching.
> >
> > One reason caching didn't go in is that it's not clear which API is
> > best. Working on pluggin scandir() into pathlib would actually help
> > choosing a stat-caching API.
> >
> > (or, rather, lstat-caching...)
> >
> >> The other related thing is that DirEntry only provides .lstat(),
> >> because it's providing stat-like info without following links.
> >
> > Path.is_dir() and friends use stat(), i.e. they inform you about
> > whether a symlink's target is a directory (not the symlink itself).  Of
> > course, if the DirEntry says the path is a symlink, Path.is_dir() could
> > then run stat() to find out about the target.
> >
> > Do you plan to propose scandir() for inclusion in the stdlib?
> 
> Yes, I was hoping to propose adding "os.scandir() -> yields DirEntry
> objects" for inclusion into the stdlib, and also speed up os.walk() as
> a result.
> 
> However, pathlib's API with .is_dir() and .lstat() etc are so close to
> DirEntry, I'd be much keener to roll up the scandir functionality into
> pathlib's iterdir(), as that's already going in the standard library,
> and iterdir() already returns Path objects.

We could still expose scandir() as a low-level API, *and* call it in
pathlib for optimizations.

> We could do Path.lstat(cached=True), but we'd also really want
> is_dir(cached=True), so that API kinda sucks. Alternatively you could
> have iterdir(cached=True) return PathWithCachedStat style objects --
> probably better, but kinda messy.

Perhaps Path.enable_caching()? It would enable caching not only on this
path objects, but all objects constructed from it.

Regards

Antoine.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [RELEASED] Python 3.4.0b1

2013-11-24 Thread Ned Deily
In article <[email protected]>,
 Larry Hastings  wrote:
> On behalf of the Python development team, it's my privilege to announce
> the first beta release of Python 3.4.
> 
> This is a preview release, and its use is not recommended for
> production settings.

Note to users of the python.org Mac OS X binary installers: if you have 
installed earlier preview releases of Python 3.4, be aware that the 
batteries-included built-in Tcl/Tk library support introduced in 3.4.0a2 has 
been reverted in 3.4.0b1 because it was found to break some third-party 
packages.  As is the case with earlier releases of Python, if you use the 
python.org 64-bit installer for OS X, you will again need to have a compatible 
third-party copy of Tcl/Tk 8.5 installed, such as ActiveTcl 8.5.15.1, to avoid 
the problematic system versions shipped in OS X 10.6+.  See 
http://www.python.org/download/mac/tcltk/ for more information.

-- 
 Ned Deily,
 [email protected]

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Nick Coghlan
On 25 Nov 2013 09:07, "Ben Hoyt"  wrote:
>
> > Right now, pathlib doesn't cache. Guido decided it was safer to start
> > off like that, and perhaps later we can add some optional caching.
> >
> > One reason caching didn't go in is that it's not clear which API is
> > best. Working on pluggin scandir() into pathlib would actually help
> > choosing a stat-caching API.
> >
> > (or, rather, lstat-caching...)
> >
> >> The other related thing is that DirEntry only provides .lstat(),
> >> because it's providing stat-like info without following links.
> >
> > Path.is_dir() and friends use stat(), i.e. they inform you about
> > whether a symlink's target is a directory (not the symlink itself).  Of
> > course, if the DirEntry says the path is a symlink, Path.is_dir() could
> > then run stat() to find out about the target.
> >
> > Do you plan to propose scandir() for inclusion in the stdlib?
>
> Yes, I was hoping to propose adding "os.scandir() -> yields DirEntry
> objects" for inclusion into the stdlib, and also speed up os.walk() as
> a result.
>
> However, pathlib's API with .is_dir() and .lstat() etc are so close to
> DirEntry, I'd be much keener to roll up the scandir functionality into
> pathlib's iterdir(), as that's already going in the standard library,
> and iterdir() already returns Path objects.
>
> I'm just not sure it's possible or useful without stat caching.
>
> We could do Path.lstat(cached=True), but we'd also really want
> is_dir(cached=True), so that API kinda sucks. Alternatively you could
> have iterdir(cached=True) return PathWithCachedStat style objects --
> probably better, but kinda messy.
>
> For these reasons, I would much prefer stat caching on by default in
> Path -- in my experience, the cached behaviour is desired much much
> more often than the non-cached. I've written directory walkers more
> often than I can count, whereas I've maybe only once written a
> long-running process that needs to re-stat, and if it's clearly
> documented as cached, then it's super easy to call restat(), or create
> a new Path instance to get new stat info.
>
> This would allow iterdir() to take advantage of the huge performance
> improvements you can get when walking directories.
>
> Guido, are you at all open to reconsidering the uncached-by-default in
> light of this?

No, caching on the object is dangerously unintuitive - it means two Path
objects can compare equal, but give different answers for stat-dependent
queries.

A global string (or Path) keyed cache (rather than a per-object cache)
would actually be a safer option, since it would ensure distinct path
objects always gave the same answer. That's the approach I will likely
pursue at some point in walkdir.

It's also quite likely the "rich stat object" API will be pursued for 3.5,
which is a much safer approach to stat result caching than trying to embed
it directly in pathlib.Path objects.

That's why we decided to punt on the caching question until 3.5 - it's
better to provide a predictable building block that doesn't provide
caching, and then work out how to provide a sensible caching layer on top
of that, rather than trying to rush a potentially flawed caching design
that leads to inconsistent behaviour.

Cheers,
Nick.

>
> -Ben
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Guido van Rossum
On Sun, Nov 24, 2013 at 3:04 PM, Ben Hoyt  wrote:

> > Right now, pathlib doesn't cache. Guido decided it was safer to start
> > off like that, and perhaps later we can add some optional caching.
> >
> > One reason caching didn't go in is that it's not clear which API is
> > best. Working on pluggin scandir() into pathlib would actually help
> > choosing a stat-caching API.
> >
> > (or, rather, lstat-caching...)
> >
> >> The other related thing is that DirEntry only provides .lstat(),
> >> because it's providing stat-like info without following links.
> >
> > Path.is_dir() and friends use stat(), i.e. they inform you about
> > whether a symlink's target is a directory (not the symlink itself).  Of
> > course, if the DirEntry says the path is a symlink, Path.is_dir() could
> > then run stat() to find out about the target.
> >
> > Do you plan to propose scandir() for inclusion in the stdlib?
>
> Yes, I was hoping to propose adding "os.scandir() -> yields DirEntry
> objects" for inclusion into the stdlib, and also speed up os.walk() as
> a result.
>
> However, pathlib's API with .is_dir() and .lstat() etc are so close to
> DirEntry, I'd be much keener to roll up the scandir functionality into
> pathlib's iterdir(), as that's already going in the standard library,
> and iterdir() already returns Path objects.
>
> I'm just not sure it's possible or useful without stat caching.
>
> We could do Path.lstat(cached=True), but we'd also really want
> is_dir(cached=True), so that API kinda sucks. Alternatively you could
> have iterdir(cached=True) return PathWithCachedStat style objects --
> probably better, but kinda messy.
>
> For these reasons, I would much prefer stat caching on by default in
> Path -- in my experience, the cached behaviour is desired much much
> more often than the non-cached. I've written directory walkers more
> often than I can count, whereas I've maybe only once written a
> long-running process that needs to re-stat, and if it's clearly
> documented as cached, then it's super easy to call restat(), or create
> a new Path instance to get new stat info.
>
> This would allow iterdir() to take advantage of the huge performance
> improvements you can get when walking directories.
>
> Guido, are you at all open to reconsidering the uncached-by-default in
> light of this?


I think we should think hard and deep about all the consequences. I was
initially in favor of stat caching, but during offline review of PEP 428
Nick pointed out that there are too many different ways to do stat caching,
and convinced me that it would be wrong to rush it. Now that beta 1 is out
I really don't want to reconsider this -- we really need to stick to the
plan.

The ship has likewise sailed for adding scandir() (whether to os or
pathlib). By all means experiment and get it ready for consideration for
3.5, but I don't want to add it to 3.4.

In general I think there are some tough choices regarding stat caching. You
already brought up stat vs. lstat -- there's also the issue of what to do
if [l]stat fails -- do we cache the exception?

IMO, the current incarnation is for convenience, correctness and
cross-platform semantics -- three C's. The next incarnation can add a
fourth C, caching.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Ben Hoyt
Antoine's class-global flag seems like a bad idea.

> A global string (or Path) keyed cache (rather than a per-object cache) would
> actually be a safer option, since it would ensure distinct path objects
> always gave the same answer. That's the approach I will likely pursue at
> some point in walkdir.

Interesting approach. This wouldn't really solve the problem for
scandir / DirEntry / performance issues, but it's a fair idea in
general.

> It's also quite likely the "rich stat object" API will be pursued for 3.5,
> which is a much safer approach to stat result caching than trying to embed
> it directly in pathlib.Path objects.

As a Windows dev, I'm not sure I love the "rich stat object idea",
because stat_result objects are sooo Posixy. On Windows, (some of) the
file attribute info is stuffed into a stat_result struct. Which kinda
works, but I like how Path exposes the higher-level, cross-platform
stuff like .is_dir() so that most of the time you don't need to worry
about stat. (You still need to worry about caching, though.)

> That's why we decided to punt on the caching question until 3.5 - it's
> better to provide a predictable building block that doesn't provide caching,
> and then work out how to provide a sensible caching layer on top of that,
> rather than trying to rush a potentially flawed caching design that leads to
> inconsistent behaviour.

Yep, agreed about rushing in a potentially flawed caching design. But
I also don't want to "rush in" a design that prohibits scandir()-style
performance optimizations -- though I guess it can still go in there
one way or the other.

"Worst case", we can add os.scandir() separately, which return
DirEntry, "path-like" objects.

-Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Ben Hoyt
> I think we should think hard and deep about all the consequences. I was
> initially in favor of stat caching, but during offline review of PEP 428
> Nick pointed out that there are too many different ways to do stat caching,
> and convinced me that it would be wrong to rush it. Now that beta 1 is out I
> really don't want to reconsider this -- we really need to stick to the plan.

Fair call, and thanks for the response.

> The ship has likewise sailed for adding scandir() (whether to os or
> pathlib). By all means experiment and get it ready for consideration for
> 3.5, but I don't want to add it to 3.4.

Yes, I was definitely thinking about 3.5 at this stage. :-) What would
be the next step for getting something like os.scandir() added for 3.5
-- a PEP referencing the various issues?

> In general I think there are some tough choices regarding stat caching. You
> already brought up stat vs. lstat -- there's also the issue of what to do if
> [l]stat fails -- do we cache the exception?
>
> IMO, the current incarnation is for convenience, correctness and
> cross-platform semantics -- three C's. The next incarnation can add a fourth
> C, caching.

Three/four C's, I like it!

-Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-24 Thread Nick Coghlan
On 25 Nov 2013 09:14, "Ben Hoyt"  wrote:
>
> >> 4) Is path_obj.glob() recursive? In the PEP it looks like it is if the
> >> pattern starts with '**',
> >
> >
> > I don't think it has to *start* with **. Rather, the ** is
> > a pattern that can span directory separators. It's not a
> > flag that applies to the whole thing -- a pattern could have
> > a * in one place and a ** in another.
>
> Oh okay, that makes more sense. It definitely needs more thorough
> documentation in that case. I would still prefer the simpler and more
> explicit rglob() / recursive=True rather than pattern new syntax, but
> I don't feel as strongly anymore.

Using "**" for directory spanning globs is also another case of us
borrowing a reasonably common idiom from *nix systems that may not be
familiar to Windows users.

Cheers,
Nick.

>
> -Ben
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-24 Thread Ben Hoyt
> Using "**" for directory spanning globs is also another case of us borrowing
> a reasonably common idiom from *nix systems that may not be familiar to
> Windows users.

Okay, *nix wins then. :-) Python's stdlib is already fairly
*nix-oriented (even when it's being cross-platform), so I guess it's
not a big deal.

My only remaining concern then is that there shouldn't be more than
one way to do recursive globbing in a new API like this. Why does
rglob() exist when the documentation simply says "like calling glob()
but with '**' added in front of the pattern"?

http://docs.python.org/dev/library/pathlib.html#pathlib.Path.rglob

-Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Nick Coghlan
On 25 Nov 2013 09:31, "Ben Hoyt"  wrote:
>
> > It's also quite likely the "rich stat object" API will be pursued for
3.5,
> > which is a much safer approach to stat result caching than trying to
embed
> > it directly in pathlib.Path objects.
>
> As a Windows dev, I'm not sure I love the "rich stat object idea",
> because stat_result objects are sooo Posixy. On Windows, (some of) the
> file attribute info is stuffed into a stat_result struct. Which kinda
> works, but I like how Path exposes the higher-level, cross-platform
> stuff like .is_dir() so that most of the time you don't need to worry
> about stat. (You still need to worry about caching, though.)

The idea of the rich stat result object is that has all that info
prepopulated, based on an initial stat call. "Caching" it amounts to "keep
a reference to it".

It is suggested that it would be a subset of the pathlib.Path API:
http://bugs.python.org/issue19725

If it's also a superset of the existing stat object API, then at least
Path.stat and Path.lstat (and perhaps the lower level APIs) can be updated
to return it in 3.5.

> > That's why we decided to punt on the caching question until 3.5 - it's
> > better to provide a predictable building block that doesn't provide
caching,
> > and then work out how to provide a sensible caching layer on top of
that,
> > rather than trying to rush a potentially flawed caching design that
leads to
> > inconsistent behaviour.
>
> Yep, agreed about rushing in a potentially flawed caching design. But
> I also don't want to "rush in" a design that prohibits scandir()-style
> performance optimizations -- though I guess it can still go in there
> one way or the other.

Yeah, the realisation that an initial non-caching approach didn't lock us
out of external caching may not have been well communicated to the list. I
was discussing the walkdir integration possibilities with Antoine and Guido
and realised I would likely still need an external cache, even if pathlib
had its own internal caching. At that point, it seemed highly desirable to
duck the caching question entirely.

> "Worst case", we can add os.scandir() separately, which return
> DirEntry, "path-like" objects.

Indeed, we may still want such an object API, since dirent doesn't provide
full stat info.

A PEP reviewing all this for 3.5 and proposing a specific os.scandir API
would be a good thing.

Cheers,
Nick.

>
> -Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-24 Thread Nick Coghlan
On 25 Nov 2013 09:42, "Ben Hoyt"  wrote:
>
> > Using "**" for directory spanning globs is also another case of us
borrowing
> > a reasonably common idiom from *nix systems that may not be familiar to
> > Windows users.
>
> Okay, *nix wins then. :-) Python's stdlib is already fairly
> *nix-oriented (even when it's being cross-platform), so I guess it's
> not a big deal.
>
> My only remaining concern then is that there shouldn't be more than
> one way to do recursive globbing in a new API like this. Why does
> rglob() exist when the documentation simply says "like calling glob()
> but with '**' added in front of the pattern"?
>
> http://docs.python.org/dev/library/pathlib.html#pathlib.Path.rglob

Because it's a layered API - embedding ** in the pattern is a strictly more
powerful interface, but can be a little tricky to get your head around
(especially if you don't use a shell that has the feature). rglob() is
simpler, but not as flexible.

We offer that kind of multi-level API fairly often. For example,
subprocess.call() and friends are simpler interfaces for particular ways of
using the powerful-but-complex subprocess.Popen API. The metaprogramming
stack (functions, classes, decorators, descriptors, metaclasses) similarly
offers the ability to trade increased complexity for increases in power and
flexibility.

In these cases, the "obvious way" is to use the simplest API that covers
the use case, and only reach for the more complex API when you genuinely
need it.

Cheers,
Nick.

>
> -Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Ben Hoyt
> The idea of the rich stat result object is that has all that info
> prepopulated, based on an initial stat call. "Caching" it amounts to "keep a
> reference to it".
>
> It is suggested that it would be a subset of the pathlib.Path API:
> http://bugs.python.org/issue19725
>
> If it's also a superset of the existing stat object API, then at least
> Path.stat and Path.lstat (and perhaps the lower level APIs) can be updated
> to return it in 3.5.

Got it.

>> "Worst case", we can add os.scandir() separately, which return
>> DirEntry, "path-like" objects.
>
> Indeed, we may still want such an object API, since dirent doesn't provide
> full stat info.

I'm not quite sure what you're suggesting here.

In any case, I'm going to modify my scandir() so its DirEntry objects
are closer to pathlib.Path, particularly:

* isdir() -> is_dir()
* isfile() -> is_file()
* islink() -> is_symlink()
* add is_socket(), is_fifo(), is_block_device(), and is_char_device()

I'm considering removing DirEntry's .dirent attribute entirely. The
above is_* functions cover everything in .dirent.d_type in a much more
Pythonic and cross-platform way, and the only other info in .dirent is
d_ino -- can a non-Windows dev tell me how or when d_ino would be
useful? If it's useful, is it useful in a higher-level, cross-platform
API such as scandir()?

Hmmm, I wonder about this "rich stat object" idea in light of the
above. Do the methods on pathlib.Path basically supercede the need for
this? Because otherwise folks will always be wondering whether to say
"path.is_dir()" or "path.stat().is_dir" ... two ways to do it, right
next to each other. So I'd prefer to add the "rich" stuff on the
higher-level Path instead of the lower-level stat.

> A PEP reviewing all this for 3.5 and proposing a specific os.scandir API
> would be a good thing.

Thanks, I'll definitely consider writing a PEP.

-Ben
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 428 - pathlib API questions

2013-11-24 Thread Serhiy Storchaka

25.11.13 01:35, Nick Coghlan написав(ла):

Using "**" for directory spanning globs is also another case of us
borrowing a reasonably common idiom from *nix systems that may not be
familiar to Windows users.


Rather from Java world.


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib and issue 11406 (a directory iterator returning stat-like info)

2013-11-24 Thread Paul Moore
On 25 November 2013 03:18, Ben Hoyt  wrote:
> d_ino -- can a non-Windows dev tell me how or when d_ino would be
> useful? If it's useful, is it useful in a higher-level, cross-platform
> API such as scandir()?

OK, so I'm a Windows dev, but my understanding is that d_ino is useful
to tell if two files are identical - hard links to the same physical
file have the same d_ino value. I don't believe it's possible to do
this on Windows at all.

I've seen it used in tools like diff, to short-circuit doing the
actual diff if you know from a stat that the 2 files are the same.
Paul
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com