Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Ethan Furman

On 05/11/2016 09:55 PM, Serhiy Storchaka wrote:


[...]  But for example os.walk() was significantly
boosted with using os.scandir(), it would be sad to make it slower
again.


scandir's speed improvement is due to not not throwing away data the OS 
was already giving us.



os.path is used in number of files, sometimes in loops, sometimes
indirectly. It is hard to find all examples.


Currently, any of these functions that already take a string have to do 
a couple pointer comparisons to make sure they have a string; any of 
these functions that take both a string and a bytes have to do a couple 
pointer comparisons to make sure they have a string or a bytes;  the 
only difference if this PEP is accepted is the fall-back path when those 
first checks fail.


As an example: if os.walk is called with a Path, it converts the Path to 
a string (once!) and then uses that string to generate more strings and 
return strings.  When os.walk calls os.path.join or os.path.split it 
will be with strings (or bytes), not with the original Path object.



Such functions as glob.glob() calls split() and join() for every
component, but they also use string or bytes operations with paths. So
they need to convert argument to str or bytes before start iteration,
and always call os.path functions only with str or bytes.


Exactly.


Additional
conversion in every os.path function is redundant.


And won't happen, since the fast-path checks will confirm that the 
argument is a string or bytes object.



I suppose most other
high-level functions that manipulates paths in a loop also should
convert arguments once at the start and don't need the support of path
protocol in os.path functions.


Human's can call the os.path functions; therefore, the os.path functions 
need to support the __fspath__ protocol.



I'm for adding conversions in C implemented path consuming APIs and may
be in high-level path manipulation functions like os.walk(), but left
low-level API of os.path, fnmatch and glob unchanged.


So instead of the easy to remember "Path doesn't work with the rest of 
the standard library" we'll have "Path works with some APIs, but not 
others -- guess you better look it up" ?  That is not an improvement.


--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Sven R. Kunze

On 12.05.2016 00:13, Brett Cannon wrote:
I see this whole discussion breaking down into a few groups which 
changes what gets done upfront and what might be done farther down the 
line:


 1. Maximum acceptance: do whatever we can to make all representation
of paths just work, which means making all places working with a
path in the stdlib accept path objects, str, and bytes.
 2. Safely use path objects: __fspath__() is there to signal an object
is a file system path and to get back a lower-level representation
so people stop calling str() on everything, providing some
interface signaling that someone doesn't misuse an object as a
path and only changing path consumptions APIs -- e.g. open() --
and not path manipulation APIs -- e.g. os.path -- in the stdlib.
 3. It ain't worth it: those that would rather just skip all of this
and drop pathlib from the stdlib.



Sorry for being picky here. I think the last group needs to be split up:

3. It ain't worth it: those that would rather just skip all of this
4. drop pathlib from the stdlib.

I put myself into camp 3, mostly because I don't consider the "wallet 
garden problem" a problem at all and I realized that our past issues 
with pathlib resulted from missing features in pathlib not in the rest 
of the stdlib.



Sven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Serhiy Storchaka

On 12.05.16 10:54, Ethan Furman wrote:

Currently, any of these functions that already take a string have to do
a couple pointer comparisons to make sure they have a string; any of
these functions that take both a string and a bytes have to do a couple
pointer comparisons to make sure they have a string or a bytes;  the
only difference if this PEP is accepted is the fall-back path when those
first checks fail.


This is cheap in C, but os.path functions are implemented in Python. 
They have to make at least one function call (os.fspath(), hasattr() or 
isinstance()), not counting a bytecode for retrieving arguments, 
resolving attributes, comparing, jumps. Currently os.path functions use 
tricks to avoid overheads


Yet one problem is that currently many os,path functions work with 
duck-typed strings (e.g. UserString). Using os.fspath() likely limit 
supported types to str, bytes and types that support the path protocol.



___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Sven R. Kunze

Thanks Brett for your hard work. My comments below:

On 11.05.2016 18:43, Brett Cannon wrote:

Rationale
=

Historically in Python, file system paths have been represented as
strings or bytes. This choice of representation has stemmed from C's
own decision to represent file system paths as
``const char *`` [#libc-open]_. While that is a totally serviceable
format to use for file system paths, it's not necessarily optimal. At
issue is the fact that while all file system paths can be represented
as strings or bytes, not all strings or bytes represent a file system
path.


I can remember this argument being made during the discussion. I am not 
sure if that 100% correct as soon as we talk about PurePaths.



This can lead to issues where any e.g. string duck-types to a
file system path whether it actually represents a path or not.

To help elevate the representation of file system paths from their
representation as strings and bytes to a more appropriate object
representation, the pathlib module [#pathlib]_ was provisionally
introduced in Python 3.4 through PEP 428. While considered by some as
an improvement over strings and bytes for file system paths, it has
suffered from a lack of adoption. Typically the key issue listed
for the low adoption rate has been the lack of support in the standard
library. This lack of support required users of pathlib to manually
convert path objects to strings by calling ``str(path)`` which many
found error-prone.

One issue in converting path objects to strings comes from
the fact that only generic way to get a string representation of the
path was to pass the object to ``str()``. This can pose a
problem when done blindly as nearly all Python objects have some
string representation whether they are a path or not, e.g.
``str(None)`` will give a result that
``builtins.open()`` [#builtins-open]_ will happily use to create a new
file.

Exacerbating this whole situation is the
``DirEntry`` object [#os-direntry]_. While path objects have a
representation that can be extracted using ``str()``, ``DirEntry``
objects expose a ``path`` attribute instead. Having no common
interface between path objects, ``DirEntry``, and any other
third-party path library had become an issue. A solution that allowed
any path-representing object to declare that is was a path and a way
to extract a low-level representation that all path objects could
support was desired.


I think the "Rationale" section ignores the fact the Path also supports 
the .path attribute now. Which indeed defines a common interface between 
path objects.




[...]

Proposal


This proposal is split into two parts. One part is the proposal of a
protocol for objects to declare and provide support for exposing a
file system path representation.


https://docs.python.org/3/whatsnew/changelog.html says:

"Add ‘path’ attribute to pathlib.Path objects, returning the same as 
str(), to make it more similar to DirEntry. Library code can now write 
getattr(p, ‘path’, p) to get the path as a string from a Path, a 
DirEntry, or a plain string. This is essentially a small one-off protocol."


So, in order to promote the "small one-off protocol" to a more broader 
protocol, this PEP proposes a simple rename of .path to .__fspath__, is 
that correct?



The only issue I see with it is that it requires another function 
(os.fspath) to extract the "low-level representation". .path seems far 
easier to me.



The other part is changes to Python's
standard library to support the new protocol.


I think this could be another PEP unrelated to the first part.

These changes will also have the pathlib module drop its provisional 
status.


Not sure if that should be part of the PEP, maybe yes.


[...]


The remainder of the PEP unfolds as a flawless implication of the 
rationale and the proposed idea.


Unfortunately, I don't have anything to contribute to the open issues. 
All solutions have their pros and cons and everything that could be said 
has been said. I think you need to decide.


Sven

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 11:14 AM, Serhiy Storchaka  wrote:
>
> This is cheap in C, but os.path functions are implemented in Python. They
> have to make at least one function call (os.fspath(), hasattr() or
> isinstance()), not counting a bytecode for retrieving arguments, resolving
> attributes, comparing, jumps. Currently os.path functions use tricks to
> avoid overheads
>

I suppose a C-implemented version of fspath *called from python* might
be the fastest option at least in some cases. After all, a function
call (isinstance or hasattr) is likely anyway, unless of course `try:
path.__fspath__` is used.

> Yet one problem is that currently many os,path functions work with
> duck-typed strings (e.g. UserString). Using os.fspath() likely limit
> supported types to str, bytes and types that support the path protocol.
>

Something like

path = path.__fspath__() if hasattr(path, '__fspath__') else path

as currently in the PEP, would not have this problem. However, I
wonder whether such duck string paths actually exist (although it does
remind me of my earlier experiments for solving the pathlib
compatibility problem ;-).


-- Koos

P.S: I think it's great that you are concerned about the performance,
and I find it important. However, this does feel like premature
optimization to me at this point. We should first decide what
functions should support the protocol, and after that, find the
performance concerns and fix them. I have the feeling that the cases
where this would be a performance bottleneck are quite rare, but if
they are not, they may well be fixable.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 11:31 AM, Sven R. Kunze  wrote:
> On 11.05.2016 18:43, Brett Cannon wrote:
>>
>> Rationale
>> =
>>
>> Historically in Python, file system paths have been represented as
>> strings or bytes. This choice of representation has stemmed from C's
>> own decision to represent file system paths as
>> ``const char *`` [#libc-open]_. While that is a totally serviceable
>> format to use for file system paths, it's not necessarily optimal. At
>> issue is the fact that while all file system paths can be represented
>> as strings or bytes, not all strings or bytes represent a file system
>> path.
>
>
> I can remember this argument being made during the discussion. I am not sure
> if that 100% correct as soon as we talk about PurePaths.
>

I had suggested an alternative wording for this (see my commit on the
work on Rationale).


>> Proposal
>> 
>>
>> This proposal is split into two parts. One part is the proposal of a
>> protocol for objects to declare and provide support for exposing a
>> file system path representation.
>
>
> https://docs.python.org/3/whatsnew/changelog.html says:
>
> "Add ‘path’ attribute to pathlib.Path objects, returning the same as str(),
> to make it more similar to DirEntry. Library code can now write getattr(p,
> ‘path’, p) to get the path as a string from a Path, a DirEntry, or a plain
> string. This is essentially a small one-off protocol."
>
> So, in order to promote the "small one-off protocol" to a more broader
> protocol, this PEP proposes a simple rename of .path to .__fspath__, is that
> correct?
>

Well, I have brought this up previously several times. Indeed I see
this as a further development of that duck-typing compatiblity
approach.  However, while the .path attribute is prior art, it has not
been in a release yet.

> Unfortunately, I don't have anything to contribute to the open issues. All
> solutions have their pros and cons and everything that could be said has
> been said. I think you need to decide.
>

Surprising enough, there are new things being said all the time. But
luckily there seem to be signs of convergence.

-- Koos
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Nick Coghlan
On 12 May 2016 at 02:43, Brett Cannon  wrote:
> **deep, calming breath**
>
> Here is the PEP for __fspath__(). The draft lives at
> https://github.com/brettcannon/path-pep so feel free to send me PRs for
> spelling mistakes, grammatical errors, etc.

Thanks for putting this together :)

> C API
> '
>
> The C API will gain an equivalent function to ``os.fspath()`` that
> also allows bytes objects through::
>
> /*
> Return the file system path of the object.
>
> If the object is str or bytes, then allow it to pass through with
> an incremented refcount. All other types raise a TypeError.
> */
> PyObject *
> PyOS_RawFSPath(PyObject *path)
> {
> if (PyObject_HasAttrString(path, "__fspath__")) {
> path = PyObject_CallMethodObjArgs(path, "__fspath__", NULL);
> if (path == NULL) {
> return NULL;
> }
> }
> else {
> Py_INCREF(path);
> }
>
> if (!PyUnicode_Check(path) && !PyBytes_Check(path)) {
> Py_DECREF(path);
> return PyErr_Format(PyExc_TypeError,
> "expected a string, bytes, or path object,
> not %S",
> path->ob_type);
> }
>
> return path;
> }

I'd still like to see this exposed to Python code as os._raw_fspath()
(with the leading underscore just meaning "this probably isn't the API
you want" rather than indicating a private or unstable API), and then
fspath() defined as a wrapper around it which disallows bytes as
output.

However, I don't have a specific use case, and it would be
straightforward to add later, so the overall PEP gets a +1 from me.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 3:04 PM, Nick Coghlan  wrote:
>
> I'd still like to see this exposed to Python code as os._raw_fspath()
> (with the leading underscore just meaning "this probably isn't the API
> you want" rather than indicating a private or unstable API), and then
> fspath() defined as a wrapper around it which disallows bytes as
> output.
>

I don't remember (should probably check) if you previously proposed
implementing exactly that in C, but I indeed agree with what you write
above, except that I don't like the "_raw_" prefix in the name. I
would be willing to call that simply fspath though, since as mentioned
before in this thread (I think by Brett and me), the reasons for
rejecting bytes in fspath are really quite minor.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Nick Coghlan
On 12 May 2016 at 22:40, Koos Zevenhoven  wrote:
> On Thu, May 12, 2016 at 3:04 PM, Nick Coghlan  wrote:
>>
>> I'd still like to see this exposed to Python code as os._raw_fspath()
>> (with the leading underscore just meaning "this probably isn't the API
>> you want" rather than indicating a private or unstable API), and then
>> fspath() defined as a wrapper around it which disallows bytes as
>> output.
>
> I don't remember (should probably check) if you previously proposed
> implementing exactly that in C, but I indeed agree with what you write
> above, except that I don't like the "_raw_" prefix in the name. I
> would be willing to call that simply fspath though, since as mentioned
> before in this thread (I think by Brett and me), the reasons for
> rejecting bytes in fspath are really quite minor.

It's not unusual for me to encounter "POSIX oughtta be enough for
anyone" folks that are not yet entirely convinced that
bytes-are-not-text, so I'm actually in favour of making the default
Python-level API str-only as a healthy nudge away from the
"text-is-just-bytes-with-an-encoding!" school of thought.

However, in terms of the three groups Brett articulated (maximum
flexibility, encouraging cross-platform correctness, and forgetting
the whole idea), I'm in both camps 1 & 2 - I work with POSIX enough
that I'm entirely on board with the notion that if you're specifically
modelling *POSIX* paths, then bytes-with-an-assumed-encoding is
frequently a good enough representation, but also deal with other
environments (like Windows, the JVM and the CLR) enough to know that
that particular representation of filesystem paths breaks down the
moment you expand your scope of interest beyond *nix platforms.

Hence the suggestion of having os.fspath() be the group 2
guaranteed-to-only-emit-str API, with os._raw_fspath() as the lower
level "I know I'm potentially being POSIX-centric here, but that's OK
for my use case" group 1 interface.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PyGC_Collect ignores state of `enabled`

2016-05-12 Thread Armin Rigo
Hi Lukasz,

On 10 May 2016 at 04:13, Łukasz Langa  wrote:
> However, because of PyGC_Collect() called in Py_Finalize(), during
> interpreter shutdown the collection is done anyway, Linux does CoW and the
> memory usage spikes. Which is ironic on process shutdown.

Try to call os._exit() to avoid doing all this work on shutdown (after
you have checked that it is indeed not doing anything interesting).


A bientôt,

Armin.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 4:20 PM, Nick Coghlan  wrote:
>
> It's not unusual for me to encounter "POSIX oughtta be enough for
> anyone" folks that are not yet entirely convinced that
> bytes-are-not-text, so I'm actually in favour of making the default
> Python-level API str-only as a healthy nudge away from the
> "text-is-just-bytes-with-an-encoding!" school of thought.
>

This was also how I convinced myself about the default str constraint.
However, I'm afraid it would be a weak weapon against using bytes
paths, since the people using bytes paths would not be likely to call
it, regardless of whether it supports bytes or not.

The nice thing is that pathlib is str-only and *that* will push people
away from bytes paths.

> However, in terms of the three groups Brett articulated (maximum
> flexibility, encouraging cross-platform correctness, and forgetting
> the whole idea), I'm in both camps 1 & 2 - I work with POSIX enough
> that I'm entirely on board with the notion that if you're specifically
> modelling *POSIX* paths, then bytes-with-an-assumed-encoding is
> frequently a good enough representation, but also deal with other
> environments (like Windows, the JVM and the CLR) enough to know that
> that particular representation of filesystem paths breaks down the
> moment you expand your scope of interest beyond *nix platforms.
>

I also agree with parts about Brett's "camp 2".

-- Koos
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slow downloads from python.org

2016-05-12 Thread Dima Tisnek
Gone now, must've been transient.
I've no idea if it was python end, my end (tail?) or something
slithery inbetween.

On 11 May 2016 at 20:17, Brett Cannon  wrote:
>
>
> On Wed, 11 May 2016 at 10:56 Dima Tisnek  wrote:
>>
>> Sorry, this is probably wrong place to ask, but is it only me?
>> I can't get more than 40KB/s downloading from python.org
>
>
> It's just you or the problem has passed; just downloaded much faster than
> 40KB/s.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-05-12 Thread Guido van Rossum
OK, then PEP 515 is now officially accepted! Congratulations. Start the
implementation work!

--Guido (mobile)
On May 11, 2016 10:33 PM, "Georg Brandl"  wrote:

I'm happy with the latest version.

Georg

On 05/11/2016 06:46 PM, Guido van Rossum wrote:
> If the authors are happy I'll accept it right away.
>
> (I vaguely recall there's another PEP that's ready for pronouncement --
but
> which one?)
>
> On Wed, May 11, 2016 at 9:34 AM, Brett Cannon  > wrote:
>
> Is there anything holding up PEP 515 at this point in terms of
acceptance or
> implementation?
>
> On Sat, 19 Mar 2016 at 11:56 Guido van Rossum  > wrote:
>
> All that sounds fine!
>
> On Sat, Mar 19, 2016 at 11:28 AM, Stefan Krah  > wrote:
> > Guido van Rossum  python.org >
writes:
> >> So should the preprocessing step just be s.replace('_', ''),
or should
> >> it reject underscores that don't follow the rules from the PEP
> >> (perhaps augmented so they follow the spirit of the PEP and
the letter
> >> of the IBM spec)?
> >>
> >> Honestly I think it's also fine if specifying this exactly is
left out
> >> of the PEP, and handled by whoever adds this to Decimal.
Having a PEP
> >> to work from for the language spec and core builtins (int(),
float()
> >> complex()) is more important.
> >
> > I'd keep it simple for Decimal: Remove left and right
whitespace (we're
> > already doing this), then remove underscores from the remaining
string
> > (which must not contain any further whitespace), then use the
IBM grammar.
> >
> >
> > We could add a clause to the PEP that only those strings that
follow
> > the spirit of the PEP are guaranteed to be accepted in the
future.
> >
> >
> > One reason for keeping it simple is that I would not like to
slow down
> > string conversion, but thinking about two grammars is also a
problem --
> > part of the string conversion in libmpdec is modeled in ACL2,
which
> > would be invalidated or at least complicated with two grammars.
> >
> >
> >
> > Stefan Krah
> >
> > ___
> > Python-Dev mailing list
> > [email protected] 
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
>
https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
> --
> --Guido van Rossum (python.org/~guido )
> ___
> Python-Dev mailing list
> [email protected] 
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
>
https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
>
>
>
> --
> --Guido van Rossum (python.org/~guido )
>
>


___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/guido%40python.org
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Ethan Furman

On 05/12/2016 01:31 AM, Sven R. Kunze wrote:


I think the "Rationale" section ignores the fact the Path also supports
the .path attribute now. Which indeed defines a common interface between
path objects.


The version of Python that has Path.path has not been released yet.  And 
even so, .path is not a "common interface" as neither str nor bytes have 
it, and they also are used as path objects.


And even given all that, for smoother interoperability with the rest of 
the stdlib, or at least the os.* portion, those functions would still 
need to be upgraded to check for .path on the incoming arguments -- at 
which point we may as well make a protocol to properly support file 
system paths instead of relying on the rather generic attribute name of 
'path'.


--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Sven R. Kunze

On 11.05.2016 23:57, Brett Cannon wrote:
On Wed, 11 May 2016 at 14:29 Nikolaus Rath > wrote:


On May 11 2016, Brett Cannon mailto:[email protected]>> wrote:
> This PEP proposes a protocol for classes which represent a file
system
> path to be able to provide a ``str`` or ``bytes`` representation.
[...]

As I said before, to me this seems like a lot of effort for a very
specific use-case.



Exactly. Especially when considering what else can be done to improve 
the situation considerably.



So let me put forward two hypothetical scenarios to
better understand your position:

- A new module for URL handling is added to the standard library (or
  urllib is suitably extended). There is a proposal to add a new
  protocol that allows classes to provide a ``str`` or ``bytes``
  representation of URLs.

- A new (third-party) library for natural language processing arises
  that exposes a specific class for representing audio data. Existing
  language processing code just uses bytes objects. To ease transition
  and interoperability, it is proposed to add a new protocol for
classes
  that represend audio data to provide a bytes representation.



You can even add the timedelta-to-seconds protocol that somebody thought 
would be good idea:


https://mail.python.org/pipermail/python-dev/2016-April/144018.html
https://mail.python.org/pipermail/python-ideas/2016-May/040226.html

The generalization is straight-forward and a result of this discussion. 
If it works and is a good idea for pathlib, then there's absolutely no 
reason not to do this for the datetime lib and other rich-object libs. 
Same goes the other way round. Question still is: is it a good idea?


Maybe, it will become a successful pattern. Maybe not.


Do you think you would you be in favor of adding these protocols to
the stdlib/languange reference as well?


Maybe for URLs, not for audio data (at least not in the stdlib; 
community can do what they want).


If not, what's the crucial
difference to file system paths?


Nearly everyone uses file system paths on a regular basis, less so 
than URLs but still a good amount of people. Very few people work with 
audio data.


Amount of usage should be taken into account of course. However, 
question remains if that suffices as a justification for the effort.



Best,
Sven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Guido van Rossum
I am glad this is finally happening. There's quite a bit of noise in the
thread which I have to ignore. The two issues that I want to respond to are
speed and whether os.fspath() can return bytes.

- Speed: We should trust our ability to optimize the implementations where
necessary. First the API issues need to be settled.

- Bytes: I strongly believe that os.fspath() should be a thin wrapper
around the __fspath__ protocol, like next() wraps the .__next__ protocol.
It should not get into bytes vs. string politics. If your app really needs
strings, call os.fsdecode(). So this is my version (unoptimized):

def fspath(p: Union[str, bytes, PathLike]) -> Union[str, bytes]:
if isinstance(p, (str, bytes)):
return p
try:
return p.__fspath__
except AttributeError:
raise TypeError(...)

Other than that I think the PEP is already in fine shape.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Sven R. Kunze

On 12.05.2016 17:42, Ethan Furman wrote:

On 05/12/2016 01:31 AM, Sven R. Kunze wrote:


I think the "Rationale" section ignores the fact the Path also supports
the .path attribute now. Which indeed defines a common interface between
path objects.


The version of Python that has Path.path has not been released yet.  
And even so, .path is not a "common interface" as neither str nor 
bytes have it, and they also are used as path objects.


str and bytes will receive the __fspath__ attribute when this PEP is 
accepted?


And even given all that, for smoother interoperability with the rest 
of the stdlib, or at least the os.* portion, those functions would 
still need to be upgraded to check for .path on the incoming arguments 
-- at which point we may as well make a protocol to properly support 
file system paths instead of relying on the rather generic attribute 
name of 'path'.


Just so, if you accept changing os.* as a necessary solution.

If not, keeping .path would suffice and would be much simpler.


Best,
Sven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Ethan Furman

On 05/12/2016 09:26 AM, Sven R. Kunze wrote:

On 12.05.2016 17:42, Ethan Furman wrote:

On 05/12/2016 01:31 AM, Sven R. Kunze wrote:



I think the "Rationale" section ignores the fact the Path also supports
the .path attribute now. Which indeed defines a common interface between
path objects.


The version of Python that has Path.path has not been released yet.
And even so, .path is not a "common interface" as neither str nor
bytes have it, and they also are used as path objects.


str and bytes will receive the __fspath__ attribute when this PEP is
accepted?


No, they won't.  The __fspath__ protocol will reduce the rich path 
object down to a str/bytes object.


One could argue that a .path attribute is similar, but consider:  if you 
are handed a random object with a .path attribute, how certain can you 
be that it represents a file system path?  Contrariwise, how certain can 
you be of an object that has __fspath__?


At any rate, we seem to be down to the details of os.fspath() so I don't 
see any reason to discuss .path any further.


--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Brett Cannon
On Thu, 12 May 2016 at 09:25 Guido van Rossum  wrote:

> I am glad this is finally happening. There's quite a bit of noise in the
> thread which I have to ignore.
>

Don't worry, I'm not ignoring it on your behalf. :)


> The two issues that I want to respond to are speed and whether os.fspath()
> can return bytes.
>
> - Speed: We should trust our ability to optimize the implementations where
> necessary. First the API issues need to be settled.
>

Added a note to the PEP to say perf isn't a worry for os.path.


>
> - Bytes: I strongly believe that os.fspath() should be a thin wrapper
> around the __fspath__ protocol, like next() wraps the .__next__ protocol.
> It should not get into bytes vs. string politics. If your app really needs
> strings, call os.fsdecode(). So this is my version (unoptimized):
>
> def fspath(p: Union[str, bytes, PathLike]) -> Union[str, bytes]:
> if isinstance(p, (str, bytes)):
> return p
> try:
> return p.__fspath__
> except AttributeError:
> raise TypeError(...)
>

> Other than that I think the PEP is already in fine shape.
>

Just to double-check, did you mean for __fspath__ to only be an attribute
in your example, or did you leave off the `()` by accident? As of right now
the PEP is proposing a method for the protocol to follow common practice of
using methods and in case the representation is not always pre-computed and
thus not necessarily giving the wrong impression that the attribute access
is cheap. But admittedly an attribute was previously proposed and there
wasn't a terribly strong argument against it beyond "we historically
haven't done it that way", so I'm open to swapping to an attribute if
that's your preference.

>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Sven R. Kunze

On 12.05.2016 18:56, Ethan Furman wrote:

On 05/12/2016 09:26 AM, Sven R. Kunze wrote:

str and bytes will receive the __fspath__ attribute when this PEP is
accepted?


No, they won't.  The __fspath__ protocol will reduce the rich path 
object down to a str/bytes object.


Would this make the implementation of os.fspath simpler?

After all, str and bytes are to some extend path-like objects.


Best,
Sven
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Ethan Furman

On 05/12/2016 10:21 AM, Sven R. Kunze wrote:

On 12.05.2016 18:56, Ethan Furman wrote:

On 05/12/2016 09:26 AM, Sven R. Kunze wrote:



str and bytes will receive the __fspath__ attribute when this PEP is
accepted?


No, they won't.  The __fspath__ protocol will reduce the rich path
object down to a str/bytes object.


Would this make the implementation of os.fspath simpler?


Maybe, but a bad idea for two reasons:

1) Reducing a str to the exact same str is silly; and, more importantly
2) not every str/bytes is a path

--
~Ethan~
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Sjoerd Job Postmus
I would like to make just 1 comment regarding the question of accepting
(or not) bytes as output of `os.fspath`.

The whole point of adding `os.fspath` is to make it easier to use Path
objects. This is in an effort to gain greater adoption of pathlib in
libraries. Now, this is an excellent idea.

However, if it were to reject bytes, that would mean that when libraries
start to use pathlib, it would suddenly become harder for people that
actually need bytes-support to use pathlib.

Now, the claim 'if you need bytes, you should not be using pathlib` is a
reasonable one. But what if I need bytes *and* a specific library (say,
image handling, or a web framework, or ...). It's not up to me if that
library uses pathlib or plain old os.path.join.

Is using surrogate-escapes enough for this case? I myself am not sure,
(and also not affected), but it sounds to me that rejecting bytes is a
wrong approach if there is no proper workaround (assuming the use-case
of pathlib is somewhere deep in library code).

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 7:24 PM, Guido van Rossum  wrote:
> I am glad this is finally happening. There's quite a bit of noise in the
> thread which I have to ignore. The two issues that I want to respond to are
> speed and whether os.fspath() can return bytes.
>
> - Speed: We should trust our ability to optimize the implementations where
> necessary. First the API issues need to be settled.
>
> - Bytes: I strongly believe that os.fspath() should be a thin wrapper around
> the __fspath__ protocol, like next() wraps the .__next__ protocol. It should
> not get into bytes vs. string politics. If your app really needs strings,
> call os.fsdecode(). So this is my version (unoptimized):
>

:)

Thank you for this. I can breathe now.

Some questions remain:

> def fspath(p: Union[str, bytes, PathLike]) -> Union[str, bytes]:
> if isinstance(p, (str, bytes)):
> return p
> try:
> return p.__fspath__
> except AttributeError:
> raise TypeError(...)
>

(I know Brett already posted this question, but somehow it did not
show up in my mailbox before I had written this. I'm (re)posting
because there is some stuff here that is not in Brett's email )

You might be suggesting that __fspath__ should be an attribute, not a
method, or did you mean something like:

def fspath(p):
if isinstance(p, (str, bytes)):
return p
try:
p.__fspath__
except AttributeError:
raise TypeError(...)
return p.__fspath__()

IMO, either is fine, I suppose. As you know, it's mostly a question of
whether __fspath__ will be a property or a method (on PurePath for
instance). But if you meant the former, that would change also the ABC
and the protocol description.

-- Koos
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Koos Zevenhoven
On Thu, May 12, 2016 at 8:22 PM, Sjoerd Job Postmus
 wrote:
> I would like to make just 1 comment regarding the question of accepting
> (or not) bytes as output of `os.fspath`.
>
> The whole point of adding `os.fspath` is to make it easier to use Path
> objects. This is in an effort to gain greater adoption of pathlib in
> libraries. Now, this is an excellent idea.
>
> However, if it were to reject bytes, that would mean that when libraries
> start to use pathlib, it would suddenly become harder for people that
> actually need bytes-support to use pathlib.
>
> Now, the claim 'if you need bytes, you should not be using pathlib` is a
> reasonable one. But what if I need bytes *and* a specific library (say,
> image handling, or a web framework, or ...). It's not up to me if that
> library uses pathlib or plain old os.path.join.
>
> Is using surrogate-escapes enough for this case? I myself am not sure,
> (and also not affected), but it sounds to me that rejecting bytes is a
> wrong approach if there is no proper workaround (assuming the use-case
> of pathlib is somewhere deep in library code).
>

This is out of the scope of this PEP and probably a very insignificant
issue (luckily, this is not the pathlib PEP). Surrogates will probably
work and if not, on can "blaim" broken filenames ;).

-- Koos
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Guido van Rossum
On Thu, May 12, 2016 at 10:18 AM, Brett Cannon  wrote:

>
> On Thu, 12 May 2016 at 09:25 Guido van Rossum  wrote:
>
>> def fspath(p: Union[str, bytes, PathLike]) -> Union[str, bytes]:
>> if isinstance(p, (str, bytes)):
>> return p
>> try:
>> return p.__fspath__
>> except AttributeError:
>> raise TypeError(...)
>>
>
>> Other than that I think the PEP is already in fine shape.
>>
>
> - Bytes: I strongly believe that os.fspath() should be a thin wrapper
> around the __fspath__ protocol, like next() wraps the .__next__ protocol.
> It should not get into bytes vs. string politics. If your app really needs
> strings, call os.fsdecode(). So this is my version (unoptimized):
>
> Just to double-check, did you mean for __fspath__ to only be an attribute
> in your example, or did you leave off the `()` by accident? As of right now
> the PEP is proposing a method for the protocol to follow common practice of
> using methods and in case the representation is not always pre-computed and
> thus not necessarily giving the wrong impression that the attribute access
> is cheap. But admittedly an attribute was previously proposed and there
> wasn't a terribly strong argument against it beyond "we historically
> haven't done it that way", so I'm open to swapping to an attribute if
> that's your preference.
>
>>
Whoops. Didn't mean to change that! Yes, __fspath__ should remain a method.
You can breathe again. :-)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Brett Cannon
On Thu, 12 May 2016 at 11:36 Guido van Rossum  wrote:

> On Thu, May 12, 2016 at 10:18 AM, Brett Cannon  wrote:
>
>> On Thu, 12 May 2016 at 09:25 Guido van Rossum  wrote:
>>
> def fspath(p: Union[str, bytes, PathLike]) -> Union[str, bytes]:
>>> if isinstance(p, (str, bytes)):
>>> return p
>>> try:
>>> return p.__fspath__
>>> except AttributeError:
>>> raise TypeError(...)
>>>
>>
>>> Other than that I think the PEP is already in fine shape.
>>>
>>
>> - Bytes: I strongly believe that os.fspath() should be a thin wrapper
>> around the __fspath__ protocol, like next() wraps the .__next__ protocol.
>> It should not get into bytes vs. string politics. If your app really needs
>> strings, call os.fsdecode(). So this is my version (unoptimized):
>>
>> Just to double-check, did you mean for __fspath__ to only be an attribute
>> in your example, or did you leave off the `()` by accident? As of right now
>> the PEP is proposing a method for the protocol to follow common practice of
>> using methods and in case the representation is not always pre-computed and
>> thus not necessarily giving the wrong impression that the attribute access
>> is cheap. But admittedly an attribute was previously proposed and there
>> wasn't a terribly strong argument against it beyond "we historically
>> haven't done it that way", so I'm open to swapping to an attribute if
>> that's your preference.
>>
>>>
> Whoops. Didn't mean to change that! Yes, __fspath__ should remain a
> method. You can breathe again. :-)
>

That's a mechanical change so not exactly the most stressful aspect of this
PEP. :) I'll add the attribute angle to the Rejected Ideas, though.

Anyway, with your strong preference of how to tweak os.fspath() what
specifically would you like to see discussed at this point? Assuming the
os.fspath() -> bytes discussion is dealt with, the only open issues listed
in the PEP are the naming and placement of the ABC and how to do type hints
for all of this (thanks to the dichotomy of path objects using the protocol
and path-like objects which is the union of path object, str, and bytes and
the joy of trying to name all of this well).
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Guido van Rossum
On Thu, May 12, 2016 at 11:51 AM, Brett Cannon  wrote:

>
> Anyway, with your strong preference of how to tweak os.fspath() what
> specifically would you like to see discussed at this point?
>

Preferably nothing. :-) There's been too much discussion already.


> Assuming the os.fspath() -> bytes discussion is dealt with, the only open
> issues listed in the PEP are the naming and placement of the ABC and how to
> do type hints for all of this (thanks to the dichotomy of path objects
> using the protocol and path-like objects which is the union of path object,
> str, and bytes and the joy of trying to name all of this well).
>

Name and placement: I think it belongs in os, and os.PathLike sounds like a
fine name. (I'm surprised that os.DirEntry doesn't exist.)

Typing: do you want it to be a generic class? If not, the types can be left
out of the stdlib and only put in the stub (though you can show them in the
PEP of course).

If you want it to be generic we have more work to do. I'm out of time now
but we can discuss that after 3pm today.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Brett Cannon
On Thu, 12 May 2016 at 12:03 Guido van Rossum  wrote:

> On Thu, May 12, 2016 at 11:51 AM, Brett Cannon  wrote:
>
>>
>> Anyway, with your strong preference of how to tweak os.fspath() what
>> specifically would you like to see discussed at this point?
>>
>
> Preferably nothing. :-) There's been too much discussion already.
>

Works for me. :) I'll update the PEP with the new semantics for os.fspath()
and send out the updated version later today.


>
>
>> Assuming the os.fspath() -> bytes discussion is dealt with, the only open
>> issues listed in the PEP are the naming and placement of the ABC and how to
>> do type hints for all of this (thanks to the dichotomy of path objects
>> using the protocol and path-like objects which is the union of path object,
>> str, and bytes and the joy of trying to name all of this well).
>>
>
> Name and placement: I think it belongs in os, and os.PathLike sounds like
> a fine name. (I'm surprised that os.DirEntry doesn't exist.)
>

SGTM. And os.DirEntry doesn't exist simply because it's posix.DirEntry
since it's implemented entirely in C. We could add an alias but since it
isn't constructed from scratch I don't think it's worth it.


>
> Typing: do you want it to be a generic class? If not, the types can be
> left out of the stdlib and only put in the stub (though you can show them
> in the PEP of course).
>

If we aren't going to restrict what os.fspath() returns then I don't see
any need to make the type generic (I mean technically a generic version
might be nice for e.g. the constructor of pathlib only taking strings, but
it's probably overkill).

I guess my real question is whether we want to create typing.PathLike to
match os.PathLike? And it sounds like you don't want to bother with a
potential second type that corresponds to Union[str, bytes, PathLike].


>
> If you want it to be generic we have more work to do. I'm out of time now
> but we can discuss that after 3pm today.
>

Sure thing.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Guido van Rossum
There's no need for typing.PathLike.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Ethan Furman

On 05/12/2016 10:22 AM, Sjoerd Job Postmus wrote:


However, if it were to reject bytes, that would mean that when libraries
start to use pathlib, it would suddenly become harder for people that
actually need bytes-support to use pathlib.


pathlib is not about bytes support.  While bytes are necessary in this 
digital world we live in, most things that look like text should be 
text, and that includes most paths.


Now, the claim 'if you need bytes, you should not be using pathlib` is a
reasonable one. But what if I need bytes *and* a specific library (say,
image handling, or a web framework, or ...). It's not up to me if that
library uses pathlib or plain old os.path.join.


If you need bytes support for your paths, there's at least one [1] that 
has that support.


--
~Ethan~


[1] Yeah, it's mine [2].  ;)  I haven't checked if the other third-party 
libs do or not.


[2]  https://pypi.python.org/pypi/antipathy plug>

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] File system path PEP, part 2

2016-05-12 Thread Brett Cannon
Second draft that takes Guido's comments into consideration. The biggest
change is os.fspath() now returns whatever path.__fspath__() returns
instead of restricting it to only str.

Minor changes:
- Renamed the C function to PyOS_FSPath()
- Added an Implementation section with a TODO list
- Bunch of things added to the Rejected Ideas section

--
PEP: NNN
Title: Adding a file system path protocol
Version: $Revision$
Last-Modified: $Date$
Author: Brett Cannon 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 11-May-2016
Post-History: 11-May-2016,
  12-May-2016


Abstract


This PEP proposes a protocol for classes which represent a file system
path to be able to provide a ``str`` or ``bytes`` representation.
Changes to Python's standard library are also proposed to utilize this
protocol where appropriate to facilitate the use of path objects where
historically only ``str`` and/or ``bytes`` file system paths are
accepted. The goal is to facilitate the migration of users towards
rich path objects while providing an easy way to work with code
expecting ``str`` or ``bytes``.


Rationale
=

Historically in Python, file system paths have been represented as
strings or bytes. This choice of representation has stemmed from C's
own decision to represent file system paths as
``const char *`` [#libc-open]_. While that is a totally serviceable
format to use for file system paths, it's not necessarily optimal. At
issue is the fact that while all file system paths can be represented
as strings or bytes, not all strings or bytes represent a file system
path. This can lead to issues where any e.g. string duck-types to a
file system path whether it actually represents a path or not.

To help elevate the representation of file system paths from their
representation as strings and bytes to a richer object representation,
the pathlib module [#pathlib]_ was provisionally introduced in
Python 3.4 through PEP 428. While considered by some as an improvement
over strings and bytes for file system paths, it has suffered from a
lack of adoption. Typically the key issue listed for the low adoption
rate has been the lack of support in the standard library. This lack
of support required users of pathlib to manually convert path objects
to strings by calling ``str(path)`` which many found error-prone.

One issue in converting path objects to strings comes from
the fact that the only generic way to get a string representation of
the path was to pass the object to ``str()``. This can pose a
problem when done blindly as nearly all Python objects have some
string representation whether they are a path or not, e.g.
``str(None)`` will give a result that
``builtins.open()`` [#builtins-open]_ will happily use to create a new
file.

Exacerbating this whole situation is the
``DirEntry`` object [#os-direntry]_. While path objects have a
representation that can be extracted using ``str()``, ``DirEntry``
objects expose a ``path`` attribute instead. Having no common
interface between path objects, ``DirEntry``, and any other
third-party path library has become an issue. A solution that allows
any path-representing object to declare that it is a path and a way
to extract a low-level representation that all path objects could
support is desired.

This PEP then proposes to introduce a new protocol to be followed by
objects which represent file system paths. Providing a protocol allows
for explicit signaling of what objects represent file system paths as
well as a way to extract a lower-level representation that can be used
with older APIs which only support strings or bytes.

Discussions regarding path objects that led to this PEP can be found
in multiple threads on the python-ideas mailing list archive
[#python-ideas-archive]_ for the months of March and April 2016 and on
the python-dev mailing list archives [#python-dev-archive]_ during
April 2016.


Proposal


This proposal is split into two parts. One part is the proposal of a
protocol for objects to declare and provide support for exposing a
file system path representation. The other part deals with changes to
Python's standard library to support the new protocol. These changes
will also lead to the pathlib module dropping its provisional status.

Protocol


The following abstract base class defines the protocol for an object
to be considered a path object::

import abc
import typing as t


class PathLike(abc.ABC):

"""Abstract base class for implementing the file system path
protocol."""

@abc.abstractmethod
def __fspath__(self) -> t.Union[str, bytes]:
"""Return the file system path representation of the object."""
raise NotImplementedError


Objects representing file system paths will implement the
``__fspath__()`` method which will return the ``str`` or ``bytes``
representation of the path. The ``str`` representation is the
preferred low-level path representation as i

Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Sjoerd Job Postmus


> On 12 May 2016, at 21:30, Ethan Furman  wrote:
> 
> If you need bytes support for your paths, there's at least one [1] that has 
> that support.

So if I would need bytes support, I should submit a pull request to  which replaces usage of the stdlib pathlib with another 
variant, upon which they will decline the pull request because it introduces 
another "useless" dependency.

Good to know.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Brett Cannon
On Thu, 12 May 2016 at 14:04 Sjoerd Job Postmus 
wrote:

>
>
> > On 12 May 2016, at 21:30, Ethan Furman  wrote:
> >
> > If you need bytes support for your paths, there's at least one [1] that
> has that support.
>
> So if I would need bytes support, I should submit a pull request to
>  which replaces usage of the stdlib pathlib
> with another variant, upon which they will decline the pull request because
> it introduces another "useless" dependency.
>

No, what you should do is ask them to create the pathlib instance lazily
and only when duck typing shows they weren't given a compatible object.
Then you could pass in some bytes-based solution like Ethan's and not worry
about pathlib's refusal to work with bytes. Or you simply ask them to work
with os.path after calling os.fspath() and be very careful to not use any
strings with the functions.
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-05-12 Thread Guido van Rossum
Is anyone going to mark the PEP as accepted?

On Thu, May 12, 2016 at 8:11 AM, Guido van Rossum 
wrote:

> OK, then PEP 515 is now officially accepted! Congratulations. Start the
> implementation work!
>
> --Guido (mobile)
> On May 11, 2016 10:33 PM, "Georg Brandl"  wrote:
>
> I'm happy with the latest version.
>
> Georg
>
> On 05/11/2016 06:46 PM, Guido van Rossum wrote:
> > If the authors are happy I'll accept it right away.
> >
> > (I vaguely recall there's another PEP that's ready for pronouncement --
> but
> > which one?)
> >
> > On Wed, May 11, 2016 at 9:34 AM, Brett Cannon  > > wrote:
> >
> > Is there anything holding up PEP 515 at this point in terms of
> acceptance or
> > implementation?
> >
> > On Sat, 19 Mar 2016 at 11:56 Guido van Rossum  > > wrote:
> >
> > All that sounds fine!
> >
> > On Sat, Mar 19, 2016 at 11:28 AM, Stefan Krah <
> [email protected]
> > > wrote:
> > > Guido van Rossum  python.org >
> writes:
> > >> So should the preprocessing step just be s.replace('_', ''),
> or should
> > >> it reject underscores that don't follow the rules from the PEP
> > >> (perhaps augmented so they follow the spirit of the PEP and
> the letter
> > >> of the IBM spec)?
> > >>
> > >> Honestly I think it's also fine if specifying this exactly is
> left out
> > >> of the PEP, and handled by whoever adds this to Decimal.
> Having a PEP
> > >> to work from for the language spec and core builtins (int(),
> float()
> > >> complex()) is more important.
> > >
> > > I'd keep it simple for Decimal: Remove left and right
> whitespace (we're
> > > already doing this), then remove underscores from the
> remaining string
> > > (which must not contain any further whitespace), then use the
> IBM grammar.
> > >
> > >
> > > We could add a clause to the PEP that only those strings that
> follow
> > > the spirit of the PEP are guaranteed to be accepted in the
> future.
> > >
> > >
> > > One reason for keeping it simple is that I would not like to
> slow down
> > > string conversion, but thinking about two grammars is also a
> problem --
> > > part of the string conversion in libmpdec is modeled in ACL2,
> which
> > > would be invalidated or at least complicated with two grammars.
> > >
> > >
> > >
> > > Stefan Krah
> > >
> > > ___
> > > Python-Dev mailing list
> > > [email protected] 
> > > https://mail.python.org/mailman/listinfo/python-dev
> > > Unsubscribe:
> >
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
> >
> >
> >
> > --
> > --Guido van Rossum (python.org/~guido  >)
> > ___
> > Python-Dev mailing list
> > [email protected] 
> > https://mail.python.org/mailman/listinfo/python-dev
> > Unsubscribe:
> >
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
> >
> >
> >
> >
> > --
> > --Guido van Rossum (python.org/~guido )
> >
> >
>
>
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 515: Underscores in Numeric Literals (revision 3)

2016-05-12 Thread Brett Cannon
Marked as accepted in https://hg.python.org/peps/rev/a52066565cc2

On Thu, 12 May 2016 at 16:59 Guido van Rossum  wrote:

> Is anyone going to mark the PEP as accepted?
>
> On Thu, May 12, 2016 at 8:11 AM, Guido van Rossum 
> wrote:
>
>> OK, then PEP 515 is now officially accepted! Congratulations. Start the
>> implementation work!
>>
>> --Guido (mobile)
>> On May 11, 2016 10:33 PM, "Georg Brandl"  wrote:
>>
>> I'm happy with the latest version.
>>
>> Georg
>>
>> On 05/11/2016 06:46 PM, Guido van Rossum wrote:
>> > If the authors are happy I'll accept it right away.
>> >
>> > (I vaguely recall there's another PEP that's ready for pronouncement --
>> but
>> > which one?)
>> >
>> > On Wed, May 11, 2016 at 9:34 AM, Brett Cannon > > > wrote:
>> >
>> > Is there anything holding up PEP 515 at this point in terms of
>> acceptance or
>> > implementation?
>> >
>> > On Sat, 19 Mar 2016 at 11:56 Guido van Rossum > > > wrote:
>> >
>> > All that sounds fine!
>> >
>> > On Sat, Mar 19, 2016 at 11:28 AM, Stefan Krah <
>> [email protected]
>> > > wrote:
>> > > Guido van Rossum  python.org >
>> writes:
>> > >> So should the preprocessing step just be s.replace('_', ''),
>> or should
>> > >> it reject underscores that don't follow the rules from the
>> PEP
>> > >> (perhaps augmented so they follow the spirit of the PEP and
>> the letter
>> > >> of the IBM spec)?
>> > >>
>> > >> Honestly I think it's also fine if specifying this exactly
>> is left out
>> > >> of the PEP, and handled by whoever adds this to Decimal.
>> Having a PEP
>> > >> to work from for the language spec and core builtins (int(),
>> float()
>> > >> complex()) is more important.
>> > >
>> > > I'd keep it simple for Decimal: Remove left and right
>> whitespace (we're
>> > > already doing this), then remove underscores from the
>> remaining string
>> > > (which must not contain any further whitespace), then use the
>> IBM grammar.
>> > >
>> > >
>> > > We could add a clause to the PEP that only those strings that
>> follow
>> > > the spirit of the PEP are guaranteed to be accepted in the
>> future.
>> > >
>> > >
>> > > One reason for keeping it simple is that I would not like to
>> slow down
>> > > string conversion, but thinking about two grammars is also a
>> problem --
>> > > part of the string conversion in libmpdec is modeled in ACL2,
>> which
>> > > would be invalidated or at least complicated with two
>> grammars.
>> > >
>> > >
>> > >
>> > > Stefan Krah
>> > >
>> > > ___
>> > > Python-Dev mailing list
>> > > [email protected] 
>> > > https://mail.python.org/mailman/listinfo/python-dev
>> > > Unsubscribe:
>> >
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>> >
>> >
>> >
>> > --
>> > --Guido van Rossum (python.org/~guido > >)
>> > ___
>> > Python-Dev mailing list
>> > [email protected] 
>> > https://mail.python.org/mailman/listinfo/python-dev
>> > Unsubscribe:
>> >
>> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>> >
>> >
>> >
>> >
>> > --
>> > --Guido van Rossum (python.org/~guido )
>> >
>> >
>>
>>
>> ___
>> Python-Dev mailing list
>> [email protected]
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>>
>
>
> --
> --Guido van Rossum (python.org/~guido)
> ___
> Python-Dev mailing list
> [email protected]
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] File system path PEP, part 2

2016-05-12 Thread Nick Coghlan
On 13 May 2016 at 06:53, Brett Cannon  wrote:
> Second draft that takes Guido's comments into consideration. The biggest
> change is os.fspath() now returns whatever path.__fspath__() returns instead
> of restricting it to only str.
>
> Minor changes:
> - Renamed the C function to PyOS_FSPath()
> - Added an Implementation section with a TODO list
> - Bunch of things added to the Rejected Ideas section

+1 for this version from me, as it means we have:

- os.fsencode(obj) as the coerce-to-bytes API
- os.fspath(obj) as the str/bytes hybrid API
- os.fsdecode(obj) as the coerce-to-str API
- os.fspath(pathlib.PurePath(obj)) as the error-on-bytes API

That more strongly nudges people towards "use pathlib if you want to
ensure cross-platform friendly path handling", which is an outcome I'm
fine with.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] file system path protocol PEP

2016-05-12 Thread Ethan Furman

On 05/12/2016 01:59 PM, Sjoerd Job Postmus wrote:

On 12 May 2016, at 21:30, Ethan Furman  wrote:

If you need bytes support for your paths, there's at least one [1] that has 
that support.


So if I would need bytes support, I should submit a pull request to > library> which replaces usage of the stdlib pathlib with another 
variant, upon which they
> will decline the pull request because it introduces another "useless" 
dependency.


My apologies, I thought you were talking about your own code.

As far as  -- it wouldn't be awesome to 
me if I couldn't pass in the data types I need.


--
~Ethan~

___
Python-Dev mailing list
[email protected]
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com