Bar Harel <bzvi7...@gmail.com> added the comment:

@rhettinger I completely understand what you're saying and at first I agreed 
with you. Before I gave it a closer look, I thought about the same thing - we 
want reliability. Reliability is important and will avoid subtle bugs, which is 
why I was against this change for the exact reasons you mentioned: it is both 
breaking and unreliable.

I then realized that this change can be a reliable replacement for 
isinstance(obj, collections.abc.Sequence) at the C level. Let's use the broken 
sre_parse.SubPattern as an example - it does not register or inherit from 
collections.abc.Sequence, and isinstance(SubPattern, collections.abc.Sequence) 
== False. We cannot know programmatically if SubPattern is a Sequence, we 
cannot type hint it as such, and apart from reading the documentation, we 
cannot deal with the type differently in dynamic code that accepts either 
sequences or mappings. I dare to say, counting on it being a sequence, 
especially on a LBYL language like C is even less reliable. While SubPattern 
"embraces" the spirit of duck typing, it is very hard to fit in light of all 
recent changes advocating for a more structured and well defined types. After 
all, this feature was requested in order to solve reliability issues in 
statically typed languages.

Putting everything aside, the grand question still remains: do you think that 
there's a use for an efficient C-API isinstance check for Sequences and 
Mappings? I would presume the answer is yes. Would we encourage it? I have no 
clue. But if there's a need, we can either change this function as it has the 
same "spirit" or introduce a new one to prevent breaking existing code.

To answer your question: per specification, testing for Py_TPFLAGS_SEQUENCE 
using PyType_HasFeature, does not take strings, bytes and bytearray into 
consideration, and will not suffice. It is an incorrect solution that is even 
less reliable and falls into the exact pitfall of "guesswork" (for example 
SubPattern currently doesn't work with it either). It is not encouraged or 
easily thought of. PySequence_Check which is much more intuitive yet doesn't 
work either and that's where fixing it can have an edge.

A theoretical `PyIsInstance_Sequence` can check for TPFLAGS_SEQUENCE and 
Str/Bytes/ByteArray_Check. If I'm not wrong, doing so will be 100% reliable, 
identical to isinstance(obj, Sequence), and will be very efficient.

As a side-note, the C-API documentation for TP_FLAGS is not clear atm. It 
mentions for example tp_as_sequence and says "if such a flag bit is clear, the 
type fields it guards must not be accessed and must be considered to have a 
zero or NULL value instead" yet Py_TPFLAGS_SEQUENCE does not actually coincide 
with sequences per specification. I know it has a different explanation as well 
and the flag has its own docstring, but it is still a bit misleading.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46376>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to