[issue46124] Deprecation warning in zoneinfo module

2021-12-28 Thread Paul Ganssle


Paul Ganssle  added the comment:

Jason's patch looks good to me, but I don't understand why Karthikeyan 
originally suggested using `normalize_path`. Trying to dig through exactly how 
`files().joinpath().open` is implemented has so many layers of indirection and 
abstract classes that I can't quite figure out if the two things are equivalent 
or not. Seems like `joinpath()` is the right thing to do, but does it have less 
validation than the `normalize_path` method?

--

___
Python tracker 
<https://bugs.python.org/issue46124>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46319] datetime.utcnow() should return a timezone aware datetime

2022-01-09 Thread Paul Ganssle


Paul Ganssle  added the comment:

Yes, this is the documented behavior, including a warning against using UTC now 
in the documentation!

There is some possibility of removing utcnow entirely as an "attractive 
nuisance", but I suspect that this will lead to much consternation and 
hand-wringing, and there are some legitimate uses for `utcnow`, so I haven't 
made it a high priority to have that particular fight...

--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue46319>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue12756] datetime.datetime.utcnow should return a UTC timestamp

2022-01-10 Thread Paul Ganssle

Paul Ganssle  added the comment:

> from practical experience, it is a whole lot better to not deal with 
> timezones in data processing code at all, but instead only use naive UTC 
> datetime values everywhere, expect when you have to prepare reports or output 
> which has a requirement to show datetime value in local time or some specific 
> timezone.

This is not good advice. It is out of date, and has some significant pitfalls. 
See my blog post on the subject: 
https://blog.ganssle.io/articles/2019/11/utcnow.html

If you are working with `datetime` objects that represent time in a specific 
time zone other than the system local zone, you should probably have them carry 
around a tzinfo object.

> Note also that datetime.now() gives you a naive datetime.  From an API 
> consistency standpoint I think it makes sense that datetime.utcnow() gives a 
> naive datetime.

This... is not accurate. `.now()` gives you the local time; to the extent that 
they represent a date in a time zone at all, naïve time zones represent times 
in the *system local time*, which is why it makes sense for `.now()` to default 
to returning naïve time zones. For example, `datetime.now().timestamp()` will 
give accurate information, whereas `datetime.utcnow().timestamp()` will *not* 
(unless your local zone happens to be UTC or equivalent).

> It would actually be confusing (IMO) for it to return an aware datetime.  I 
> can see why you might disagree, but backward compatibility wins in this case 
> regardless.

As evidenced by this thread, the fact that we have some APIs that return naïve 
datetimes generated by a process that treats them as localized datetimes in 
something other than system local times is actually the source of confusion 😛

That said, from a backwards compatibility point of view, we simply cannot 
change this. It has been proposed many times and it would be a major breaking 
change for almost no reason. The best we can do is to deprecate the offending 
methods and remove them.

There is more information about the challenge that doing this would present in 
this datetime-SIG thread: 
https://mail.python.org/archives/list/datetime-...@python.org/thread/PT4JWJLYBE5R2QASVBPZLHH37ULJQR43/

I am sympathetic to the idea of removing it, but we would probably want to put 
some optimizations in place for `UTC` first, to make the transition more 
seamless in the few places where there are legitimate uses for `utcnow` and 
`utcfromtimestamp`.

> I would argue that PEP20 should win over backward compatibility, in addition 
> to the points I hinted at above, practicality beats purity

PEP 20 contains a bunch of contradictory advice, and it's not really a binding 
document anyway, so it definitely doesn't "win" over anything, but in this case 
you have it backwards. "Purity" would be making a breaking change for the 
purposes of making the function do what a lot of people think it does, but 
doing so would actually be impractical, because it would cause a lot of work 
for people, and create a lot of ambiguity in what people meant when they wrote 
a given line of code. The practical things to do here would be to either do 
nothing (not break anything that works and try and guide people away from using 
`utcnow` — maybe get a linting rule added to `pylint` to warn against it), or 
to deprecate and remove the function.

--

___
Python tracker 
<https://bugs.python.org/issue12756>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46447] datetime.isoformat() documentation does not point to the risk of using it with naive datetime objects

2022-01-24 Thread Paul Ganssle

Paul Ganssle  added the comment:

Sorry I missed this! Thank you for taking the time to write this up and to make 
a PR.

Unfortunately, I don't think I agree with the idea of warning about this. The 
warnings about `utcnow` and `utcfromtimestamp` are a problem because `utcnow` 
and `utcfromtimestamp` are intended to represent times in UTC, but they return 
datetimes that actually represent local times in the semantics of modern 
Python. Basically, these functions are dangerous not because they are using 
naïve datetimes, but because they are *mis*using naïve datetimes.

The same can not be said of `.isoformat()`, which is doing the right thing when 
you use `datetime.now().isoformat()`. If you look at Wikipedia's article on ISO 
8601 (which is pretty much the best resource on this, since ISO 8601 is itself 
paywalled and we never should have standardized on a proprietary standard in 
the first place), you'll see it says:

> Local time (unqualified)
> If no UTC relation information is given with a time representation, the time 
> is assumed to be in local time. While it may be safe to assume local time 
> when communicating in the same time zone, it is ambiguous when used in 
> communicating across different time zones.

It may be that for the kind of programming you do, it doesn't make sense to use 
local datetimes in an interchange format — but it is a legitimate use case and 
there are definitely situations where it is very much the right thing to do, so 
I don't think we should warn against it in the `datetime.isoformat` 
documentation.

There is *might* be some case for warning about this or something like it in 
the `datetime.now` documentation. The major use cases for naïve datetimes are 
things where you are working with system time or things where you are working 
with dates in the future — you don't want to specify that some event is going 
to happen at 2030-03-31T12:00Z if the actual event is planned for April 1, 2030 
at 13:00 *London time*, because if, between now and then, the UK decides to 
cancel DST or move the start back a week, the event you've stored as a UTC time 
now longer represents what it was intended to represent. In a lot of cases 
`datetime.now()` will just be used as "what time is it now", which is not 
subject to that particular problem because by the time the datetime gets stored 
or used, `datetime.now()` is a date in the *past*, and can safely be converted 
to UTC for all time.

Of course, if you are consuming a bunch of event dates stored in local time and 
you want to compare them to the current time, then `datetime.now()` would be 
appropriate. Similarly if you want to display the current time to a user on a 
given system (rather than logging or storing it), it would also make sense to 
do things like `print(datetime.now().isoformat())`, so there are definitely 
also legitimate use cases for `datetime.now()`.

I'm inclined to say that we should *not* have a warning on `datetime.now()`, 
because we will  give people warning fatigue if we do, and we definitely want 
people to see `now()` as the correct alternative to `utcnow()`. I am more 
sympathetic to rewording the `.now()` documentation to make it clear that this 
will be a naïve time *representing the current time in the system local time 
zone* if `None` is passed (i.e. rewording or appending to the "If optional 
argument `tz`" paragraph).

--
nosy: +belopolsky, p-ganssle

___
Python tracker 
<https://bugs.python.org/issue46447>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46614] Add option to output UTC datetimes as "Z" in `.isoformat()`

2022-02-02 Thread Paul Ganssle

New submission from Paul Ganssle :

As part of bpo-35829, it was suggested that we add the ability to output the 
"Z" suffix in `isoformat()`, so that `fromisoformat()` can both be the exact 
functional inverse of `isoformat()` and parse datetimes with "Z" outputs. I 
think that that's not a particularly compelling motivation for this, but I also 
see plenty of examples of `datetime.utcnow().isoformat() + "Z"` out there, so 
it seems like this is a feature that we would want to have *anyway*, 
particularly if we want to deprecate and remove `utcnow`.

I've spun this off into its own issue so that we can discuss how to implement 
the feature. The two obvious questions I see are:

1. What do we call the option? `use_utc_designator`, `allow_Z`, `utc_as_Z`?
2. What do we consider as "UTC"? Is it anything with +00:00? Just 
`timezone.utc`? Anything that seems like a fixed-offset zone with 0 offset?

For example, do we want this?

>>> LON = zoneinfo.ZoneInfo("Europe/London")
>>> datetime(2022, 3, 1, tzinfo=LON).isoformat(utc_as_z=True)
2022-03-01T00:00:00Z
>>> datetime(2022, 6, 1, tzinfo=LON).isoformat(utc_as_z=True)
2022-06-01T00:00:00+01:00

Another possible definition might be if the `tzinfo` is a fixed-offset zone 
with offset 0:

>>> datetime.timezone.utc.utcoffset(None)
timedelta(0)
>>> zoneinfo.ZoneInfo("UTC").utcoffset(None)
timedelta(0)
>>> dateutil.tz.UTC.utcoffset(None)
timedelta(0)
>>> pytz.UTC.utcoffset(None)
timedelta(0)

The only "odd man out" is `dateutil.tz.tzfile` objects representing fixed 
offsets, since all `dateutil.tz.tzfile` objects return `None` when `utcoffset` 
or `dst` are passed `None`. This can and will be changed in future versions.

I feel like "If the offset is 00:00, use Z" is the wrong rule to use 
conceptually, but considering that people will be opting into this behavior, it 
is more likely that they will be surprised by `datetime(2022, 3, 1, 
tzinfo=ZoneInfo("Europe/London").isoformat(utc_as_z=True)` returning 
`2022-03-01T00:00:00+00:00` than alternation between `Z` and `+00:00`.

Yet another option might be to add a completely separate function, 
`utc_isoformat(*args, **kwargs)`, which is equivalent to (in the parlance of 
the other proposal) `dt.astimezone(timezone.utc).isoformat(*args, **kwargs, 
utc_as_z=True)`.  Basically, convert any datetime to UTC and append a Z to it. 
The biggest footgun there would be people using it on naïve datetimes and not 
realizing that it would interpret them as system local times.

--
assignee: p-ganssle
components: Library (Lib)
messages: 412384
nosy: belopolsky, brett.cannon, p-ganssle
priority: normal
severity: normal
stage: needs patch
status: open
title: Add option to output UTC datetimes as "Z" in `.isoformat()`
type: enhancement
versions: Python 3.11

___
Python tracker 
<https://bugs.python.org/issue46614>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35829] datetime: parse "Z" timezone suffix in fromisoformat()

2022-02-02 Thread Paul Ganssle


Paul Ganssle  added the comment:

I don't think it's necessary to add a feature to `isoformat()` just for the 
purpose of being able to add the corresponding parser, particularly when the 
plan is to implement a much broader ISO 8601 parser for Python 3.11 (I've done 
most of the implementation in C already, I can share the progress I've made, 
particularly if someone else wants to pick up the baton there before I get back 
to it).

That said, I think it's useful for `isoformat()` to be able to output UTC times 
as "Z", so we may as well do that as part of 3.11 anyway. I think that's a 
separate issue to discuss, so I've created bpo-46614 to hammer out the details.

--
versions: +Python 3.11 -Python 3.8

___
Python tracker 
<https://bugs.python.org/issue35829>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue46614] Add option to output UTC datetimes as "Z" in `.isoformat()`

2022-04-03 Thread Paul Ganssle

Paul Ganssle  added the comment:

I think this approach is probably the best we can do, but I could also imagine 
that users might find it to be confusing behavior. I wonder if there's any 
informal user testing we can do?

I guess the ISO 8601 spec does call "Z" the "UTC designator", so 
`use_utc_designator` seems like approximately the right name. My main 
hesitation with this name is that I suspect users may think that 
`use_utc_designator` means that they *unconditionally* want to use `Z` — 
without reading the documentation (which we can assume 99% of users won't do) — 
you might assume that `dt.isoformat(use_utc_designator=True)` would translate 
to `dt.astimezone(timezone.utc).replace(tzinfo=None).isoformat() + "Z"`.

A name like `utc_as_z` is definitely less... elegant, but conveys the concept a 
bit more clearly. Would be worth throwing it to a poll or something before 
merging.

--

___
Python tracker 
<https://bugs.python.org/issue46614>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47207] Switch datetime docstrings / documentation to using "Returns" rather than "Return"?

2022-04-03 Thread Paul Ganssle


New submission from Paul Ganssle :

In bpo-9305, Fred Drake recommends preferring `Returns ...` over the imperative 
`Return ...`: https://bugs.python.org/issue9305#msg110912

Currently we're pretty consistent about `Return ...`, which is consistent with 
PEP 257: https://peps.python.org/pep-0257/

That said, I actually think "Returns ..." sounds much better in the 
documentation, and I'd be happy to see it changed if others agree.

I have spun this off from bpo-9305 so that we can unblock 
https://github.com/python/cpython/pull/31697.

--
assignee: docs@python
components: Documentation
messages: 416628
nosy: belopolsky, docs@python, fdrake, p-ganssle, slateny
priority: normal
severity: normal
status: open
title: Switch datetime docstrings / documentation to using "Returns" rather 
than "Return"?
versions: Python 3.11

___
Python tracker 
<https://bugs.python.org/issue47207>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue47228] Document that naïve datetime objects represent local time

2022-04-05 Thread Paul Ganssle

New submission from Paul Ganssle :

Currently, the `datetime` documentation has this to say about naïve datetimes:

> A naive object does not contain enough information to unambiguously locate 
> itself relative to other date/time objects. Whether a naive object represents 
> Coordinated Universal Time (UTC), local time, or time in some other timezone 
> is purely up to the program, just like it is up to the program whether a 
> particular number represents metres, miles, or mass. Naive objects are easy 
> to understand and to work with, at the cost of ignoring some aspects of 
> reality.

This was accurate in Python 2.7, but as of Python 3, the picture is a bit more 
nuanced. `.astimezone()` and `.timestamp()` work for naïve datetimes, but they 
are treated as *system local times*. It is no longer really appropriate to use 
a naïve datetime to a datetime in any specific concrete time zone — instead, 
they should be considered either abstract datetimes for the purposes of 
calendar calculations, or they should be considered as representing the 
realization of that abstract datetime *in the current system locale*. This new 
behavior is referenced in, for example, the warning in `.utcnow()`: 
https://docs.python.org/3.10/library/datetime.html#datetime.datetime.utcnow

We make reference to this in the documentation for `.timestamp()`: 
https://docs.python.org/3.10/library/datetime.html#datetime.datetime.timestamp 
and in `.astimezone()`: 
https://docs.python.org/3.10/library/datetime.html#datetime.datetime.astimezone,
 but the top level explanation makes no reference to it.

I have written a blog post about *why* this is the case: 
https://blog.ganssle.io/articles/2022/04/naive-local-datetimes.html and made 
reference to this behavior in an earlier blog post about `utcnow`: 
https://blog.ganssle.io/articles/2019/11/utcnow.html, but I think it would be a 
good idea to revamp the official documentation to reflect this change in status 
(12 years or so after the change…)

--
assignee: p-ganssle
components: Documentation
messages: 416778
nosy: belopolsky, p-ganssle
priority: normal
severity: normal
stage: needs patch
status: open
title: Document that naïve datetime objects represent local time
type: enhancement
versions: Python 3.10, Python 3.11, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue47228>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37642] timezone allows no offset from range (23:59, 24:00)

2019-08-15 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset 27b38b99b3a154fa5c25cd67fe01fb4fc04604b0 by Paul Ganssle in 
branch '3.8':
bpo-37642: Update acceptable offsets in timezone (GH-14878) (#15227)
https://github.com/python/cpython/commit/27b38b99b3a154fa5c25cd67fe01fb4fc04604b0


--

___
Python tracker 
<https://bugs.python.org/issue37642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37642] timezone allows no offset from range (23:59, 24:00)

2019-08-15 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset ed44b84961eb0e5b97e4866c1455ac4093d27549 by Paul Ganssle in 
branch '3.7':
bpo-37642: Update acceptable offsets in timezone (GH-14878) (#15226)
https://github.com/python/cpython/commit/ed44b84961eb0e5b97e4866c1455ac4093d27549


--

___
Python tracker 
<https://bugs.python.org/issue37642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37642] timezone allows no offset from range (23:59, 24:00)

2019-08-15 Thread Paul Ganssle


Change by Paul Ganssle :


--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue37642>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37914] class timedelta, support the method hours and minutes in field accessors

2019-08-22 Thread Paul Ganssle


Paul Ganssle  added the comment:

> I would support this addition. The timedelta class already has accessors for 
> days and seconds, why not for hours and minutes? 

The `timedelta.days` and `timedelta.seconds` accessors do not do what is being 
requested here. The component accessors just give you a given component of the 
timedelta in its normalized form, so:

>>> td = timedelta(hours=25, minutes=1, seconds=2)
>>> td.days
1
>>> td.seconds
3662
>>> td // timedelta(seconds=1)
90062


The reason there is no hours or minutes is that the normalized form of 
timedelta doesn't have those components. It would be inconsistent to have 
`hours` and `minutes` give the total duration of the timedelta in the chosen 
units while `.days` and `.seconds` return just the component of the normalized 
form.

What's really being asked for here are `total_hours()` and `total_minutes()` 
methods, and when that has come up in the past (including recently on the 
python-dev mailing list), we've largely decided that the "divide by the units 
you want" idiom is sufficient (and in fact better in that you can choose any 
arbitrary units rather than just the ones that have specific names).

--

___
Python tracker 
<https://bugs.python.org/issue37914>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37962] Improve ISO 8601 timezone support in the datetime.fromisoformat() method

2019-08-27 Thread Paul Ganssle


Paul Ganssle  added the comment:

This is a duplicate of #35829.

The reason that 'Z' is not supported is that `fromisoformat()` is not a general 
ISO 8601 parser, but rather is intended to be the inverse of `isoformat()`. See 
the documentation here: 
https://docs.python.org/dev/library/datetime.html#datetime.datetime.fromisoformat

The current state of #35829 is that expanding to support all of ISO 8601 is an 
option, but determining the scope an the API are a bit tricky. ISO 8601 is more 
complicated than most people think.

In the meantime, `dateutil.parser.isoparse` is intentionally scoped to parse 
all valid ISO 8601 datetimes.

--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> datetime: parse "Z" timezone suffix in fromisoformat()
type:  -> enhancement

___
Python tracker 
<https://bugs.python.org/issue37962>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35829] datetime: parse "Z" timezone suffix in fromisoformat()

2019-08-27 Thread Paul Ganssle


Paul Ganssle  added the comment:

> Defining isoformat() and fromisoformat() as functional inverses is misguided. 
> Indeed, it's not even true:

`isoformat()` is not the inverse of `fromisoformat()`, that doesn't work 
because there are multiple strings that isoformat() can create from any given 
datetime. There is, however, only one datetime that is represented by any given 
datetime (assuming you consider truncation to create a new datetime), so it is 
fine for fromisoformat() to be the inverse of isoformat().

I have explained the reason that was chosen for the contract in several places 
(including in this thread), so I won't bother to repeat it. I think from a 
practical point of view we should eventually grow more generalized ISO 8601 
parsing functionality, and the main question is what the API will look like. In 
dateutil.parser.isoparse, I still haven't figured out a good way to do feature 
flags.

> I'd be willing to work on a PR, but a change of this size probably needs to 
> through python-ideas first?

I don't think it *needs* to go to python-ideas, though it's probably a good 
idea to try and work out the optimal API in a post on the discourse ( 
discuss.python.org ), and the "ideas" category seems like the right one there. 
Please CC me (pganssle) if you propose modifications to the fromisoformat API 
on the discourse.

--

___
Python tracker 
<https://bugs.python.org/issue35829>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37979] Document an alternative to ISO 8601 parsing

2019-08-29 Thread Paul Ganssle

New submission from Paul Ganssle :

Per Antoine's comment in the discourse thread ( 
https://discuss.python.org/t/parse-z-timezone-suffix-in-datetime/2220/6 ): 

> ... the doc isn’t helpful, as it doesn’t give any alternative.

I think we can link to dateutil.parser.isoparse as an alternative. I'm happy to 
field other options for ISO 8601 parsers instead, though considering that 
fromisoformat is adapted from dateutil.parser.isoparse, it seems reasonable to 
link them.

--
assignee: p-ganssle
components: Documentation
messages: 350784
nosy: p-ganssle
priority: low
severity: normal
stage: needs patch
status: open
title: Document an alternative to ISO 8601 parsing
type: enhancement
versions: Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue37979>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37979] Document an alternative to ISO 8601 parsing

2019-08-29 Thread Paul Ganssle


Change by Paul Ganssle :


--
keywords: +patch
pull_requests: +15272
stage: needs patch -> patch review
pull_request: https://github.com/python/cpython/pull/15596

___
Python tracker 
<https://bugs.python.org/issue37979>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37992] Change datetime.MINYEAR to allow for negative years

2019-08-31 Thread Paul Ganssle


Paul Ganssle  added the comment:

This is only a semi-arbitrary restriction. Realistically, `datetime` is not a 
particularly good way to represent times much older than the 17th or 18th 
century (and if you're using time zones, it gets increasingly inaccurate as you 
go further back from 1970 or the further in the future you go from the current 
time). Generally, I think the choice to keep it to positive dates is due to a 
combination of the fact that 1. it introduces a lot more edge cases (there's no 
year 0, for example) 2. it may invalidate otherwise perfectly acceptable 
assumptions that people have made in code about the sign of the component 
values and 3. it would be a rarely used feature of dubious utility. I am not 
sure that adding this feature would be worth the support burden it would bring.

There was a discussion about this on the discourse in the past, there wasn't an 
obvious consensus that it would never happen, but I would not say that there 
was much support for the idea: 
https://discuss.python.org/t/bc-date-support/582/2

--

___
Python tracker 
<https://bugs.python.org/issue37992>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24416] Return a namedtuple from date.isocalendar()

2019-09-01 Thread Paul Ganssle


Paul Ganssle  added the comment:

Sorry for the late response after a patch, but I'm actually -1 on this patch. I 
don't think it buys us very much in terms of the API considering it only has 3 
parameters, and it adds some complexity to the code (in addition to the 
performance issue). Honestly, I think the main reason we didn't go with 
positional-only parameters in `fromisocalendar` was that the "/" syntax didn't 
exist yet.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue24416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24416] Return a namedtuple from date.isocalendar()

2019-09-01 Thread Paul Ganssle


Paul Ganssle  added the comment:

> But I'm wondering how the `fromisocalendar` API relates to this patch.
> Rather, wouldn't this patch contribute to improving the usability of the 
> `fromisocalendar` API?

The `fromisocalendar` API relates to this patch only insofar as it is the 
inverse function of `isocalendar` and in some sense it allows specifying the 
parameters by keyword rather by position. I was merely bringing up that we 
didn't choose that API because we thought people would or should want or need 
to specify the individual components by keyword but because we didn't have an 
easy way to maintain the same API in the pure Python and C APIs at the time. By 
contrast, returning a plain tuple from `isocalendar()` is the easier *and* more 
performant thing to do, and given that any benefits seem marginal I'm against 
the switch.

I think that the usability of `fromisoformat` with the output of `isocalendar` 
will be largely unchanged if we were to switch to returning a namedtuple.

--

___
Python tracker 
<https://bugs.python.org/issue24416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24416] Have date.isocalendar() return a structseq instance

2019-09-01 Thread Paul Ganssle


Paul Ganssle  added the comment:

> Dong-hee Na, if you want to make a fresh PR for this and bring it to 
> fruition, I would be happy to review and apply it.

It seems premature to say that you will accept a PR for this when there's no 
consensus for actually adding the feature, and it would be good to probably 
work out if it's even desirable before asking contributors to do more work on 
it.

It seems like it would be better to argue the point of *why* you think a 
structseq actually solves the problem here. Is a struct sequence more backwards 
compatible than a namedtuple? Less? Is it as performant? Will it make it easier 
or harder to maintain compatibility between the C and pure Python 
implementations of the datetime module?

--

___
Python tracker 
<https://bugs.python.org/issue24416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24416] Have date.isocalendar() return a structseq instance

2019-09-03 Thread Paul Ganssle


Paul Ganssle  added the comment:

> What IS unprecedented is having a C function bend over backwards to return an 
> instance of collections.namedtuple().  

Is this an issue that anyone is currently insisting upon? From what I can tell 
the current implementation uses a structseq and none of my objections had to do 
with the properties of a structseq.

> ISTM the cross-version pickling issue is minor red herring.  We've cheerfully 
> upgraded tuples to structseqs on a number of occasions and it hasn't been an 
> issue.

I generally agree with this - it is nice to not break this compatibility when 
there's no good reason to do so, but pickling data in one version of Python and 
unpickling it in another is not something that's supported by the pickle module 
anyway.

> Tim, would you please weigh in on this so we can put this to bed, either 
> closing the request because we're too meek to make any change, 
upgrading to structseq to provide the requested functionality, or twisting our 
code in weird ways to have a C function become dependent on a pure python 
module.

I must take some umbrage at this framing of the question. I don't even know 
where meekness comes into it - adding *any* new functionality brings support 
burdens and additional code complexity and changes to code that's been stable 
for a long time like `dateutil.isocalendar` is particularly likely to cause 
breakages if only because of the time people have had to start relying on the 
specific implementation. I have merely asked for a justification and an 
argument other than your subjective judgement that this is a nice improvement.

--

___
Python tracker 
<https://bugs.python.org/issue24416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24416] Have date.isocalendar() return a structseq instance

2019-09-03 Thread Paul Ganssle


Paul Ganssle  added the comment:

In an effort to get a sense of how useful this would actually be, I did a code 
search for `.isoformat()` on github. I saw a few doctests that will break (if 
they're even being run) if we make this change, but I also found that the 
*vast* majority of uses of `isocalendar` seem to be people pulling out a single 
component of it, like:  `return datetime.datetime.now().isocalendar()[1]`.

That is not the kind of usage pattern I was envisioning when I said that this 
was a marginal improvement, a *lot* of this code could be made considerably 
more readable with named fields. If indeed the performance is similar or the 
same and this won't impact consumers of the pure python version of the module 
unduly (I checked in #pypy and they think that it shouldn't be more than a 
minor irritation if anything), then I am changing my -1 to a +1.

--
assignee: tim.peters -> p-ganssle

___
Python tracker 
<https://bugs.python.org/issue24416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24416] Have date.isocalendar() return a structseq instance

2019-09-08 Thread Paul Ganssle

Paul Ganssle  added the comment:

I haven't had time to try this with an optimized build, I have done a few more 
benchmarks using a standard build, I'm seeing almost a 50% regression on 
isocalendar() calls, but the picture is a little rosier if you consider the 
fact that you need to construct a `date` or `datetime` object in the first 
place and anyone concerned with performance will likely be making exactly one 
`isocalendar()` call per datetime object. The common case is that they 
construct the datetime with a datetime literal or something equivalent. The 
"worst case scenario" is that they construct a single "seed" datetime and then 
construct many new datetimes with faster operations like adding a timedelta.

I have compared the following cases:

call only: -s 'import datetime; dt = datetime.datetime(2019, 1, 1)' 
'dt.isocalendar()'
constructor + call: -s 'import datetime' 'dt = datetime.datetime(2019, 1, 1); 
dt.isocalendar()'
timedelta + call: -s 'import datetime; dt = datetime.datetime(2019, 1, 1); td = 
timedelta(days=1)' '(dt + td).isocalendar()'


The results are summarized below, the most likely real practical impact on 
performance-sensitive "hot loops" would be a 29% performance regression *if* 
isocalendar() is the bottleneck:


   benchmark|  master (ns)  |  PR 15633 (ns)  |  Δ (%)
+---+-+--
call only (datetime)|349 (±14)  | 511 (±22)   |   46
constructor + call (datetime)   |989 (±48)  |1130 (±50)   |   14
timedelta + call (datetime) |550 (±14)  | 711 (±22)   |   29


The numbers for `datetime.date` are basically similar:

   benchmark|  master (ns)  |  PR 15633 (ns)  |  Δ (%)
+---+-+--
call only (date)|360 (±18)  | 530 (±41)   |   47
constructor + call (date)   |824 (±17)  | 975 (±29)   |   18
timedelta + call (datetime) |534 (±20)  | 685 (±24)   |   28


This is a bit disheartening, but I think the fact that these performance 
sensitive situations are rare and the "grab a single component" use is 
overwhelmingly the common case, I think it's worth it in the end.

If there are significant complaints about the performance regression, we may be 
able to create a variant method similar to the way that `chain` has 
`chain.from_iterables`, something like `dt.isocalendar.as_tuple()` for the 
performance-sensitive folks. That said, that's very YAGNI, we should not do 
that unless someone comes forward with a real use case that will be adversely 
affected here.

--

___
Python tracker 
<https://bugs.python.org/issue24416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24416] Have date.isocalendar() return a structseq instance

2019-09-08 Thread Paul Ganssle

Paul Ganssle  added the comment:

I have compiled both versions with optimizations on, looks like the gap gets a 
bit smaller (percentage-wise) after that:

   benchmark|  master (ns)  |  PR 15633 (ns)  |  Δ (%)
+---+-+--
call only (datetime)| 73 (±3)   | 92.3 (±7)   |   26
constructor + call (datetime)   |228 (±9)   | 260 (±16)   |   14
timedelta + call (datetime) |108 (±5)   | 128 (±9)|   18

If this were something fundamental like a performance regression in building a 
tuple or constructing a dictionary or something I'd be concerned, but this just 
reinforces my feeling that, on balance, this is worth it, and that we are 
probably not going to need a "fast path" version of this.

--

___
Python tracker 
<https://bugs.python.org/issue24416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38065] Document the datetime capsule API

2019-09-09 Thread Paul Ganssle


New submission from Paul Ganssle :

The datetime module has a capsule API, which is very useful for other 
languages' bindings, but the C API documentation for datetime only covers the C 
macros: https://docs.python.org/3/c-api/datetime.html

The current extent of the documentation is that everyone who wants to bind to 
the C API (PyO3, Cython, pypy, etc), you need to just read the struct 
definition ( 
https://github.com/python/cpython/blob/master/Include/datetime.h#L150 ) and 
reads `_datetimemodule.c`. There's even some question as to whether the capsule 
is public (see, e.g. 
https://github.com/PyO3/pyo3/pull/393#issuecomment-476664650 ), when in fact 
I'm fairly certain that it's actually *preferred* to use the capsule API.

Most or many of the macros are thin wrappers around the capsule API, so we may 
need to figure out whether we want to try and use the same "documentation" for 
both versions, e.g.:

  .. c:function:: PyObject* PyDateTime_CAPI.Date_FromDate(int year, int month, 
int day, PyTypeObject* cls)
  .. c:function:: PyObject* PyDate_FromDate(int year, int month, int day)

 Return a :class:`datetime.date` object with the specified year, month and 
day.

 The version of this function in the capsule module takes an additional 
argument
 representing the specific subclass to construct.

Could replace:

  .. c:function:: PyObject* PyDate_FromDate(int year, int month, int day)

 Return a :class:`datetime.date` object with the specified year, month and 
day.

I would say that we also need a paragraph or two at the beginning of the C API 
document explaining why there are two ways to access most of these things?

A more minor bikeshedding-y issue is how we should stylize these: 
PyDatetime_CAPI.x? PyDatetime_CAPI->x? A dedicated RST directive? Something 
else?

--
assignee: docs@python
components: Documentation
messages: 351427
nosy: belopolsky, docs@python, eric.araujo, ezio.melotti, mdk, p-ganssle, 
willingc
priority: normal
severity: normal
status: open
title: Document the datetime capsule API
type: enhancement
versions: Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue38065>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38082] datetime.time object incorrectly shows associated date in strftime() output

2019-09-10 Thread Paul Ganssle

Paul Ganssle  added the comment:

Hi Abhisek,

This is actually the expected / intended behavior, and it is documented under 
"strptime() and strftime() behavior": 
https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior 
(which is linked to by the time.strftime documentation: 
https://docs.python.org/3/library/datetime.html#datetime.datetime.strftime . 
The relevant section is:

  For time objects, the format codes for year, month, and day should not be 
used, as time objects have no such values. If they’re used anyway, 1900 is 
substituted for the year, and 1 for the month and day.

If I were designing `datetime.time.strftime` from scratch, my instinct would be 
to throw an error in this case, but my philosophy on parsing interfaces tends 
towards the stricter side. At this point, I think it would do more harm than 
good to change this behavior. I imagine that the motivation is something like 
the Robustness Principle ( https://en.wikipedia.org/wiki/Robustness_principle 
), but I wasn't involved in the original design so I can't be sure.

Thank you for taking the time to make a bug report, it's very appreciated even 
when it turns out to not be a bug.

--
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue38082>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37499] test_gdb.test_pycfunction should use dedicated test functions

2019-09-10 Thread Paul Ganssle


Paul Ganssle  added the comment:

This is done, thanks Petr and Jeroen!

I don't see a need to backport this unless we also want to backport GH-14311 or 
something else that depends on it.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue37499>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36960] Make datetime docs more user-friendly

2019-09-11 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset 3fb1363fe87a24cdb2ee1dd9746f1c49046af958 by Paul Ganssle (Brad) 
in branch 'master':
Overhaul datetime documentation (GH-13410)
https://github.com/python/cpython/commit/3fb1363fe87a24cdb2ee1dd9746f1c49046af958


--

___
Python tracker 
<https://bugs.python.org/issue36960>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36960] Make datetime docs more user-friendly

2019-09-11 Thread Paul Ganssle


Change by Paul Ganssle :


--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed
versions: +Python 3.9 -Python 3.8

___
Python tracker 
<https://bugs.python.org/issue36960>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38096] Clean up the "struct sequence" / "named tuple" docs

2019-09-11 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset 7117074410118086938044c7a4ef6846ec1662b2 by Paul Ganssle (Raymond 
Hettinger) in branch 'master':
bpo-38096: Clean up the "struct sequence" / "named tuple" docs (GH-15895)
https://github.com/python/cpython/commit/7117074410118086938044c7a4ef6846ec1662b2


--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue38096>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38096] Clean up the "struct sequence" / "named tuple" docs

2019-09-11 Thread Paul Ganssle


Change by Paul Ganssle :


--
resolution:  -> fixed
stage: patch review -> backport needed
status: open -> pending

___
Python tracker 
<https://bugs.python.org/issue38096>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38096] Clean up the "struct sequence" / "named tuple" docs

2019-09-11 Thread Paul Ganssle


Change by Paul Ganssle :


--
pull_requests: +15595
status: pending -> open
pull_request: https://github.com/python/cpython/pull/15961

___
Python tracker 
<https://bugs.python.org/issue38096>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38096] Clean up the "struct sequence" / "named tuple" docs

2019-09-11 Thread Paul Ganssle


Change by Paul Ganssle :


--
pull_requests: +15596
stage: backport needed -> patch review
pull_request: https://github.com/python/cpython/pull/15962

___
Python tracker 
<https://bugs.python.org/issue38096>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38121] Synchronize importlib.metadata with importlib_metadata 0.22

2019-09-12 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset 8ed6503eca4e3ea4949479d8d7fd9ffd54f81038 by Paul Ganssle (Jason 
R. Coombs) in branch 'master':
bpo-38121: Sync importlib.metadata with 0.22 backport (GH-15993)
https://github.com/python/cpython/commit/8ed6503eca4e3ea4949479d8d7fd9ffd54f81038


--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue38121>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38096] Clean up the "struct sequence" / "named tuple" docs

2019-09-12 Thread Paul Ganssle


Paul Ganssle  added the comment:

Sorry guys, my mistake. I think I was a bit caught up in the workflow at the 
sprint where I've been going through the review-cleanup-merge process a lot 
faster than I usually do (partially since I have the time and partially since 
the huge number of PRs getting merged is requiring a lot of rebases, so it's 
better to get them in quicker).

No need to worry, I will not merge any of your PRs in the future unless you 
request it for some reason.

--

___
Python tracker 
<https://bugs.python.org/issue38096>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13927] Document time.ctime format

2019-09-12 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset 2d32bf1ef23c9e468b2e8afab3c24e7a2047ac36 by Paul Ganssle 
(Harmandeep Singh) in branch 'master':
bpo-13927: time.ctime and time.asctime return string explantion (GH-11303)
https://github.com/python/cpython/commit/2d32bf1ef23c9e468b2e8afab3c24e7a2047ac36


--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue13927>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13927] Document time.ctime format

2019-09-12 Thread Paul Ganssle


Paul Ganssle  added the comment:

We've merged the PR and I think it resolves this issue, so we can close this 
issue now. Please let me know if it's not fully resolved and we can re-open.

Thanks Roger for reporting this and Harmandeep for making the PR and requested 
changes.

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed
type: behavior -> enhancement

___
Python tracker 
<https://bugs.python.org/issue13927>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue22377] %Z in strptime doesn't match EST and others

2019-09-12 Thread Paul Ganssle


Change by Paul Ganssle :


--
stage:  -> needs patch
versions: +Python 3.7, Python 3.8, Python 3.9 -Python 3.5, Python 3.6

___
Python tracker 
<https://bugs.python.org/issue22377>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38139] [BUG] datetime.strptime could not handle timezone

2019-09-12 Thread Paul Ganssle


Paul Ganssle  added the comment:

Hi Yixing, thank you for your bug report. This issue has already been reported, 
and the discussion is in issue #22377.

In the short term I believe the solution will be to document the current 
behavior. In the long term there are some solutions, though I imagine none of 
them will be amazingly satisfying, because the output of %Z is basically 
freeform and doesn't necessarily match to an unambiguous offset. The best you'd 
be able to do would be %Z%z or %z%Z, but there will be many implementation 
challenges there.

--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> %Z in strptime doesn't match EST and others

___
Python tracker 
<https://bugs.python.org/issue38139>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37488] Document the "gotcha" behaviors in utcnow() and utcfromtimestamp()

2019-09-12 Thread Paul Ganssle


Paul Ganssle  added the comment:

Thanks Joannah!

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue37488>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38155] Add __all__ to the datetime module

2019-09-13 Thread Paul Ganssle


New submission from Paul Ganssle :

Currently the datetime module has no __all__, which means we only advertise 
what is public and private based on leading underscores. Additionally, because 
there are two implementations (Python and C), you actually get different things 
when you do `from datetime import *` depending on whether you have the C module 
installed or not.

The "easy" part is to add an __all__ variable to Lib/datetime.py for all the 
documented attributes:

  __all__ = ["date", "datetime", "time", "timedelta", "timezone", "tzinfo", 
"MINYEAR", "MAXYEAR"]

A "stretch goal" would be to add a test to ensure that `from datetime import *` 
imports the same set of symbols from the pure python module that it does from 
the C module. I haven't quite thought through how this would be achieved, 
probably something in test_datetime 
(https://github.com/python/cpython/blob/6a517c674907c195660fa9178a7b561de49cc721/Lib/test/test_datetime.py#L1),
 where we need to import both modules anyway. I think we can accept an "add 
__all__" PR without tests, though.

--
components: Library (Lib)
keywords: newcomer friendly
messages: 352280
nosy: belopolsky, p-ganssle
priority: normal
severity: normal
stage: needs patch
status: open
title: Add __all__ to the datetime module
type: enhancement
versions: Python 3.9

___
Python tracker 
<https://bugs.python.org/issue38155>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38155] Add __all__ to the datetime module

2019-09-13 Thread Paul Ganssle


Paul Ganssle  added the comment:

Hi Tahia: Go ahead and make a PR, no need to worry about the test.

I mainly put in the bit about tests because I was hoping to nerd-snipe someone 
into figuring out how to do it for me ;) It's not a particularly important test.

--

___
Python tracker 
<https://bugs.python.org/issue38155>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38155] Add __all__ to the datetime module

2019-09-13 Thread Paul Ganssle


Paul Ganssle  added the comment:

Actually, how about adding this simpler test into `Lib/test/datetimetester.py`, 
right above test_name_cleanup 
(https://github.com/python/cpython/blob/ff2e18286560e981f4e09afb0d2448ea994414d8/Lib/test/datetimetester.py#L65):

def test_all(self):
"""Test that __all__ only points to valid attributes."""
all_attrs = dir(datetime_module)
for attr in datetime_module.__all__:
self.assertIn(attr, all_attrs)

This will at least test that __all__ only contains valid attributes on the 
module.

--

___
Python tracker 
<https://bugs.python.org/issue38155>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37555] _CallList.__contains__ doesn't always respect ANY.

2019-09-13 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset d6a9d17d8b6c68073931dd8ffa213b4ac351a4ab by Paul Ganssle 
(Elizabeth Uselton) in branch 'master':
bpo-37555: Update _CallList.__contains__ to respect ANY (#14700)
https://github.com/python/cpython/commit/d6a9d17d8b6c68073931dd8ffa213b4ac351a4ab


--

___
Python tracker 
<https://bugs.python.org/issue37555>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24416] Have date.isocalendar() return a structseq instance

2019-09-13 Thread Paul Ganssle


Paul Ganssle  added the comment:

The current state of the PR doesn't hinge on the pure Python implementation, we 
went with a very simple tuple subclass to keep the two more closely in sync and 
because we don't need any of the additional functionality that namedtuple 
brings, but if it were any more complicated than what we did we probably would 
have just gone with a namedtuple.

The only thing that's holding things up now is that we're working out a way to 
maintain the ability to pickle the object without making the class public. This 
is not really a hard requirement, but I'd like to give it an honest effort 
before calling it a day and just relying on "it's not in __all__, therefore 
it's not public." (I should note that we can only take that approach after 
issue #38155 is resolved, which is another reason for the delay).

In any case, the bulk of the conversation on the implementation has been taking 
place on GH-15633, sorry for the split discussion location, folks.

--

___
Python tracker 
<https://bugs.python.org/issue24416>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30367] Cannot build CPython3.6 with module “testcapimodule” statically

2019-09-17 Thread Paul Ganssle


Paul Ganssle  added the comment:

Is this issue only in Python 3.6? I believe Python 3.6 is only receiving 
security fixes at the moment, so this could only be fixed in 3.7, 3.8 and 3.9.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue30367>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38155] Add __all__ to the datetime module

2019-09-19 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset 96b1c59c71534db3f0f3799cd84e2006923a5098 by Paul Ganssle (t k) in 
branch 'master':
bpo-38155: Add __all__ to datetime module (GH-16203)
https://github.com/python/cpython/commit/96b1c59c71534db3f0f3799cd84e2006923a5098


--

___
Python tracker 
<https://bugs.python.org/issue38155>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38155] Add __all__ to the datetime module

2019-09-19 Thread Paul Ganssle


Paul Ganssle  added the comment:

Closing this as resolved. I don't think we should backport this, as it's more 
of an enhancement than a bug fix (and since no one has ever complained about it 
to my knowledge, I don't think there's any big rush to see this released).

Thanks Tahia and all the reviewers!

--
resolution:  -> fixed
stage: patch review -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue38155>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35540] dataclasses.asdict breaks with defaultdict fields

2019-09-24 Thread Paul Ganssle


Change by Paul Ganssle :


--
pull_requests: +15935
pull_request: https://github.com/python/cpython/pull/16356

___
Python tracker 
<https://bugs.python.org/issue35540>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35540] dataclasses.asdict breaks with defaultdict fields

2019-09-24 Thread Paul Ganssle


Paul Ganssle  added the comment:

Considering that `namedtuple` is special-cased, I think it's reasonable to 
special-case `defaultdict` as well, though it may be worth considering more 
general solutions that will also work for things other than the standard 
library. One would be to solve this the same way that other "subclasses may 
have a different constructor" problems are solved (e.g. `float`, `int`, 
formerly `datetime`) and ignore the subclass (or selectively ignore it if it's 
a problem), for example changing _asdict_inner to something like this:

if isinstance(obj, dict):
new_keys = tuple((_asdict_inner(k, dict_factory),
  _asdict_inner(v, dict_factory))
  for k, v in obj.items())

try:
return type(obj)(new_keys)
except Exception:
return dict(new_keys)

Another more general alternative would be to add a type registry for `asdict`, 
either as an additional parameter or with a new transformer class of some sort. 
I created a quick proof of concept for this in GH-16356 to see one way it could 
look.

In any case I think it's quite unfortunate that we can't easily just support 
anything that has a __deepcopy__ defined. There may be some crazy solution that 
involves passing a class with a custom __getitem__ to the `memo` argument of 
copy.deepcopy, but if it's even possible (haven't thought about it enough) I'm 
not sure it's *advisable*.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue35540>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue35540] dataclasses.asdict breaks with defaultdict fields

2019-09-24 Thread Paul Ganssle


Paul Ganssle  added the comment:

I checked and it appears that `attrs` handles this by creating *all* dicts 
using the default dict_factory (similar to my original suggestion of just using 
`dict` instead of the specific type), if I'm reading this right: 
https://github.com/python-attrs/attrs/blob/master/src/attr/_funcs.py#L102

Using `attr.asdict` seems to bear this out, as `defaultdict` attributes are 
converted to `dict` when the dict factory is not specified.

I think changing the default behavior like that would be a 
backwards-incompatible change at this point (and one that it's really hard to 
warn about, unfortunately), but we could still use the "fall back to 
dict_factory" behavior by trying to construct a `type(obj)(...)` and in the 
case of an exception return `dict_factory(...)`.

--

___
Python tracker 
<https://bugs.python.org/issue35540>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7980] time.strptime not thread safe

2019-10-03 Thread Paul Ganssle


Paul Ganssle  added the comment:

>From what I can tell, this is a Python 2.7-only bug, and it's not a security 
>issue, so I think we can close the issue as either "wontfix" (because we won't 
>fix it in Python 2) or "fixed" (because it is already fixed in Python 3), 
>depending on your perspective.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue7980>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37527] Timestamp conversion on windows fails with timestamps close to EPOCH

2019-11-01 Thread Paul Ganssle


Paul Ganssle  added the comment:

This indeed seems to be a duplicate of 29097, which is fixed in Python 3.7, so 
we can close this bug. Thank you for your report Dschoni, and thank you for 
finding the duplicate Ma Lin!

--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> [Windows] datetime.fromtimestamp(t) when 0 <= t <= 86399 fails 
on Python 3.6

___
Python tracker 
<https://bugs.python.org/issue37527>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38233] datetime.datetime.fromtimestamp have different behaviour on windows and mac

2019-11-01 Thread Paul Ganssle


Paul Ganssle  added the comment:

Changing the superceder here as I think #36439 matches better than #37527.

--
nosy: +p-ganssle
resolution: duplicate -> 
status: closed -> open
superseder: Timestamp conversion on windows fails with timestamps close to 
EPOCH -> Inconsistencies with datetime.fromtimestamp(t) when t < 0

___
Python tracker 
<https://bugs.python.org/issue38233>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue36439] Inconsistencies with datetime.fromtimestamp(t) when t < 0

2019-11-01 Thread Paul Ganssle


Paul Ganssle  added the comment:

This has been coming up in a few different contexts lately, so I think it would 
be really good if we could get some sort of fix for it.

One option is to implement our own versions of these APIs for use in Windows, 
but a thought occurred to me recently: we have not investigated the possibility 
of seeing if Microsoft would be willing to either add support for negative 
timestamps in their localtime() or gmtime() implementations or add a new API 
that *does* support negative timestamps. It would also be good to rule out the 
possibility that such APIs already exist but we just don't know about them 
(preliminary googling doesn't show anything, though possibly something can be 
done with the Win32 APIs? Not sure how or if those work in C and how big a lift 
it would be to maintain compatibility if can switch: 
https://docs.microsoft.com/en-us/windows/win32/sysinfo/time-functions?redirectedfrom=MSDN
 ).

Adding Steve Dower to the nosy list in case he can shed some light onto the 
possibility of native support.

--
nosy: +steve.dower

___
Python tracker 
<https://bugs.python.org/issue36439>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue37527] Timestamp conversion on windows fails with timestamps close to EPOCH

2019-11-01 Thread Paul Ganssle


Paul Ganssle  added the comment:

Ah, my mistake. The examples all use `datetime.fromtimestamp`, so I didn't 
notice that it was failing only on the `timestamp` side. Re-opening, thanks!

--
resolution: duplicate -> 
status: closed -> open
superseder: [Windows] datetime.fromtimestamp(t) when 0 <= t <= 86399 fails on 
Python 3.6 -> 

___
Python tracker 
<https://bugs.python.org/issue37527>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43484] valid datetimes can become invalid if the timezone is changed

2021-03-19 Thread Paul Ganssle

Paul Ganssle  added the comment:

> That it allows creating the datetime instance looks like a bug to me, i.e. a 
> time before 0001-01-01 00:00 UTC is invalid. What am I misunderstanding?

`datetime.datetime(1, 1, 1, tzinfo=timezone(timedelta(hours=1)))` is a valid 
datetime, it's just that it cannot be converted to all other timestamps, 
because in some time zones, the same absolute time is out of datetime's range.

`datetime.datetime` is a representation of an abstract datetime, and it can 
also be annotated with a time zone to basically tag the civil time with a 
function for converting it into other representations of the same *absolute* 
time. The range of valid `datetime.datetime` objects is based entirely on the 
naïve portion of the datetime, and has nothing to do with the absolute time. So 
this is indeed a natural consequence of the chosen design.

If we wanted to change things, it would cause a number of problems, and the 
cure would be much worse than the "disease". For one thing, accessing UTC 
offsets is done lazily, so `.utcoffset()` is not currently called during 
`datetime` creation. The datetime documentation lays out that this is 
consistent with the raison d'être of `datetime`: "While date and time 
arithmetic is supported, the focus of the implementation is on efficient 
attribute extraction for output formatting and manipulation." In order to 
determine whether a given `datetime` can always be converted to an equivalent 
datetime in any time zone, we'd need to actively determine its UTC offset, 
which would be a major performance regression in creating aware datetimes. We 
could avoid this performance regression by only doing the `.utcoffset()` check 
when the datetime is within 2 days of `MINYEAR` or `MAXYEAR`, but while this 
would be a more minor performance regression, it would also add new edge cases 
where `.utcoffset()` is sometimes called during the constructor and sometimes 
not, which is not ideal. Not to mention if we were to ever open up the allowed 
return values for `.utcoffset()` the logic might get hairier (depending on the 
nature of the allowed values).

Another issue with "fixing" this is that it would take currently-valid 
datetimes and turn them into invalid datetimes, which violates backwards 
compatibility. I imagine in most cases this is only done as part of test 
suites, since TZ-aware datetimes near 0 and 10,000 CE are anachronistic and not 
likely to be of much instrumental value, but the same can be said of these 
potentially "invalid" dates in the first place.

Additionally, even worse is that even naïve datetimes can be converted to UTC 
or other time zones, and if we want to add a new constraint that 
`some_datetime.astimezone(some_timezone)` must always work, then you wouldn't 
even be able to *construct* `datetime.MINYEAR` or `datetime.MAXYEAR`, since 
`datetime.MINYEAR.astimezone(timezone(timedelta(hours=-24)))` would fail 
everywhere, and worse, the minimum datetime value you could construct would 
depend on your system locale! Again, the alternative would be to make an 
exception for naïve datetimes, but given that this change is of dubious value 
to start with, I don't think it is worth it.

> So I'm pretty sure this is "not a bug" but it's a bit of a problem and I have 
> a user suggesting the "security vulnerability" bell on this one, and to be 
> honest I don't even know what any library would do to "prevent" this.

I don't really know why it would be a "security vulnerability", but presumably 
a library could either convert their datetimes to UTC as soon as they get them 
from the user if they want to use them as UTC in the future, or they could 
simply refuse to accept any datetimes outside the range 
`datetime.datetime.MINYEAR + timedelta(hours=48) < dt.replace(tzinfo=None) < 
datetime.datetime.MAXYEAR - timedelta(hours=48)`, or if the concern is only 
about UTC, then refuse datetimes outside the range 
`datetime.MINYEAR.replace(tzinfo=timezone.utc) < dt < 
datetime.MAXYEAR.replace(tzinfo=timezone.utc)`.

> Why's this a security problem?   ish?because PostgreSQL has a data type 
> "TIMESTAMP WITH TIMEZONE" and if you take said date and INSERT it into your 
> database, then SELECT it back using any Python DBAPI that returns datetime() 
> objects like psycopg2, if your server is in a timezone with zero or negative 
> offset compared to the given date, you get an error.  So the mischievous user 
> can create that datetime for some reason and now they've broken your website 
> which can't SELECT that table anymore without crashing.

Can you clarify why this crashes? Is it because it always returns the datetime 
value in UTC?

> So, suppose you maintain the database library that helps people send data in 
> and out of psycopg

[issue15443] datetime module has no support for nanoseconds

2021-04-07 Thread Paul Ganssle


Paul Ganssle  added the comment:

> I don't think full nanosecond support is feasible to complete in the 
> remaining weeks

This may be so, but I think the important part of that question is "what work 
needs to be done and what questions need to be answered?" If the answer is that 
we need to make 3 decisions and do the C implementation, that seems feasible to 
do in under a month. If the answer is that we've got 10 contentious UI issues 
and we probably want to go through the PEP process, I agree with your 
assessment of the timing. Regardless, we'll need to know what work needs to be 
done before we do it...

> but we can try to add nanoseconds to timedelta only.  The mixed datetime + 
> timedelta ops will still truncate, but many time-related  operations will be 
> enabled. I would even argue that when nanoseconds precision is required, it 
> is more often intervals no longer than a few days and rarely a specific point 
> in time.

To be honest, I don't find this very compelling and I think it will only 
confuse people. I think most people use `timedelta` to represent something you 
add or subtract to a `datetime`. Having the `nanoseconds` part of it truncate 
seems like it would be frustrating and counter-intuitive. 

>From the use cases in this thread: 
 - ns-precision timestamps: https://bugs.python.org/issue15443#msg180125
 - ns-precision timestamps: https://bugs.python.org/issue15443#msg223039
 - Your suggestion that `datetime` should be able to support what `timespec` 
does: https://bugs.python.org/issue15443#msg223042
 - ns-precision timestamps: https://bugs.python.org/issue15443#msg270266

So I don't think there's high enough demand for nanosecond-timedelta on its own 
that we need to rush it out there before datetime gets it.

--

___
Python tracker 
<https://bugs.python.org/issue15443>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42109] Use hypothesis for testing the standard library, falling back to stubs

2021-05-14 Thread Paul Ganssle


Paul Ganssle  added the comment:

@Terry

> The problem with random input tests in not that they are 'flakey', but that 
> they are useless unless someone is going to pay attention to failures and try 
> to find the cause.  This touches on the difference between regression testing 
> and bug-finding tests.  CPython CI is the former, and marred at that by buggy 
> randomly failing tests.

> My conclusion: bug testing would likely be a good idea, but should be done 
> separate from the CI test suite.  Such testing should only be done for 
> modules with an active maintainer who would welcome failure reports.

Are you saying that random input tests are flaky but that that is not the big 
problem? In my experience using hypothesis, in practice it is not the case that 
you get tests that fail randomly. The majority of the time if your code doesn't 
violate one of the properties, the tests fail the first time you run the test 
suite (this is particularly true for strategies where hypothesis deliberately 
makes it more likely that you'll get a "nasty" input by biasing the random 
selection algorithm in that direction). In a smaller number of cases, I see 
failures that happen on the second, third or fourth run.

That said, if it were a concern that every run of the tests is using different 
inputs (and thus you might see a bug that only appears once in every 20 runs), 
it is possible to run hypothesis in a configuration where you specify the seed, 
making it so that hypothesis always runs the same set of inputs for the same 
tests. We can disable that on a separate non-CI run for hypothesis "fuzzing" 
that would run the test suite for longer (or indefinitely) looking for 
long-tail violations of these properties.

I feel that if we don't at least run some form of the hypothesis tests in CI, 
there will likely be bit rot and the tests will decay in usefulness. Consider 
the case where someone accidentally breaks an edge case that makes it so that 
`json.loads(json.dumps(o))` no longer works for some obscure value of `o`. With 
hypothesis tests running in CI, we are MUCH more likely to find this bug / 
regression during the initial PR that would break the edge case than if we run 
it separately and report it later. If we run the hypothesis tests in a 
build-bot, the process would be:

1. Contributor makes PR with passing CI.
2. Core dev review passes, PR is merged.
3. Buildbot run occurs and the buildbot watch is notified.
4. Buildbot maintainers track down the PR responsible and either file a new bug 
or comment on the old bug.
5. Someone makes a NEW PR adding a regression test and the fix for the old PR.
6. Core dev review passes, second PR is merged.

If we run it in CI, the process would be:

1. Contributor makes PR, CI breaks.
2. If the contributor doesn't notice the broken CI, core dev points it out and 
it is fixed (or the PR is scrapped as unworkable).

Note that in the non-CI process, we need TWO core dev reviews, we need TWO PRs 
(people are not always super motivated to fix bugs that don't affect them that 
they the caused when fixing a bug that does affect them), and we need time and 
effort from the buildbot maintainers (note the same applies even if the 
"buildbot" is actually a separate process run by Zac out of a github repo).

Even if the bug only appears in one out of every 4 CI runs, it's highly likely 
that it will be found and fixed before it makes it into production, or at least 
much more quickly, considering that most PRs go through a few edit cycles, and 
a good fraction of them are backported to 2-3 branches, all with separate CI 
runs. It's a much quicker feedback loop.

I think there's an argument to be made that incorporating more third-party 
libraries (in general) into our CI build might cause headaches, but I think 
that is not a problem specific to hypothesis, and I think its one where we can 
find a reasonable balance that allows us to use hypothesis in one form or 
another in the standard library.

--

___
Python tracker 
<https://bugs.python.org/issue42109>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue24929] _strptime.TimeRE should not enforce range in regex

2021-05-18 Thread Paul Ganssle


Paul Ganssle  added the comment:

I also commented on GH-26215 ( https://github.com/python/cpython/pull/26215 ), 
but for posterity, I'll note a few things:

1. It seems that (and this may have changed since 2015), `_strptime._strptime` 
now has a stage that (unconditionally?) constructs a temporary `datetime_date`, 
which means it does do this particular validation in both `time.strptime` and 
`datetime.strptime`. That said, both flavors of `strptime` are *way* slower 
than I'd like them to be, and constructing an unnecessary `date`/`datetime` is 
a pretty good way to slow down your function, so if we ever go around 
optimizing this function, that may be one of the first bits on the chopping 
block.

2. The logic for `strptime` is very complicated and it's very hard to test the 
full input space of the function (particularly since we're not  using property 
tests (yet)...). This makes me somewhat uneasy about moving the validation 
stage from the beginning of the function (in parsing the regular expression) to 
the very *end* of the function (in the datetime constructor).

It's *probably* safe to do so, but it may also be worth exploring the 
possibility of validating this directly in `_strptime` (possibly immediately 
after the string is parsed by the regex), and raising a cleaner error message 
on failure.

Probably not worth spending a ton of time on that compared to improving the 
testing around this so that we can feel confident making changes under the 
hood. `.strptime` is really quite slow, and I wouldn't be surprised if we 
pulled out its guts and replaced most of the regex stuff with a fast C parser 
at some point in the future. Having good tests will both give us confidence to 
make this change (and that making this change won't lead to regressions in the 
future) and help with any future project to replace `_strptime._strptime` with 
a faster version, so I'd say that's the most important thing to do here.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue24929>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42109] Use hypothesis for testing the standard library, falling back to stubs

2021-05-19 Thread Paul Ganssle


Paul Ganssle  added the comment:

I do not want to dissuade you from figuring out how minithesis / hypothesis 
works (far from it), but I'm wondering if the question of how shrinking works 
is germane to the issue at hand, which is whether or not hypothesis / 
property-based-testing is suitable for testing the standard library.

I almost don't think it matters *if* shrinking works, much less *how*. 
Shrinking is something of a UI nicety in the sense that when hypothesis finds 
an example that violates a property, it will try to take whatever example it 
has chosen and find something like a "minimal working example" to show to the 
end user. So if we have something like:

```
@given(x=strategies.integers())
def f(x):
assert x >= 0
```

Hypothesis will presumably tell us that `x == -1` will violate the property 
instead of `-24948929` or some other random thing. But if it did report 
`-24948929` or something, it wouldn't be too hard to see "oh, right, integers 
can be negative". In some cases with the time zone stuff, there isn't a good 
metric for complexity at the moment, so time zones are sorted by their IANA key 
(which is to say, basically arbitrary), and I generally find it useful anyway 
(admittedly, the shrinking still is helpful because most problems affecting all 
zones will return Africa/Abuja, whereas things particular to a specific zone's 
odd time zone history will return whichever zone with that quirk comes 
alphabetically first).

Anyway, do not let me disrupt your process, I just thought it might be worth 
making the point that some of these specific details might be nice to know, but 
don't seem like they should be blockers for hypothesis' adoption in the 
standard library.

--

___
Python tracker 
<https://bugs.python.org/issue42109>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue43295] datetime.strptime emits IndexError on parsing 'z' as %z

2021-05-19 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset c87b81dcb2c22b6d151da39a0f65d5db304f59a8 by Miss Islington (bot) 
in branch '3.9':
bpo-43295: Fix error handling of datetime.strptime format string '%z' 
(GH-24627) (#25695)
https://github.com/python/cpython/commit/c87b81dcb2c22b6d151da39a0f65d5db304f59a8


--

___
Python tracker 
<https://bugs.python.org/issue43295>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue42109] Use hypothesis for testing the standard library, falling back to stubs

2021-05-25 Thread Paul Ganssle

Paul Ganssle  added the comment:

> I use hypothesis during development, but don't have a need for in the the 
> standard library.  By the time code lands there, we normally have a specific 
> idea of what edge cases needs to be in the tests.

The suggestion I've made here is that we use @example decorators to take the 
hypothesis tests you would have written already and turn them in to what are 
essentially parameterized tests. For anyone who doesn't want to explicitly run 
the hypothesis test suite, the tests you are apparently already writing would 
simply turn into normal tests for just the edge cases.

One added benefit of keeping the tests around in the form of property tests is 
that you can run these same tests through hypothesis to find regressions in 
bugfixes that are implemented after landing (e.g. "Oh we can add a fast path 
here", which introduces a new edge case). The segfault bug from bpo-34454, for 
example, would have been found if I had been able to carry over the 
hypothesis-based tests I was using during the initial implementation of 
fromisoformat into later stages of the development. (Either because I didn't 
run it long enough to hit that particular edge case or because that edge case 
didn't come up until after I had moved the development locus into the CPython 
repo, I'm not sure).

Another benefit of keeping them around is that they become fuzz targets, 
meaning people like oss-fuzz or anyone who wants to throw some fuzzing 
resources at CPython have an existing body of tests that are expected to pass 
on *any* input, to find especially obscure bugs.

> For the most part, hypothesis has not turned up anything useful for the 
> standard library.  Most of the reports that we've gotten reflected a 
> misunderstanding by the person running hypothesis rather than an actual bug. 
> [...]

I don't really think it's a good argument to say that it hasn't turned up 
useful bugs. Most of the bugs in a bit of code will be found during development 
or during the early stages of adoption, and we have very wide adoption. I've 
found a number of bugs in zoneinfo using hypothesis tests, and I'd love to 
continue using them in CPython rather than throwing them away or maintaining 
them in a separate repo.

I also think it is very useful for us to write tests about the properties of 
our systems for re-use in PyPy (which does use hypothesis, by the way) and 
other implementations of Python. This kind of "define the contract and maintain 
tests to enforce that" is very helpful for alternate implementations.

> For numeric code, hypothesis is difficult to use and requires many 
> restrictions on the bounds of variables and error tolerances.  [...]

I do not think that we need to make hypothesis tests mandatory. They can be 
used when someone finds them useful.

> The main area where hypothesis seems easy to use and gives some comfort is in 
> simple roundtrips:  assert zlib.decompress(zlib.compress(s)) == s.  However, 
> that is only a small fraction of our test cases.

Even if this were the only time that hypothesis were useful (I don't think it 
is), some of these round-trips can be among the trickiest and important code to 
test, even if it's a small fraction of the tests. We have a bunch of functions 
that are basically "Load this file format" and "Dump this file format", usually 
implemented in C, which are a magnet for CVEs and often the target for fuzz 
testing for that reason. Having a small library of maintained tests for round 
tripping file formats seems like it would be very useful for people who want to 
donate compute time to fuzz test CPython (or other implementations!)

> Speed is another issue.  During development, it doesn't matter much if 
> Hypothesis takes a lot of time exercising one function.  But in the standard 
> library tests already run slow enough to impact development.  If hypothesis 
> were to run everytime we run a test suite, it would make the situation worse.

As mentioned in the initial ticket, the current plan I'm suggesting is to have 
fallback stubs which turn your property tests into parameterized tests when 
hypothesis is not installed. If we're good about adding `@example` decorators 
(and certainly doing so is easier than writing new ad hoc tests for every edge 
case we can think of when we already have property tests written!), then I 
don't see any particular reason to run the full test suite against a full 
hypothesis run on every CI run.

My suggestion is:

1. By default, run hypothesis in "stubs" mode, where the property tests are 
simply parameterized tests.
2. Have one or two CI jobs that runs *only* the hypothesis tests, generating 
new examples — since this is just for edge case detection, it doesn't 
necessarily need to run on every combination of architectu

[issue44307] date.today() is 2x slower than datetime.now().date()

2021-06-04 Thread Paul Ganssle


Paul Ganssle  added the comment:

Yeah, I knew this was slower and it's been on my long list to look at it (tied 
to this is the fact that `datetime.today()` is basically just a slow version of 
`datetime.now()`, in defiance of user expectations).

My inclination is that we shouldn't re-implement `fromtimestamp` in 
`date.today`, but rather call `date_fromtimestamp` in the fast path. I believe 
that incurs the overhead of creating one additional Python object (an integer), 
but if it's a sufficiently significant speedup, we could possibly refactor 
`date_fromtimestamp` to a version that accepts a C integer and a version that 
accepts a Python integer, then call the version accepting a C integer.

I think this won't give any speedup to `datetime.today`, since `datetime.today` 
will still take the slow path. If we care about this, we *may* be able to 
implement `datetime.today` as an alias for `datetime.now(None)`, assuming there 
are no behavioral differences between the two.

--

___
Python tracker 
<https://bugs.python.org/issue44307>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44603] REPL: exit when the user types exit instead of asking them to explicitly type exit()

2021-07-12 Thread Paul Ganssle

Paul Ganssle  added the comment:

If we want to confine the behavior to just the repl, we could possibly have the 
repl set an environment variable or something of that nature for interactive 
sessions, so that `__repr__` of `exit` can tell the difference between being 
invoked in a REPL and not — though I suppose it could cause some pretty 
frustrating and confusing behavior if some library function is doing something 
like this behind the scenes:

```
def get_all_reprs():
return {
  v: repr(obj) for v, obj in globals()
]
```

You could invoke some function and suddenly your shell quits for no apparent 
reason. And if it only happens when triggered in a REPL, you'd be doubly 
confused because you can't reproduce it with a script.

I do think the "type exit() to exit" is a papercut. The ideal way to fix it 
would be in the REPL layer by special-casing `exit`, but I realize that that 
may introduce unnecessary complexity that isn't worth it for this one thing.

> Second, if absolutely necessary we could ask the user to confirm that they 
> want to exit.

A thought occurs: we could simply re-word the message to make it seem like 
we're asking for confirmation:

```
>>> exit
Do you really want to exit? Press Ctrl+Z to confirm, or type exit() to exit 
without confirmation.
```

Then it won't seem as much like we know what you meant to do but aren't doing 
it, despite the fact that the behavior is exactly the same 😅.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue44603>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44603] REPL: exit when the user types exit instead of asking them to explicitly type exit()

2021-07-12 Thread Paul Ganssle


Paul Ganssle  added the comment:

I'm +1 for Pablo's approach. That's approximately what I meant by "special-case 
it in the REPL layer" anyway.

Are there any downsides to doing it this way? It seems tightly scoped and with 
minimal overhead.

--

___
Python tracker 
<https://bugs.python.org/issue44603>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44603] REPL: exit when the user types exit instead of asking them to explicitly type exit()

2021-07-13 Thread Paul Ganssle

Paul Ganssle  added the comment:

> In fact, you're proposing to use exit as a keyword, but lying about it to the 
> users. If it were really so important, then it _should_ be a keyword, and at 
> least I'd know that I can't use it for my variables anymore. (It's not the 
> first time such a thing would happen. The same thing happened with `async` a 
> few years ago.) But please don't introduce those "keywords just in a 
> particular context", they are horrible from the perspective of usability.

We already have so-called "soft keywords", e.g. `match`, so the horse is out of 
the barn at this point.

I'm not sure why this is closed as rejected — I don't see any decision one way 
or the other in this thread or on the PR, did I miss it?

I am struggling to understand how this is a user-hostile change; it is not 
unreasonable for a REPL to have some commands for interacting with the REPL 
which are not Python functions. I have accidentally typed `exit` instead of 
`exit()` many times, and one of the reasons I and many others like IPython is 
that `exit` exits the REPL. It has never once caused a problem for me, as far 
as I can tell. I cannot imagine that it is a common scenario for someone to 
type "exit" in order to inspect the "exit" object — it doesn't even have a 
useful repr!

The only reason you'd do this would be if you knew what it does and were 
demonstrating it, or if you were exploring what the various builtins are (which 
I think is very rare, and you'd probably only have to learn that lesson once).

Vedran's point, however, that you could do `exit = some_func()` and then type 
`exit` to try and inspect the `exit` object is a solid one. That said, I think 
we can get around this fairly easily (albeit with the cost of some additional 
complexity in the "handle the exit keyword" function) — if there's a single AST 
node that is a name and the name is "exit" or "quit", the REPL inspects locals 
and globals to see if the object referred to is a Quitter, and if so it exits, 
otherwise pass through the command as normal (possibly raising a warning like, 
"Did you mean to exit? You have shadowed the `exit` builtin, so use 
Ctrl-Z/Ctrl-D to exit or delete your `exit` object and try again").

I understand the arguments for purity and explicability and I'm often one of 
the first people to argue for keeping things consistent and understandable, but 
this is one of those things where we could significantly improve user 
experience for no practical cost. We can identify with very high certainty the 
situations in which a user intended to exit the REPL, we should go ahead and do 
it to provide a more intuitive REPL experience.

--

___
Python tracker 
<https://bugs.python.org/issue44603>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44603] REPL: exit when the user types exit instead of asking them to explicitly type exit()

2021-07-13 Thread Paul Ganssle


Paul Ganssle  added the comment:

At this point I think we should probably start a thread on python-dev to see 
how people feel about it. I'd be happy to author or co-author a PEP for this if 
need be.

--

___
Python tracker 
<https://bugs.python.org/issue44603>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44603] REPL: exit when the user types exit instead of asking them to explicitly type exit()

2021-07-13 Thread Paul Ganssle


Paul Ganssle  added the comment:

Re-opening this because I think the discussion is not done and I don't see any 
reason why this was rejected.

> Related 2005 python-dev discussion: 
> https://mail.python.org/archives/list/python-...@python.org/thread/VNGY2DLML4QJUXE73JLVBIH5WFBZNIKG/

@Mark Thanks for digging these up! From what I can tell, that discussion ended 
up with a combination of there not being quite enough enthusiasm for the idea 
to drive it forward and no one coming up with a good way to localize the effect 
to just the case where we know the person was trying to type "exit" in a REPL.

I think Pablo's patch shows that a very limited addition to the "REPL layer" is 
actually plausible, and we *could* implement this without taking on an enormous 
amount of additional complexity or affecting non-interactive use cases.

Fernando's point about it being dangerous to generalize this additional layer 
of "interactive-use only" keywords is a good one (see: 
https://mail.python.org/archives/list/python-...@python.org/message/L37RD7SG26IOBETPI7TETKFGHPAPC75Q/),
 though it seems that it was this thread that prompted him to add exit/quit as 
auto-call magic keywords to IPython, and I think that has worked out in the 
intervening 16 years. I don't think there's much danger of us wanting to 
generalize this concept, since the only really compelling argument for doing it 
this way for exit/quit is that almost everyone seems to think it *should* work 
this way (despite it never having worked this way, and there not being any 
equivalents), and gets tripped up when it doesn't.

> and the related issue: https://bugs.python.org/issue1446372

Looks to me like that is an issue for adding the message when you type "exit". 
There's no additional discussion disqualifying the use of "exit" as an 
interactive-only keyword.

--
resolution: rejected -> 
status: closed -> open

___
Python tracker 
<https://bugs.python.org/issue44603>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44603] REPL: exit when the user types exit instead of asking them to explicitly type exit()

2021-07-13 Thread Paul Ganssle


Change by Paul Ganssle :


--
stage: resolved -> 

___
Python tracker 
<https://bugs.python.org/issue44603>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44829] zoneinfo.ZoneInfo does not check for Windows device names

2021-08-04 Thread Paul Ganssle


Paul Ganssle  added the comment:

Sorry you didn't receive a response to your security@ email, I guess my 
response just went to the PSRT, not to you as well. I believe we determined 
that this was an issue in importlib.resources generally, not specific to 
zoneinfo.

I think `importlib.resources.open_binary` should check if a resource is a file 
with `os.isfile` before opening it. That will solve the issue in zoneinfo and 
other similar situations.

--

___
Python tracker 
<https://bugs.python.org/issue44829>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue44831] Inconsistency between datetime.now() and datetime.fromtimestamp(time.time(), None)

2021-08-06 Thread Paul Ganssle


Paul Ganssle  added the comment:

I think this is a rounding issue. `time.time()` returns an epoch timestamp as a 
float and at the current epoch time, floats are spaced ~500ns apart.

`datetime.datetime.now` does a floor division when rounding: 
https://github.com/python/cpython/blob/8bdf12e99a3dc7ada5f85bba79c2a9eb9931f5b0/Modules/_datetimemodule.c#L5056

`datetime.fromtimestamp` uses the standard banker's round (round above half, 
tie goes to the nearest even number): 
https://github.com/python/cpython/blob/8bdf12e99a3dc7ada5f85bba79c2a9eb9931f5b0/Modules/_datetimemodule.c#L5038-L5039

Presumably if we change these two to be consistent, this issue will go away. I 
am not entirely sure if anyone is relying on a particular rounding behavior for 
one or both of these, and I'm not sure which one is the right one to harmonize 
on.

For now I'm going to say that we should target 3.11 on this, since it will 
change an existing observable behavior for at least one of these functions in a 
way that isn't necessarily going from "obviously wrong" to "obviously right", 
so I think we should be cautious and not change this in a patch release.

--
versions: +Python 3.11 -Python 3.6, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue44831>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45414] pathlib.Path.parents negative indexing is wrong for absolute paths

2021-10-08 Thread Paul Ganssle


Change by Paul Ganssle :


--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue45414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45414] pathlib.Path.parents negative indexing is wrong for absolute paths

2021-10-08 Thread Paul Ganssle

Paul Ganssle  added the comment:

This is a great bug report, but for anyone else who gets a bit lost in the 
details, here's the core of the issue:

>>> p = Path("/1/2")
>>> q = Path("1/2")

>>> p.parents[-1]  # This is correct
PosixPath('/')
>>> q.parents[-1]
PosixPath('.')

>>> p.parents[-2]  # Should be PosixPath('/1')
PosixPath('/')
>>> q.parents[-2]
PosixPath('1')

>>> p.parents[-3]  # Should be PosixPath('/1/2')
PosixPath('/1')
>>> q.parents[-3]
PosixPath('1/2')

I think a refactoring where '/' doesn't appear in ._parts would be a good idea 
if we can get past Chesterton's Fence and determine that this was indeed not a 
deliberate design decision (or at least one whose concerns no longer apply), 
but at least in the short term, I agree that transforming negative indexes into 
positive indices is the right, expedient thing to do.

We'll definitely want to make sure that we're careful about bad indices (and 
add relevant tests), though, since it would be easy to get weird behavior where 
too-large negative indexes start "wrapping around" (e.g. p.parents[-4] with 
len(p._parents) == 3 → p.parents[-1]).

--
type:  -> behavior

___
Python tracker 
<https://bugs.python.org/issue45414>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45515] Add reference to zoneinfo in the datetime module documetnation

2021-10-18 Thread Paul Ganssle


New submission from Paul Ganssle :

Right now the datetime documentation recommends using `dateutil.tz` for IANA 
time zones, but we should update this to point to `zoneinfo`.

--
assignee: p-ganssle
components: Documentation
messages: 404207
nosy: p-ganssle
priority: low
severity: normal
status: open
title: Add reference to zoneinfo in the datetime module documetnation
versions: Python 3.10, Python 3.11, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue45515>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45515] Add reference to zoneinfo in the datetime module documetnation

2021-10-18 Thread Paul Ganssle


Change by Paul Ganssle :


--
keywords: +patch
pull_requests: +27309
stage:  -> patch review
pull_request: https://github.com/python/cpython/pull/29038

___
Python tracker 
<https://bugs.python.org/issue45515>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue45814] datetime.time.strftime: use the same function signature for C and Python implementations

2021-11-16 Thread Paul Ganssle


Paul Ganssle  added the comment:

I think this is mostly a duplicate of bpo-41260, which has an open PR on it. I 
think that got lost in the shuffle, I'm sad we didn't fix it in Python 3.10. I 
think we should migrate all of these signatures that differ to whichever one 
the C implementation is using (I believe that's 3.11).

I'm going to close that one and edit the other one to cover `time` and `date` 
as well. Thanks for the report Yevhenii!

--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> datetime: strftime method takes different keyword argument: fmt 
(pure) or format (C)

___
Python tracker 
<https://bugs.python.org/issue45814>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue41260] datetime, date and time: strftime method takes different keyword argument: fmt (pure) or format (C)

2021-11-16 Thread Paul Ganssle


Paul Ganssle  added the comment:

Updating this issue to cover the problem in date, time and datetime.

--
title: datetime: strftime method takes different keyword argument: fmt (pure) 
or format (C) -> datetime, date and time: strftime method takes different 
keyword argument: fmt (pure) or format (C)

___
Python tracker 
<https://bugs.python.org/issue41260>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38812] Comparing datetime.time objects incorrect for TZ aware and unaware

2019-11-17 Thread Paul Ganssle


Paul Ganssle  added the comment:

I do not think this is a bug in pytz, but if it's a bug in Python it's one in 
reporting what the error is.

The issue is that the time zone offset for "rules-based zones" like 
America/Denver (i.e. most time zones) is *undefined* for bare times, because 
the offset that apply depends on the *date* and the *time*.

The documentation for `tzinfo.utcoffset` specifies that if the offset is 
unknown, a time zone offset should return None: 
https://docs.python.org/3/library/datetime.html#datetime.tzinfo.utcoffset

The documentation for determining whether an object is aware or naive also 
specifies that if utcoffset() returns `None`, the object is naive (even if 
tzinfo is not None): 
https://docs.python.org/3/library/datetime.html#determining-if-an-object-is-aware-or-naive

So basically, everyone is doing the right thing except the person who attached 
this `pytz` time zone to a time object (as a side note, it may be worth reading 
this blog post that explains why the way this time zone is attached to the 
`time` object is incorrect: 
https://blog.ganssle.io/articles/2018/03/pytz-fastest-footgun.html).

That said, we may be able to improve the error message raised here by 
distinguishing between the case where there's no `tzinfo` at all and the case 
where `utcoffset()` returns `None`. I think we can change the exception message 
to have a helpful hint like, "cannot compare offset-naive and offset-aware 
times; one of the operands is offset-naive because its offset is undefined."

We could possibly be even more specific.

--
components: +Library (Lib)
versions: +Python 3.9 -Python 3.6

___
Python tracker 
<https://bugs.python.org/issue38812>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue38914] Clarify wording for warning message when checking a package

2019-11-26 Thread Paul Ganssle


Paul Ganssle  added the comment:

For the future, we generally tend to keep distutils pretty "frozen", only 
making minor changes or the changes needed to build Python itself. Instead we 
generally make changes in setuptools, which for the moment monkey-patches 
distutils (and into which distutils will eventually be merged). One of the big 
reasons is that setuptools is used across all versions of Python, so the 
changes are automatically backported, whereas changes to distutils will only be 
seen by people using the most recent Python versions.

In this case, it's not really a substantive change, so I think we can leave it 
in distutils, I just wanted to bring this up as an FYI.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue38914>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39103] [linux] strftime renders %Y with only 3 characters

2019-12-20 Thread Paul Ganssle


Paul Ganssle  added the comment:

This is a duplicate of issue 13305.

Right now we have some shims around `strftime` to improve consistency in some 
situations and for other reasons, but mostly we just call the libc version.

There is an open issue from 2008 (#3173) to ship our own implementation of 
strftime that could smooth out some of these issues and try and make the 
behavior more consistent (though presumably some people have started to rely on 
platform-specific behaviors by now, so it may be a decent amount of work to 
roll it out).

I'm going to close this in favor of 13305, but thanks for reporting it!

--
resolution:  -> duplicate
stage:  -> resolved
status: open -> closed
superseder:  -> datetime.strftime("%Y") not consistent for years < 1000
type:  -> behavior

___
Python tracker 
<https://bugs.python.org/issue39103>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue13305] datetime.strftime("%Y") not consistent for years < 1000

2019-12-20 Thread Paul Ganssle


Change by Paul Ganssle :


--
versions: +Python 3.7, Python 3.8, Python 3.9 -Python 3.6

___
Python tracker 
<https://bugs.python.org/issue13305>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30717] Add unicode grapheme cluster break algorithm

2020-01-06 Thread Paul Ganssle


Paul Ganssle  added the comment:

> Oh, also, if y'all are fine with binding to Rust (through a C ABI) I'd love 
> to help y'all use unicode-segmentation, which is much less work that pulling 
> in ICU. Otherwise if y'all have implementation questions I can answer them. 
> This spec is kinda tricky to implement efficiently, but it's not super hard.

Is the idea here that we'd take on a new dependency on the compiled 
`unicode-segmentation` binary, rather than adding Rust into our build system? 
Does `unicode-segmentation` support all platforms that CPython supports? I was 
under the impression that Rust requires llvm and llvm doesn't necessarily have 
the same support matrix as CPython (I'd love to be corrected if I'm wrong on 
this).

(Note: I don't actually know what the process is for taking on new dependencies 
like this, just trying to point at one possible stumbling block.)

--

___
Python tracker 
<https://bugs.python.org/issue30717>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39280] Don't allow datetime parsing to accept non-Ascii digits

2020-01-09 Thread Paul Ganssle


Paul Ganssle  added the comment:

I don't love the inconsistency, but can you elaborate on the actual *danger* 
posed by this? What security vulnerabilities involve parsing a datetime using a 
non-ascii digit?

The reason that `fromisoformat` doesn't accept non-ASCII digits is actually 
because it's the inverse of `datetime.isoformat`, which never *emits* non-ASCII 
digits. For `strptime`, we're really going more for a general specifier for 
parsing datetime strings in a given format. I'll note that we do accept any 
valid unicode character for the date/time separator.

>From my perspective, there are a few goals, some of which may be in conflict 
>with the others:

1. Mitigating security vulnerabilities, if they exist.
2. Supporting international locales if possible.
3. Improving consistency in the API.

If no one ever actually specifies datetimes in non-ascii locales (and this 
gravestone that includes the date in both Latin and Chinese/Japanese characters 
seems to suggest otherwise: 
https://jbnewall.com/wp-content/uploads/2017/02/LEE-MONUMENT1-370x270.jpg ), 
then I don't see a problem dropping our patchy support, but I think we need to 
carefully consider the backwards compatibility implications if we go through 
with that.

One bit of evidence in favor of "no one uses this anyway" is that no one has 
yet complained that apparently this doesn't work for "%d" even if it works for 
"%y", so presumably it's not heavily used. If our support for this sort of 
thing is so broken that no one could possibly be using it, I suppose we may as 
well break it all the way, but it would be nice to try and identify some 
resources that the documentation can point to for how to handle international 
date parsing.


> Note the "unique and unambiguous". By accepting non-Ascii digits, we're 
> breaking the uniqueness requirement of ISO 8601.

I think in this case "but the standard says X" is probably not a very strong 
argument. 
 Even if we were coding to the ISO 8601 standard (I don't think we claim to be, 
we're just using that convention), I don't really know how to interpret the 
"unique" portion of that claim, considering that ISO 8601 specifies dozens of 
ways to represent the same datetime. Here's an example from [my 
`dateutil.parse.isoparse` test 
suite](https://github.com/dateutil/dateutil/blob/110a09b4ad46fb87ae858a14bfb5a6b92557b01d/dateutil/test/test_isoparser.py#L150):

```
'2014-04-11T00',
'2014-04-10T24',
'2014-04-11T00:00',
'2014-04-10T24:00',
'2014-04-11T00:00:00',
'2014-04-10T24:00:00',
'2014-04-11T00:00:00.000',
'2014-04-10T24:00:00.000',
'2014-04-11T00:00:00.00',
'2014-04-10T24:00:00.00'
```

All of these represent the exact same moment in time, and this doesn't even get 
into using the week-number/day-number configurations or anything with time 
zones. They also allow for the use of `,` as the subsecond-component separator 
(so add 4 more variants for that) and they allow you to leave out the dashes 
between the date components and the colons between time components, so you can 
multiply the possible variants by 4.

Just a random aside - I think there may be strong arguments for doing this even 
if we don't care about coding to a specific standard.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue39280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39280] Don't allow datetime parsing to accept non-Ascii digits

2020-01-10 Thread Paul Ganssle

Paul Ganssle  added the comment:

> Yes, but not within the same format. If someone were to choose the format 
> '2014-04-10T24:00:00', they would have a reasonable expectation that there is 
> only one unique string that corresponds with that datetime

That's a particularly bad example, because it's exactly the same as another 
string with the exact same format:

  2014-04-11T00:00:00

Since ISO 8601 allows you to specify midnight (and only midnight) using 
previous day + 24:00. Admittedly, that is the only ambiguity I know of offhand 
(though it's a huge spec) *for a given format*, but also ISO 8601 does not 
really have a concept of format specifiers, so it's not like there's a way to 
unambiguously specify the format you are intending to use.

Either way, I think we can explicitly dispense with "there will be an exact 
mapping between a given (format_str, datetime_str) pair and the datetime it 
produces" as a goal here. I can't think of any good reason you'd want that 
property, nor have we made any indication that I can see that we provide it 
(probably the opposite, since there are some formats that explicitly ignore 
whitespace).

> Okay, since it seems like I'm the only one who wants this change, I'll let it 
> go. Thanks for your input.

I wouldn't go that far. I think I am +0 or +1 on this change, I just wanted to 
be absolutely clear *why* we're doing this. I don't want someone pointing at 
this thread in the future and saying, "Core dev says that it's a bug in their 
code if they don't follow X standard / if more than one string produces the 
same datetime / etc".

I think the strongest argument for making this or a similar change is that I'm 
fairly certain that we don't have the bandwidth to handle internationalized 
dates and I don't think we have much to gain by doing a sort of half-assed 
version of that by accepting unicode transliterations of numerals and calling 
it a day. I think there are tons of edge cases here that could bite people, and 
if we don't support this *now* I'd rather give people an error message early in 
the process and try to point people at a library that is designed to handle 
datetime localization issues. If all we're going to do is switch [0-9] to \d 
(which won't work for the places where it's actually [1-9], mind you), I think 
people will get a better version of that with something like:

  def normalize_dt_str(dt_str):
  return "".join(str(int(x)) if x.isdigit() else x
 for x in dt_str)

There are probably more robust and/or faster versions of this, but it's 
probably roughly equivalent to what we'd be doing here *anyway*, and at least 
people would have to opt-in to this.

I am definitely open to us supporting non-ASCII digits in strptime if it would 
be useful at the level of support we could provide, but given that it's 
currently broken for any reasonable use case and as far as I know no one has 
complained, we're better off resolving the inconsistency by requiring ASCII 
digits and considering non-ASCII support to be a separate feature request.

CC-ing Inada on this as unicode guru and because he might have some intuition 
about how useful non-ASCII support might be. The only place I've seen non-ASCII 
dates is in Japanese graveyards, and those tend to use Chinese numerals (which 
don't match \d anyway), though Japanese and Korean also tends to make heavier 
use of "full-width numerals" block, so maybe parsing something like 
"2020-02-02" is an actual pain point that would be improved by this change 
(though, again, I suspect that this is just the beginning of the required 
changes and we may never get a decent implementation that supports unicode 
numerals).

--
nosy: +inada.naoki

___
Python tracker 
<https://bugs.python.org/issue39280>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39541] distutils: Remove bdist_wininst (Windows .exe installers) in favor of bdist_wheel (.whl)

2020-02-03 Thread Paul Ganssle


Paul Ganssle  added the comment:

Per my reasoning in the discourse thread, I disagree with this move. I think 
that this should be handled in setuptools, which is where we tend to handle 
breaking changes or even enhancements to distutils.

If we do this in setuptools, we'll get a backport of the deprecation and 
removal back to 3.5, and it will make it easier to maintain setuptools.

The deprecation of bdist_wininst in Python 3.8 already made it harder to 
maintain setuptools with no real benefit to the CPython project, I would prefer 
to not repeat this mistake.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue39541>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39550] isinstance accepts subtypes of tuples as second argument

2020-02-04 Thread Paul Ganssle


Paul Ganssle  added the comment:

Serhiy: I think at least a test for this particular corner case should be 
added, so that no implementations of `isinstance` that use the CPython test 
suite hit an infinite recursion in that event, I guess?

Though I think it's maybe an open question as to what the correct behavior is. 
Should we throw on any tuple subclass because there's no reason to support 
tuple subclasses? Should we switch to using __iter__ when it's defined because 
there are other cases where the custom behavior of the subclass is defined by 
its __iter__? Should we make it a guarantee that __iter__ is *never* called?

I can't really think of a reason why defining __iter__ on a tuple subclass 
would be anything other than a weird hack, so I would probably say either ban 
tuple subclasses or add a test like so:

def testIsinstanceIterNeverCalled(self):
"""Guarantee that __iter__ is never called when isinstance is invoked"""
class NoIterTuple(tuple):
def __iter__(self):  # pragma: nocover
raise NotImplemented("Cannot call __iter__ on this.")

self.assertTrue(isinstance(1, NoIterTuple((int,

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue39550>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39604] Document PyDateTimeAPI / PyDateTime_CAPI struct

2020-02-10 Thread Paul Ganssle


New submission from Paul Ganssle :

The entire public interface documented for the datetime C API is various C 
macros (see: https://docs.python.org/3/c-api/datetime.html) which are wrappers 
around function calls to the PyDateTimeAPI / PyDatetime_CAPI struct, but the 
struct itself is undocumented. 

Unfortunately (or fortunately, depending on how you think the C API should 
look), pretty much everyone has to know the implementation details of the C API 
struct anyway. Bindings in other languages usually can't use the C preprocessor 
macros and have to directly use the C API struct so projects like PyPy, PyO3 
and Cython are using it. The struct also can do things that the macros can't 
do: consider bug #30155 which is looking for a way to create a datetime object 
with a tzinfo (which is possible using the C struct).

I think we can should go ahead and make the `PyDateTimeAPI` struct "public" and 
document the functions on it. This may be a bit tougher than one would hope 
because the overlap between the macros and the struct functions isn't 100%, but 
it's pretty close, so I would think we'd want to document the two ways to do 
things rather close to one another.

nosy-ing Victor on here in case he has any strong opinions about whether these 
kinds of struct should be exposed as part of the official public interface.

--
assignee: docs@python
components: C API, Documentation
messages: 361733
nosy: belopolsky, docs@python, lemburg, p-ganssle, vstinner
priority: normal
severity: normal
status: open
title: Document PyDateTimeAPI / PyDateTime_CAPI struct
versions: Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue39604>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue30155] Add ability to get tzinfo from a datetime instance in C API

2020-02-10 Thread Paul Ganssle


Paul Ganssle  added the comment:

So this bug is asking for two things:

1. An official accessor for the `tzinfo` component of an existing datetime, 
which I think is very reasonable in light of the fact that there are official 
accessors for all the other components of a datetime.

2. An official constructor for a timezone-aware datetime, which I think 
basically exists in the form of 
PyDatetime_CAPI->PyDateTimeAPI->DateTime_FromDateAndTime / 
->DateTime_FromDateAndTimeAndFold, and we just need to document it. I think 
this is basically a separate issue, and I have opened #39604 to track it.

I'm going to rename this bug to focus only on issue #1. I think we can accept a 
PR adding two new macros. I would suggest calling them:

- PyDateTime_DATE_GET_TZINFO
- PyDateTime_TIME_GET_TZINFO

Please make sure to add tests to any PR you make. See the CapiTest case 
(https://github.com/python/cpython/blob/d68e0a8a165761604e820c8cb4f20abc735e717f/Lib/test/datetimetester.py#L5914)
 for examples. You may want to look at the git blame for a few of those tests 
to see the PRs that they were added in, since part of the tests are defined in 
a C file.

(As an aside: I don't love that the accessor methods are not available on the 
struct, since all the "macro-only" code needs to be re-implemented in all 
other-language bindings. Since the accessors are already all macro-only, 
though, might as well keep with the tradition for now :P)

--
stage:  -> needs patch
title: Add ability to get/set tzinfo on datetime instances in C API -> Add 
ability to get tzinfo from a datetime instance in C API
versions: +Python 3.9 -Python 3.6

___
Python tracker 
<https://bugs.python.org/issue30155>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39804] timezone constants in time module inaccurate with negative DST (e.g. Ireland)

2020-02-29 Thread Paul Ganssle


New submission from Paul Ganssle :

>From a report on the dateutil tracker today, I found that `time.timezone` and 
>`time.altzone` are not accurate in Ireland (at least on Linux, not tested on 
>other platforms): https://github.com/dateutil/dateutil/issues/1009

Europe/Dublin in the modern era has the exact same rules as Europe/London, but 
the values for `isdst` are switched, so for Ireland GMT is the "DST" zone with 
a DST offset of -1H, and IST is the standard zone, while London has GMT as the 
standard zone and BST as a DST zone of +1h.

The documentation for the timezone constants here pretty clearly say that the 
DST zone should be the *second* value in tzname, and should be the offset for 
altzone: https://docs.python.org/3/library/time.html#timezone-constants

But when setting my TZ variable to Europe/Dublin I get the same thing as for 
Europe/London:

$ TZ=Europe/Dublin python -c \
  "from time import *; print(timezone); print(altzone); print(tzname)"

0
-3600
('GMT', 'IST')
$ TZ=Europe/London python -c \
  "from time import *; print(timezone); print(altzone); print(tzname)"
0
-3600
('GMT', 'BST')

This would be less of a problem if localtime() were *also* getting isdst wrong 
in the same way, but it's not:


$ TZ=Europe/London python -c \
  "from time import *; print(localtime())"
time.struct_time(tm_year=2020, tm_mon=3, tm_mday=1, tm_hour=2, tm_min=5, 
tm_sec=6, tm_wday=6, tm_yday=61, tm_isdst=0)

$ TZ=Europe/Dublin python -c \
  "from time import *; print(localtime())"
time.struct_time(tm_year=2020, tm_mon=3, tm_mday=1, tm_hour=2, tm_min=5, 
tm_sec=18, tm_wday=6, tm_yday=61, tm_isdst=1)


So now it seems that there's no way to determine what the correct timezone 
offset and name are based on isdst. I'm not entirely sure if this is an issue 
in our code or a problem with the system APIs we're calling. This code looks 
like a *very* dicey heuristic (I expect it would also have some problems with 
Morocco in 2017, even before they were using a type of negative DST, since they 
used DST but turned it off from May 21st to July 2nd): 
https://github.com/python/cpython/blob/0b0d29fce568e61e0d7d9f4a362e6dbf1e7fb80a/Modules/timemodule.c#L1612

One option might be to deprecate these things as sort of very leaky 
abstractions *anyway* and be done with it, but it might be nice to fix it if we 
can.

--
messages: 363037
nosy: belopolsky, lemburg, p-ganssle
priority: normal
severity: normal
status: open
title: timezone constants in time module inaccurate with negative DST (e.g. 
Ireland)
type: behavior
versions: Python 3.6, Python 3.7, Python 3.8, Python 3.9

___
Python tracker 
<https://bugs.python.org/issue39804>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39763] distutils.spawn should use subprocess (hang in parallel builds on QNX)

2020-02-29 Thread Paul Ganssle


Change by Paul Ganssle :


--
nosy:  -p-ganssle

___
Python tracker 
<https://bugs.python.org/issue39763>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue26460] datetime.strptime without a year fails on Feb 29

2020-03-02 Thread Paul Ganssle


Paul Ganssle  added the comment:

I don't think adding a default_year parameter is the right solution here.

The actual problem is that `time.strptime`, and by extension 
`datetime.strptime` has a strange and confusing interface. What should happen 
is either that `year` is set to None or some other marker of a missing value or 
datetime.strptime should raise an exception when it's being asked to construct 
something that does not contain a year.

Since there is no concept of a partial datetime, I think our best option would 
be to throw an exception, except that this has been baked into the library for 
ages and would start to throw exceptions even when the person has correctly 
handled the Feb 29th case.

I think one possible "solution" to this would be to raise a warning any time 
someone tries to use `datetime.strptime` without requesting a year to warn them 
that the thing they're doing only exists for backwards compatibility reasons. 
We could possibly eventually make that an exception, but I'm not sure it's 
totally worth a break in backwards compatibility when a warning should put 
people on notice.

--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue26460>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39970] Combined behavior of datetime.datetime.timestamp() and datetime.datetime.utcnow() on non-UTC timezoned machines

2020-03-15 Thread Paul Ganssle


Paul Ganssle  added the comment:

This is the intended behavior of these functions, and there is actually now a 
warning on both the utcnow and utcfromtimestamp functionsto reflect this:

https://docs.python.org/3/library/datetime.html#datetime.datetime.utcnow

I would say that the correct answer here is to stop using utcnow and 
utcfromtimestamp (except possibly in very limited circumstance), I have written 
about it here:

https://blog.ganssle.io/articles/2019/11/utcnow.html

The preferred way to do this is `datetime.now(tzinfo=datetime.timezone.utc)` or 
`datetime.fromtimestamp(ts, tzinfo=datetime.timezone.utc)`.

The main thing to internalize is that the result of `.timestamp()` always has a 
time zone, because it is an epoch time, meaning that it is the number of 
seconds in UTC since 1970-01-01T00:00:00Z.

In Python 2, any operations on naive datetimes that required them to represent 
absolute times were an error, but in Python 3 that was changed and they were 
treated as local times. Perhaps leaving that behavior as is and having a 
dedicated "local time" object would have been a good idea, but there are 
actually some serious problems with doing it that way because it's difficult to 
define "local time" in such a way that it may not change over the course of an 
interpreter lifetime, which would cause major issues for an aware datetime 
(guaranteed not to change over the course of the interpreter lifetime). 
Treating naive times as local for operations that require localization (without 
changing their equality and comparison semantics, which is what would cause the 
problems) is a neat solution to that.

Sorry this causes confusion, perhaps in the future we can look into removing 
the `.utcnow()` and `.utcfromtimestamp()` functions, or renaming them to 
something else.

I'm going to set the status of this as wontfix because this is an intended 
behavior, but feel free to continue to use the ticket for discussion.

Thank you for taking the time to file an issue!

--
resolution:  -> wont fix
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue39970>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue39970] Combined behavior of datetime.datetime.timestamp() and datetime.datetime.utcnow() on non-UTC timezoned machines

2020-03-16 Thread Paul Ganssle


Paul Ganssle  added the comment:

@Yi Luan

I think you may misunderstand what the `.timestamp()` function does - it 
returns an epoch time, which is the amount of time (in seconds) elapsed since 
the Unix epoch: https://en.wikipedia.org/wiki/Unix_time

The number is not different depending on your time zone:

>>> from datetime import *
>>> from dateutil import tz

>>> dt = datetime(2019, 1, 1, tzinfo=timezone.utc)
>>> print(f"{dt}: {dt.timestamp()}")
2019-01-01 00:00:00+00:00: 1546300800.0

>>> dt = dt.astimezone(tz.gettz("America/New_York"))
>>> print(f"{dt}: {dt.timestamp()}")
2018-12-31 19:00:00-05:00: 1546300800.0

>>> dt = dt.astimezone(tz.gettz("Asia/Tokyo"))
>>> print(f"{dt}: {dt.timestamp()}")
2019-01-01 09:00:00+09:00: 1546300800.0

Note how the timestamp number is always the same.

Alexander's suggestion of using `datetime.now(tz=timezone.utc).timestamp()` is 
slightly misleading because `datetime.now().timestamp()` and 
`datetime.now(tz=timezone.utc).timestamp()` will always return the same value. 
I think he was just using that as shorthand for "replace datetime.utcnow() with 
datetime.now(tz=timezone.utc) in all cases".

When you have a naive datetime (with no tzinfo), the only options are to pick 
the time zone it represents and convert to UTC or to throw an error and say, 
"We don't know what time zone this represents, so we cannot do this operation." 
Python 2 used to throw an exception, but in Python 3 naive datetimes represent 
local times.

If you want "nominal number of seconds since 1970-01-01T00:00:00 *in this time 
zone*", you want something more like this:

  def seconds_since(dt, epoch=datetime(1970, 1, 1)):
return (dt.replace(tzinfo=None) - epoch).total_seconds()

That does not take into account total elapsed time from DST transitions and the 
like - to do that, you'll want something more like this:

  def seconds_elapsed_since(dt, epoch=datetime(1970, 1, 1)):
if epoch.tzinfo is None and dt.tzinfo is not None:
epoch = epoch.replace(tzinfo=dt.tzinfo)
return (dt - epoch).total_seconds()

I urge you not to do this in any sort of interop protocol, though, because 
integer timestamps are traditionally interpreted as Unix times, and if you 
start passing around an integer timestamp that represents "unix time plus or 
minus a few hours", you are likely to create bugs when someone mistakes it for 
a unix time.

--

___
Python tracker 
<https://bugs.python.org/issue39970>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40058] Running test_datetime twice fails with: module 'datetime' has no attribute '_divide_and_round'

2020-03-25 Thread Paul Ganssle


Paul Ganssle  added the comment:

This isn't exactly "working as intended", but I believe it's a known problem 
with either `import_fresh_module` or `datetime`, as you can see from these 
comments: 
https://github.com/python/cpython/blob/302e5a8f79514fd84bafbc44b7c97ec636302322/Lib/test/test_datetime.py#L14-L23

Based on the git blame, those TODO comments are from Georg Brandl, so I'm 
adding him to the nosy list in case he has some insight.

--
nosy: +georg.brandl

___
Python tracker 
<https://bugs.python.org/issue40058>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40076] isoformat function drops microseconds part if its value is 000000

2020-03-26 Thread Paul Ganssle


Paul Ganssle  added the comment:

> isoformat function does not conform to the ISO 8601 and drops microseconds 
> part if its value is 00.

I'm not sure why you think that this does not conform to ISO 8601 - ISO 8601 is 
a sprawling beast of a spec and allows some crazy formats. Some examples of 
perfectly valid ISO 8601 strings:

--03-26
2020-W13-4T03
2020-03-26T03.5
2020-03-26T03,5
2020-03-26T03:30:40.334


There are *hundreds* of valid formats encompassed by ISO 8601.

Anyway, that's an aside. The behavior of .isoformat() is pretty clearly 
documented. These are the first three line of the documentation:

Return a string representing the date and time in ISO 8601 format:

  - -MM-DDTHH:MM:SS.ff, if microsecond is not 0
  - -MM-DDTHH:MM:SS, if microsecond is 0

I believe Karthikeyan has adequately explained how to get the behavior you 
want, so I am going to go ahead and close this as working as intended.

--

___
Python tracker 
<https://bugs.python.org/issue40076>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue33262] Deprecate shlex.split(None) to read from stdin.

2020-04-01 Thread Paul Ganssle


Paul Ganssle  added the comment:


New changeset 975ac326ffe265e63a103014fd27e9d098fe7548 by Zackery Spytz in 
branch 'master':
bpo-33262: Deprecate passing None for `s` to shlex.split() (GH-6514)
https://github.com/python/cpython/commit/975ac326ffe265e63a103014fd27e9d098fe7548


--
nosy: +p-ganssle

___
Python tracker 
<https://bugs.python.org/issue33262>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40136] add warning to datetime.replace documentation to not use it for setting tzinfo unless UTC or None

2020-04-01 Thread Paul Ganssle


Paul Ganssle  added the comment:

That is a specific problem with the third-party library `pytz`, not a standard 
feature of the datetime module. Using `datetime.replace` is the intended way to 
set a time zone, see: 
https://blog.ganssle.io/articles/2018/03/pytz-fastest-footgun.html

As of Python 3.6, we've been recommending dateutil.tz instead of pytz, and 
assuming PEP 615 is accepted ( https://www.python.org/dev/peps/pep-0615/ ), we 
will have a built in time zone type that supports IANA time zones.

I am going to close this because this is not a bug in CPython, but if you think 
otherwise feel free to continue using this ticket to make the case.

--
nosy: +p-ganssle
resolution:  -> not a bug
stage:  -> resolved
status: open -> closed

___
Python tracker 
<https://bugs.python.org/issue40136>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue40173] test.support.import_fresh_module fails to correctly block submodules when fresh is specified

2020-04-03 Thread Paul Ganssle


New submission from Paul Ganssle :

It seems that test.support.import_fresh_module gets tripped up with its module 
blocking when you attempt to get a fresh copy of a submodule of a module where 
you are also importing the module that you are trying to block (bit of a doozy 
of a sentence there...). So, for example, with the following configuration in 
mymodule/__init__.py:

from .other import other

try:
from ._a import attr
except ImportError:
from ._b import attr

(Assuming _a.attr = "A" and _b.attr = "B"), if you attempt to do:

m = test.support.import_fresh_module("mymodule", 
fresh=("mymodule._other",), blocked=("mymodule._a"))

Then you'll find that m.attr is pulled from _a.attr. Here's a small script to 
demonstrate:

from test.support import import_fresh_module
import sys

def import_ab(fresh_other):
fresh = ("mymodule._other", ) if fresh_other else ()

mods_out = []
for to_block in "_b", "_a":
blocked = (f"mymodule.{to_block}",)

mods_out.append(import_fresh_module("mymodule",
fresh=fresh, blocked=blocked))
return mods_out


for fresh_other in [True, False]:
mymodule_a, mymodule_b = import_ab(fresh_other)

qualifier = "With" if fresh_other else "Without"
print(f"{qualifier} a fresh import of mymodule._other")

print(f"a: {mymodule_a.attr}")
print(f"b: {mymodule_b.attr}")
print()

When you run it with a suitably configured module on Python 3.8:

$ python importer.py 
With a fresh import of mymodule._other
a: A
b: A

Without a fresh import of mymodule._other
a: A
b: B

It also happens if you add `mymodule._a` or `mymodule._b` to the fresh list 
when you are trying to block the other one.

I *think* the problem is that in the step where _save_and_remove_module is 
called on fresh_name (see here: 
https://github.com/python/cpython/blob/76db37b1d37a9daadd9e5b320f2d5a53cd1352ec/Lib/test/support/__init__.py#L328-L329),
 it's necessarily populating `sys.modules` with a fresh import of the top-level 
module we're trying to import (mymodule) *before* the blocking goes into 
effect, then the final call to importlib.import_module just hits that cache.

I think either of the following options will fix this issue:

1. Switching the order of how "fresh" and "blocked" are resolved or
2. Deleting `sys.modules[name]` if it exists immediately before calling 
`importlib.import_module(name)

That said, I'm still having some weird statefulness problems if I block a C 
module's import and *then* block a Python module's import, so there may be some 
other underlying pathology to the current approach.

--
components: Tests
files: test_support_repro.zip
messages: 365702
nosy: brett.cannon, eric.snow, ncoghlan, p-ganssle
priority: normal
severity: normal
status: open
title: test.support.import_fresh_module fails to correctly block submodules 
when fresh is specified
type: behavior
versions: Python 3.7, Python 3.8, Python 3.9
Added file: https://bugs.python.org/file49031/test_support_repro.zip

___
Python tracker 
<https://bugs.python.org/issue40173>
___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



  1   2   3   4   5   >