Tim Peters added the comment:
If someone opens a bug report with OpenBSD, or just for us to get more info, it
could be useful to have a larger universe of troublesome tan inputs to stare
at. So the attached tanny.py supplies them, testing all inputs within 100 ulps
of math.pi/2 (or change N
Tim Peters added the comment:
Thanks for tanny-openbsd.txt, Serhiy! OpenBSD didn't get anywhere close to the
best answer on any of those 201 inputs. I was hoping we could, e.g., test
something a little more removed from pi/2 - but even its best cases in this
range are hundreds of mil
Tim Peters added the comment:
When Sun was developing fdlibm, I was (among other things) working on a
proprietary libm for Kendall Square Research. I corresponded with fdlibm's
primary author (KC Ng) often at the time. There's no way he would have left
errors this egregious s
Tim Peters added the comment:
The docs for the `time` module say:
"""
Although this module is always available, not all functions are available on
all platforms. Most of the functions defined in this module call platform C
library functions with the same name. It may sometime
Tim Peters added the comment:
Since this is a pretty common gotcha, I'd prefer to add it as an example to the
text I already quoted; e.g., add:
"""
For example, the native Windows C libraries do not support times before the
epoch, and `localtime(n)` for negative `n`
Tim Peters added the comment:
I'll just add that it may be a different issue to argue about how
`_naive_is_dst()` is implemented.
--
nosy: +belopolsky
___
Python tracker
<https://bugs.python.org/is
Tim Peters added the comment:
Well, the problem in the regexp is this part: "\d+,? ?". You're not
_requiring_ that strings of digits be separated by a comma or blank, you're
only _allowing_ them to be so separated. A solid string of digits is matched
by this,
Tim Peters added the comment:
Sure! The OP was obviously asking about the engine that ships with Python, so
that's what I talked about.
Raphaël, Matthew develops an excellent replacement ("regex") for Python's re
module, which you can install via, e.g., "pip insta
Tim Peters added the comment:
On 16 Oct 2017, exactly the same test failures were reported on python-dev:
https://mail.python.org/pipermail/python-dev/2017-October/149880.html
>From the test output posted there:
"""
== CPython 3.6.3 (default, Oct 16 2017, 14:42:21) [GCC 4.7
Tim Peters added the comment:
Segfaults are different: they usually expose an error in CPython's
implementation. We don't prioritize them because the user may have to restart
their program (who cares? <0.5 wink>), but because they demonstrate the
language implementation is
Tim Peters added the comment:
BTW, has anyone tried running a tiny C program on these platforms to see what
tan(1.5707963267948961) delivers? The kind of code fdlibm uses is sensitive
not only to compiler (mis)optimization, but also to stuff like how the FPU's
"precision contr
Tim Peters added the comment:
Since fdlibm uses tan(x) ~= -1/(x-pi/2) in this range, and the reciprocals of
the bad results have a whole of bunch of trailing zero bits, my guess is that
argument reduction (the "x-pi/2" part) is screwing up (losing bits of pi/2
beyond the long
Tim Peters added the comment:
Oops! I mixed up `sin` and `cos` in that comment. If it's argument reduction
that's broken, then for x near pi/2 cos(x) will be evaluated as -sin(x - pi/2),
which is approximately -(x - pi/2), and so error in argument reduction (the "x
- pi/2&q
Tim Peters added the comment:
Pass "autojunk=False" to your SequenceMatcher constructor and the ratio you get
back will continue to increase as `i` increases.
The docs:
"""
Automatic junk heuristic: SequenceMatcher supports a heuristic that
automatically treats
Tim Peters added the comment:
`doctest` is intended to be anal - there are few things more pointlessly
confusing for a user than to see docs that don't match what they actually see
when they run the doc's examples. "Is it a bug? Did I do it wrong? Why can't
they docum
Tim Peters added the comment:
Tomáš, of course you can combine testing methods any way you like. Don't
oversell this - there's nothing actually magical about comparing objects
instead of strings ;-)
I'm only -0 on this. It grates a bit against doctest's original intent
Tim Peters added the comment:
Best I can tell, the fdlibm 5.3 on netlib was released in 2002, and essentially
stopped existing as a maintained project then. Everyone else copied the source
code, and made their own changes independently ever since :-( At least the
folks behind the Julia
Tim Peters added the comment:
I have no opinion about any version of xxxBSD, because I've never used one ;-)
If current versions of those do have this failure, has anyone opened a bug
report on _their_ tracker(s)? I've seen no reason yet to imagine these
failures are a fault
Tim Peters added the comment:
I agree the current recipe strikes a very nice balance among competing
interests, and is educational on several counts.
s/pending/numactive/
# Remove the iterator we just exhausted from the cycle.
numactive -= 1
nexts = cycle(islice(nexts, numactive
Tim Peters added the comment:
As a comment in the referenced patch says, the intent of the patch was to make
behavior match the C99 spec. Among other things, C99's annex F (section
F.9.4.4 "The pow functions") says:
"""
— pow(−∞, y) returns −0 for y an odd int
Tim Peters added the comment:
No worries, Mark :-) Odd things happen sometimes when people are editing near
the same time. BTW, of course I agree with closing this!
--
___
Python tracker
<https://bugs.python.org/issue32
Tim Peters added the comment:
Mark, indeed, in the email from Vincent Lefevre you linked to, his entire
argument was: (a) we already specified what happens when the base is a zero;
so, (b) for each of the six pow(a_zero, y) cases we specified, derive a
matching rule for an inf base via
Tim Peters added the comment:
To answer the old accusation ;-), no, this isn't my wording. I _always_
explain that Python's integer bit operations act as if the integers were stored
in 2's-complement representation but with an infinite number of sign bits.
That's
Tim Peters added the comment:
First thing: the code uses the global name `outputer` for two different
things, as the name of a module function and as the global name given to the
Process object running that function. At least on Windows under Python 3.6.4
that confusion prevents the
Tim Peters added the comment:
Right, "..." immediately after a ">>>" line is taken to indicate a code
continuation line, and there's no way to stop that short of rewriting the
parser.
The workaround you already found could be made more palatable if
Tim Peters added the comment:
And I somehow managed to unsubscribe Steven :-(
--
nosy: +steven.daprano
___
Python tracker
<https://bugs.python.org/issue32
Tim Peters added the comment:
Jason, an ellipsis will match an empty string. But if your expected output is:
"""
x...
abcd
...
"""
you're asking for output that:
- starts with "x"
- followed by 0 or more of anything
- FOLLOWED BY A NEWLINE (I t
Tim Peters added the comment:
By the way, going back to your original problem, "the usual" solution to that
different platforms can list directories in different orders is simply to sort
the listing yourself. That's pretty easy in Python ;-) Then your test can
verify the h
Tim Peters added the comment:
Min, you need to give a complete example other people can actually run for
themselves.
Offhand, this part of the regexp
(.|\s)*
all by itself _can_ cause exponential-time behavior. You can run this for
yourself:
>>> import re
>>> p = r"
Tim Peters added the comment:
I expect these docs date back to when ints, longs, and floats were the only
hashable language-supplied types for which mixed-type comparison could ever
return True.
They could stand some updates ;-) `fractions.Fraction` and `decimal.Decimal`
are more language
Tim Peters added the comment:
They both look wrong to me. Under 3.6.5 on Win10, `one` and `three` are the
same.
Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 17:00:18) [MSC v.1900 64 bit
(AMD64)] on win32
time.struct_time(tm_year=2009, tm_mon=2, tm_mday=13, tm_hour=23, tm_min=31,
tm_sec
Tim Peters added the comment:
doctest was intended to deal with the standard CPython terminal shell. I'd
like to keep it that way, but recognize that everyone wants to change
everything into "a framework" ;-)
How many other shells are there? As Sergey linked to, IPython alre
Tim Peters added the comment:
Sergey, I understand that, but I don't care. The only people I've ever seen
_use_ this are people writing an entirely different shell interface. They're
rare. There's no value in complicating doctest to cater to theoretical use
cases that
Tim Peters added the comment:
You missed my point about IPython: forget "In/Out arrays, etc". What you
suggest is inadequate for _just_ changing PS1/PS2 for IPython. Again, read
their `parse()` function. They support _more than one_ set of PS1/PS2
conventions. So the code c
Tim Peters added the comment:
Berker Peksag's change (PR 5667) is very simple and, I think, helpful.
--
nosy: +tim.peters
___
Python tracker
<https://bugs.python.org/is
Tim Peters added the comment:
The message isn't confusing - the definition of "aware" is confusing ;-)
"""
A datetime object d is aware if d.tzinfo is not None and d.tzinfo.utcoffset(d)
does not return None. If d.tzinfo is None, or if d.tzinfo is not None but
Tim Peters added the comment:
I copy/pasted the definitions of "aware" and "naive" from the docs. Your TZ's
.utcoffset() returns None, so, yes, any datetime using an instance of that for
its tzinfo is naive.
In
print(datetime(2000,1,1).astimezone(timezone.utc))
Tim Peters added the comment:
Dan, your bug report is pretty much incoherent ;-) This standard Stack
Overflow advice applies here too:
https://stackoverflow.com/help/mcve
Guessing your complaint is that:
sys.getrefcount(itertools.repeat)
keeps increasing by 1 across calls to `leaks
Tim Peters added the comment:
I'd call it a bug fix, but I'm really not anal about what people call things ;-)
--
___
Python tracker
<https://bugs.python.o
Tim Peters added the comment:
Raymond, I'd say scaling is vital (to prevent spurious infinities), but
complications beyond that are questionable, slowing things down for an
improvement in accuracy that may be of no actual benefit.
Note that your original "simple homework problem
Tim Peters added the comment:
There are a couple bug reports here that have been open for years, and it's
about time we closed them.
My stance: if any platform still exists on which "double rounding" is still a
potential problem, Python _configuration_ should be changed to
Tim Peters added the comment:
Mark, do you believe that 32-bit Linux uses a different libm? One that fails
if, e.g., SSE2 were used instead? I don't know, but I'd sure be surprised it
if did. Very surprised - compilers have been notoriously unpredictable in
exactly when
Tim Peters added the comment:
Mark, ya, I agree it's most prudent to let sleeping dogs lie.
In the one "real" complaint we got (issue 24546) the cause was never determined
- but double rounding was ruled out in that specific case, and no _plausible_
cause was identified (sho
Tim Peters added the comment:
[Mark]
> If we do this, can we also persuade Guido to Pronounce that
> Python implementations assume IEEE 754 format and semantics
> for floating-point?
On its own, I don't think a change to force 53-bit precision _on_ 754 boxes
would justify that
Tim Peters added the comment:
Victor, look at Raymond's patch. In Python 3, `randrange()` and friends
already use the all-integer `getrandbits()`. He's changing three other lines,
where some variant of `int(random() * someinteger)` is being used in an inner
loop for speed.
Pres
Tim Peters added the comment:
[Victor]
> This method [shuffle()] has a weird API. What is
> the point of passing a random function,
> ... I proposed to deprecate this argument and remove it later.
I don't care here. This is a bug report. Making backward-incompatible API
Tim Peters added the comment:
Lucas, as Mark said you're sorting _strings_ here, not sorting integers.
Please study his reply. As strings, "10" is less than "9", because "1" is less
than "9".
>>> "10
Tim Peters added the comment:
The language doesn't define anything about this - any program relying on
accidental identity is in error itself.
Still, it's nice if a code object's co_consts vector is as short as reasonably
possible. That's a matter of pragmatics
Tim Peters added the comment:
Fine, Serhiy, so reword it a tiny bit: it's nice if a code object's co_consts
vector references as few distinct objects as possible. Still a matter of
pragmatics, not of correctness.
--
___
Python track
Tim Peters added the comment:
? I expect your code to return -1 about once per 7**4 = 2401 times, which
would be about 400 times per million tries, which is what your output shows.
If you start with -5, and randint(1, 7) returns 1 four times in a row, r5 is
left at -5 + 4 = -1
Tim Peters added the comment:
Nick, that seems a decent compromise. "Infinite string of sign bits" is how
Guido & I both thought of it when the semantics of longs were first defined,
and others in this report apparently find it natural enough too. It also
applies to all 6
Tim Peters added the comment:
Well, all 6 operations "are calculated as though carried out in two's
complement with an infinite number of sign bits", so I'd float that part out of
the footnote and into the main text. When, e.g., you're thinking of ints _as_
bit
Tim Peters added the comment:
Ya, Mark's got a point there. Perhaps
s/the internal/a finite two's complement/
?
--
___
Python tracker
<https://bugs.python.o
Tim Peters added the comment:
If your `bucket` has 30 million items, then
for element in bucket:
executor.submit(kwargs['function']['name'], element, **kwargs)
is going to create 30 million Future objects (and all the under-the-covers
objects needed to mana
Tim Peters added the comment:
Note that you can consume multiple gigabytes of RAM with this simpler program
too, and for the same reasons:
"""
import concurrent.futures as cf
bucket = range(30_000_000)
def _dns_query(target):
from time import sleep
sleep(0.1)
def
Tim Peters added the comment:
I'm sure Guido designed the API to discourage subtly bug-ridden code relying on
the mistaken belief that it _can_ know the queue's current size. In the
general multi-threaded context Queue is intended to be used, the only thing
`.qsize()`'s cal
Tim Peters added the comment:
@CuriousLearner, does the PR also include Nick's first suggested change? Here:
"""
1. Replace the opening paragraph of
https://docs.python.org/3/library/stdtypes.html#bitwise-operations-on-integer-types
(the one I originally quoted whe
Tim Peters added the comment:
Nick suggested two changes on 2018-07-15 (look above). Mark & I agreed about
the first change, so it wasn't mentioned again after that. All the rest has
been refining the second change.
--
___
Pytho
Tim Peters added the comment:
Note: if you found a regexp like this _in_ the Python distribution, then a bug
report would be appropriate. It's certainly possible to write regexps that can
suffer catastrophic backtracking, and we've repaired a few of those, over the
years, th
Tim Peters added the comment:
Closing as not-a-bug - not enough info to reproduce, but the regexp looked
prone to exponential-time backtracking to both MRAB and me, and there's been no
response to requests for more info.
--
components: +Regular Expressions
nosy: +ezio.me
Tim Peters added the comment:
Yes, the assignment does "hide the global definition of g". But this
determination is made at compile time, not at run time: an assignment to `g`
_anywhere_ inside `f()` makes _every_ appearance of `g` within `f()` local to
`f`.
--
nosy: +
Tim Peters added the comment:
Not that it matters: "ulp" is a measure of absolute error, but the script is
computing some notion of relative error and _calling_ that "ulp". It can
understate the true ulp error by up to a factor of 2 (the "wobble" of base 2
f
Tim Peters added the comment:
Thanks for doing the "real ulp" calc, Raymond! It was intended to make the
Kahan gimmick look better, and it succeeded ;-) I don't personally care
whether adding 10K things ends up with 50 ulp error, but to each their own.
Division can be most
Tim Peters added the comment:
Sure, if we make more assumptions. For 754 doubles, e.g., scaling isn't needed
if `1e-100 < absmax < 1e100` unless there are a truly ludicrous number of
points. Because, if that holds, the true sum is between 1e-200 and
number_of_points*1e200, bo
Tim Peters added the comment:
I agree there's pointless code now, but don't understand why the patch replaces
it with mysterious asserts. For example, what's the point of this?
assert(Py_SIZE(a) <= PY_SSIZE_T_MAX / sizeof(PyObject*));
assert(Py_SIZE(b) <= PY_SSIZE_T_
Tim Peters added the comment:
Bah - the relevant thing to assert is really
assert((size_t)Py_SIZE(a) + (size_t)Py_SIZE(b) <= (size_t)PY_SSIZE_T_MAX);
C sucks ;-)
--
___
Python tracker
<https://bugs.python.org/issu
New submission from Tim Peters :
The invariants on the run-length stack are uncomfortably subtle. There was a
flap a while back when an attempt at a formal correctness proof uncovered that
the _intended_ invariants weren't always maintained. That was easily repaired
(as the resear
Tim Peters added the comment:
The attached runstack.py models the relevant parts of timsort's current
merge_collapse and the proposed 2-merge. Barring conceptual or coding errors,
they appear to behave much the same with respect to "total cost", with no clear
overall win
Tim Peters added the comment:
Looks like all sorts of academics are exercised over the run-merging order now.
Here's a paper that's unhappy because timsort's strategy, and 2-merge too,
aren't always near-optimal with respect to the entropy of the distribution of
Tim Peters added the comment:
"Galloping" is the heart & soul of Python's sorting algorithm. It's explained
in detail here:
https://github.com/python/cpython/blob/master/Objects/listsort.txt
The Java fork of the sorting code has had repeated bugs due to reducing
Tim Peters added the comment:
A new version of the file models a version of the `powersort` merge ordering
too. It clearly dominates timsort and 2-merge in all cases tried, for this
notion of "cost".
Against it, its code is much more complex, and the algorithm is very far fro
Tim Peters added the comment:
The notion of cost is that merging runs of lengths A and B has "cost" A+B,
period. Nothing to do with logarithms. Merge runs of lengths 1 and 1000, and
it has cost 1001.
They don't care about galloping, only about how the order in which merges
Tim Peters added the comment:
No, there's no requirement that run lengths on the stack be ordered in any way
by magnitude. That's simply one rule timsort uses, as well as 2-merge and
various other schemes discussed in papers. powersort has no such rule, and
that's fine.
Re
New submission from Tim Peters :
Using Visual Studio 2017 to build the current master branch of Python
(something I'm trying for the first time in about two years - maybe I'm missing
something obvious!), with the x64 target, under both the Release and Debug
builds I get a Python
Tim Peters added the comment:
New version of runstack.py.
- Reworked code to reflect that Python's sort uses (start_offset, run_length)
pairs to record runs.
- Two unbounded-integer power implementations, one using a loop and the other
division. The loop version implies that, in Pyt
Tim Peters added the comment:
Another runstack.py adds a bad case for 2-merge, and an even worse
(percentage-wise) bad case for timsort. powersort happens to be optimal for
both.
So they all have contrived bad cases now. powersort's bad cases are the least
bad. So far ;-) But I e
Tim Peters added the comment:
FYI, I bet I didn't see a problem with the Win32 target because I followed
instructions ;-) and did my first build using build.bat. Using that for the
x64 too target makes the problem go away.
--
___
Python tr
Tim Peters added the comment:
Ya, I care: `None` was always intended to be an explicit way to say "nothing
here", and using unique non-None sentinels instead for that purpose is
needlessly convoluted. `initial=None` is perfect. But then I'm old & in the
way ;
Tim Peters added the comment:
@jdemeyer, please define exactly what you mean by "Bernstein hash". Bernstein
has authored many hashes, and none on his current hash page could possibly be
called "simple":
https://cr.yp.to/hash.html
If you're talking about the
Tim Peters added the comment:
Ah! I see that the original SourceForge bug report got duplicated on this
tracker, as PR #942952. So clicking on that is a lot easier than digging thru
the mail archive.
One message there noted that replacing xor with addition made collision
statistics much
Change by Tim Peters :
--
nosy: +ned.deily
___
Python tracker
<https://bugs.python.org/issue34751>
___
___
Python-bugs-list mailing list
Unsubscribe:
Tim Peters added the comment:
@jdemeyer, you didn't submit a patch, or give any hint that you _might_. It
_looked_ like you wanted other people to do all the work, based on a contrived
example and a vague suggestion.
And we already knew from history that "a simple Bernstein has
Tim Peters added the comment:
You said it yourself: "It's not hard to come up with ...". That's not what
"real life" means. Here:
>>> len(set(hash(1 << i) for i in range(100_000)))
61
Wow! Only 61 hash codes across 100 thousand distinct int
Tim Peters added the comment:
For me, it's largely because you make raw assertions with extreme confidence
that the first thing you think of off the top of your head can't possibly make
anything else worse. When it turns out it does make some things worse, you're
equally con
Tim Peters added the comment:
Oops!
"""
"j odd implies j^(-2) == -j, so that m*(j^(-2)) == -m"
"""
The tail end should say "m*(j^(-2)) == -m*j" instead.
--
___
P
Tim Peters added the comment:
Thank you, Vincent! I very much enjoyed - and appreciated - your paper I
referenced at the start. Way back when, I thought I had a proof of O(N log N),
but never wrote it up because some details weren't convincing - even to me ;-)
. Then I had to move
Tim Peters added the comment:
>> Why do you claim the original was "too small"? Too small for
>> what purpose?
> If the multiplier is too small, then the resulting hash values are
> small too. This causes collisions to appear for smaller numbers:
All right! An
Tim Peters added the comment:
Because the behavior of signed integer overflow isn't defined in C. Picture a
3-bit integer type, where the maximum value of the signed integer type is 3.
3+3 has no defined result. Cast them to the unsigned flavor of the integer
type, though, and the r
Tim Peters added the comment:
So you don't know of any directly relevant research either. "Offhand I can't
see anything wrong" is better than nothing, but very far from "and we know it
will be OK because [see references 1 and 2]".
That Bernstein's DJBX3
Tim Peters added the comment:
I strive not to believe anything in the absence of evidence ;-)
FNV-1a supplanted Bernstein's scheme in many projects because it works better.
Indeed, Python itself used FNV for string hashing before the security wonks got
exercised over collision attacks
Tim Peters added the comment:
Raymond, I share your concerns. There's no reason at all to make gratuitous
changes (like dropping the "post-addition of a constant and incorporating
length signature"), apart from that there's no apparent reason for them
existing to begin
Tim Peters added the comment:
Oh, I don't agree that it's "broken" either. There's still no real-world test
case here demonstrating catastrophic behavior, neither even a contrived test
case demonstrating that, nor a coherent characterization of what "the proble
Tim Peters added the comment:
Has anyone figured out the real source of the degeneration when mixing in
negative integers? I have not. XOR always permutes the hash range - it's
one-to-one. No possible outputs are lost, and XOR with a negative int isn't
"obviously degener
Tim Peters added the comment:
[Raymond, on boosting the multiplier on 64-bit boxes]
> Yes, that would be perfectly reasonable (though to some
> extent the objects in the tuple also share some of the
> responsibility for getting all bits into play).
It's of value independent of
Tim Peters added the comment:
FYI, using this for the guts of the tuple hash works well on everything we've
discussed. In particular, no collisions in the current test_tuple hash test,
and none either in the cases mixing negative and positive little ints. This
all remains so usin
Tim Peters added the comment:
BTW, those tests were all done under a 64-bit build. Some differences in a
32-bit build:
1. The test_tuple hash test started with 6 collisions. With the change, it
went down to 4. Also changing to the FNV-1a 32-bit multiplier boosted it to 8.
The test
Tim Peters added the comment:
> when you do t ^= t << 7, then you are not changing
> the lower 7 bits at all.
I want to leave low-order hash bits alone. That's deliberate.
The most important tuple component types, for tuples that are hashable, are
strings and contiguous ra
Tim Peters added the comment:
Jeroen, I understood the part about -2 from your initial report ;-) That's why
the last code I posted didn't use -2 at all (neither -1, which hashes to -2).
None of the very many colliding tuples contained -2 in any form. For example,
these 8 tuple
Tim Peters added the comment:
> advantage of my approach is that high-order bits become more
> important:
I don't much care about high-order bits, beyond that we don't systematically
_lose_ them. The dict and set lookup routines have their own strategies for
incorporating
Tim Peters added the comment:
Just noting that this Bernstein-like variant appears to work as well as the
FNV-1a version in all the goofy ;-) endcase tests I've accumulated:
while (--len >= 0) {
y = PyObject_Hash(*p++);
if (y == -1)
r
701 - 800 of 1332 matches
Mail list logo