[issue34751] Hash collisions for tuples

2018-09-24 Thread Tim Peters
Tim Peters added the comment: And one more: x = (x * mult) ^ t; also appears to work equally well. So, way back when, it appears we _could_ have wormed around the disaster du jour just by applying a shift-xor permutation to the raw hash results. Note the implication: if we

[issue34751] Hash collisions for tuples

2018-09-25 Thread Tim Peters
Tim Peters added the comment: > Suppose that there is a hash collision, say hash((3, 3)) == > hash((-3, -3)) and you change the hashing algorithm to fix > this collision. There are _two_ hash functions at play in that collision: the tuple hash function, and the integer hash

[issue34751] Hash collisions for tuples

2018-09-25 Thread Tim Peters
Tim Peters added the comment: >> j is even implies (j ^ -3) == -(j ^ 3) > This follows from what I posted before: if j is even, then > j ^ 3 is odd, so we can apply the rule x ^ -2 = -x to x = j ^ 3 > ... Thanks! That helps a lot. I had a blind spot there. This kind of th

[issue34751] Hash collisions for tuples

2018-09-26 Thread Tim Peters
Tim Peters added the comment: High-order bit: please restore the original tuple hash test. You have the worst case of "but I didn't write it" I've ever encountered ;-) Your new test is valuable, but I've seen several cases now where it fails to detect any problem

[issue34751] Hash collisions for tuples

2018-09-26 Thread Tim Peters
Tim Peters added the comment: >> The two-liner above with the xor in the second line is >> exactly Bernstein 33A, followed by a permutation >> of 33A's _output_ space. > Not output space, but internal state ? 33A's output _is_ its internal state at the e

[issue34751] Hash collisions for tuples

2018-09-27 Thread Tim Peters
Tim Peters added the comment: I should have spelled this out before: these are all permutations, so in general permuting the result space of `x * mult + y` (or any other permutation involving x and y) is exactly the same as not permuting it but applying a different permutation to y instead

[issue34751] Hash collisions for tuples

2018-09-27 Thread Tim Peters
Tim Peters added the comment: Perhaps worth noting that FNV-1a works great if used as _intended_: on a stream of unsigned bytes. All tests except the new tuple hash test suffer no collisions; the new test suffers 14. Nothing is needed to try to worm around nested tuple catastrophes, or

[issue34751] Hash collisions for tuples

2018-09-27 Thread Tim Peters
Tim Peters added the comment: Also worth noting: other projects need to combine hashes too. Here's a 64-bit version of the highly regarded C++ Boost[1] library's hash_combine() function (I replaced its 32-bit int literal with "a random" 64-bit one):

[issue34751] Hash collisions for tuples

2018-09-28 Thread Tim Peters
Tim Peters added the comment: [Tim] > Perhaps worth noting that FNV-1a works great if used as > _intended_: on a stream of unsigned bytes. > ... > >Py_uhash_t t = (Py_uhash_t)y; >for (int i = 0; i < sizeof(t); ++i) { >x = (x ^ (t & 0xff)) *

[issue34751] Hash collisions for tuples

2018-09-28 Thread Tim Peters
Tim Peters added the comment: Jeroen, thanks for helping us fly slightly less blind! ;-) It's a lot of work. I'd say you may as well pick a prime. It's folklore, but "a reason" is that you've discovered that regular bit patterns in multipliers can hurt, and

[issue34751] Hash collisions for tuples

2018-09-30 Thread Tim Peters
Tim Peters added the comment: An "aha!" moment for me: I noted before that replacing the tuple hash loop with Py_uhash_t t = (Py_uhash_t)y; t ^= t << 1; x = (x * mult) + t; does OK (64-bit build): most tests had no collisions, but the original tuple test

[issue34751] Hash collisions for tuples

2018-09-30 Thread Tim Peters
Tim Peters added the comment: Noting that the failure of high-order bits to propagate is one form of "avalanche" failure. A recent 64-bit hash, SeaHash, takes this approach which is said to have provably "perfect" avalanche behavior: Py_uhash_t t = (Py_uhash_t)y;

[issue34751] Hash collisions for tuples

2018-10-01 Thread Tim Peters
Tim Peters added the comment: If SeaHash is interesting to us (it is to me), I suggest dropping our DJB/FNV combining part entirely and using a purer form, like so: Py_uhash_t t = (Py_uhash_t)y; x ^= t ^ (t << 1); x *= (Py_uhash_t)0x6eed0e9da4d94a4fULL; x ^= (x >&g

[issue34751] Hash collisions for tuples

2018-10-01 Thread Tim Peters
Tim Peters added the comment: >Py_uhash_t t = (Py_uhash_t)y; >x ^= t ^ (t << 1); >x *= (Py_uhash_t)0x6eed0e9da4d94a4fULL; >x ^= (x >> 32) >> (x >> 60); >x *= (Py_uhash_t)0x6eed0e9da4d94a4fULL; Comment out either one of the multip

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: A meta-comment: a great deal of work has been done on non-crypto hashes in recent years. I'm not interested in rolling our own if one of the "winners" from that process can be adapted. For that reason, I've only been looking at those

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: >> the author wants this transformation to be easily >> invertible, so a prime is necessary > A multiplication by any odd number modulo 2**64 is > invertible. Right! My mistake. > As I argued before, the concept of primes is > mea

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: > So my suggestion remains > > for y in INPUT: >t = hash(y) >t ^= t * SOME_LARGE_EVEN_NUMBER >h ^= t >h *= MULTIPLIER On the face of it, I'd be amazed if that passed SMHasher, because it only propagates bits "to th

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: > If we, e.g., tested tuples of little floats instead ... Speaking of which: >>> from itertools import product >>> len(set(map(hash, product([0.5, 0.25], repeat=20 32 32 hash codes out of 1048576 distinct two-tuples isn't exa

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: >> I've noted before, e.g., that sticking to a prime >> eliminates a world of regular bit patterns in the >> multiplier. > Why do you think this? 0x1fff is prime :-) Point taken ;-) But "a world of" is not the s

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: >> For that reason, I've only been looking at those that >> scored 10 (best possible) on Appleby's SMHasher[1] test suite > Do you have a list of such hash functions? A world of searching. The ones rated "Excellent" here (pre

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: > >>> from itertools import product > >>> len(set(map(hash, product([0.5, 0.25], repeat=20 > 32 > Good catch! Would you like me to add this to the testsuite? It's in mine already ;-) I've added all the "bad ex

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: > So it seems that this SMHasher test suite doesn't > catch the problem that we're seeing with negative integers. Seems to be so, but I've never run SMHasher myself. I believe it's focused on statistical properties, like aval

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: FYI, this appears to be a good overview of what SMHasher is looking for: https://github.com/aappleby/smhasher/wiki/SMHasher Someg of everything: performance, correctness, statistical measures, and specific kinds of keysets that have proved problematic for other

[issue34751] Hash collisions for tuples

2018-10-02 Thread Tim Peters
Tim Peters added the comment: > SeaHash seems to be designed for 64 bits. Absolutely. > I'm guessing that replacing the shifts by > > x ^= ((x >> 16) >> (x >> 29)) > > would be what you'd do for a 32-bit hash. My guess too. But "the pri

[issue34751] Hash collisions for tuples

2018-10-03 Thread Tim Peters
Tim Peters added the comment: Thanks for looking at xxHash! An advantage is that it already comes in 32- and 64-bit versions. > A (simplified and slightly modified version of) xxHash > seems to work very well, much better than SeaHash. ? I've posted several SeaHash cores tha

[issue34751] Hash collisions for tuples

2018-10-03 Thread Tim Peters
Tim Peters added the comment: Here's a complete xxHash-based implementation via throwing away C++-isms from https://github.com/RedSpah/xxhash_cpp/blob/master/xxhash/xxhash.hpp In the 64-bit build there are no collisions across my tests except for 11 in the new tuple test. The 32-bit

[issue34751] Hash collisions for tuples

2018-10-04 Thread Tim Peters
Tim Peters added the comment: > Note that I'm always considering parametrized > versions of the hash functions that I'm testing. > I'm replacing the fixed multiplier (all algorithms > mentioned here have such a thing) by a random > multiplier which is 3 mod 8.

[issue34751] Hash collisions for tuples

2018-10-04 Thread Tim Peters
Tim Peters added the comment: >> In the 64-bit build there are no collisions across my >> tests except for 11 in the new tuple test. > That's pretty bad actually. With 64 bits, you statistically > expect something in the order of 10**-8 collisions. So > what yo

[issue34751] Hash collisions for tuples

2018-10-04 Thread Tim Peters
Change by Tim Peters : -- components: +Interpreter Core -XML ___ Python tracker <https://bugs.python.org/issue34751> ___ ___ Python-bugs-list mailing list Unsub

[issue34751] Hash collisions for tuples

2018-10-04 Thread Tim Peters
Tim Peters added the comment: >> people already wrote substantial test suites dedicated >> to that sole purpose, and we should aim to be "mere >> consumers" of functions that pass _those_ tests. > There are hash functions that pass those tests which > are

[issue34839] doctest: Change example under warnings section

2018-10-05 Thread Tim Peters
Tim Peters added the comment: Add a comment along the lines you (Terry) suggested. Some people need to write doctests that run under many versions of Python, so the info is still supremely relevant to them. -- nosy: +tim.peters ___ Python

[issue34839] doctest: Change example under warnings section

2018-10-05 Thread Tim Peters
Tim Peters added the comment: Stephane, it's not deep. People who need to write doctests that work across N versions of Python shouldn't need to read N versions of the documentation. This is hardly unique to doctest. We routinely add "Changed in version m.n" blurb

[issue34751] Hash collisions for tuples

2018-10-06 Thread Tim Peters
Tim Peters added the comment: As a sanity check, I tried the 32-bit version of MurmurHash2 (version 3 doesn't have a 32-bit version). All of my Python tests had collisions, and while the new tuple test dropped to 15, product([0.5, 0.25], repeat=20) skyrocketed from 141 (32-bit xxHas

[issue34751] Hash collisions for tuples

2018-10-06 Thread Tim Peters
Tim Peters added the comment: Thinking about the way xxHash works prompted me to try this initialization change: x = 0x345678UL + (Py_uhash_t)len; // adding in the length is new That was a pure win in my tests, slashing the number of collisions in the new tuple test, 32-bit build, from

[issue34751] Hash collisions for tuples

2018-10-07 Thread Tim Peters
Tim Peters added the comment: Attaching htest.py so we have a common way to compare what various things do on the tests that have been suggested. unittest sucks for that. doctest too. Here's current code output from a 32-bit build; "ideally" we want "got" val

[issue34751] Hash collisions for tuples

2018-10-07 Thread Tim Peters
Tim Peters added the comment: Here's htest.py output from git pull https://github.com/jdemeyer/cpython.git bpo34751 and a variation. The number of collisions in the variation appear in square brackets at the end of each line. 32-bit build: range(100) by 3; 32-bit hash codes; mean 1

[issue34751] Hash collisions for tuples

2018-10-08 Thread Tim Peters
Tim Peters added the comment: We need to worry about timing too :-( I'm using this as a test. It's very heavy on using 3-tuples of little ints as dict keys. Getting hash codes for ints is relatively cheap, and there's not much else going on, so this is intended to be ve

[issue34751] Hash collisions for tuples

2018-10-09 Thread Tim Peters
Tim Peters added the comment: >> Changes initialization to add in the length: > What's the rationale for that change? You always asked > me to stay as close as possible to the "official" hash function > which adds in the length at the end. Is there an actual

[issue35010] sort by partially reversed key tuple

2018-10-17 Thread Tim Peters
Tim Peters added the comment: This comes up every few years, but that's about it. Here's the iteration from 2 years ago: https://mail.python.org/pipermail/python-ideas/2016-October/043039.html Follow the thread. It contains easy-to-use wrappers for both "do it in multipl

[issue35023] Missed a key when iterating over dictionary

2018-10-18 Thread Tim Peters
Tim Peters added the comment: This won't be changed. The effect on the iteration order of adding and/or removing keys from a dict while iterating over it has never been defined in Python, and never will be. The Python 3 docs for dict views say this clearly: """ I

[issue35023] Missed a key when iterating over dictionary

2018-10-18 Thread Tim Peters
Tim Peters added the comment: Questions about Python should be asked, e.g., on the Python mailing list. The short course is that it's desired that iterating over dicts be as fast as possible, and nobody knows a way to make iteration robust in the face of mutations that wouldn

[issue35023] Missed a key when iterating over dictionary

2018-10-18 Thread Tim Peters
Tim Peters added the comment: Not without more expense. Which is why it hasn't been done. But this is a wrong place to talk about it. If you want Python to change, go to the python-ideas mailing list? https://mail.python.org/mailman/listinfo/python-

[issue35079] difflib.SequenceMatcher.get_matching_blocks omits common strings

2018-10-26 Thread Tim Peters
Tim Peters added the comment: Sorry, I find this too hard to follow. At the end: > ['TA', 'CA'] # Missing the substring 'CA' the comment claims it's missing 'CA', while the output plainly shows it's _not_ missing 'CA'.

[issue35079] difflib.SequenceMatcher.get_matching_blocks omits common strings

2018-10-26 Thread Tim Peters
Tim Peters added the comment: I don't object to spelling it out more, and your (Terry's) suggestions are fine. On the other hand, this module has been around for a lng time now, and this is the first instance I've seen of someone surprised that it doesn't produce

[issue35091] Objects/listobject.c: gallop functions rely on signed integer overflow

2018-10-28 Thread Tim Peters
Tim Peters added the comment: This doesn't actually matter - the code can never trigger. It would be fine to replace it with an assert to that effect (see below for a specific suggestion). The reason: The indices in this code are into vectors of PyObject*. These vectors can'

[issue35091] Objects/listobject.c: gallop functions rely on signed integer overflow

2018-10-28 Thread Tim Peters
Tim Peters added the comment: I left the code in because it was harmless (a 100%-predictable branch), and it was easier to show that overflow was _considered_ than to explain in full why it was impossible. In the context of CPython. For example, the Java port of this code couldn't re

[issue18522] Error creating a raw string of r'\\?\'

2013-07-21 Thread Tim Peters
Tim Peters added the comment: As section 2.4.1. (String literals) of the Python reference manual says, "... (even a raw string cannot end in an odd number of backslashes). Specifically, a raw string cannot end in a single backslash (since the backslash would escape the following

[issue18607] struct.unpack

2013-07-31 Thread Tim Peters
Tim Peters added the comment: Looks fine to me. The third one uses native size and alignment (see the "Byte Order, Size, and Alignment" section of the struct docs). After the first 5 B's, a pad byte is inserted so that the next H is properly aligned (to a 2-byte boundary)

[issue18629] future division breaks timedelta division by integer

2013-08-03 Thread Tim Peters
Tim Peters added the comment: Well, a timedelta is a duration. timedelta // n is as close as possible to one n'th of that duration, but "rounding down" (if necessary) so that the result is representable as a timedelta. In the same way, if i and j are integers, i // j is as cl

[issue18647] re.error: nothing to repeat

2013-08-03 Thread Tim Peters
Changes by Tim Peters : -- nosy: +tim_one ___ Python tracker <http://bugs.python.org/issue18647> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue18647] re.error: nothing to repeat

2013-08-04 Thread Tim Peters
Tim Peters added the comment: Serhiy, I don't see the regexp '(?:.*$\n?)*' anywhere in doctest.py. Are you talking about the _EXAMPLE_RE regexp? That's the closest I see. If that's the case, the "nothing to repeat" error is incorrect: _EXAMPLE_RE

[issue18647] re.error: nothing to repeat

2013-08-04 Thread Tim Peters
Tim Peters added the comment: Serhiy, I'm asking you to be very explicit about which regexp in doctest.py you're talking about. If you're talking about the _EXAMPLE_RE regexp, I already explained what's going on with that. If you're talking about some other regexp,

[issue18647] re.error: nothing to repeat

2013-08-04 Thread Tim Peters
Tim Peters added the comment: I'm afraid it's just too tricky for the code to deduce that a negative lookahead assertion can imply that a later match can't be empty. But I don't know how smart the re compilation code already is ;-) It occurs to me now that the docte

[issue18647] re.error: nothing to repeat

2013-08-04 Thread Tim Peters
Tim Peters added the comment: Matching an empty string an unbounded number of times isn't a case of exponential runtime, it's a case of infinite runtime, unless the regexp internals are smart enough to cut the search off (I don't know enough about re's internals to kno

[issue18652] Add itertools.coalesce

2013-08-04 Thread Tim Peters
Tim Peters added the comment: FWIW, I like this. It would be a nice addition to the itertools module. +1 The `key` argument should be renamed to `pred`, as others have said. As to the name, I like "first_true". Does what it says. Plain "first" is misleading, an

[issue18647] re.error: nothing to repeat

2013-08-04 Thread Tim Peters
Tim Peters added the comment: Matthew, yes, I agree that a regexp engine can be coded to be robust against unboundedly repeated matching of an empty string. I don't know whether Python's current engine is so coded. It's easy to trick the 2.7 engine into accepting regexps

[issue18652] Add itertools.coalesce

2013-08-04 Thread Tim Peters
Tim Peters added the comment: > skipping the underscore ("firsttrue" or "nexttrue") > would arguably be more consistent with other itertools names. Like izip_longest and combinations_with_replacement? ;-) I care about readability here more than consistency with th

[issue18647] re.error: nothing to repeat

2013-08-04 Thread Tim Peters
Tim Peters added the comment: > Python's current regex engine isn't so coded. That's > the reason for the up-front check. It's peculiar then that nobody noticed before now that the check was so badly broken ;-) Offhand, do you have an example that displays bad beh

[issue18647] re.error: nothing to repeat

2013-08-06 Thread Tim Peters
Tim Peters added the comment: Serhiy, yup, that regexp is slow, but it does finish - so the engine is doing something to avoid _unbounded_ repetitive matching of an empty string. Change it to (?:.?.+)*y and the group can no longer match an empty string, but it's still slow (although

[issue18647] re.error: nothing to repeat

2013-08-10 Thread Tim Peters
Tim Peters added the comment: Serhiy, yes, I know the regexp you gave takes exponential time. But: 1. This appears to have nothing to do with repeated 0-length matches. I gave you an example of a very similar regexp that also takes exponential time, but never makes any 0-length sub-match

[issue18647] re.error: nothing to repeat

2013-08-10 Thread Tim Peters
Tim Peters added the comment: So does anyone believe this check serves a useful purpose _now_? Doesn't seem so to me. -- ___ Python tracker <http://bugs.python.org/is

[issue18314] Have os.unlink remove junction points

2013-08-13 Thread Tim Peters
Tim Peters added the comment: Victor, Wikipedia has a readable explanation: http://en.wikipedia.org/wiki/NTFS_junction_point I haven't used them much. From cmd.exe, I've been able to delete them, not with "del" but with "rmdir". You can create one from cmd.

[issue18739] math.log of a long returns a different value of math.log of an int

2013-08-14 Thread Tim Peters
Tim Peters added the comment: Yup - 2.7 evaluates this in a less precise way, as log(10L) = log(10./16 * 2**4) = log(0.625) + log(2)*4 >>> log(10L) == log(0.625) + log(2)*4 True This patterns works well even for longs that are far too large to represent as a double; e.g., &

[issue18739] math.log of a long returns a different value of math.log of an int

2013-08-14 Thread Tim Peters
Tim Peters added the comment: FYI, the improvement was made in these 2 changesets: c8dc4b5e54fb a9349fd3d0f7 If desired, those could presumably be grafted back into the 2.7 branch. The commit messages refer to issue #9599, but that appears to be mistaken

[issue18739] math.log of a long returns a different value of math.log of an int

2013-08-14 Thread Tim Peters
Tim Peters added the comment: OK, the correct one is issue #9959. -- ___ Python tracker <http://bugs.python.org/issue18739> ___ ___ Python-bugs-list mailin

[issue18739] math.log of a long returns a different value of math.log of an int

2013-08-15 Thread Tim Peters
Tim Peters added the comment: +1 on fixing it in 2.7, for the reasons Mark gave. Way back when I introduced the original scheme, log(a_long) raised an exception, and the `int` and `long` types had a much higher wall between them. The original scheme changed an annoying failure into a "p

[issue18827] mistake in the timedelta.total_seconds docstring

2013-08-24 Thread Tim Peters
Tim Peters added the comment: The docs look correct to me. For example, >>> from datetime import * >>> td = datetime.now() - datetime(1,1,1) >>> td datetime.timedelta(735103, 61756, 484000) >>>

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-26 Thread Tim Peters
Tim Peters added the comment: Python's debug-mode memory allocators add some "magic values" before and after each allocated chunk of memory, and check them when the chunk is freed to make sure nobody overwrote them. In this case, someone did overwrite the byte at p-5, where p

[issue18845] 2.7.5-r2: Fatal Python error: Segmentation fault

2013-08-26 Thread Tim Peters
Tim Peters added the comment: Impossible to know, but since everything in the traceback comes from matplotlib, the error is most likely in matplotlib. -- nosy: +tim.peters ___ Python tracker <http://bugs.python.org/issue18

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-26 Thread Tim Peters
Tim Peters added the comment: Memory corruption can be difficult to track down. Best thing you can do is strive to find a test case as small and fast as possible that shows the same kind of error. By "rogue extension module" I just mean 3rd-party C code (like, for example, matpl

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-26 Thread Tim Peters
Tim Peters added the comment: By the way, if memory serves, compiling with --with-pydebug changes the memory layout of Python objects, so a Python compiled this way _cannot_ be used successfully with extension modules that were compiled without the same options. Did you rebuild your

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-26 Thread Tim Peters
Tim Peters added the comment: Well, if you delete a giant list, and the list held the only references remaining to the objects the list contained, then the memory for those objects will be free'd, one object at a time. A debug build would then detect the memory corruption in those ob

[issue18845] 2.7.5-r2: Fatal Python error: Segmentation fault

2013-08-26 Thread Tim Peters
Tim Peters added the comment: Note that the same poster is also reporting memory corruption in issue 18843. I suggest ignoring this one unless/until the earlier bug is resolved (memory corruption can easily cause a segfault - or any other kind of error

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-27 Thread Tim Peters
Tim Peters added the comment: Did you read Misc/README.valgrind (in the Python tree)? The warnings you've seen so far are probably all nonsense, and README.valgrind explains how to avoid getting them. -- ___ Python tracker <http://bugs.py

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-27 Thread Tim Peters
Tim Peters added the comment: I don't know why there isn't a configure switch for this - but then I've never used valgrind - LOL ;-) Other developers use valgrind on Python routinely, though. So it's unlikely you'll find a legitimate problem _in Pyt

[issue18859] README.valgrind should mention --with-valgrind

2013-08-27 Thread Tim Peters
New submission from Tim Peters: In issue 18843 a user noted that Misc/README.valgrind doesn't mention the --with-valgrind configure option. It probably should. But since I've never used valgrind, I'm not the guy to do it ;-) -- components: Build messages: 196338 n

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-27 Thread Tim Peters
Tim Peters added the comment: I opened issue 18859 about the lack of --with-valgrind info in Misc/README.valgrind. Thanks for noticing! -- ___ Python tracker <http://bugs.python.org/issue18

[issue18859] README.valgrind should mention --with-valgrind

2013-08-27 Thread Tim Peters
Changes by Tim Peters : -- keywords: +easy ___ Python tracker <http://bugs.python.org/issue18859> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-29 Thread Tim Peters
Tim Peters added the comment: Hmm. I don't quite know what you're doing: you said you're getting away from --with-pydebug, but these "bad leading pad byte" messages can't be generated unless Python is compiled with (at least) PYMALLOC_DEBUG defined. That sa

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-29 Thread Tim Peters
Tim Peters added the comment: Yet Another Tool ;-) Python's "small object" allocator grabs memory in chunks of 256KB from the system, and carves up the space itself. Other memory tools (like Valgrind ...) only see that Python has grabbed 256KB chunks, so can't detect any

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-29 Thread Tim Peters
Tim Peters added the comment: It would be a severely lame OS that allowed a process to overwrite another process's memory ;-) "Bad C or C++ code", in the process you're running, is still the best guess. A memory module that sometimes dropped the last bit _could_ be at fa

[issue18881] Can someone try to duplicate corruption on Gentoo?

2013-08-29 Thread Tim Peters
New submission from Tim Peters: In http://bugs.python.org/issue18843 a user reported a debug PyMalloc "bad leading pad byte" memory corruption death while running their code. After some thrashing, they decided to rebuild Python, and got the same kind of error while rebuilding Py

[issue18881] Can someone try to duplicate corruption on Gentoo?

2013-08-29 Thread Tim Peters
Changes by Tim Peters : -- resolution: -> invalid ___ Python tracker <http://bugs.python.org/issue18881> ___ ___ Python-bugs-list mailing list Unsubscri

[issue18881] Can someone try to duplicate corruption on Gentoo?

2013-08-29 Thread Tim Peters
Changes by Tim Peters : -- stage: -> committed/rejected ___ Python tracker <http://bugs.python.org/issue18881> ___ ___ Python-bugs-list mailing list Unsubscri

[issue18881] Can someone try to duplicate corruption on Gentoo?

2013-08-29 Thread Tim Peters
Changes by Tim Peters : -- status: open -> closed ___ Python tracker <http://bugs.python.org/issue18881> ___ ___ Python-bugs-list mailing list Unsubscri

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-29 Thread Tim Peters
Tim Peters added the comment: I sent a msg to Python-Dev, asking for a Gentoo user to try to reproduce this. Thanks! -- ___ Python tracker <http://bugs.python.org/issue18

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-29 Thread Tim Peters
Tim Peters added the comment: Thanks for chiming in, Stephen! I can't answer your questions, but agree the original poster needs to tell you exactly what he did -- else we'd just be thrashing at random. There's been enough

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-29 Thread Tim Peters
Tim Peters added the comment: Martin, can you please supply exact commands Stephen can use to try to reproduce the problem you saw running `emerge`? And try them yourself first, to make sure you can reproduce the problem too. Any detail may be important. For example, it's possible th

[issue18884] python-2.7.5-r3: 40 bytes in 1 blocks are definitely lost

2013-08-30 Thread Tim Peters
Tim Peters added the comment: The matplotlib people won't care about this one either. matplotlib allocated the memory, and the error message at the end says it's _trying_ to call _PyMem_DebugFree (which may well be trying to release the memory), but the binaries aren&#x

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-30 Thread Tim Peters
Tim Peters added the comment: Martin, would it be possible to borrow someone else's machine and try to reproduce this? If you can, that would greatly reduce the probability of this being a HW error. It would also leave us with an exact set of commands to share so others can try it on

[issue18884] python-2.7.5-r3: 40 bytes in 1 blocks are definitely lost

2013-08-30 Thread Tim Peters
Tim Peters added the comment: Now you have something to show the matplotlib folks - although they're not likely to get excited about leaking 40 bytes. There is nothing Python can do about this. matplotlib is responsible for free'ing the memory matplotlib allocates, just as

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-30 Thread Tim Peters
Tim Peters added the comment: OK, it sounds to me like you do not have a reproducible test case, of any kind. It that's true, this bug report isn't going anywhere :-( Python isn't a memory-testing program, so it would be insanely inefficient for it to (for example) read u

[issue18845] 2.7.5-r2: Fatal Python error: Segmentation fault

2013-08-30 Thread Tim Peters
Tim Peters added the comment: As issue 18843 has evolved, seems more likely now that it's flaky HW, but agreed in any case there's really no evidence of a Python problem here. So closing it. Martin, we can revisit this if there's real progress on the other issue. --

[issue18808] Thread.join returns before PyThreadState is destroyed

2013-08-30 Thread Tim Peters
Tim Peters added the comment: Nasty problem ;-) I don't understand the need for all the indirections in the second patch. Like, why use a weakref? It's not like we have to worry about an immortal tstate keeping a measly little lock object alive forever, right? Seems to me t

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-31 Thread Tim Peters
Tim Peters added the comment: Someone may find the new stress.valgrind.stderr interesting, but - since I've never used valgrind - it doesn't mean much to me. I _expected_ you'd run the little stress program under a debug Python and without valgrind, since that's the onl

[issue18808] Thread.join returns before PyThreadState is destroyed

2013-08-31 Thread Tim Peters
Tim Peters added the comment: I'm getting a headache now - LOL ;-) Thanks for the explanation! What I still don't understand: the new lock is an internal implementation detail. How would it gain a weakref with a callback? Users aren't going to mess with this lock, and if y

[issue18897] Illegal instruction at Python-2.7.5/Modules/_sre.c:1173

2013-08-31 Thread Tim Peters
Tim Peters added the comment: Note this line in your first post: DUMA Aborting: mprotect() failed: Cannot allocate memory. Python never calls mprotect(), but DUMA() probably does. Also note what it said after that: Check README section 'MEMORY USAGE AND EXECUTION SPEED'

[issue18843] Py_FatalError (msg=0x7f0e3b373232 "bad leading pad byte") at Python-2.7.5/Python/pythonrun.c:1689

2013-08-31 Thread Tim Peters
Tim Peters added the comment: Thanks for that, Stephen! I don't know of anything else you could try that would be helpful. The OP doesn't appear able to reproduce his problems either, and last I heard he was off running `emerge` under DUMA: http://duma.sourceforge.net/

[issue11798] Test cases not garbage collected after run

2013-08-31 Thread Tim Peters
Tim Peters added the comment: All the buildbots are failing due to changeset 868ad6fa8e68 - I'm going to back it out. -- nosy: +tim.peters ___ Python tracker <http://bugs.python.org/is

[issue18889] test_sax: multiple failures on Windows desktop

2013-09-01 Thread Tim Peters
Tim Peters added the comment: Suggest caution here. test_sax fails the same way for me too (Windows Vista), under both the released 3.3.2 and a Python built from the current hg default branch. However, these files (test.xml and test.xml.out) have not changed since the Python 2.7 line - the

<    4   5   6   7   8   9   10   11   12   13   >