New submission from LiarPrincess <m...@liarprincess.me>: This one is so tiny that I'm not really sure we want to merge it…
=== Problem === `Objects/unicodetype_db.h` starts in a following way: ```c /* a list of unique character type descriptors */ const _PyUnicode_TypeRecord _PyUnicode_TypeRecords[] = { {0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 0}, {0, 0, 0, 0, 0, 32}, {0, 0, 0, 0, 0, 48}, … ``` The 1st record (`{0, 0, 0, 0, 0, 0}`) is duplicated. This is not a problem, since the 1st occurrence is never used, but if we wanted to remove it then this is the ticket about it. === Detailed description === `Objects/unicodetype_db.h` is generated by `Tools/unicode/makeunicodedata.py` (I removed irrelevant lines): ```py def makeunicodetype(unicode, trace): dummy = (0, 0, 0, 0, 0, 0) table = [dummy] # (1) cache = {0: dummy} # (2) for char in unicode.chars: # Things… item = (upper, lower, title, decimal, digit, flags) i = cache.get(item) # (3) if i is None: cache[item] = i = len(table) table.append(item) index[char] = i ``` - (1) - list which contains unique character properties (as `(upper, lower, title, decimal, digit, flags)` tuples) - (2) - mapping from character properties to index in `table` - improperly initialized as a mapping from index to character properties - (3) - we check if the current tuple is in `cache` === Result === The first time we get to a character that has `(0, 0, 0, 0, 0, 0)` properties (which is code point 0 - `NULL`) we check if it is in cache. It it not (there is an entry that goes from index `0` to `(0, 0, 0, 0, 0, 0)` - the other way around), so we add this entry to `table` and `cache`. === Fix === In the line `(2)` we should have: `cache = {dummy: 0}`. Obviously after doing so we have to run `makeunicodedata.py` - this is why this simple change modifies a lot of lines. I will submit PR on github in just a sec… ---------- components: Unicode messages: 416889 nosy: LiarPrincess, ezio.melotti, vstinner priority: normal severity: normal status: open title: Duplicate entry in 'Objects/unicodetype_db.h' type: enhancement _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue47243> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com