Sorry, I made a mistake in the code. It's not worth watching this patch yet.
29.01.2025 23:23, Alexander Borisov пишет:
Hi, hackers! I propose to consider a simple optimization for Unicode case tables. The main changes affect the generate-unicode_case_table.pl file. Because of the modified approach of record search by table we managed to: 1. Removed storing Unicode codepoints (unsigned int) in all tables. 2. Reduce the main table from 3003 to 1575 (duplicates removed) records. 3. Replace pointer (essentially uint64_t) with uin8_t in the main table. 4. Reduced the time to find a record in the table. 5. Reduce the size of the final object file. The approach is generally as follows: Group Unicode codepoints into ranges in which the difference between neighboring elements does not exceed the specified limit. For example, if there are numbers 1, 2, 3, 5, 6 and limit = 1, then there is a difference of 2 between 3 and 5, which is greater than 1, so there will be ranges 1-3 and 5-6. Then we form a table (let's call it an index table) by combining the obtained ranges. The table contains uint16_t index to the main table. Then from the previously obtained diapasons we form a function (get_case()) to get the index to the main table. The function, in fact, contains only IF/ELSE IF constructs imitating binary search. Because we are not directly accessing the main table with data, we can exclude duplicates from it, and there are almost half of them. Also, because get_case() contains all the information about Unicode ranges, we don't need to store Unicode codepoints in the main table. Also because of this approach some checks were removed, which allowed to increase performance even with fast path (codepoints < 0x80). casefold() test. * macOS 15.1 (Apple M3 Pro) (Apple clang version 16.0.0) ASCII: Repeated characters (700kb) in the range from 0x20 to 0x7E. Patch: tps = 282.457745 Without: tps = 263.749652 Cyrillic: Repeated characters (1MB) in the range from 0x0410 to 0x042F. Patch: tps = 82.399637 Without: tps = 48.291034 Unicode: A query consisting of all Unicode characters from 0xA0 to 0x2FA1D (excluding 0xD800..0xDFFF). Patch: tps = 120.703471 Without: tps = 92.423490 * Ubuntu 24.04.1 (Intel(R) Xeon(R) Gold 6140) (gcc version 13.3.0) ASCII: Repeated characters (700kb) in the range from 0x20 to 0x7E. Patch: tps = 172.291972 Without: tps = 111.592281 Cyrillic: Repeated characters (1MB) in the range from 0x0410 to 0x042F. Patch: tps = 36.487650 Without: tps = 22.537515 Unicode: A query consisting of all Unicode characters from 0xA0 to 0x2FA1D (excluding 0xD800..0xDFFF). Patch: tps = 55.190635 Without: tps = 45.493104
-- Alexander Borisov