Jeff Davis writes:
> On Tue, 2025-03-18 at 11:11 -0400, Tom Lane wrote:
>> Also, probably better to make it const:
>>
>> -static const pg_wchar *casekind_map[NCaseKind] =
>> +static const pg_wchar * const casekind_map[NCaseKind] =
> Was this a general suggestion, or did you see something in part
On Tue, 2025-03-18 at 11:11 -0400, Tom Lane wrote:
> It's not apparent to me why that table needs to be in a header
> file and not in the sole user .c file?
Thank you, fixed.
> Also, probably better to make it const:
>
> -static const pg_wchar *casekind_map[NCaseKind] =
> +static const pg_wchar
One more thing: I observe that headerscheck is now unhappy:
$ src/tools/pginclude/headerscheck
In file included from /tmp/headerscheck.yOpahZ/test.c:2:
./src/include/common/unicode_case_table.h:8598:24: warning: 'casekind_map'
defined but not used [-Wunused-variable]
static const pg_wchar *case
15.03.2025 23:07, Jeff Davis wrote:
On Fri, 2025-03-14 at 15:00 +0300, Alexander Borisov wrote:
I tried adding a loop to create tables, and everything looks fine
(v7).
[...]
I prefer to generalize when we have the other code in place. As it was,
it was a bit confusing why the extra arguments
Jeff Davis writes:
> On Sat, Mar 15, 2025 at 1:11 PM Tom Lane wrote:
>> crake doesn't like your perl style:
>> ./src/common/unicode/generate-unicode_case_table.pl: Loop iterator is not
>> lexical at line 638, column 2. See page 108 of PBP.
> I suppose pgperltidy didn't catch that. I will fix it
On Sat, Mar 15, 2025 at 1:11 PM Tom Lane wrote:
> Jeff Davis writes:
> > Committed. Thank you!
>
> crake doesn't like your perl style:
>
> ./src/common/unicode/generate-unicode_case_table.pl: Loop iterator is not
> lexical at line 638, column 2. See page 108 of PBP.
I suppose pgperltidy didn'
Jeff Davis writes:
> Committed. Thank you!
crake doesn't like your perl style:
./src/common/unicode/generate-unicode_case_table.pl: Loop iterator is not
lexical at line 638, column 2. See page 108 of PBP.
([Variables::RequireLexicalLoopIterators] Severity: 5)
regard
On Fri, 2025-03-14 at 15:00 +0300, Alexander Borisov wrote:
> I tried adding a loop to create tables, and everything looks fine
> (v7).
> Also removed unnecessary (hanging) global variables.
Changed. I used a loop more similar to your first one (hash of arrays),
and I left case_map_special outside
On 14/03/2025 05:43, Jeff Davis wrote:
On Wed, 2025-03-12 at 23:39 +0300, Alexander Borisov wrote:
v5 attached.
Attached v6j.
* marked arrays as "static const" rather than just "static"
* ran pgindent
* changed data types where appropriate (uint32->pg_wchar)
* modified perl code so that it pr
On Fri, 2025-03-14 at 13:16 +0200, Heikki Linnakangas wrote:
> Attached are fixes for those and some other minor things.
Thank you, I agree and I have applied your changes.
Regards,
Jeff Davis
On Wed, 2025-03-12 at 19:55 +0300, Alexander Borisov wrote:
> 1. Added static for casemap() function. Otherwise the compiler could
> not
> optimize the code and the performance dropped significantly.
Oops, it was static, but I made it external just to see what code it
generated. I didn't intend to
12.03.2025 19:55, Alexander Borisov wrote:
[...]
A couple questions:
* Is there a reason the fast-path for codepoints < 0x80 is in
unicode_case.c rather than unicode_case_func.h?
Yes, this is an important optimization, below are benchmarks that
[...]
I forgot to add the benchmark:
Benchm
19.02.2025 01:56, Jeff Davis пишет:
On Wed, 2025-02-19 at 01:54 +0300, Alexander Borisov wrote:
In proposing the patch for v3, I struck a balance between improving
performance and reducing binary size, without sacrificing code
clarity.
Fair enough. I will continue reviewing v3.
Did you have
On Wed, 2025-02-19 at 01:54 +0300, Alexander Borisov wrote:
> In proposing the patch for v3, I struck a balance between improving
> performance and reducing binary size, without sacrificing code
> clarity.
Fair enough. I will continue reviewing v3.
Regards,
Jeff Davis
19.02.2025 01:02, Jeff Davis пишет:
On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
I tried the approach via a range table. The result was worse than
without the table. With branching in a function, the result is
better.
Patch v3 — ranges binary search by branches.
Patch v4 — ranges
On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
> I tried the approach via a range table. The result was worse than
> without the table. With branching in a function, the result is
> better.
>
> Patch v3 — ranges binary search by branches.
> Patch v4 — ranges binary search by table.
T
On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
> What's the result?
>
> I would use Range Binary in Unicode case/normalization. The algorithm
> shows good results. Plus it can be customized (increasing/decreasing)
> the table by allowing empty values.
>
> Also, I got a strong feeling
06.02.2025 22:08, Jeff Davis пишет:
On Thu, 2025-02-06 at 18:39 +0300, Alexander Borisov wrote:
Since I started to improve Unicode Case, I used the same approach,
essentially a binary search, only not by individual values, but by
ranges.
I considered it a 4th approach because of the generated
On Thu, 2025-02-06 at 18:39 +0300, Alexander Borisov wrote:
> Since I started to improve Unicode Case, I used the same approach,
> essentially a binary search, only not by individual values, but by
> ranges.
I considered it a 4th approach because of the generated branches in
case_index(). Case_ind
Hi Jeff,
06.02.2025 00:46, Jeff Davis пишет:
On Tue, 2025-02-04 at 23:19 +0300, Alexander Borisov wrote:
I've done many different experiments and everywhere the result is
within
the margin of the v2 patch result.
Great, thank you for working on this!
There doesn't appear to be a downside. Ev
On Tue, 2025-02-04 at 23:19 +0300, Alexander Borisov wrote:
> I've done many different experiments and everywhere the result is
> within
> the margin of the v2 patch result.
Great, thank you for working on this!
There doesn't appear to be a downside. Even though it's more complex,
we have exhaust
31.01.2025 01:43, Heikki Linnakangas пишет:
Hi Heikki,
Did you consider using a radix tree? We use that method in src/backend/
utils/mb/Unicode/convutils.pm. I'm not sure if that's better or worse
than what's proposed here, but it would seem like a more standard
technique at least. Or if this
On 30/01/2025 15:39, Alexander Borisov wrote:
The code is fixed, now the patch passes all tests.
Change from the original patch (v1):
Reduce the main table from 3003 to 1677 (duplicates removed) records.
Added records from 0x00 to 0x80 for fast path.
Renamed get_case() function to pg_unicode_cas
Sorry, I made a mistake in the code. It's not worth watching this patch yet.
29.01.2025 23:23, Alexander Borisov пишет:
Hi, hackers!
I propose to consider a simple optimization for Unicode case tables.
The main changes affect the generate-unicode_case_table.pl file.
Because of the modified app
24 matches
Mail list logo