05.12.2024 17:59, Peter Eisentraut пишет:
On 05.12.24 15:01, Alexander Borisov wrote:
Postgres users often store URLs in the database. As an example, they
provide links to their pages on the web, analyze users posts and get
links for further storage and analysis. Naturally, there is a need to
06.12.2024 21:04, Matthias van de Meent:
On Thu, 5 Dec 2024 at 15:02, Alexander Borisov wrote:
[..]
I'd be extremely annoyed if URLs I wrote into the database didn't
return in identical manner when fetched from the database. See also
how numeric has different representations o
06.02.2025 22:08, Jeff Davis пишет:
On Thu, 2025-02-06 at 18:39 +0300, Alexander Borisov wrote:
Since I started to improve Unicode Case, I used the same approach,
essentially a binary search, only not by individual values, but by
ranges.
I considered it a 4th approach because of the generated
Hi Jeff,
06.02.2025 00:46, Jeff Davis пишет:
On Tue, 2025-02-04 at 23:19 +0300, Alexander Borisov wrote:
I've done many different experiments and everywhere the result is
within
the margin of the v2 patch result.
Great, thank you for working on this!
There doesn't appear to be
by uint8*n.
Thanks, after the weekend I'll send an updated patch that takes into
account the comments/advice.
--
SberTech
Alexander Borisov
10.12.2024 13:59, Victor Yegorov пишет:
чт, 5 дек. 2024 г. в 17:02, Alexander Borisov <mailto:lex.bori...@gmail.com>>:
[..]
Hey, I had a look at this patch and found its functionality mature and
performant.
As Peter mentioned pguri, I used it to compare with the proposed
Hi Daniel,
06.12.2024 16:46, Daniel Gustafsson пишет:
On 6 Dec 2024, at 13:59, Alexander Borisov wrote:
As I've written before, there is a difference between parsing URLs
according to the RFC 3986 specification and WHATWG URLs. This is
especially true for host. Here are a couple
Sorry, I made a mistake in the code. It's not worth watching this patch yet.
29.01.2025 23:23, Alexander Borisov пишет:
Hi, hackers!
I propose to consider a simple optimization for Unicode case tables.
The main changes affect the generate-unicode_case_table.pl file.
Because of the mod
19.02.2025 01:02, Jeff Davis пишет:
On Tue, 2025-02-11 at 23:08 +0300, Alexander Borisov wrote:
I tried the approach via a range table. The result was worse than
without the table. With branching in a function, the result is
better.
Patch v3 — ranges binary search by branches.
Patch v4
19.02.2025 01:56, Jeff Davis пишет:
On Wed, 2025-02-19 at 01:54 +0300, Alexander Borisov wrote:
In proposing the patch for v3, I struck a balance between improving
performance and reducing binary size, without sacrificing code
clarity.
Fair enough. I will continue reviewing v3.
Did you have
12.03.2025 19:55, Alexander Borisov wrote:
[...]
A couple questions:
* Is there a reason the fast-path for codepoints < 0x80 is in
unicode_case.c rather than unicode_case_func.h?
Yes, this is an important optimization, below are benchmarks that
[...]
I forgot to add the benchm
15.03.2025 23:07, Jeff Davis wrote:
On Fri, 2025-03-14 at 15:00 +0300, Alexander Borisov wrote:
I tried adding a loop to create tables, and everything looks fine
(v7).
[...]
I prefer to generalize when we have the other code in place. As it was,
it was a bit confusing why the extra
me from the commit message nor the
skimming the original thread, whether the perf improvement numbers
listed by Alexander also apply to lower() and upper(), or if they only
apply to casefold():
On Sun, 4 May 2025 at 00:32, Alexander Borisov wrote:
ASCII by ≈10%
Cyrillic by ≈80%
Unicode in general by
e in this area.
But again, I'm new to the Postgres community and I'm getting to know
what's going on here and how it works.
Thank you for paying attention to it!
--
Regards,
Alexander Borisov
d want to understand.
Commit:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=27bdec06841d1bb004ca7627eac97808b08a7ac7
I am now actively working on a major improvement to Unicode
Normalization Forms.
Thanks!
--
Regards,
Alexander Borisov
algorithms.
Because of which the functions lower(), upper(), casefold() got a
significant boost.
--
Regards,
Alexander Borisov
u for clarifying!
Users are not interested in performance gains.
Then it's not worth considering. Sorry to interrupt.
--
Regards,
Alexander Borisov
17 matches
Mail list logo