04.05.2025 12:19, Jelte Fennema-Nio wrote:
On Sun, 4 May 2025 at 03:21, Bruce Momjian <br...@momjian.us> wrote:
So the logic is something I posted to this thread already:
So, a few things. First, these set of commits was in a group of 10
that
I added since there have been complaints in the past that optimizer
improvements were not listed and therefore patch authors were not given
sufficient credit. That means the 209 item count for PG 18 is 10
higher
than my normal filtering would produce.
Second, looking at the items, these are a case of "X is faster", which
we don't normally mention in the release notes. We normally mention
"faster" when it is so much faster that use cases which were not
possible before might be possible now, so it is recommended to retest.
That is what I saw this grouped item as, whereas I don't think the
individual items meet that criteria.
Let me start off the yearly thread of people saying they disagree with
this filtering logic. I think there's an important utility of the
Release Notes that these logic is not covering well:
Many people read the release notes to see if upgrading is worth the
hassle & risk for them specifically. The aggregate of some small
performance improvements that apply to their queries could very well
push them over the edge. These performance improvements don't need to
"allow any new use cases" for that to be the case.
The filtering that you currently do makes the release notes much less
useful for people using the release notes for this purpose. Users
might very well care more about ~10% perf improvement for a feature
they use heavily, than all of the newly added SQL syntax combined.
So, users are interested in performance in the sense it makes use cases
possible, and if your commit is making the case folding useful, we
should mention it in the release notes. I don't think making it
separate would fit though.
For this specific commit, I think if it had only changed the
performance of casefold(), then I'd agree that it should be grouped
with the casefold addition in the release notes. My reasoning would be
that there's no "diff" in performance since the previous release,
because the function did not exist in the previous release. So the
perf improvements are simply part of the "initial implementation" of
casefold from a user perspective.
However since this commit also impacts the very commonly used lower()
and upper() functions, I think that it would make sense if it got its
own entry. It's neither clear for me from the commit message nor the
skimming the original thread, whether the perf improvement numbers
listed by Alexander also apply to lower() and upper(), or if they only
apply to casefold():
On Sun, 4 May 2025 at 00:32, Alexander Borisov <lex.bori...@gmail.com> wrote:
ASCII by ≈10%
Cyrillic by ≈80%
Unicode in general by ≈30%
If they apply the lower() and upper() I definitely think this patch
deserves a place in "General Performance".
I'm actually a bit confused, and didn't expect such a heated discussion
about creating an entry about my patch in Release Notes.
I thought I had made significant Unicode improvements to Postgres.
Namely, significantly reduce the object file size for Unicode Case, and
most importantly increased performance.
Thanks to my approach/algorithm, all Unicode Case related functions got
a significant boost.
Namely: lower(), upper(), casefold(). I have already given the figures.
Why casefold() is the one that got caught here is not clear to me, I
have nothing to do with the implementation of this function.
I improved the overall Unicode Case algorithm, which caused a boost in
all of the listed functions.
I thought it would be useful for users to know that Postgres is
improving in the performance direction, especially in such functions
that are very often used by users to compare text by bringing it to a
certain case.
The discussion made me realize that it's not that important.
I'm not insisting. If it really doesn't matter to users, then it's not
worth discussing.
Thank you for your attention.
I will continue to improve Postgres.
--
Regards,
Alexander Borisov