Hi Joe, Yes, it seems that the issue is indeed related to that glibc commit. I also found a bug report that highlights a similar problem: https://sourceware.org/bugzilla/show_bug.cgi?id=18441 .
Unfortunately, the situation doesn't look promising. As Carlos O'Donell mentioned in his last comment back in 2019: Carlos O'Donell 2019-05-09 20:44:56 UTC > Hello. Is there any chance that the issues will be fixed? Unfortunately > PostgreSQL Is unable to use ICU for some basic features (e.g., in the > analyze operation). "We haven't had anyone working on strcoll_l performance improvements. So it's unlikely that this will get merged or reviewed any time soon." On Mon, 7 Oct 2024 at 19:48, Joe Conway <m...@joeconway.com> wrote: > On 10/6/24 14:13, Tom Lane wrote: > > Joe Conway <m...@joeconway.com> writes: > >> This is not surprising. There is a performance regression that started > >> in glibc 2.21 with regard to sorting unicode. Test with RHEL 7.x (glibc > >> 2.17) and I bet you will see comparable results to ICU. The best answer > >> in the long term, IMHO, is likely to use the new built-in collation > just > >> released in Postgres 17. > > > > It seems unrelated to unicode though --- I also reproduced the issue > > in a database with LATIN1 encoding. > > > > Whatever, it is pretty awful, but the place to be complaining to > > is the glibc maintainers. Not much we can do about it. > > Yeah, my reply was imprecise. > > The regression was to strcoll in general. Specifically this commit which > purports to improve performance but demonstrably causes massive > regressions: > > https://sourceware.org/git/?p=glibc.git;a=commit;h=0742aef6 > > -- > Joe Conway > PostgreSQL Contributors Team > RDS Open Source Databases > Amazon Web Services: https://aws.amazon.com > -- Best regards, Andrey Stikheev