Re: [PING] fallocate() causes btrfs to never compress postgresql files

2025-06-02 Thread Jakub Wartak
On Sat, May 31, 2025 at 4:33 PM Tomas Vondra wrote: > > On 5/31/25 16:00, Thomas Munro wrote: > > On Fri, May 30, 2025 at 3:58 AM Dimitrios Apostolou wrote: > >> All I'm saying is that this is a regression for PostgreSQL users that keep > >> tablespaces on compressed Btrfs. What could be done fro

Re: Better HINT message for "unexpected data beyond EOF"

2025-05-26 Thread Jakub Wartak
On Fri, Apr 4, 2025 at 12:55 PM Jakub Wartak wrote: Hi [..] > OK, so attached is a small patch to eradicate this HINT: CI tested, > verified using original reproducer, registered in cf app. v2 attached, rebased and tested. -J. v2-0001-Remove-HINT-message-for-unexpected-data-beyond-EO

xlogrecovery.c:WaitForWALToBecomeAvailable() - make "switched WAL source" visible by default?

2025-05-15 Thread Jakub Wartak
Hi, In presence of restore_command being configured, physical standby can either use it to restore archive logs to get WAL OR we can stream WAL from the primary. If we are streaming from primary we get: "started streaming WAL from primary" clear log message. In case of using restore_command we do

NUMA shared memory interleaving

2025-04-16 Thread Jakub Wartak
Thanks to having pg_numa.c, we can now simply address problem#2 of NUMA imbalance from [1] pages 11-14, by interleaving shm memory in PG19 - patch attached. We do not need to call numa_set_localalloc() as we only interleave shm segments, while local allocations stay the same (well, "local" means re

Re: Draft for basic NUMA observability

2025-04-10 Thread Jakub Wartak
On Mon, Apr 7, 2025 at 9:51 PM Tomas Vondra wrote: > > So it looks like that the new way to iterate on the buffers that has been > > introduced > > in v26/v27 has some issue? > > > > Yeah, the calculations of the end pointers were wrong - we need to round > up (using TYPEALIGN()) when calculatin

Re: Draft for basic NUMA observability

2025-04-09 Thread Jakub Wartak
On Wed, Apr 9, 2025 at 12:48 AM Tomas Vondra wrote: > On 4/8/25 16:59, Andres Freund wrote: > > Hi, > > > > On 2025-04-08 09:35:37 -0400, Andres Freund wrote: > >> On April 8, 2025 9:21:57 AM EDT, Tomas Vondra wrote: > >>> On 4/8/25 15:06, Andres Freund wrote: > On 2025-04-08 17:44:19 +0500,

Re: BAS_BULKREAD vs read stream

2025-04-08 Thread Jakub Wartak
On Sun, Apr 6, 2025 at 10:15 PM Andres Freund wrote: > > Hi, [..] > The obvious solution to that would be to increase BAS_BULKREAD substantially > above 256kB. > > For quite a while was worried about increasing the size, because somewhere (I > couldn't find it while writing this email, will add th

Re: Draft for basic NUMA observability

2025-04-07 Thread Jakub Wartak
On Mon, Apr 7, 2025 at 11:27 PM Tomas Vondra wrote: > > Hi, > > I've pushed all three parts of v29, with some additional corrections > (picked lower OIDs, bumped catversion, fixed commit messages). Hi Tomas, great, awesome! (this is an awesome feeling)! Thank You for such incredible support on th

Re: Draft for basic NUMA observability

2025-04-07 Thread Jakub Wartak
On Mon, Apr 7, 2025 at 11:53 AM Bertrand Drouvot wrote: > > Hi, > > On Mon, Apr 07, 2025 at 10:09:26AM +0200, Jakub Wartak wrote: > > Bertrand noticed this first in > > https://www.postgresql.org/message-id/Z/FhOOCmTxuB2h0b%40ip-10-97-1-34.eu

Re: Draft for basic NUMA observability

2025-04-07 Thread Jakub Wartak
On Sun, Apr 6, 2025 at 3:52 PM Tomas Vondra wrote: > > On 4/6/25 14:57, Jakub Wartak wrote: > > On Sun, Apr 6, 2025 at 12:29 AM Andres Freund wrote: > >> > >> Hi, > > > > Hi Andres/Tomas, > > > > I've noticed that Tomas responded to t

Re: Draft for basic NUMA observability

2025-04-06 Thread Jakub Wartak
On Sun, Apr 6, 2025 at 12:29 AM Andres Freund wrote: > > Hi, Hi Andres/Tomas, I've noticed that Tomas responded to this while writing this, so I'm attaching git-am patches based on his v25 (no squash) and there's only one new (last one contains fixes based on this review) + slight commit amendme

Re: Draft for basic NUMA observability

2025-04-05 Thread Jakub Wartak
On Thu, Apr 3, 2025 at 10:23 AM Bertrand Drouvot wrote: > > Hi, Hi Bertrand, > On Thu, Apr 03, 2025 at 09:01:43AM +0200, Jakub Wartak wrote: [..] > === v21-0002 > While pg_buffercache_build_tuple() is not added (pg_buffercache_save_tuple() > is). Fixed > About v21-0002

Re: Draft for basic NUMA observability

2025-04-05 Thread Jakub Wartak
On Mon, Mar 17, 2025 at 5:11 PM Bertrand Drouvot wrote: > Thanks for v13! Rebased and fixes inside in the attached v14 (it passes CI too): > Looking at 0003: > > === 1 > > + NUMA mappings for shared memory allocations > > s/NUMA mappings/NUMA node mappings/ maybe? Done. > === 2 > > + >

Re: Draft for basic NUMA observability

2025-04-05 Thread Jakub Wartak
On Wed, Apr 2, 2025 at 5:27 PM Bertrand Drouvot wrote: > > Hi Jakub, Hi Bertrand, > > OK, but I still fail to grasp why pg_indent doesnt fix this stuff on > > it's own... I believe orginal ident, would fix this on it's own? > > My comment was not about indention but about the fact that I think t

Re: Draft for basic NUMA observability

2025-04-04 Thread Jakub Wartak
On Wed, Apr 2, 2025 at 6:40 PM Tomas Vondra wrote: Hi Tomas, > OK, so you agree the commit messages are complete / correct? Yes. > OK. FWIW if you disagree with some of my proposed changes, feel free to > push back. I'm sure some may be more a matter of personal preference. No, it's all fine.

Re: Draft for basic NUMA observability

2025-04-04 Thread Jakub Wartak
On Fri, Apr 4, 2025 at 4:36 PM Tomas Vondra wrote: Hi Tomas, > Do you have any suggestions regarding the column names in the new view? > I'm not sure I like node_id and page_num. They actually look good to me. We've discussed earlier dropping s/numa_//g for column names (after all views contain

Re: Better HINT message for "unexpected data beyond EOF"

2025-04-04 Thread Jakub Wartak
On Tue, Apr 1, 2025 at 3:59 PM Andres Freund wrote: Hi Robert, Andres, Christoph, > On 2025-04-01 09:49:12 -0400, Robert Haas wrote: > > On Tue, Apr 1, 2025 at 7:13 AM Jakub Wartak > > wrote: > > > Thread bump. So we have the following candidates: > > > &g

Re: Draft for basic NUMA observability

2025-04-04 Thread Jakub Wartak
On Fri, Apr 4, 2025 at 8:50 AM Bertrand Drouvot wrote: > > Hi, > > On Thu, Apr 03, 2025 at 08:53:57PM +0200, Tomas Vondra wrote: > > On 4/3/25 15:12, Jakub Wartak wrote: > > > On Thu, Apr 3, 2025 at 1:52 PM Tomas Vondra wrote: > > > > > >> ... &

Re: Draft for basic NUMA observability

2025-04-03 Thread Jakub Wartak
On Thu, Apr 3, 2025 at 1:52 PM Tomas Vondra wrote: > On 4/3/25 09:01, Jakub Wartak wrote: > > On Wed, Apr 2, 2025 at 6:40 PM Tomas Vondra wrote: Hi Tomas, Here's v23 attached (had to rework it because the you sent v22 just the moment you I wanted to send it) Change include:

Re: Draft for basic NUMA observability

2025-04-03 Thread Jakub Wartak
On Thu, Apr 3, 2025 at 2:15 PM Tomas Vondra wrote: > Ah, OK. Jakub, can you correct (and double-check) this in the next > version of the patch? Done. > > About v21-0002: > > > > === 1 > > > > I can see that the pg_buffercache_init_entries() helper comments are added > > in > > v21-0003 but I t

Re: Draft for basic NUMA observability

2025-04-03 Thread Jakub Wartak
uot;pg_shm_numa_allocations" OPEN_QUESTION: To be honest, I'm not attached to any of those two (or naming things in general), I can change if you want. 13) In the patch: "review: What if we get multiple pages per buffer (the default). Could we get multiple nodes per buffer?"

Re: Draft for basic NUMA observability

2025-04-02 Thread Jakub Wartak
On Tue, Apr 1, 2025 at 5:13 PM Bertrand Drouvot wrote: > > Hi Jakub, > > On Tue, Apr 01, 2025 at 12:56:06PM +0200, Jakub Wartak wrote: > > On Mon, Mar 31, 2025 at 4:59 PM Bertrand Drouvot > > wrote: > > > > > Hi, > > > > Hi Bertrand, happ

Re: Better HINT message for "unexpected data beyond EOF"

2025-04-01 Thread Jakub Wartak
On Thu, Mar 27, 2025 at 4:00 PM Christoph Berg wrote: > > Re: Robert Haas > > I think that would be better than what we have now, but I still wonder > > if we should give some kind of a hint that an external process may be > > doing something to that file. Jakub and I may be biased by having just

Re: Draft for basic NUMA observability

2025-04-01 Thread Jakub Wartak
On Mon, Mar 31, 2025 at 4:59 PM Bertrand Drouvot wrote: > Hi, Hi Bertrand, happy to see you back, thanks for review and here's v18 attached (an ideal fit for PG18 ;)) > On Mon, Mar 31, 2025 at 11:27:50AM +0200, Jakub Wartak wrote: > > On Thu, Mar 27, 2025 at 2:40 PM And

Re: Draft for basic NUMA observability

2025-03-31 Thread Jakub Wartak
On Thu, Mar 27, 2025 at 2:40 PM Andres Freund wrote: > > Hi, Hi Andres, > On 2025-03-27 14:02:03 +0100, Jakub Wartak wrote: > >setup_additional_packages_script: | > > -#apt-get update > > -#DEBIAN_FRONTEND=noninteractive apt-get -y install

Re: Draft for basic NUMA observability

2025-03-31 Thread Jakub Wartak
On Thu, Mar 27, 2025 at 2:15 PM Álvaro Herrera wrote: > Hello Good morning :) > I think you should remove numa_warn() and numa_error() from 0001. > AFAICS they are dead code (even with all your patches applied), and > furthermore would get you in trouble regarding memory allocation because > sr

Re: Draft for basic NUMA observability

2025-03-27 Thread Jakub Wartak
On Thu, Mar 27, 2025 at 12:31 PM Nazir Bilal Yavuz wrote: > > Hi, > > Thank you for working on this! > > On Wed, 19 Mar 2025 at 12:06, Jakub Wartak > wrote: > > > > On Tue, Mar 18, 2025 at 3:29 PM Bertrand Drouvot > > wrote: > > > > Hi! v15 a

Re: Better HINT message for "unexpected data beyond EOF"

2025-03-27 Thread Jakub Wartak
On Wed, Mar 26, 2025 at 4:01 PM Robert Haas wrote: [..] > > so how about: > > -HINT: This has been seen to occur with buggy kernels; consider > > updating your system. > > +HINT: This has been observed with files being overwritten, buggy > > kernels and potentially other external file system inf

Better HINT message for "unexpected data beyond EOF"

2025-03-26 Thread Jakub Wartak
I would like to propose that we tweak the following error message: ERROR: unexpected data beyond EOF in block 1472 of relation base/5/16387 HINT: This has been seen to occur with buggy kernels; consider updating your system. to something more generic and less confusing. It is coming from ffae5c

Re: doc: Mention clock synchronization recommendation for hot_standby_feedback

2025-03-24 Thread Jakub Wartak
On Fri, Mar 14, 2025 at 11:31 AM vignesh C wrote: > > On Wed, 5 Mar 2025 at 11:46, Amit Kapila wrote: > > > > On Tue, Mar 4, 2025 at 4:44 PM Jakub Wartak > > > > I can go with the last patch as you observed that in a real-world > > case, and we can look at oth

Re: AIO v2.5

2025-03-20 Thread Jakub Wartak
On Tue, Mar 18, 2025 at 9:12 PM Andres Freund wrote: > > Hi, > > Attached is v2.10, with the following changes: > > - committed core AIO infrastructure patch Hi, yay, It's happening.jpg ;) Some thoughts about 2.10-0004: What do you think about putting there into (io_uring patch) info about the

Re: Draft for basic NUMA observability

2025-03-19 Thread Jakub Wartak
On Tue, Mar 18, 2025 at 3:29 PM Bertrand Drouvot wrote: Hi! v15 attached, rebased, CI-tested, all fixed incorporated. > > I've adjusted it all and settled on "numa_node_id" column name. > [...] > I think that we can get rid of the "numa_" stuff in column(s) name as > the column(s) are part of "n

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-03-18 Thread Jakub Wartak
On Mon, Mar 17, 2025 at 10:46 PM Melanie Plageman wrote: > > On Mon, Mar 17, 2025 at 2:55 PM Andres Freund wrote: > > > > On 2025-03-17 14:52:02 -0400, Melanie Plageman wrote: > > > I don't feel strongly that we need to be as rigorous for > > > maintenance_io_concurrency, but I'm also not sure 16

Re: Draft for basic NUMA observability

2025-03-17 Thread Jakub Wartak
On Fri, Mar 14, 2025 at 1:08 PM Bertrand Drouvot wrote: > On Fri, Mar 14, 2025 at 11:05:28AM +0100, Jakub Wartak wrote: > > On Thu, Mar 13, 2025 at 3:15 PM Bertrand Drouvot > > wrote: > > > > Hi, > > > > Thank you very much for the review! I'm ans

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-03-17 Thread Jakub Wartak
On Thu, Mar 13, 2025 at 9:34 PM Melanie Plageman wrote: > On Thu, Mar 13, 2025 at 5:46 AM Jakub Wartak > wrote: > > > > Cool, anything > 1 is just better. Just quick question, so now we have: > > > > #define DEFAULT_EFFECTIVE_IO_CONCURRENCY 16 > > #defin

Re: Draft for basic NUMA observability

2025-03-14 Thread Jakub Wartak
On Thu, Mar 13, 2025 at 3:15 PM Bertrand Drouvot wrote: Hi, Thank you very much for the review! I'm answering to both reviews in one go and the results is attached v12, seems it all should be solved now: > > > === 2 > > > > > > +else > > > + as_fn_error $? "header file is required for --with-

Re: Draft for basic NUMA observability

2025-03-13 Thread Jakub Wartak
On Wed, Mar 12, 2025 at 4:41 PM Jakub Wartak wrote: > > On Mon, Mar 10, 2025 at 11:14 AM Bertrand Drouvot > wrote: > > > Thanks for the new version! > > v10 is attached with most fixes after review and one new thing > introduced: pg_numa_available() for run-time decisi

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-03-13 Thread Jakub Wartak
On Wed, Mar 12, 2025 at 9:02 PM Melanie Plageman wrote: > > Thanks for taking a look. I've pushed the patch to increase the > default effective_io_concurrency. Cool, anything > 1 is just better. Just quick question, so now we have: #define DEFAULT_EFFECTIVE_IO_CONCURRENCY 16 #define DEFAULT_MAIN

Re: Draft for basic NUMA observability

2025-03-12 Thread Jakub Wartak
On Mon, Mar 10, 2025 at 11:14 AM Bertrand Drouvot wrote: > Thanks for the new version! v10 is attached with most fixes after review and one new thing introduced: pg_numa_available() for run-time decision inside tests which was needed after simplifying code a little bit as you wanted. I've also f

Re: AIO v2.5

2025-03-11 Thread Jakub Wartak
On Tue, Mar 4, 2025 at 8:00 PM Andres Freund wrote: > Attached is v2.5 of the AIO patchset. [..] Hi, Thanks for working on this! > Questions: > > - My current thinking is that we'd set io_method = worker initially - so we > actually get some coverage - and then decide whether to switch to >

Re: Draft for basic NUMA observability

2025-03-07 Thread Jakub Wartak
On Fri, Mar 7, 2025 at 11:20 AM Jakub Wartak wrote: > > Hi, > On Wed, Mar 5, 2025 at 10:30 AM Jakub Wartak > wrote: > >Hi, > > > > Yeah, that's why I was mentioning to use a "shared" > > > populate_buffercache_entry() > > > or such

Re: AIO v2.5

2025-03-07 Thread Jakub Wartak
On Thu, Mar 6, 2025 at 2:13 PM Andres Freund wrote: > On 2025-03-06 12:36:43 +0100, Jakub Wartak wrote: > > On Tue, Mar 4, 2025 at 8:00 PM Andres Freund wrote: > > > Questions: > > > > > > - My current thinking is that we'd set io_method = worker init

Re: Draft for basic NUMA observability

2025-03-07 Thread Jakub Wartak
Hi, On Wed, Mar 5, 2025 at 10:30 AM Jakub Wartak wrote: >Hi, > > Yeah, that's why I was mentioning to use a "shared" > > populate_buffercache_entry() > > or such function: to put the "duplicated" code in it and then use this > > shared fu

Re: Draft for basic NUMA observability

2025-03-05 Thread Jakub Wartak
On Tue, Mar 4, 2025 at 5:02 PM Bertrand Drouvot wrote: > > Cool! Attached is v7 > Thanks for the new version! ... and another one: 7b ;) > > > === 2 [..] > > Well, I've made query_numa a parameter there simply to avoid that code > > duplication in the first place, look at those TupleDescInitEnt

Re: FileFallocate misbehaving on XFS

2025-03-04 Thread Jakub Wartak
On Fri, Jan 31, 2025 at 3:33 PM Jean-Christophe Arnu wrote: > > Hello Mike, > > We encountered the same problem with a fixed allocsize=262144k. Removing this > option seemed to fix the problem.We are now in an XFS managed allocation > heuristic way. The problem does not show up since the change

Re: doc: Mention clock synchronization recommendation for hot_standby_feedback

2025-03-04 Thread Jakub Wartak
Hi, On Tue, Mar 4, 2025 at 4:59 AM Amit Kapila wrote: > > > > > > Sure thing. I've just added '(..) In the extreme cases this can..' as > > > it is pretty rare to hit it. Patch attached. > > > > When the clock moves forward or backward, couldn't it affect > > not only the standby but also the pr

Re: Draft for basic NUMA observability

2025-03-04 Thread Jakub Wartak
> #include "postgres.h" > > Is this new include needed? Removed, don't remember how it arrived here, most have been some artifact of earlier attempts. > #include "access/htup_details.h" > @@ -13,10 +14,12 @@ > #include "funcapi.h" >

Re: doc: Mention clock synchronization recommendation for hot_standby_feedback

2025-03-02 Thread Jakub Wartak
Hi Amit, On Mon, Mar 3, 2025 at 6:26 AM Amit Kapila wrote: [..] OK, sure. > How about something like: "Note that if the clock on standby is moved > ahead or backward, the feedback message may not be sent at the > required interval. This can lead to prolonged risk of not removing > dead rows on

Re: Draft for basic NUMA observability

2025-02-27 Thread Jakub Wartak
On Wed, Feb 26, 2025 at 6:13 PM Bertrand Drouvot wrote: [..] > > Meanwhile v5 is attached with slight changes to try to make cfbot happy: > > Thanks for the updated version! > > FWIW, I had to do a few changes to get an error free compiling experience with > autoconf/or meson and both with or with

Re: Draft for basic NUMA observability

2025-02-26 Thread Jakub Wartak
On Wed, Feb 26, 2025 at 10:58 AM Andres Freund wrote: > > Hi, > > On 2025-02-26 09:38:20 +0100, Jakub Wartak wrote: > > > FWIW, what you posted fails on CI: > > > https://cirrus-ci.com/task/5114213770723328 > > > > > > Probably some ifdefs ar

Re: Draft for basic NUMA observability

2025-02-26 Thread Jakub Wartak
On Mon, Feb 24, 2025 at 5:11 PM Bertrand Drouvot wrote: > > Hi, > > On Mon, Feb 24, 2025 at 09:06:20AM -0500, Andres Freund wrote: > > Does the issue with "new" backends seeing pages as not present exist both > > with > > and without huge pages? > > That's a good point and from what I can see it'

Re: Draft for basic NUMA observability

2025-02-26 Thread Jakub Wartak
On Mon, Feb 24, 2025 at 3:06 PM Andres Freund wrote: > On 2025-02-24 12:57:16 +0100, Jakub Wartak wrote: Hi Andres, thanks for your review! OK first sane version attached with new src/port/pg_numa.c boilerplate in 0001. Fixed some bugs too, there is one remaining optimization to be done (

Re: Draft for basic NUMA observability

2025-02-24 Thread Jakub Wartak
Hi Bertrand, TL;DR; the main problem seems choosing which way to page-fault the shared memory before the backend is going to use numa_move_pages() as the memory mappings (fresh after fork()/CoW) seem to be not ready for numa_move_pages() inquiry. On Thu, Feb 20, 2025 at 9:32 AM Bertrand Drouvot

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-02-19 Thread Jakub Wartak
On Fri, Feb 14, 2025 at 7:16 PM Andres Freund wrote: Hi, > On 2025-02-14 18:36:37 +0100, Tomas Vondra wrote: > > All of this is true, ofc, but maybe it's better to have a tool providing > > at least some advice > > I agree, a tool like that would be useful! > > One difficulty is that the relevan

Re: pg_upgrade fails for PostGIS custom SRIDs

2025-02-19 Thread Jakub Wartak
On Wed, Feb 19, 2025 at 11:27 AM Jakub Wartak wrote: > > Hi pg-hackers and postgis-hackers, sorry for cross-posting but we > think it really affects how those two products work together. This is > about pg_upgrade failure for tables with custom SRIDs. > > PostGIS is an ext

pg_upgrade fails for PostGIS custom SRIDs

2025-02-19 Thread Jakub Wartak
Hi pg-hackers and postgis-hackers, sorry for cross-posting but we think it really affects how those two products work together. This is about pg_upgrade failure for tables with custom SRIDs. PostGIS is an extension that uses a special table called public.spatial_ref_sys for handling available spat

Re: Draft for basic NUMA observability

2025-02-17 Thread Jakub Wartak
On Thu, Feb 13, 2025 at 4:28 PM Bertrand Drouvot wrote: Hi Bertrand, Thanks for playing with this! > Which makes me wonder if using numa_move_pages()/move_pages is the right > approach. Would be curious to know if you observe the same behavior though. You are correct, I'm observing identical

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-02-14 Thread Jakub Wartak
On Wed, Feb 12, 2025 at 9:57 PM Robert Haas wrote: > > On Wed, Feb 12, 2025 at 3:07 PM Tomas Vondra wrote: > > AFAICS the "1" value is simply one of the many "defensive" defaults in > > our sample config. It's much more likely to help than cause harm, even > > on smaller/older systems, but for ma

Re: Allow io_combine_limit up to 1MB

2025-02-14 Thread Jakub Wartak
On Wed, Feb 12, 2025 at 1:03 AM Andres Freund wrote: > > Hi, > > On 2025-02-11 13:12:17 +1300, Thomas Munro wrote: > > Tomas queried[1] the limit of 256kB (or really 32 blocks) for > > io_combine_limit. Yeah, I think we should increase it and allow > > experimentation with larger numbers. Note t

Re: AIO v2.3

2025-02-13 Thread Jakub Wartak
On Tue, Feb 11, 2025 at 12:10 AM Andres Freund wrote: >> TLDR; in terms of SELECTs the master vs aioworkers looks very solid! > Phew! Weee! Yay. Another good news: I've completed a full 24h pgbench run on the same machine and it did not fail or report anything suspicious. FYI, patchset didn't n

Re: hash_search_with_hash_value is high in "perf top" on a replica

2025-02-10 Thread Jakub Wartak
Hi Thomas! On Tue, Feb 4, 2025 at 10:22 PM Thomas Munro wrote: > > On Sun, Feb 2, 2025 at 3:44 AM Ants Aasma wrote: > > The other direction is to split off WAL decoding, buffer lookup and maybe > > even pinning to a separate process from the main redo loop. > > Hi Ants, > [..] > An assumption I

Draft for basic NUMA observability

2025-02-07 Thread Jakub Wartak
As I have promised to Andres on the Discord hacking server some time ago, I'm attaching the very brief (and potentially way too rushed) draft of the first step into NUMA observability on PostgreSQL that was based on his presentation [0]. It might be rough, but it is to get us started. The patches w

Re: Trim the heap free memory

2025-01-21 Thread Jakub Wartak
On Sun, Dec 8, 2024 at 7:48 PM Tomas Vondra wrote: [..] > >> I have previously encountered situations where the non-garbage-collected > >> memory of wal_sender was approximately hundreds of megabytes or even > >> exceeded 1GB, but I was unable to reproduce this situation using simple > >> SQL. The

Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)

2025-01-09 Thread Jakub Wartak
On Thu, Jan 9, 2025 at 4:12 AM Thomas Munro wrote: > > On Tue, Jan 7, 2025 at 9:54 AM Thomas Munro wrote: > > On Tue, Jan 7, 2025 at 5:23 AM Andrew Dunstan wrote: > > > Do you have a plan for moving ahead with this? > > > > I think that all looks good, and I'll go ahead and commit it in the > >

Re: AIO v2.0

2025-01-08 Thread Jakub Wartak
On Mon, Jan 6, 2025 at 5:28 PM Andres Freund wrote: > > Hi, > > On 2024-12-19 17:29:12 -0500, Andres Freund wrote: > > > Not about patch itself, but questions about related stack functionality: > > > --

Re: doc: Mention clock synchronization recommendation for hot_standby_feedback

2025-01-08 Thread Jakub Wartak
On Wed, Dec 18, 2024 at 10:33 AM Amit Kapila wrote: Hi Amit! > On Thu, Dec 5, 2024 at 3:14 PM Jakub Wartak > wrote: > > > > One of our customers ran into a very odd case, where hot standby feedback > > backend_xmin propagation stopped working due to major (hours/days

Re: FileFallocate misbehaving on XFS

2024-12-20 Thread Jakub Wartak
On Thu, Dec 19, 2024 at 7:49 AM Michael Harris wrote: > Hello, > > I finally managed to get the patched version installed in a production > database where the error is occurring very regularly. > > Here is a sample of the output: > > 2024-12-19 01:08:50 CET [2533222]: LOG: mdzeroextend FileFall

Re: FileFallocate misbehaving on XFS

2024-12-16 Thread Jakub Wartak
On Thu, Dec 12, 2024 at 12:50 AM Andres Freund wrote: > Hi, > > FWIW, I tried fairly hard to reproduce this. > Same, but without PG and also without much success. I've also tried to push the AGs (with just one or two AGs created via mkfs) to contain only small size extents (by creating hundreds

Re: FileFallocate misbehaving on XFS

2024-12-11 Thread Jakub Wartak
On Wed, Dec 11, 2024 at 4:00 AM Michael Harris wrote: > Hi Jakub > > On Tue, 10 Dec 2024 at 22:36, Jakub Wartak > wrote: [..] > > > 3. Maybe somehow there is a bigger interaction between posix_fallocate() > and delayed XFS's dynamic speculative preallocation from

Re: FileFallocate misbehaving on XFS

2024-12-10 Thread Jakub Wartak
On Tue, Dec 10, 2024 at 7:34 AM Michael Harris wrote: Hi Michael, 1. Well it doesn't look like XFS AG fragmentation to me (we had a customer with a huge number of AGs with small space in them) reporting such errors after upgrading to 16, but not for earlier versions (somehow posix_fallocate() h

Re: doc: Mention clock synchronization recommendation for hot_standby_feedback

2024-12-09 Thread Jakub Wartak
Hi Euler!, On Thu, Dec 5, 2024 at 4:07 PM Euler Taveira wrote: > On Thu, Dec 5, 2024, at 6:43 AM, Jakub Wartak wrote: > > One of our customers ran into a very odd case, where hot standby feedback > backend_xmin propagation stopped working due to major (hours/days) clock >

Re: FileFallocate misbehaving on XFS

2024-12-09 Thread Jakub Wartak
On Mon, Dec 9, 2024 at 10:19 AM Michael Harris wrote: Hi Michael, We found this thread describing similar issues: > > > https://www.postgresql.org/message-id/flat/AS1PR05MB91059AC8B525910A5FCD6E699F9A2%40AS1PR05MB9105.eurprd05.prod.outlook.com > We've got some case in the past here in EDB, wher

doc: Mention clock synchronization recommendation for hot_standby_feedback

2024-12-05 Thread Jakub Wartak
One of our customers ran into a very odd case, where hot standby feedback backend_xmin propagation stopped working due to major (hours/days) clock time shifts on hypervisor-managed VMs. This happens (and is fully reproducible) e.g. in scenarios where standby connects and its own VM is having time f

Re: Windows pg_basebackup unable to create >2GB pg_wal.tar tarballs ("could not close file: Invalid argument" when creating pg_wal.tar of size ~ 2^31 bytes)

2024-11-22 Thread Jakub Wartak
Hi Thomas! On Thu, Nov 21, 2024 at 2:38 PM Thomas Munro wrote: > On Thu, Nov 21, 2024 at 11:44 PM Jakub Wartak > wrote: > > This literally looks like something like off_t/size_t would be limited > to 2^31 somewhere. > > off_t is 32 bits on Windows. I'd be quite susp

Re: AIO v2.0

2024-11-18 Thread Jakub Wartak
On Fri, Sep 6, 2024 at 9:38 PM Andres Freund wrote: > Hi, > > Attached is the next version of the patchset. (..) Hi Andres, Thank You for worth admiring persistence on this. Please do not take it as criticism, just more like set of questions regarding the patchset v2.1 that I finally got litt

Re: pg_combinebackup --incremental

2024-11-13 Thread Jakub Wartak
On Mon, Nov 4, 2024 at 6:53 PM Robert Haas wrote: Hi Robert, [..] 1. Well, I have also the same bug as Bertrand which seems to be because MacOS was used development rather than Linux (and thus MacOS doesnt have copy_file_range(2)/HAVE_COPY_FILE_RANGE) --> I've simply fallen back to undefHAVE_C

failed optimization attempt for ProcArrayGroupClearXid(): using custom PGSemaphores that use __atomics and futexes batching via IO_URING

2024-11-07 Thread Jakub Wartak
Having little energy boost after meeting some VSP (Very Smart People) on recent PGConfEu, I've attempted to pursue optimization about apparently minor (?) inefficiency that I've spotted while researching extremely high active backends on extreme max_connections (measured in thousands active backend

Re: allowing extensions to control planner behavior

2024-10-18 Thread Jakub Wartak
Hi Andrei, On Fri, Oct 11, 2024 at 8:21 AM Andrei Lepikhov wrote: > On 10/10/24 23:51, Robert Haas wrote: > > On Wed, Sep 18, 2024 at 11:48 AM Robert Haas > wrote: > > 1. If you want to specify in-query hints using comments, how does your > > extension get access to the comments? > Having desi

Re: Syncrep and improving latency due to WAL throttling

2024-10-18 Thread Jakub Wartak
nd will be rejected/returned with feedback due to lack of feedback or work, having the record in CF should help others in future using those patches and design discussions as a good starting point. Right now I do not have plans to work on that,but maybe I'll be able to help in future. Reg

Re: allowing extensions to control planner behavior

2024-10-14 Thread Jakub Wartak
002 would be better to have than nothing (well other than pg_hint_plan), as the it looks to me that the most frequent workaround for optimizer issues is to just throw 'enable_nestloop = no' into the mix quite often (so having the ability to just throw fixproblem.so into session_preload_libraries with just strstr()/regex() - to match on specific query - and disable it just there seems to be completely achievable and has much better granularity when targeting whole sessions with SET issued there). -Jakub Wartak.

Re: scalability bottlenecks with (many) partitions (and more)

2024-09-23 Thread Jakub Wartak
On Mon, Sep 16, 2024 at 4:19 PM Tomas Vondra wrote: > On 9/16/24 15:11, Jakub Wartak wrote: > > On Fri, Sep 13, 2024 at 1:45 AM Tomas Vondra wrote: > > > >> [..] > > > >> Anyway, at this point I'm quite happy with this improvement. I didn't >

Re: Configurable FP_LOCK_SLOTS_PER_BACKEND

2024-09-23 Thread Jakub Wartak
Good morning! FYI: I know many people are/were tracking this email thread rather than the newer and more recent one "scalability bottlenecks with (many) partitions (and more)", but please see [1] [2] , where Tomas committed enhanced fast-path locking to the master(18). Thanks Tomas for persisten

Re: scalability bottlenecks with (many) partitions (and more)

2024-09-16 Thread Jakub Wartak
On Fri, Sep 13, 2024 at 1:45 AM Tomas Vondra wrote: > [..] > Anyway, at this point I'm quite happy with this improvement. I didn't > have any clear plan when to commit this, but I'm considering doing so > sometime next week, unless someone objects or asks for some additional > benchmarks etc. T

Re: scalability bottlenecks with (many) partitions (and more)

2024-09-06 Thread Jakub Wartak
On Thu, Sep 5, 2024 at 7:33 PM Tomas Vondra wrote: >>> My $0.02 cents: the originating case that triggered those patches, >>> actually started with LWLock/lock_manager waits being the top#1. The >>> operator can cross check (join) that with a group by pg_locks.fastpath >>> (='f'), count(*). So, I

Re: scalability bottlenecks with (many) partitions (and more)

2024-09-04 Thread Jakub Wartak
Hi Tomas! On Tue, Sep 3, 2024 at 6:20 PM Tomas Vondra wrote: > > On 9/3/24 17:06, Robert Haas wrote: > > On Mon, Sep 2, 2024 at 1:46 PM Tomas Vondra wrote: > >> The one argument to not tie this to max_locks_per_transaction is the > >> vastly different "per element" memory requirements. If you ad

Re: Set log_lock_waits=on by default

2024-09-03 Thread Jakub Wartak
On Fri, Jul 19, 2024 at 5:13 PM Robert Haas wrote: > > On Fri, Jul 19, 2024 at 10:22 AM Christoph Berg wrote: [..] > > Let's fix the default. People who have a problem can still disable it, > > but then everyone else gets the useful messages in the first iteration. > > Reasonable. > I have feeli

Re: Redux: Throttle WAL inserts before commit

2024-08-29 Thread Jakub Wartak
ck then and were also based on SyncRepWaitForLSN(), but somehow maybe we ran out of steam and there was not that big interest back then. Maybe you could post a review there (for Tomas's more modern recent patch), if it is helping your use case even today. That way it could get some traction again? -Jakub Wartak.

Re: allowing extensions to control planner behavior

2024-08-28 Thread Jakub Wartak
Hi Robert, On Mon, Aug 26, 2024 at 6:33 PM Robert Haas wrote: > > I'm somewhat expecting to be flamed to a well-done crisp for saying > this, but I think we need better ways for extensions to control the > behavior of PostgreSQL's query planner. [..] > [..] But all that > said, as much as anythi

Re: elog/ereport VS misleading backtrace_function function address

2024-08-26 Thread Jakub Wartak
Hi Robert, On Tue, May 14, 2024 at 5:36 PM Robert Haas wrote: > > On Tue, May 14, 2024 at 6:13 AM Jakub Wartak > wrote: > > OK I'll try to explain using assembly, but I'm not an expert on this. > > Let's go to the 1st post, assume we run with backtrac

Re: Enable data checksums by default

2024-08-22 Thread Jakub Wartak
On Thu, Aug 22, 2024 at 8:11 AM Peter Eisentraut wrote: > > On 15.08.24 08:38, Peter Eisentraut wrote: > > On 08.08.24 19:42, Robert Haas wrote: > >>> I'm thinking pg_upgrade could have a mode where it adds the > >>> checksum during the upgrade as it copies the files (essentially a subset > >>> of

Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2024-08-20 Thread Jakub Wartak
On Tue, Aug 20, 2024 at 9:03 AM John Naylor wrote: > This is done. Cool! Thanks John and Robert! :) -J.

debug_palloc_context_threshold - dump code paths leading to memory leaks

2024-08-19 Thread Jakub Wartak
Hi -hackers, >From time to time we hit some memory leaks in PostgreSQL and on one occasion Tomas wrote: "I certainly agree it's annoying that when OOM hits, we end up with no information about what used the memory. Maybe we could have a threshold triggering a call to MemoryContextStats?" - see [1]

Re: Enable data checksums by default

2024-08-15 Thread Jakub Wartak
Hi all, On Tue, Aug 13, 2024 at 10:08 PM Robert Haas wrote: > And it's not like we have statistics anywhere that you can look at to > see how much CPU time you spent computing checksums, so if a user DOES > have a performance problem that would not have occurred if checksums > had been disabled,

Re: Enable data checksums by default

2024-08-15 Thread Jakub Wartak
Hi Greg and others On Tue, Aug 13, 2024 at 4:42 PM Greg Sabino Mullane wrote: > > On Thu, Aug 8, 2024 at 6:11 AM Peter Eisentraut wrote: > >> >> My understanding was that the reason for some hesitation about adopting data >> checksums was the performance impact. Not the checksumming itself, bu

Re: Enable data checksums by default

2024-08-15 Thread Jakub Wartak
On Wed, Aug 7, 2024 at 4:18 PM Greg Sabino Mullane wrote: > > On Wed, Aug 7, 2024 at 4:43 AM Michael Banck wrote: >> >> I think the last time we dicussed this the consensus was that >> computational overhead of computing the checksums is pretty small for >> most systems (so the above change seems

Re: Doc limitation update proposal: include out-of-line OID usage per TOAST-ed columns

2024-05-20 Thread Jakub Wartak
On Tue, May 14, 2024 at 8:19 PM Robert Haas wrote: > > I looked at your version and wrote something that is shorter and > doesn't touch any existing text. Here it is. Hi Robert, you are a real tactician here - thanks for whatever references the original problem! :) Maybe just slight hint nearby e

Re: elog/ereport VS misleading backtrace_function function address

2024-05-14 Thread Jakub Wartak
Hi Peter! On Sun, May 12, 2024 at 10:33 PM Peter Eisentraut wrote: > > On 07.05.24 09:43, Jakub Wartak wrote: > > NOTE: in case one will be testing this: one cannot ./configure with > > --enable-debug as it prevents the compiler optimizations that actually > > end up

Re: elog/ereport VS misleading backtrace_function function address

2024-05-07 Thread Jakub Wartak
Hi Tom and -hackers! On Thu, Mar 28, 2024 at 7:36 PM Tom Lane wrote: > > Jakub Wartak writes: > > While chasing some other bug I've learned that backtrace_functions > > might be misleading with top elog/ereport() address. > > That was understood from the beginni

Re: apply_scanjoin_target_to_paths and partitionwise join

2024-05-06 Thread Jakub Wartak
Hi Ashutosh & hackers, On Mon, Apr 15, 2024 at 9:00 AM Ashutosh Bapat wrote: > > Here's patch with > [..] > Adding to the next commitfest but better to consider this for the next set of > minor releases. 1. The patch does not pass cfbot - https://cirrus-ci.com/task/5486258451906560 on master du

Re: GUC-ify walsender MAX_SEND_SIZE constant

2024-04-23 Thread Jakub Wartak
Hi, > My understanding of Majid's use-case for tuning MAX_SEND_SIZE is that the > bottleneck is storage, not network. The reason MAX_SEND_SIZE affects that is > that it determines the max size passed to WALRead(), which in turn determines > how much we read from the OS at once. If the storage has

  1   2   >