Re: pgsql: Add function to get memory context stats for processes

2025-04-26 Thread Tomas Vondra
ssGetMemoryContextInterrupt() do the same thing? In any case, if DSA happens to not be the right way to transfer this, what should we use instead? The only thing I can think of is some sort of pre-allocated chunk of shared memory. regards -- Tomas Vondra

Re: Get rid of integer divide in FAST_PATH_REL_GROUP() macro

2025-04-26 Thread Tomas Vondra
o be verifying something that the loop > condition was checking already. I thought it was better to check that > we end up with a power-of-two. > > Please see the attached patch. > Thanks. Those changes seem fine to me to. Do you intend to push these, or do you want me to do it? regards -- Tomas Vondra

Re: AIO v2.5

2025-04-22 Thread Tomas Vondra
cause of the RMT, but I'm also willing to do some of the tests, if needed - but it'd be good to get some guidance. regards -- Tomas Vondra

Re: Enable data checksums by default

2025-04-22 Thread Tomas Vondra
ecksums by default, but now I realize the thread talks about "upgrade experience" which seems fairly wide. So, what kind of data we expect to gather in order to evaluate this? Who's expected to collect it and evaluate this? regards -- Tomas Vondra

Re: index prefetching

2025-04-22 Thread Tomas Vondra
On 4/22/25 18:26, Peter Geoghegan wrote: > On Tue, Apr 22, 2025 at 6:46 AM Tomas Vondra wrote: >> here's an improved (rebased + updated) version of the patch series, with >> some significant fixes and changes. The patch adds infrastructure and >> modifies btree index

Re: Parallel CREATE INDEX for GIN indexes

2025-04-21 Thread Tomas Vondra
approaches > to > resolve this too). > Thanks for the report. I didn't have time to look at this in detail yet, but the fix looks roughly correct. I've added this to the list of open items for PG18. regards -- Tomas Vondra

Re: Draft for basic NUMA observability

2025-04-10 Thread Tomas Vondra
bigint, perhaps? Attached is v28, with the commit messages updated, added about allocation of the memory, etc. I'll let the CI run the tests on it, and then will push, unless someone has more comments. regards -- Tomas Vondra From 9a222c77de2ee4a0b32d97c3d8bab2bb33f066de Mon Sep 17 00:0

Re: Add os_page_num to pg_buffercache

2025-04-10 Thread Tomas Vondra
> - It's currently doing the changes in pg_buffercache v1.6 but will need to > create v1.7 for 19 (if the above stands true) > This seems like a good idea in principle, but at this point it has to wait for PG19. Please add it to the July commitfest. regards -- Tomas Vondra

Re: long-standing data loss bug in initial sync of logical replication

2025-04-10 Thread Tomas Vondra
gt; >> >> Seeing no responses for a long time, I am planning to push the fix >> till 14 tomorrow unless there are some opinions on the fix for 13. We >> can continue to discuss the scope of the fix for 13. >> > > Pushed till 14. > Thanks everyone who persevered and kept working on fixing this! Highly appreciated. regards -- Tomas Vondra

Re: Draft for basic NUMA observability

2025-04-09 Thread Tomas Vondra
On 4/9/25 17:51, Andres Freund wrote: > Hi, > > On 2025-04-09 17:28:31 +0200, Tomas Vondra wrote: >> On 4/9/25 17:14, Andres Freund wrote: >>> I'd mention that the includes of postgres.h/fmgr.h is what caused missing >>> build-time dependencies and via tha

Re: Draft for basic NUMA observability

2025-04-09 Thread Tomas Vondra
On 4/9/25 17:14, Andres Freund wrote: > Hi, > > On 2025-04-09 16:33:14 +0200, Tomas Vondra wrote: >> From e1f093d091610d70fba72b2848f25ff44899ea8e Mon Sep 17 00:00:00 2001 >> From: Tomas Vondra >> Date: Tue, 8 Apr 2025 23:31:29 +0200 >> Subject: [PATCH 1/2] Clea

Re: Draft for basic NUMA observability

2025-04-09 Thread Tomas Vondra
On 4/9/25 01:29, Andres Freund wrote: > Hi, > > On 2025-04-09 01:10:09 +0200, Tomas Vondra wrote: >> On 4/8/25 15:06, Andres Freund wrote: >>> Hi, >>> >>> On 2025-04-08 17:44:19 +0500, Kirill Reshke wrote: >>>> On Mon, 7 Apr 2025 at 23:00, To

Re: Draft for basic NUMA observability

2025-04-09 Thread Tomas Vondra
Updated patches with proper commit messages etc. -- Tomas Vondra From e1f093d091610d70fba72b2848f25ff44899ea8e Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Tue, 8 Apr 2025 23:31:29 +0200 Subject: [PATCH 1/2] Cleanup of pg_numa.c This moves/renames some of the functions defined in

Re: Draft for basic NUMA observability

2025-04-09 Thread Tomas Vondra
On 4/9/25 14:07, Tomas Vondra wrote: > ... > > OK, here are two patches, where 0001 adds the missingdeps check to the > Debian meson build. It just adds that to the build script. > > 0002 leaves the NUMA stuff in src/port (i.e. it's no longer moved to > src/backen

Re: Draft for basic NUMA observability

2025-04-08 Thread Tomas Vondra
On 4/8/25 15:06, Andres Freund wrote: > Hi, > > On 2025-04-08 17:44:19 +0500, Kirill Reshke wrote: >> On Mon, 7 Apr 2025 at 23:00, Tomas Vondra wrote: >>> I'll let the CI run the tests on it, and >>> then will push, unless someone has more comments. >

Re: Draft for basic NUMA observability

2025-04-08 Thread Tomas Vondra
On 4/8/25 15:06, Andres Freund wrote: > Hi, > > On 2025-04-08 17:44:19 +0500, Kirill Reshke wrote: >> On Mon, 7 Apr 2025 at 23:00, Tomas Vondra wrote: >>> I'll let the CI run the tests on it, and >>> then will push, unless someone has more comments. >

Re: Draft for basic NUMA observability

2025-04-08 Thread Tomas Vondra
On 4/8/25 16:59, Andres Freund wrote: > Hi, > > On 2025-04-08 09:35:37 -0400, Andres Freund wrote: >> On April 8, 2025 9:21:57 AM EDT, Tomas Vondra wrote: >>> On 4/8/25 15:06, Andres Freund wrote: >>>> On 2025-04-08 17:44:19 +0500, Kirill Reshke wro

Re: Draft for basic NUMA observability

2025-04-08 Thread Tomas Vondra
> The attached small patch fixes the manual. > Thank you for noticing this and for the fix! Pushed. This also reminded me we agreed to change page_num to bigint, which I forgot to change before commit. So I adjusted that too, separately. regards -- Tomas Vondra

Re: Draft for basic NUMA observability

2025-04-07 Thread Tomas Vondra
On 4/7/25 17:51, Andres Freund wrote: > Hi, > > On 2025-04-06 13:56:54 +0200, Tomas Vondra wrote: >> On 4/6/25 01:00, Andres Freund wrote: >>> On 2025-04-05 18:29:22 -0400, Andres Freund wrote: >>>> I think one thing that the docs should mention is that callin

Re: Draft for basic NUMA observability

2025-04-07 Thread Tomas Vondra
On 4/7/25 23:50, Jakub Wartak wrote: > On Mon, Apr 7, 2025 at 11:27 PM Tomas Vondra wrote: >> >> Hi, >> >> I've pushed all three parts of v29, with some additional corrections >> (picked lower OIDs, bumped catversion, fixed commit messages). > > H

Re: Draft for basic NUMA observability

2025-04-07 Thread Tomas Vondra
Hi, I've pushed all three parts of v29, with some additional corrections (picked lower OIDs, bumped catversion, fixed commit messages). On 4/7/25 23:01, Jakub Wartak wrote: > On Mon, Apr 7, 2025 at 9:51 PM Tomas Vondra wrote: > >>> So it looks like that the new way to it

Re: Draft for basic NUMA observability

2025-04-07 Thread Tomas Vondra
On 4/7/25 20:11, Bertrand Drouvot wrote: > Hi, > > On Mon, Apr 07, 2025 at 12:42:21PM -0400, Andres Freund wrote: >> Hi, >> >> On 2025-04-07 18:36:24 +0200, Tomas Vondra wrote: >> >> I was thinking of checking if the BufferDesc indicates BM_VALID or >&g

Re: Draft for basic NUMA observability

2025-04-07 Thread Tomas Vondra
ent patches are good enough >> for PG18, with the current behavior, and then maybe improve that in >> PG19. > > I think as long as the docs mention this with or it's ok for > now. > OK, I'll add a warning explaining this. regards -- Tomas Vondra

Re: Improve monitoring of shared memory allocations

2025-04-07 Thread Tomas Vondra
ssion tests can't tell us much, considering it didn't fail once with the reverted patch :-( I did check the coverage in: https://coverage.postgresql.org/src/backend/utils/hash/dynahash.c.gcov.html and sure enough, dir_realloc() is not executed once. And there's a couple more p

Re: Draft for basic NUMA observability

2025-04-07 Thread Tomas Vondra
in os_page_status. I intend to push 0001 and 0002 shortly, and 0003 after a bit more review and testing, unless I hear objections. regards -- Tomas Vondra From fcc4fc2ada33cbbc962d561ddeea6966f0d55492 Mon Sep 17 00:00:00 2001 From: Jakub Wartak Date: Wed, 2 Apr 2025 12:29:22 +0200 Subject: [P

Re: Draft for basic NUMA observability

2025-04-06 Thread Tomas Vondra
;> pages. >>> + * It's a bit misleading to call that "aligned", no? */ >>> + >>> + /* Get number of OS aligned pages */ >>> + shm_ent_page_count >>> + = TYPEALIGN(os_page_size, ent->allocated_size) / >>> os_page_size; >>> + >>> + /* >>> + * If we get ever 0xff back from kernel inquiry, then we >>> probably have >>> + * bug in our buffers to OS page mapping code here. >>> + */ >>> + memset(pages_status, 0xff, sizeof(int) * shm_ent_page_count); >> >> There's obviously no guarantee that shm_ent_page_count is a multiple of >> os_page_size. I think it'd be interesting to show in the view when one shmem >> allocation shares a page with the prior allocation - that can contribute a >> bit >> to contention. What about showing a start_os_page_id and end_os_page_id or >> something? That could be a feature for later though. > > I was thinking about it, but it could be done when analyzing this > together with data from pg_shmem_allocations(?) My worry is timing :( > Anyway, we could extend this view in future revisions. > I'd leave this out for now. It's not difficult, but let's focus on the other issues. >>> +SELECT NOT(pg_numa_available()) AS skip_test \gset >>> +\if :skip_test >>> +\quit >>> +\endif >>> +-- switch to superuser >>> +\c - >>> +SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa; >>> + ok >>> + >>> + t >>> +(1 row) >> >> Could it be worthwhile to run the test if !pg_numa_available(), to test that >> we do the right thing in that case? We need an alternative output anyway, so >> that might be fine? > > Added. the meson test passes, but I'm sending it as fast as possible > to avoid a clash with Tomas. > Please keep working on this. I may hava a bit of time in the evening, but in the worst case I'll merge it into your patch. regards -- Tomas Vondra

Re: Draft for basic NUMA observability

2025-04-06 Thread Tomas Vondra
he current backend, so I'd bet people would not be happy with NULL, and would proceed to force the allocation in some other way (say, a large query of some sort). Which obviously causes a lot of other problems. I can imagine having a flag that makes the allocation optional, but there's no convenient way to pass that to a view, and I think most people want the allocation anyway. Especially for monitoring purposes, which usually happens in a new connection, so the backend has little opportunity to allocate the pages "naturally." regards -- Tomas Vondra

Re: Draft for basic NUMA observability

2025-04-06 Thread Tomas Vondra
at right now, but at the very least we ought to > document it. > +1 to documenting this > > On 2025-04-05 16:33:28 +0200, Tomas Vondra wrote: >> The libnuma library is not available on 32-bit builds (there's no shared >> object for i386), so we disable it in that

Re: Improve monitoring of shared memory allocations

2025-04-05 Thread Tomas Vondra
fields. Seems a bit weird, but we always did that - the patch does not really change that. I'll now mark this as committed. I haven't done about the alignment. My conclusion from the discussion was we don't quite need to do that, but if we do I think it's a matter for a separate patch - perhaps something like the 0003. Thanks for the patch, reviews, etc. -- Tomas Vondra

Re: Snapshot related assert failure on skink

2025-04-05 Thread Tomas Vondra
On 3/24/25 16:25, Heikki Linnakangas wrote: > On 24/03/2025 16:56, Tomas Vondra wrote: >> >> >> On 3/23/25 17:43, Heikki Linnakangas wrote: >>> On 21/03/2025 17:16, Andres Freund wrote: >>>> Am I right in understanding that the only scenario (w

Re: Draft for basic NUMA observability

2025-04-05 Thread Tomas Vondra
On 4/5/25 15:23, Tomas Vondra wrote: > On 4/5/25 11:37, Bertrand Drouvot wrote: >> Hi, >> >> On Fri, Apr 04, 2025 at 09:25:57PM +0200, Tomas Vondra wrote: >>> OK, >>> >>> here's v25 after going through the patches once more, fixing the issues &

Re: Draft for basic NUMA observability

2025-04-05 Thread Tomas Vondra
On 4/5/25 11:37, Bertrand Drouvot wrote: > Hi, > > On Fri, Apr 04, 2025 at 09:25:57PM +0200, Tomas Vondra wrote: >> OK, >> >> here's v25 after going through the patches once more, fixing the issues >> mentioned by Bertrand, etc. > > Thanks! > &

Re: Proposal: Adding compression of temporary files

2025-04-04 Thread Tomas Vondra
code gets multiple loops in while (wpos < file->nbytes) { ... } because bytestowrite will be the value from the last loop? I haven't tried, but I guess writing wide tuples (more than 8k) might fail. regards -- Tomas Vondra

Re: Draft for basic NUMA observability

2025-04-04 Thread Tomas Vondra
in the function comment, but I'm also not quite sure I understand what "output shared memory" is ... regards -- Tomas Vondra From 381c5077592e38dbcbbf6acc4f1e86a767a92957 Mon Sep 17 00:00:00 2001 From: Jakub Wartak Date: Wed, 2 Apr 2025 12:29:22 +0200 Subject: [PATCH v25 1/5]

Re: index prefetching

2025-04-04 Thread Tomas Vondra
Yes, I agree. regards -- Tomas Vondra

Re: Draft for basic NUMA observability

2025-04-04 Thread Tomas Vondra
On 4/4/25 08:50, Bertrand Drouvot wrote: > Hi, > > On Thu, Apr 03, 2025 at 08:53:57PM +0200, Tomas Vondra wrote: >> On 4/3/25 15:12, Jakub Wartak wrote: >>> On Thu, Apr 3, 2025 at 1:52 PM Tomas Vondra wrote: >>> >>>> ... >>>> >>&

Re: Draft for basic NUMA observability

2025-04-04 Thread Tomas Vondra
On 4/4/25 09:35, Jakub Wartak wrote: > On Fri, Apr 4, 2025 at 8:50 AM Bertrand Drouvot > wrote: >> >> Hi, >> >> On Thu, Apr 03, 2025 at 08:53:57PM +0200, Tomas Vondra wrote: >>> On 4/3/25 15:12, Jakub Wartak wrote: >>>>

Re: Draft for basic NUMA observability

2025-04-03 Thread Tomas Vondra
On 4/3/25 15:12, Jakub Wartak wrote: > On Thu, Apr 3, 2025 at 1:52 PM Tomas Vondra wrote: > >> ... >> >> So unless someone can demonstrate a use case where this would matter, >> I'd not worry about it too much. > > OK, fine for me - just 3 cols for p

Re: Draft for basic NUMA observability

2025-04-03 Thread Tomas Vondra
On 4/3/25 10:23, Bertrand Drouvot wrote: > Hi, > > On Thu, Apr 03, 2025 at 09:01:43AM +0200, Jakub Wartak wrote: >> On Wed, Apr 2, 2025 at 6:40 PM Tomas Vondra wrote: >> >> Hi Tomas, >> >>> OK, so you agree the commit messages are complete / correct

Re: Draft for basic NUMA observability

2025-04-03 Thread Tomas Vondra
On 4/3/25 09:01, Jakub Wartak wrote: > On Wed, Apr 2, 2025 at 6:40 PM Tomas Vondra wrote: > > Hi Tomas, > >> OK, so you agree the commit messages are complete / correct? > > Yes. > >> OK. FWIW if you disagree with some of my proposed changes, feel free to &

Re: BTScanOpaqueData size slows down tests

2025-04-02 Thread Tomas Vondra
On 4/2/25 17:45, Peter Geoghegan wrote: > On Wed, Apr 2, 2025 at 11:36 AM Tom Lane wrote: >> Ouch! I had no idea it had gotten that big. Yeah, we ought to >> do something about that. > > Tomas Vondra talked about this recently, in the context of his work on > prefe

Re: Parallel CREATE INDEX for GIN indexes

2025-04-02 Thread Tomas Vondra
On 4/2/25 18:43, Andres Freund wrote: > Hi, > > On 2025-03-04 20:50:43 +0100, Tomas Vondra wrote: >> I pushed the two smaller parts today. >> >> Here's the remaining two parts, to keep cfbot happy. I don't expect to >> get these into PG18, though. >

Re: Draft for basic NUMA observability

2025-04-02 Thread Tomas Vondra
On 4/2/25 16:46, Jakub Wartak wrote: > On Tue, Apr 1, 2025 at 10:17 PM Tomas Vondra wrote: >> >> Hi, >> >> I've spent a bit of time reviewing this. In general I haven't found >> anything I'd call a bug, but here's a couple comments for v1

Re: Draft for basic NUMA observability

2025-04-01 Thread Tomas Vondra
inters like this etc.). 11) This could use UINT64_FORMAT, instead of a cast: elog(DEBUG1, "NUMA: os_page_count=%lu os_page_size=%zu pages_per_blk=%.2f", (unsigned long) os_page_count, os_page_size, pages_per_blk); regards -- Tomas Vondra From 46a7801b1985a81bb8bc35fcfb2cbb74e6ea5

Re: Improve monitoring of shared memory allocations

2025-03-31 Thread Tomas Vondra
t;number of elements", but it's a simple flag. So I renamed it to "prealloc", which seems clearer to me. I also tweaked (reordered/reformatted) the conditions a bit. For the other patch, I realized we can simply MemSet() the whole chunk, instead of resetting the individual parts

Re: Amcheck verification of GiST and GIN

2025-03-30 Thread Tomas Vondra
On 3/30/25 06:04, Tom Lane wrote: > Tomas Vondra writes: >> I've pushed all the parts of this patch series, except for the stress >> test - which I think was not meant for commit. >> buildfarm seems happy so far, except for a minor indentation issue >> (forg

Re: Amcheck verification of GiST and GIN

2025-03-29 Thread Tomas Vondra
On 3/28/25 20:51, Kirill Reshke wrote: > On Fri, 28 Mar 2025 at 21:26, Tomas Vondra wrote: >> >> Here's a polished version of the patches. If you have any >> comments/objections, please speak now. >> -- >> Tomas Vondra > > Hi, no objections, lgtm

Re: Amcheck verification of GiST and GIN

2025-03-28 Thread Tomas Vondra
callee functions. So now it's - amcheck consistency check context - posting tree check context regards -- Tomas Vondra From 28b392b687f641b09bc79bb3bb3e61505845e6c1 Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Fri, 28 Mar 2025 16:49:04 +0100 Subject: [PATCH v20250328 1/6] Fix grammar in

Re: Improve monitoring of shared memory allocations

2025-03-28 Thread Tomas Vondra
> */ > > hash_create code is confusing because the nelem_alloc named variable is used > in two different cases, In  the above case  nelem_alloc  refers to the one  > returned by choose_nelem_alloc function. > > The other nelem_alloc determines the number of elements in each partition > for a partitioned hash table. This is not what is being referred to in > the above  > comment. > > The bit "For more explanation see comments within this function" is not > great, if only because there are not many comments within the function, > so there's no "more explanation". But if there's something important, it > should be in the main comment, preferably. > >   > I will improve the comment in the next version. > OK. Do we even need to pass nelem_alloc to hash_get_init_size? It's not really used except for this bit: +if (init_size > nelem_alloc) +element_alloc = false; Can't we determine before calling the function, to make it a bit less confusing? regards -- Tomas Vondra

Re: Improve monitoring of shared memory allocations

2025-03-27 Thread Tomas Vondra
On 3/27/25 13:56, Tomas Vondra wrote: > ... > > OK, I don't have any other comments for 0001 and 0002. I'll do some > more review and polishing on those, and will get them committed soon. > Actually ... while polishing 0001 and 0002, I noticed a couple more details t

Re: Amcheck verification of GiST and GIN

2025-03-27 Thread Tomas Vondra
On 3/27/25 16:30, Mark Dilger wrote: > > > On Fri, Feb 21, 2025 at 6:29 AM Tomas Vondra <mailto:to...@vondra.me>> wrote: > > Hi, > > I see this patch didn't move since December :-( I still think these > improvements would be useful, it

Re: Improve monitoring of shared memory allocations

2025-03-27 Thread Tomas Vondra
his. >  Do you have any suggestions in mind?   > > Please find attached updated patches after merging all your review > comments except > a few discussed above. >   OK, I don't have any other comments for 0001 and 0002. I'll do some more review and polishing on those, and will get them committed soon. I don't plan to push 0003, unless someone can actually explain and demonstrate the benefits of the proposed padding, regards -- Tomas Vondra

Re: Advanced Patch Feedback Session / pgconf.dev 2025

2025-03-25 Thread Tomas Vondra
ards Tomas On 3/13/25 17:37, Tomas Vondra wrote: > Hi all, > > pgconf.dev 2025 will host "Advanced Patch Feedback Session", with the > same format as in 2024 [1]: > > Participants will work in small groups with a Postgres committer to > analyze a past contribut

Re: Snapshot related assert failure on skink

2025-03-24 Thread Tomas Vondra
x27;s be tidy and fix both latestCompletedXid and > xactCompletionCount. > Thanks for looking into this and pushing the fix. Would it make sense to add a comment documenting this reasoning about not handling aborts? Otherwise someone will get to rediscover this in the future ... regards -- Tomas Vondra

Re: Improve monitoring of shared memory allocations

2025-03-24 Thread Tomas Vondra
ntion can we get there? I don't get it. Also, why is the patch adding padding after statusFlags (the last array allocated in InitProcGlobal) and not between allProcs and xids? regards -- Tomas Vondra From f527909dda02b4c7231db53a0fe6cecbaec55ca4 Mon Sep 17 00:00:00 2001 From: Rahila Sye

Re: Snapshot related assert failure on skink

2025-03-21 Thread Tomas Vondra
On 3/19/25 13:27, Tomas Vondra wrote: > On 3/19/25 08:17, Heikki Linnakangas wrote: >> On 19/03/2025 04:22, Tomas Vondra wrote: >>> I kept stress-testing this, and while the frequency massively increased >>> on PG18, I managed to reproduce this all the way back to

Re: Snapshot related assert failure on skink

2025-03-19 Thread Tomas Vondra
On 3/19/25 08:17, Heikki Linnakangas wrote: > On 19/03/2025 04:22, Tomas Vondra wrote: >> I kept stress-testing this, and while the frequency massively increased >> on PG18, I managed to reproduce this all the way back to PG14. I see >> ~100x more corefiles on PG18. >>

Re: Snapshot related assert failure on skink

2025-03-18 Thread Tomas Vondra
7;t have the same issue. None of them seems to advance the XID to 209508. regards -- Tomas Vondra

Re: Snapshot related assert failure on skink

2025-03-17 Thread Tomas Vondra
On 3/17/25 13:18, Thomas Munro wrote: > On Tue, Mar 18, 2025 at 12:59 AM Tomas Vondra wrote: >> On 3/17/25 12:36, Tomas Vondra wrote: >>> I'm still fiddling with the script, trying to increase the probability >>> of the (apparent) race condition. On one machine

Re: Snapshot related assert failure on skink

2025-03-17 Thread Tomas Vondra
On 3/17/25 12:36, Tomas Vondra wrote: > ... > > I'm still fiddling with the script, trying to increase the probability > of the (apparent) race condition. On one machine (old Xeon) I can hit it > very easily/reliably, while on a different machine (new Ryzen) it's ver

Re: Snapshot related assert failure on skink

2025-03-17 Thread Tomas Vondra
g to increase the probability of the (apparent) race condition. On one machine (old Xeon) I can hit it very easily/reliably, while on a different machine (new Ryzen) it's very rare. I don't know if that's due to difference in speed of the CPU, or fewer cores, ... I guess it changes the timing just enough. I've also tried running the stress test on PG17, and I'm yet to see a single failure there. Not even on the xeon machine, that hits it reliably on 18. So this seems to be a PG18-only issue. If needed, I can try adding more logging, or test a patch. regards -- Tomas Vondra

Assert(TransactionIdPrecedesOrEquals(TransactionXmin, RecentXmin));

2025-03-15 Thread Tomas Vondra
a while to hit it - on my laptop it takes an hour or so, but I guess it's more about the random sleeps in the script. I've only ever seen this on the standby, never on the primary. regards -- Tomas Vondra Program terminated with signal SIGABRT, Aborted. #0 __pthread_kill_implementa

Re: Changing the state of data checksums in a running cluster

2025-03-15 Thread Tomas Vondra
On 3/15/25 17:26, Andres Freund wrote: > Jo. > > On 2025-03-15 16:50:02 +0100, Tomas Vondra wrote: >> Thanks, here's an updated patch version > > FWIW, this fails in CI; > > https://cirrus-ci.com/build/4678473324691456 > On all OSs: > [16:08:36.331] #

Re: Get rid of WALBufMappingLock

2025-03-14 Thread Tomas Vondra
t may be the case that there are multiple related bottlenecks, and we'd need to fix all of them - in which case it'd be silly to block the improvements on the grounds that it alone does not help. Another thought is that this is testing the "good case". Can anyone think of a wor

Re: Changing the state of data checksums in a running cluster

2025-03-14 Thread Tomas Vondra
On 3/14/25 00:11, Tomas Vondra wrote: > ... >>>>>> One issue I ran into is the postmaster does not seem to be processing >>>>>> the barriers, and thus not getting info about the data_checksum_version >>>>>> changes. >>>>> &

Re: Changing the state of data checksums in a running cluster

2025-03-13 Thread Tomas Vondra
On 3/13/25 17:26, Tomas Vondra wrote: > On 3/13/25 13:32, Daniel Gustafsson wrote: >>> On 13 Mar 2025, at 12:03, Tomas Vondra wrote: >>> >>> ... >>> >>> This also reminds me I had a question about the barrier - can't it >>> happen a

Advanced Patch Feedback Session / pgconf.dev 2025

2025-03-13 Thread Tomas Vondra
g groups, etc.) If needed, feel free to ask question - either here, or get in touch directly with either of the organizers (me, Robert or Amit). kind regards -- Tomas Vondra

Re: Changing the state of data checksums in a running cluster

2025-03-13 Thread Tomas Vondra
On 3/13/25 13:32, Daniel Gustafsson wrote: >> On 13 Mar 2025, at 12:03, Tomas Vondra wrote: >> On 3/13/25 10:54, Daniel Gustafsson wrote: >>>> On 12 Mar 2025, at 14:16, Tomas Vondra wrote: > >>>> I believe the approach is correct, but the number of p

Re: Implement waiting for wal lsn replay: reloaded

2025-03-13 Thread Tomas Vondra
reading this comment I should understand how it all fits together. 10) WaitForLSNReplay / WaitLSNWakeup I think the function comment should document the important stuff (e.g. return values for various situations, how it groups waiters into chunks of 16 elements during wakeup, ...). 11) WaitLSNProcInfo / WaitLSNState Does this need to be exposed in xlogwait.h? These structs seem private to xlogwait.c, so maybe declare it there? regards -- Tomas Vondra

Re: Changing the state of data checksums in a running cluster

2025-03-13 Thread Tomas Vondra
On 3/13/25 10:54, Daniel Gustafsson wrote: >> On 12 Mar 2025, at 14:16, Tomas Vondra wrote: > >> I continued investigating this and experimenting with alternative >> approaches, and I think the way the patch relies on ControlFile is not >> quite rig

Re: Changing the state of data checksums in a running cluster

2025-03-12 Thread Tomas Vondra
On 3/11/25 14:07, Dagfinn Ilmari Mannsåker wrote: > As the resident perl style pedant, I'd just like to complain about the > below: > > Tomas Vondra writes: > >> diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm >> b/src/test/perl/PostgreSQL/Test

Re: Changing the state of data checksums in a running cluster

2025-03-11 Thread Tomas Vondra
es not. It flushes the control file in only a couple places, I couldn't think of a way to get it out of sync. regards [1] https://www.postgresql.org/message-id/e4dbcb2c-e04a-4ba2-bff0-8d979f55960e%40vondra.me -- Tomas Vondra test.sh Description: application/shellscript

Re: Changing the state of data checksums in a running cluster

2025-03-10 Thread Tomas Vondra
uld report the current_relation/fork, but it seems like an overkill. The main fork is by far the largest one, so this seems OK. regards -- Tomas Vondra

Re: Parallel CREATE INDEX for GIN indexes

2025-03-09 Thread Tomas Vondra
On 3/9/25 17:38, Tom Lane wrote: > Tomas Vondra writes: >> I pushed the two smaller parts today. > > Coverity is a little unhappy about this business in > _gin_begin_parallel: > > boolleaderparticipates = true; > ... > #ifde

Re: strange valgrind reports about wrapper_handler on 64-bit arm

2025-03-09 Thread Tomas Vondra
On 3/9/25 03:16, Nathan Bossart wrote: > On Sat, Mar 08, 2025 at 11:48:22PM +0100, Tomas Vondra wrote: >> Shortly after restarting this I got three more reports - all of them are >> related to strcoll_l. This is on c472a18296e4, i.e. with the asserts >> added in this thread e

Re: strange valgrind reports about wrapper_handler on 64-bit arm

2025-03-08 Thread Tomas Vondra
On 3/7/25 17:32, Andres Freund wrote: > Hi, > > On 2025-03-07 00:03:47 +0100, Tomas Vondra wrote: >> while running check-world on 64-bit arm (rpi5 with Debian 12.9), I got a >> couple reports like this: >> >> ==64550== Use of uninitialised value of s

Re: strange valgrind reports about wrapper_handler on 64-bit arm

2025-03-08 Thread Tomas Vondra
On 3/8/25 21:38, Tomas Vondra wrote: > > I've restarted check-world with valgrind on my rpi5 machines, with > current master. I can try running other stuff once that finishes in a > couple hours. > Shortly after restarting this I got three more reports - all of them are

Re: Refactoring postmaster's code to cleanup after child exit

2025-03-07 Thread Tomas Vondra
On 3/7/25 16:49, Andres Freund wrote: > Hi, > > On 2025-03-07 16:25:09 +0100, Tomas Vondra wrote: >> FWIW I keep running into this (and skink seems unhappy too). I ended up >> just adding a sleep(1), right before >> >> push(@sessions, background_psql_as_user(

Re: Refactoring postmaster's code to cleanup after child exit

2025-03-07 Thread Tomas Vondra
. For the archives sake, I just want to clarify that this pump stuff is >> all about getting better error messages on a test failure. It doesn't help >> with the original issue. > > Agreed. > FWIW I keep running into this (and skink seems unhappy too). I ended up just adding a sleep(1), right before push(@sessions, background_psql_as_user('regress_superuser')); and that makes it work on all my machines (including rpi5). regards -- Tomas Vondra

strange valgrind reports about wrapper_handler on 64-bit arm

2025-03-06 Thread Tomas Vondra
an int). But that leaves just pqsignal_handlers, and why would that be uninitialized? The closest thing I found in archives is [1] from about a year ago, but we haven't found any clear explanation there either :-( [1] https://www.postgresql.org/message-id/f1a022e5-9bec-42c5-badd-cfc00b605...@en

Re: Parallel CREATE INDEX for GIN indexes

2025-03-04 Thread Tomas Vondra
I pushed the two smaller parts today. Here's the remaining two parts, to keep cfbot happy. I don't expect to get these into PG18, though. regards -- Tomas Vondra From bea52f76255830af45b7122b0fa5786997182cf5 Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Tue, 25 Feb 2025 16:1

Re: scalability bottlenecks with (many) partitions (and more)

2025-03-04 Thread Tomas Vondra
On 3/4/25 15:38, Tomas Vondra wrote: > > ... > >>> >>> Attached is a patch doing this, but considering it has nothing to do >>> with the shmem sizing, I wonder if it's worth it. >> >> Yes. >> > > OK, barrin

Re: scalability bottlenecks with (many) partitions (and more)

2025-03-04 Thread Tomas Vondra
On 3/4/25 14:11, Andres Freund wrote: > Hi, > > On 2025-03-04 14:05:22 +0100, Tomas Vondra wrote: >> On 3/3/25 21:52, Andres Freund wrote: >>>> It's not a proper constant, of course, but it seemed close >>>> enough. Yes, it might confuse people int

Re: scalability bottlenecks with (many) partitions (and more)

2025-03-04 Thread Tomas Vondra
On 3/3/25 21:52, Andres Freund wrote: > Hi, > > On 2025-03-03 21:31:42 +0100, Tomas Vondra wrote: >> On 3/3/25 19:10, Andres Freund wrote: >>> On 2024-09-21 20:33:49 +0200, Tomas Vondra wrote: >>>> I've finally pushed this, after many rounds of careful t

Re: scalability bottlenecks with (many) partitions (and more)

2025-03-03 Thread Tomas Vondra
On 3/3/25 19:10, Andres Freund wrote: > Hi, > > On 2024-09-21 20:33:49 +0200, Tomas Vondra wrote: >> I've finally pushed this, after many rounds of careful testing to ensure >> no regressions, and polishing. > > One minor nit: I don't like that FP_LOCK_

Re: Parallel CREATE INDEX for GIN indexes

2025-03-03 Thread Tomas Vondra
ng the two WIP parts that are unlikely to make it into PG18 at this point. regards -- Tomas Vondra From 0541012bd9a092d0d6e4c020608d4fdea98d7ab8 Mon Sep 17 00:00:00 2001 From: Tomas Vondra Date: Sat, 15 Feb 2025 21:01:43 +0100 Subject: [PATCH v20250303 1/4] Compress TID lists when writing GIN

Re: suspicious lockup on widowbird in AdvanceXLInsertBuffer (could it be due to 6a2275b8953?)

2025-02-26 Thread Tomas Vondra
On 2/26/25 23:13, Peter Geoghegan wrote: > On Wed, Feb 26, 2025 at 5:09 PM Tomas Vondra wrote: >> So, it's stuck in AdvanceXLInsertBuffer ... interesting. Another >> interesting fact is it's testing 75dfde13639, which is just a couple >> commits after 6

suspicious lockup on widowbird in AdvanceXLInsertBuffer (could it be due to 6a2275b8953?)

2025-02-26 Thread Tomas Vondra
lready includes 6a2275b895, so maybe it's unrelated. Is there something else I could collect from the stuck instance, before I restart it? regards -- Tomas Vondra 0x007fa64b8ddc in __GI_epoll_pwait (epfd=5, events=0x55ad6285a8, maxevents=1, timeout=timeout@entry=-1, set=set@entry=0x0) at

Re: Parallel CREATE INDEX for GIN indexes

2025-02-26 Thread Tomas Vondra
ed to do something similar. My conclusion is this can be left as a future improvement, independent of the parallel builds. regards -- Tomas Vondra diff --git a/src/backend/access/gin/ginbulk.c b/src/backend/access/gin/ginbulk.c index 302cb2092a9..74cc62839cb 100644 --- a/src/backend/access/gi

Re: Adjusting hash join memory limit to handle batch explosion

2025-02-25 Thread Tomas Vondra
On 2/25/25 17:30, James Hunter wrote: > On Wed, Feb 19, 2025 at 12:22 PM Tomas Vondra wrote: >> >> I've pushed the first (and main) part of the patch series, after some >> more cleanup and comment polishing. > > Two comments on your merged patch -- > > Fir

Re: Amcheck verification of GiST and GIN

2025-02-21 Thread Tomas Vondra
On 2/21/25 18:07, Mark Dilger wrote: > > >> On Feb 21, 2025, at 6:29 AM, Tomas Vondra wrote: >> >> Hi, >> >> I see this patch didn't move since December :-( I still think >> these improvements would be useful, it certainly was very helpful >&g

Re: Enhancing Memory Context Statistics Reporting

2025-02-21 Thread Tomas Vondra
e large chunk of DSA allocation. I have changed this > to use > dynamically allocated chunks with dsa_allocate0 within the same DSA.   > Sounds good. Do you have any measurements how much this reduced the size of the entries written to the DSA? How many entries will fit into 1MB of shared memory? regards -- Tomas Vondra

Re: Amcheck verification of GiST and GIN

2025-02-21 Thread Tomas Vondra
ng those parts? regards -- Tomas Vondra

Re: Should heapam_estimate_rel_size consider fillfactor?

2025-02-19 Thread Tomas Vondra
On 2/18/25 13:29, Heikki Linnakangas wrote: > On 18/02/2025 14:01, Tomas Vondra wrote: >> On 2/4/25 17:54, Tomas Vondra wrote: >>> On 2/4/25 16:02, Tomas Vondra wrote: >>>> ... >>>> >>>> Thanks for the report. And yeah, clamping it to 1 se

Re: Adjusting hash join memory limit to handle batch explosion

2025-02-19 Thread Tomas Vondra
t's the best fix, or whether we need to do something about exhausting hash bits. In any case, it's not PG18 material. And it's a separate issue, so I'm marking this as committed. Thanks everyone who helped with any of the many old patch versions! -- Tomas Vondra

Re: Should heapam_estimate_rel_size consider fillfactor?

2025-02-18 Thread Tomas Vondra
On 2/4/25 17:54, Tomas Vondra wrote: > On 2/4/25 16:02, Tomas Vondra wrote: >> ... >> >> Thanks for the report. And yeah, clamping it to 1 seems like the right >> fix for this. I wonder if it's worth inventing some sort of test for >> this, shouldn't b

Re: psql: Add tab completion for ALTER USER RESET

2025-02-17 Thread Tomas Vondra
On 2/16/25 17:56, Tomas Vondra wrote: >... > > Thanks. These patches look fine to me. I'll get them committed. > I've pushed both patches. Thanks! -- Tomas Vondra

Re: psql: Add tab completion for ALTER USER RESET

2025-02-16 Thread Tomas Vondra
On 2/15/25 12:14, Robins Tharakan wrote: > Hi Tomas, > > Thanks for taking a look - apologies for the delay here. > > On Tue, 10 Dec 2024 at 09:09, Tomas Vondra <mailto:to...@vondra.me>> wrote: >> >> 1) Does it make sense to still show "ALL" whe

Re: Parallel CREATE INDEX for GIN indexes

2025-02-15 Thread Tomas Vondra
On 2/12/25 15:59, Matthias van de Meent wrote: > On Tue, 7 Jan 2025 at 12:59, Tomas Vondra wrote: >> >> ... >> >> I haven't done anything about this, but I'm not sure adding the number >> of GIN tuples to pg_stat_progress_create_index would be very u

Re: BitmapHeapScan streaming read user and prelim refactoring

2025-02-14 Thread Tomas Vondra
On 2/14/25 18:31, Melanie Plageman wrote: > io_combine_limit 1, effective_io_concurrency 16, read ahead kb 16 > > On Fri, Feb 14, 2025 at 12:18 PM Tomas Vondra wrote: >> >> Based on off-list discussion with Melanie, I ran a modified version of >> the benchmark, with

  1   2   3   4   5   6   7   8   9   10   >