RE: Parallel heap vacuum

2025-04-18 Thread Hayato Kuroda (Fujitsu)
Dear Sawada-san, Thanks for updating the patch. I have been reviewing and below are comments for now. 01. Sorry if I forgot something, but is there a reason that parallel_vacuum_compute_workers() is mandatory? My fresh eye felt that the function can check and regards zero if the API is NULL.

Re: Parallel heap vacuum

2025-04-07 Thread Peter Smith
Hi Sawada-san, Here are some review comments for the patch v16-0002 == Commit message 1. Missing period. /cleanup, ParallelVacuumState/cleanup. ParallelVacuumState/ ~~~ 2. Typo /paralel/parallel/ ~~~ 3. Heap table AM disables the parallel heap vacuuming for now, but an upcoming patch use

Re: Parallel heap vacuum

2025-04-06 Thread Peter Smith
Hi Sawada-san. I was revisiting this thread after a long time. I found most of my previous review comments from v11-0001 were not yet addressed. I can't tell if they are deliberately left out, or if they are accidentally overlooked. Please see the details below. On Mon, Mar 10, 2025 at 3:05 PM Pe

Re: Parallel heap vacuum

2025-04-06 Thread Melanie Plageman
ially if it's in a vacuum area. While I agree that disabling > eager freeze scans during parallel heap vacuum is not very > user-friendly, there are still many cases where parallel heap vacuum > helps even without the eager freeze scan. FYI the parallel heap scan > can be disabled b

Re: Parallel heap vacuum

2025-04-05 Thread Masahiko Sawada
accept that a new feature B regresses a pre-existing > feature A, particularly not if feature B is enabled by default. Why would that > be OK here? The eager freeze scan is the pre-existing feature but it's pretty new code that was pushed just a couple months ago. I didn't want t

Re: Parallel heap vacuum

2025-04-05 Thread Andres Freund
Hi, On 2025-04-04 14:34:53 -0700, Masahiko Sawada wrote: > On Fri, Apr 4, 2025 at 11:05 AM Melanie Plageman > wrote: > > > > On Tue, Apr 1, 2025 at 5:30 PM Masahiko Sawada > > wrote: > > > > > > > > > I've attached the new version patch. There are no major changes; I > > > fixed some typos, imp

Re: Parallel heap vacuum

2025-04-05 Thread Melanie Plageman
n I initially replied to the thread saying I thought temporarily disabling eager scanning for parallel heap vacuuming was viable, I hadn't looked at the patch yet and thought that there was a separate way to enable the new parallel heap vacuum (separate from the parallel option for the e

Re: Parallel heap vacuum

2025-04-05 Thread Melanie Plageman
On Sun, Mar 23, 2025 at 4:46 AM Masahiko Sawada wrote: > > If we use ParallelBlockTableScanDesc with streaming read like the > patch did, we would also need to somehow rewind the number of blocks > allocated to workers. The problem I had with such usage was that a > parallel vacuum worker allocate

Re: Parallel heap vacuum

2025-04-04 Thread Masahiko Sawada
On Fri, Apr 4, 2025 at 11:05 AM Melanie Plageman wrote: > > On Tue, Apr 1, 2025 at 5:30 PM Masahiko Sawada wrote: > > > > > > I've attached the new version patch. There are no major changes; I > > fixed some typos, improved the comment, and removed duplicated codes. > > Also, I've updated the com

Re: Parallel heap vacuum

2025-04-04 Thread Melanie Plageman
On Tue, Apr 1, 2025 at 5:30 PM Masahiko Sawada wrote: > > > I've attached the new version patch. There are no major changes; I > fixed some typos, improved the comment, and removed duplicated codes. > Also, I've updated the commit messages. I haven't looked closely at this version but I did notic

Re: Parallel heap vacuum

2025-03-27 Thread Masahiko Sawada
On Wed, Mar 26, 2025 at 1:00 PM Melanie Plageman wrote: > > On Mon, Mar 24, 2025 at 7:58 PM Masahiko Sawada wrote: > > > > You're right. I've studied the read stream code and figured out how to > > use it. In the attached patch, we end the read stream at the end of > > phase 1 and start a new rea

Re: Parallel heap vacuum

2025-03-26 Thread Melanie Plageman
On Mon, Mar 24, 2025 at 7:58 PM Masahiko Sawada wrote: > > You're right. I've studied the read stream code and figured out how to > use it. In the attached patch, we end the read stream at the end of > phase 1 and start a new read stream, as you suggested. I've started looking at this patch set s

Re: Parallel heap vacuum

2025-03-24 Thread Masahiko Sawada
On Sun, Mar 23, 2025 at 10:13 AM Andres Freund wrote: > > Hi, > > On 2025-03-23 01:45:35 -0700, Masahiko Sawada wrote: > > Another idea is that parallel workers don't exit phase 1 until it > > consumes all pinned buffers in the queue, even if the memory usage of > > TidStore exceeds the limit. > >

Re: Parallel heap vacuum

2025-03-24 Thread Andres Freund
Hi, On 2025-03-23 01:45:35 -0700, Masahiko Sawada wrote: > Another idea is that parallel workers don't exit phase 1 until it > consumes all pinned buffers in the queue, even if the memory usage of > TidStore exceeds the limit. Yes, that seems a quite reasonable approach to me. > It would need t

Re: Parallel heap vacuum

2025-03-23 Thread Masahiko Sawada
usage of TidStore reaches 70% of the limit or so. On the other hand, it means that we would not use the streaming read for the blocks in this mode, which is not efficient. > > > One plausible solution would be that we don't use ReadStream in > > parallel heap vacuum cases but d

Re: Parallel heap vacuum

2025-03-22 Thread Andres Freund
Hi, On 2025-03-20 01:35:42 -0700, Masahiko Sawada wrote: > One plausible solution would be that we don't use ReadStream in > parallel heap vacuum cases but directly use > table_block_parallelscan_xxx() instead. It works but we end up having > two different scan methods for

Re: Parallel heap vacuum

2025-03-22 Thread Melanie Plageman
later, it can initialize vacrel->current_block to the last block processed. > One plausible solution would be that we don't use ReadStream in > parallel heap vacuum cases but directly use > table_block_parallelscan_xxx() instead. It works but we end up having > two different

Re: Parallel heap vacuum

2025-03-15 Thread Masahiko Sawada
; | | | | | > > --2.63%--alloc_set_pte > > | | | | | > > pfn_pte > > | | | | | > > | |

Re: Parallel heap vacuum

2025-03-13 Thread Masahiko Sawada
On Wed, Mar 12, 2025 at 3:05 AM Amit Kapila wrote: > > On Wed, Mar 12, 2025 at 3:12 AM Masahiko Sawada wrote: > > > > On Tue, Mar 11, 2025 at 6:00 AM Amit Kapila wrote: > > > > > > On Mon, Mar 10, 2025 at 11:57 PM Masahiko Sawada > > > wrote: > > > > > > > > On Sun, Mar 9, 2025 at 11:12 PM Ami

Re: Parallel heap vacuum

2025-03-12 Thread Amit Kapila
On Wed, Mar 12, 2025 at 11:24 AM Dilip Kumar wrote: > > On Wed, Mar 12, 2025 at 3:17 AM Masahiko Sawada wrote: > > > > On Tue, Mar 11, 2025 at 5:51 AM Amit Kapila wrote: > > > > > Some thoughts/questions on the idea > > I notice that we are always considering block-level parallelism for > heaps

Re: Parallel heap vacuum

2025-03-12 Thread Dilip Kumar
On Wed, Mar 12, 2025 at 3:40 PM Amit Kapila wrote: > > On Wed, Mar 12, 2025 at 11:24 AM Dilip Kumar wrote: > > > > On Wed, Mar 12, 2025 at 3:17 AM Masahiko Sawada > > wrote: > > > > > > On Tue, Mar 11, 2025 at 5:51 AM Amit Kapila > > > wrote: > > > > > > > > Some thoughts/questions on the ide

Re: Parallel heap vacuum

2025-03-12 Thread Amit Kapila
On Wed, Mar 12, 2025 at 3:12 AM Masahiko Sawada wrote: > > On Tue, Mar 11, 2025 at 6:00 AM Amit Kapila wrote: > > > > On Mon, Mar 10, 2025 at 11:57 PM Masahiko Sawada > > wrote: > > > > > > On Sun, Mar 9, 2025 at 11:12 PM Amit Kapila > > > wrote: > > > > > > > > > > > > > However, in the heap

Re: Parallel heap vacuum

2025-03-11 Thread Dilip Kumar
On Wed, Mar 12, 2025 at 3:17 AM Masahiko Sawada wrote: > > On Tue, Mar 11, 2025 at 5:51 AM Amit Kapila wrote: > > Some thoughts/questions on the idea I notice that we are always considering block-level parallelism for heaps and object-level parallelism for indexes. I'm wondering, when multiple

Re: Parallel heap vacuum

2025-03-11 Thread Dilip Kumar
On Wed, Mar 12, 2025 at 11:24 AM Dilip Kumar wrote: > > On Wed, Mar 12, 2025 at 3:17 AM Masahiko Sawada wrote: > > Some more specific comments on the patch set -- v11-0001 1. This introduces functions like parallel_vacuum_estimate(), parallel_vacuum_initialize(), etc., but these functions have

Re: Parallel heap vacuum

2025-03-11 Thread Masahiko Sawada
On Tue, Mar 11, 2025 at 5:51 AM Amit Kapila wrote: > > On Tue, Mar 11, 2025 at 5:00 AM Masahiko Sawada wrote: > > > > On Sun, Mar 9, 2025 at 11:28 PM Amit Kapila wrote: > > > > > > > > > Does phase 3 also use parallelism? If so, can we try to divide the > > > ring buffers among workers or at lea

Re: Parallel heap vacuum

2025-03-11 Thread Masahiko Sawada
On Tue, Mar 11, 2025 at 6:00 AM Amit Kapila wrote: > > On Mon, Mar 10, 2025 at 11:57 PM Masahiko Sawada > wrote: > > > > On Sun, Mar 9, 2025 at 11:12 PM Amit Kapila wrote: > > > > > > > > > > However, in the heap vacuum phase, the leader process needed > > > > to process all blocks, resulting i

Re: Parallel heap vacuum

2025-03-11 Thread Masahiko Sawada
indexes. > > Does that refer to something you did implement or you are saying we > could do that in the future? It referred to the parallel heap vacuum implementation that I wrote. Since the parallel degrees for parallel heap scan and parallel index vacuuming are chosen separately based on d

Re: Parallel heap vacuum

2025-03-11 Thread Amit Kapila
On Mon, Mar 10, 2025 at 11:57 PM Masahiko Sawada wrote: > > On Sun, Mar 9, 2025 at 11:12 PM Amit Kapila wrote: > > > > > > > However, in the heap vacuum phase, the leader process needed > > > to process all blocks, resulting in soft page faults while creating > > > Page Table Entries (PTEs). With

Re: Parallel heap vacuum

2025-03-11 Thread Amit Kapila
On Tue, Mar 11, 2025 at 5:00 AM Masahiko Sawada wrote: > > On Sun, Mar 9, 2025 at 11:28 PM Amit Kapila wrote: > > > > > > Does phase 3 also use parallelism? If so, can we try to divide the > > ring buffers among workers or at least try vacuum with an increased > > number of ring buffers. This wou

Re: Parallel heap vacuum

2025-03-10 Thread Melanie Plageman
On Sat, Mar 8, 2025 at 1:42 AM Masahiko Sawada wrote: > > > I've attached the updated version patches. I've started trying to review this and realized that, while I'm familiar with heap vacuuming code, I'm not familiar enough with the vacuumparallel.c machinery to be of help without much addition

Re: Parallel heap vacuum

2025-03-10 Thread Masahiko Sawada
On Sun, Mar 9, 2025 at 11:28 PM Amit Kapila wrote: > > On Fri, Mar 7, 2025 at 11:06 PM Masahiko Sawada wrote: > > > > Discussing with Amit offlist, I've run another benchmark test where no > > data is loaded on the shared buffer. In the previous test, I loaded > > all table blocks before running

Re: Parallel heap vacuum

2025-03-10 Thread Peter Smith
Hi Sawada-San. Some review comments for patch v11-0003. == src/backend/access/heap/vacuumlazy.c 1. +typedef struct LVScanData +{ + BlockNumber rel_pages; /* total number of pages */ + + BlockNumber scanned_pages; /* # pages examined (not skipped via VM) */ + + /* + * Count of all-visible blo

Re: Parallel heap vacuum

2025-03-09 Thread Amit Kapila
On Fri, Mar 7, 2025 at 11:06 PM Masahiko Sawada wrote: > > Discussing with Amit offlist, I've run another benchmark test where no > data is loaded on the shared buffer. In the previous test, I loaded > all table blocks before running vacuum, so it was the best case. The > attached test results sho

Re: Parallel heap vacuum

2025-03-09 Thread Amit Kapila
| | | | > | | | --1.99%--TransactionIdPrecedes > > I did not observe these page faults in the 'perf record' results for > the HEAD version. Furthermore, when I disabled parallel heap vacuum > while keep

Re: Parallel heap vacuum

2025-03-09 Thread Peter Smith
Hi Sawada-San, Here are some review comments for patch v11-0001. == src/backend/access/heap/vacuumlazy.c 1. +/* + * Compute the number of workers for parallel heap vacuum. + * + * Return 0 to disable parallel vacuum so far. + */ +int +heap_parallel_vacuum_compute_workers(Relation rel, int

Re: Parallel heap vacuum

2025-03-09 Thread Peter Smith
Hi Sawada-San. Here are some review comments for patch v11-0002 == Commit message. 1. Heap table AM disables the parallel heap vacuuming for now, but an upcoming patch uses it. This function implementation was moved into patch 0001, so probably this part of the commit message comment also b

Re: Parallel heap vacuum

2025-03-07 Thread Masahiko Sawada
On Thu, Mar 6, 2025 at 5:33 PM Peter Smith wrote: > > Some minor review comments for patch v10-0001. > > == > src/include/access/tableam.h > > 1. > struct IndexInfo; > +struct ParallelVacuumState; > +struct ParallelContext; > +struct ParallelWorkerContext; > struct SampleScanState; > > Use a

Re: Parallel heap vacuum

2025-03-07 Thread Masahiko Sawada
On Mon, Mar 3, 2025 at 3:24 PM Masahiko Sawada wrote: > > On Mon, Mar 3, 2025 at 1:28 AM Masahiko Sawada wrote: > > > > On Tue, Feb 25, 2025 at 4:49 PM Masahiko Sawada > > wrote: > > > > > > On Tue, Feb 25, 2025 at 2:44 PM Melanie Plageman > > > wrote: > > > > > > > > On Tue, Feb 25, 2025 at 5

Re: Parallel heap vacuum

2025-03-06 Thread Peter Smith
t and call the new function 'heapam_parallel_vacuum_compute_workers' instead of 'heap_parallel_vacuum_compute_workers'? == src/backend/access/heap/vacuumlazy.c 2. +/* + * Compute the number of workers for parallel heap vacuum. + * + * Disabled so far. + */ +int +heap_p

Re: Parallel heap vacuum

2025-03-06 Thread Peter Smith
Hi Sawada-San. FYI. I am observing the following test behaviour: I apply patch v10-0001, do a clean rebuild and run 'make check', and all tests are OK. Then, after I apply just patch v10-0002 on top of 0001, do a clean rebuild and run 'make check' there are many test fails. == Kind Regards,

Re: Parallel heap vacuum

2025-03-06 Thread Peter Smith
Some minor review comments for patch v10-0001. == src/include/access/tableam.h 1. struct IndexInfo; +struct ParallelVacuumState; +struct ParallelContext; +struct ParallelWorkerContext; struct SampleScanState; Use alphabetical order for consistency with existing code. ~~~ 2. + /* + * Esti

Re: Parallel heap vacuum

2025-03-04 Thread Masahiko Sawada
| | | --1.99%--pmd_page_vaddr | | | | | | | --1.99%--TransactionIdPrecedes I did not observe these page faults in the 'perf record' results for the HEAD version. Furthermore, when I disabled parallel heap vacuum while keeping parallel index vacuuming enabled, the regression dis

Re: Parallel heap vacuum

2025-03-03 Thread Masahiko Sawada
On Mon, Mar 3, 2025 at 1:28 AM Masahiko Sawada wrote: > > On Tue, Feb 25, 2025 at 4:49 PM Masahiko Sawada wrote: > > > > On Tue, Feb 25, 2025 at 2:44 PM Melanie Plageman > > wrote: > > > > > > On Tue, Feb 25, 2025 at 5:14 PM Masahiko Sawada > > > wrote: > > > > > > > > Given that we have only

Re: Parallel heap vacuum

2025-02-25 Thread Masahiko Sawada
On Tue, Feb 25, 2025 at 2:44 PM Melanie Plageman wrote: > > On Tue, Feb 25, 2025 at 5:14 PM Masahiko Sawada wrote: > > > > Given that we have only about one month until the feature freeze, I > > find that it's realistic to introduce either one parallelism for PG18 > > and at least we might want t

Re: Parallel heap vacuum

2025-02-25 Thread Melanie Plageman
On Tue, Feb 25, 2025 at 5:14 PM Masahiko Sawada wrote: > > Given that we have only about one month until the feature freeze, I > find that it's realistic to introduce either one parallelism for PG18 > and at least we might want to implement the one first that is more > beneficial and helpful for u

Re: Parallel heap vacuum

2025-02-25 Thread Masahiko Sawada
On Tue, Feb 25, 2025 at 9:59 AM Melanie Plageman wrote: > > On Mon, Feb 24, 2025 at 8:15 PM Masahiko Sawada wrote: > > > > What I can see from these results was that we might not benefit much > > from parallelizing phase III, unfortunately. Although in the best case > > the phase III got about 2x

Re: Parallel heap vacuum

2025-02-25 Thread Melanie Plageman
easant work > > finishes faster.) > > I think that vacuum would still need to scan a large amount of blocks > of the table especially when it is very large and heavily modified. > Parallel heap vacuum (only phase I) would be beneficial in case where > autovacuum could not cat

Re: Parallel heap vacuum

2025-02-24 Thread Masahiko Sawada
sant work > finishes faster.) I think that vacuum would still need to scan a large amount of blocks of the table especially when it is very large and heavily modified. Parallel heap vacuum (only phase I) would be beneficial in case where autovacuum could not catch up. And we might want to consid

Re: Parallel heap vacuum

2025-02-24 Thread Jim Nasby
On Mon, Feb 17, 2025 at 12:11 PM Masahiko Sawada wrote: > > If the idea is to never allow parallelism in vacuum, then I think > > disabling eager scanning during manual parallel vacuum seems > > reasonable. People could use vacuum freeze if they want more freezing. > > IIUC the purpose of paralle

Re: Parallel heap vacuum

2025-02-23 Thread John Naylor
esigned for multiple parallel workers > to iterate a shared TidStore. During an iteration, parallel workers > share the iteration state and iterate the underlying radix tree while > taking appropriate locks. Therefore, it's available only for a shared > TidStore. This is required t

Re: Parallel heap vacuum

2025-02-19 Thread Masahiko Sawada
On Tue, Feb 18, 2025 at 4:43 PM Melanie Plageman wrote: > > On Mon, Feb 17, 2025 at 1:11 PM Masahiko Sawada wrote: > > > > On Fri, Feb 14, 2025 at 2:21 PM Melanie Plageman > > wrote: > > > > > > Since the failure rate is defined as a percent, couldn't we just have > > > parallel workers set eage

Re: Parallel heap vacuum

2025-02-18 Thread Melanie Plageman
On Mon, Feb 17, 2025 at 1:11 PM Masahiko Sawada wrote: > > On Fri, Feb 14, 2025 at 2:21 PM Melanie Plageman > wrote: > > > > Since the failure rate is defined as a percent, couldn't we just have > > parallel workers set eager_scan_remaining_fails when they get their > > chunk assignment (as a per

Re: Parallel heap vacuum

2025-02-17 Thread Masahiko Sawada
On Fri, Feb 14, 2025 at 2:21 PM Melanie Plageman wrote: > > On Wed, Feb 12, 2025 at 5:37 PM Masahiko Sawada wrote: > > > > Since we introduced the eagar vacuum scan (052026c9b9), I need to > > update the parallel heap vacuum patch. After thinking of how to > > i

Re: Parallel heap vacuum

2025-02-14 Thread Melanie Plageman
On Wed, Feb 12, 2025 at 5:37 PM Masahiko Sawada wrote: > > Since we introduced the eagar vacuum scan (052026c9b9), I need to > update the parallel heap vacuum patch. After thinking of how to > integrate these two features, I find some complexities. The region > size used by eager

Re: Parallel heap vacuum

2025-02-14 Thread Masahiko Sawada
On Thu, Feb 13, 2025 at 8:16 PM John Naylor wrote: > > On Thu, Feb 13, 2025 at 5:37 AM Masahiko Sawada wrote: > > > > During eager vacuum scan, we reset the eager_scan_remaining_fails > > counter when we start to scan the new region. So if we want to make > > parall

Re: Parallel heap vacuum

2025-02-13 Thread John Naylor
On Thu, Feb 13, 2025 at 5:37 AM Masahiko Sawada wrote: > > During eager vacuum scan, we reset the eager_scan_remaining_fails > counter when we start to scan the new region. So if we want to make > parallel heap vacuum behaves exactly the same way as the > single-progress vacuum

Re: Parallel heap vacuum

2025-02-12 Thread Masahiko Sawada
s. The region size used by eager vacuum scan and the chunk size used by parallel table scan are different. While the region is fixed size the chunk becomes smaller as we scan the table. A chunk of the table that a parallel vacuum worker took could be across different regions or be within one region

Re: Parallel heap vacuum

2025-01-21 Thread Masahiko Sawada
On Sun, Jan 19, 2025 at 7:50 AM Tomas Vondra wrote: > > Hi, > > Thanks for the new patches. I've repeated my benchmarking on v8, and I > agree this looks fine - the speedups are reasonable and match what I'd > expect on this hardware. I don't see any suspicious results like with > the earlier patc

Re: Parallel heap vacuum

2025-01-18 Thread Dilip Kumar
gt;> > > > > > > > IIRC, there was one of the blocker for implementing parallel heap vacuum > was group locking, have we already resolved that issue or its being > included in this patch set? > > I recall we had some discussion on changes to group locking for >

Re: Parallel heap vacuum

2025-01-17 Thread Masahiko Sawada
On Fri, Jan 17, 2025 at 1:43 AM Dilip Kumar wrote: > > On Fri, Jan 17, 2025 at 6:37 AM Masahiko Sawada wrote: >> >> On Sun, Jan 12, 2025 at 1:34 AM Masahiko Sawada >> wrote: >> > > > > IIRC, there was one of the blocker for implementing parallel

Re: Parallel heap vacuum

2025-01-17 Thread Dilip Kumar
On Fri, Jan 17, 2025 at 6:37 AM Masahiko Sawada wrote: > On Sun, Jan 12, 2025 at 1:34 AM Masahiko Sawada > wrote: > > > IIRC, there was one of the blocker for implementing parallel heap vacuum was group locking, have we already resolved that issue or its being included in

Re: Parallel heap vacuum

2024-12-25 Thread Tomas Vondra
On 12/19/24 23:05, Masahiko Sawada wrote: > On Sat, Dec 14, 2024 at 1:24 PM Tomas Vondra wrote: >> >> On 12/13/24 00:04, Tomas Vondra wrote: >>> ... >>> >>> The main difference is here: >>> >>> >>> master / no parallel workers: >>> >>> pages: 0 removed, 221239 remain, 221239 scanned (100.00%

RE: Parallel heap vacuum

2024-12-23 Thread Hayato Kuroda (Fujitsu)
Dear Sawada-san, Thanks for updating the patch. ISTM that 0001 and 0002 can be applied independently. Therefore I can firstly post some comments only for them. Comments for 0001: ``` +/* New estimated total # of tuples and total # of live tuples */ ``` There is a unnecessary blank. ``` +

Re: Parallel heap vacuum

2024-12-19 Thread Masahiko Sawada
On Sat, Dec 14, 2024 at 1:24 PM Tomas Vondra wrote: > > On 12/13/24 00:04, Tomas Vondra wrote: > > ... > > > > The main difference is here: > > > > > > master / no parallel workers: > > > > pages: 0 removed, 221239 remain, 221239 scanned (100.00% of total) > > > > 1 parallel worker: > > > > pa

Re: Parallel heap vacuum

2024-12-14 Thread Tomas Vondra
On 12/13/24 00:04, Tomas Vondra wrote: > ... > > The main difference is here: > > > master / no parallel workers: > > pages: 0 removed, 221239 remain, 221239 scanned (100.00% of total) > > 1 parallel worker: > > pages: 0 removed, 221239 remain, 10001 scanned (4.52% of total) > > > Clearl

Re: Parallel heap vacuum

2024-12-12 Thread Tomas Vondra
On 12/13/24 00:04, Tomas Vondra wrote: > > ... > Attached are results.csv with raw data, and a PDF showing the difference > between master and patched build with varying number of workers. The > columns on the right show timing relative to master (with no parallel > workers). Green means "faster"

Re: Parallel heap vacuum

2024-12-12 Thread Tomas Vondra
On 12/9/24 19:47, Tomas Vondra wrote: > Hi, > > Thanks for working on this. I took a quick look at this today, to do > some basic review. I plan to do a bunch of testing, but that's mostly to > get a better idea of what kind of improvements to expect - the initial > results look quite nice and sen

Re: Parallel heap vacuum

2024-12-11 Thread Masahiko Sawada
a for parallel heap scanning and vacuuming are stored in PHVState. > > 4) I think it would be good to have some sort of README explaining how > the parallel heap vacuum works, i.e. how it's driven by FSM. Took me a > while to realize how the workers coordinate which blocks to sc

RE: Parallel heap vacuum

2024-12-10 Thread Hayato Kuroda (Fujitsu)
I think it would be good to have some sort of README explaining how > the parallel heap vacuum works, i.e. how it's driven by FSM. Took me a > while to realize how the workers coordinate which blocks to scan. I love the idea, it is quite helpful for reviewers like me. Best regards, Hayato Kuroda FUJITSU LIMITED

Re: Parallel heap vacuum

2024-12-09 Thread Tomas Vondra
ould be good to have some sort of README explaining how the parallel heap vacuum works, i.e. how it's driven by FSM. Took me a while to realize how the workers coordinate which blocks to scan. 5) Wouldn't it be better to introduce the scan_stats (grouping some of the fields in a separate pa

Re: Parallel heap vacuum

2024-12-09 Thread Peter Smith
Hi Sawada-San. FWIW, here are the remainder of my review comments for the patch v4-0001 == src/backend/access/heap/vacuumlazy.c lazy_scan_heap: 2.1. + /* + * Do the actual work. If parallel heap vacuum is active, we scan and + * vacuum heap with parallel workers. + */ /with/using

Re: Parallel heap vacuum

2024-12-05 Thread Peter Smith
parallel vacuum workers + */ +typedef struct PHVShared The comment wording is not quite right. /that need to be shared/that needs to be shared/ ~~~ 1.3. +/* Struct for parallel heap vacuum */ +typedef struct PHVState +{ + /* Parallel scan description shared among parallel workers

Re: Parallel heap vacuum

2024-12-05 Thread Peter Smith
Hi Sawada-San, FYI, the patch 0001 fails to build stand-alone vacuumlazy.c: In function ‘parallel_heap_vacuum_gather_scan_stats’: vacuumlazy.c:3739:21: error: ‘LVRelScanStats’ has no member named ‘vacuumed_pages’ vacrel->scan_stats->vacuumed_pages += ss->vacuumed_pages; ^

RE: Parallel heap vacuum

2024-11-25 Thread Hayato Kuroda (Fujitsu)
Dear Swada-san, > > BTW while updating the patch, I found that we might want to launch > different numbers of workers for scanning heap and vacuuming heap. The > number of parallel workers is determined based on the number of blocks > in the table. However, even if this number is high, it could h

Re: Parallel heap vacuum

2024-11-18 Thread Masahiko Sawada
be shared among parallel vacuum > workers > + */ > +typedef struct PHVShared > +{ > + bool aggressive; > + boolskipwithvm; > + > > 3.c) Similarly this too: > +/* Per-worker scan state for parallel heap vacuum scan */ > +typedef struct

Re: Parallel heap vacuum

2024-11-15 Thread Masahiko Sawada
terate the underlying radix tree while > > taking appropriate locks. Therefore, it's available only for a shared > > TidStore. This is required to implement the parallel heap vacuum, > > where multiple parallel workers do the iteration on the shared > > TidStore. > >

RE: Parallel heap vacuum

2024-11-13 Thread Hayato Kuroda (Fujitsu)
ilable only for a shared > TidStore. This is required to implement the parallel heap vacuum, > where multiple parallel workers do the iteration on the shared > TidStore. > > On the other hand, TidStoreBeginIterate() is designed for a single > process to iterate a TidStore. It acce

Re: Parallel heap vacuum

2024-11-12 Thread vignesh C
rozen tuples */ 3.b) Similarly this too: +/* + * Struct for information that need to be shared among parallel vacuum workers + */ +typedef struct PHVShared +{ + boolaggressive; + boolskipwithvm; + 3.c) Similarly this too: +/* Per-worker scan state for paral

Re: Parallel heap vacuum

2024-11-11 Thread Masahiko Sawada
o iterate a shared TidStore. During an iteration, parallel workers share the iteration state and iterate the underlying radix tree while taking appropriate locks. Therefore, it's available only for a shared TidStore. This is required to implement the parallel heap vacuum, where mult

RE: Parallel heap vacuum

2024-11-11 Thread Hayato Kuroda (Fujitsu)
Dear Sawda-san, > > I've attached new version patches that fixes failures reported by > cfbot. I hope these changes make cfbot happy. Thanks for updating the patch and sorry for delaying the reply. I confirmed cfbot for Linux/Windows said ok. I'm still learning the feature so I can post only on

Re: Parallel heap vacuum

2024-10-22 Thread Masahiko Sawada
st pages are all-visible. Which might not work for other table > > AMs. > > > > Okay, thanks for confirming. I wanted to ask others as well. > > > > I'm updating the patch to implement parallel heap vacuum and will > > share the updated patch. It might take time a

Re: Parallel heap vacuum

2024-07-26 Thread Masahiko Sawada
ber of parallel workers as we don't want to launch many workers on the table where most pages are all-visible. Which might not work for other table AMs. I'm updating the patch to implement parallel heap vacuum and will share the updated patch. It might take time as it requires to implement shared iteration support in radx tree. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com

RE: Parallel heap vacuum

2024-07-25 Thread Hayato Kuroda (Fujitsu)
Dear Sawada-san, > Thank you for the test! > > I could reproduce this issue and it's a bug; it skipped even > non-all-visible pages. I've attached the new version patch. > > BTW since we compute the number of parallel workers for the heap scan > based on the table size, it's possible that we lau

Re: Parallel heap vacuum

2024-07-07 Thread Masahiko Sawada
Also, I guess that the lookup > > performance of the local TidStore is better than the shared TidStore's > > lookup performance because of the differences between a bump context > > and an DSA area. I think that this difference contributed the fact > > that index vacuuming

Re: Parallel heap vacuum

2024-07-07 Thread Masahiko Sawada
t; Sounds great. IIUC, vacuuming is still one of the main weak point of postgres. > > > I've attached a PoC patch for this feature. It implements only > > parallel heap scans in lazyvacum. We can extend this feature to > > support parallel heap vacuum as well in the future or

RE: Parallel heap vacuum

2024-07-05 Thread Hayato Kuroda (Fujitsu)
r this feature. It implements only > parallel heap scans in lazyvacum. We can extend this feature to > support parallel heap vacuum as well in the future or in the same > patch. Before diving into deep, I tested your PoC but found unclear point. When the vacuuming is requested with parallel

Re: Parallel heap vacuum

2024-06-28 Thread Amit Kapila
l TidStore is better than the shared TidStore's > lookup performance because of the differences between a bump context > and an DSA area. I think that this difference contributed the fact > that index vacuuming got slower (16.74 vs. 22.04). > > There are two obvious improvem

Parallel heap vacuum

2024-06-27 Thread Masahiko Sawada
ng on the single heap table. This would be helpful to speedup vacuuming for tables without indexes or tables with INDEX_CLENAUP = off. I've attached a PoC patch for this feature. It implements only parallel heap scans in lazyvacum. We can extend this feature to support parallel heap vacuum as w