[PATCH] Avoid useless prefetches in case of recent FPI WAL records

2025-03-29 Thread Michael Zhilin
k you! -- Michael Zhilin Postgres Professional https://www.postgrespro.ru From 7367cce118e5196b7e916c1d8e8403198c5334a5 Mon Sep 17 00:00:00 2001 From: Michael Zhilin Date: Sun, 30 Mar 2025 01:30:37 +0300 Subject: [PATCH v1] Avoid useless prefetches in case of recent FPI WAL records If block is logged i

Re: Compression of bigger WAL records

2025-01-30 Thread wenhui qiu
Hi Andery I have a question ,If wal_compression_threshold is set to more than the block size of the wal log, then the FPI is not compressed, and if so, it might make sense to have a maximum value of this parameter that does not exceed the block size of the wal log? Best regards On Thu, Jan

Re: Compression of bigger WAL records

2025-01-30 Thread Andrey Borodin
> On 23 Jan 2025, at 20:13, Japin Li wrote: > > > I find this feature interesting; Thank you for your interest in the patch! > however, it cannot be applied to the current > master (b35434b134b) due to commit 32a18cc0a73. PFA a rebased version. > > I see the patch compresses the WAL recor

Re: Compression of bigger WAL records

2025-01-28 Thread Fujii Masao
On 2025/01/22 3:24, Andrey M. Borodin wrote: On 12 Jan 2025, at 17:43, Andrey M. Borodin wrote: I attach a prototype patch. Here's v2, now it passes all the tests with wal_debug. I like the idea of WAL compression more. With the current approach, each backend needs to allocate memory

Re: Compression of bigger WAL records

2025-01-23 Thread Japin Li
125 MB 97 MB > > So, for lz4 and zstd this seems to be a significant reduction. > > I'm planning to work on improving the patch quality. > > Thanks! > Hi, Andrey Borodin I find this feature interesting; however, it cannot be applied to the current master (b35434b

Re: Compression of bigger WAL records

2025-01-21 Thread Andrey M. Borodin
> On 12 Jan 2025, at 17:43, Andrey M. Borodin wrote: > > I attach a prototype patch. Here's v2, now it passes all the tests with wal_debug. Some stats. On this test create table a as select random() from generate_series(1,1e7); select pg_stat_reset_shared('wal'); create index on a(random );

Re: Compression of bigger WAL records

2025-01-12 Thread Andrey Borodin
Hi! Thanks for looking into this! > On 12 Jan 2025, at 23:36, Kirill Reshke wrote: > > initdb fails when configured with --without-zstd Yes, the patch is intended to demonstrate improvement when using Zstd. > On 12 Jan 2025, at 17:43, Andrey M. Borodin wrote: > > WAL_DEBUG and wal_compressio

Re: Compression of bigger WAL records

2025-01-12 Thread Kirill Reshke
I ./pgbin/bin/pg_waldump On Sun, 12 Jan 2025 at 17:43, Andrey M. Borodin wrote: > > Hi hackers! > > I propose a slight change to WAL compression: compress body of big records, > if it's bigger than some threshold. > Hi, initdb fails when configured with --without-zstd ``` reshke@ygp-jammy:

Compression of bigger WAL records

2025-01-12 Thread Andrey M. Borodin
Hi hackers! I propose a slight change to WAL compression: compress body of big records, if it's bigger than some threshold. ===Rationale=== 0. Better compression ratio for full page images when pages are compressed together. Consider following test: set wal_compression to 'zstd'; create table

Re: Fsync (flush) all inserted WAL records

2024-08-20 Thread Vitaly Davydov
Dear All, I would propose a new function like GetXLogInsertRecPtr(), but with some modifications (please, see the attached patch). The result LSN can be passed to XLogFLush() safely. I believe, it will not raise an error in any case. XLogFlush(GetXLogLastInsertEndRecPtr()) will flush (fsync) a

Re: Fsync (flush) all inserted WAL records

2024-08-18 Thread Michael Paquier
On Wed, Aug 07, 2024 at 06:00:45PM +0300, Aleksander Alekseev wrote: > Assuming the function has value, as you claim, I see no reason not to > expose it similarly to pg_current_wal_*(). On top of that you will > have to test-cover it anyway. The easiest way to do it will be to have > an SQL-wrapper

Re: Fsync (flush) all inserted WAL records

2024-08-07 Thread Aleksander Alekseev
Hi, > I use asynchronous commit (without XLogFlush/fsync at commit). At some moment > I would like to XLogFlush (fsync) all already asynchronously committed > transactions (inserted but not flushed/fsynced yet WAL records). Assume, that > there is no any active transactions at thi

Re: Fsync (flush) all inserted WAL records

2024-08-07 Thread Vitaly Davydov
plica. Your case(s) however is different and I don't fully understand it. I use asynchronous commit (without XLogFlush/fsync at commit). At some moment I would like to XLogFlush (fsync) all already asynchronously committed transactions (inserted but not flushed/fsynced yet WAL records). As

Re: Fsync (flush) all inserted WAL records

2024-08-07 Thread Aleksander Alekseev
Hi Vitaly, > I would propose a new function to fulfill my requirements like this (see > below) but I prefer not to create new functions unreasonably: > > XLogRecPtr > GetXLogLastInsertEndRecPtr(void) > { > XLogCtlInsert *Insert = &XLogCtl->Insert; > uint64 current_bytepos; > SpinLockA

Re: Fsync (flush) all inserted WAL records

2024-08-07 Thread Vitaly Davydov
Hi Aleksander, On Wednesday, August 07, 2024 12:19 MSK, Aleksander Alekseev wrote:  > Does pg_current_wal_flush_lsn() [1] return what you need? > > [1]: > https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL If not, take a look at its implementation and func

Re: Fsync (flush) all inserted WAL records

2024-08-07 Thread Aleksander Alekseev
Hi, > > Could you please advice which way to go? > > Does pg_current_wal_flush_lsn() [1] return what you need? > > [1]: > https://www.postgresql.org/docs/current/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL If not, take a look at its implementation and functions around, GetInsertRecPtr() and

Re: Fsync (flush) all inserted WAL records

2024-08-07 Thread Aleksander Alekseev
Hi, > I use async commits. At some moment, I would like to make sure that all > inserted WAL records are fsync-ed. I can use XLogFlush function but I have > some doubts which LSN to specify. There is a number of functions which return > write or insert LSNs but they are not appli

Fsync (flush) all inserted WAL records

2024-08-05 Thread Vitaly Davydov
Hi Hackers, I use async commits. At some moment, I would like to make sure that all inserted WAL records are fsync-ed. I can use XLogFlush function but I have some doubts which LSN to specify. There is a number of functions which return write or insert LSNs but they are not applicable. I

Re: Regarding t_cid in Neon heap WAL records

2024-07-24 Thread Heikki Linnakangas
On 24/07/2024 21:44, Muhammad Malik wrote: Neon added a t_cid field to heap WAL records https://github.com/yibit/neon-postgresql/blob/main/docs/core_changes.md#add-t_cid-to-heap-wal-records <https://github.com/yibit/neon-postgresql/blob/main/docs/core_changes.md#add-t_cid-to-heap-wal-reco

Regarding t_cid in Neon heap WAL records

2024-07-24 Thread Muhammad Malik
Neon added a t_cid field to heap WAL records https://github.com/yibit/neon-postgresql/blob/main/docs/core_changes.md#add-t_cid-to-heap-wal-records. However, when replaying the delete log record, it is discarding the combo flag and storing the raw cmax on the old tuple https://github.com

Re: Recording whether Heap2/PRUNE records are from VACUUM or from opportunistic pruning (Was: Show various offset arrays for heap WAL records)

2024-03-25 Thread Peter Geoghegan
On Mon, Mar 25, 2024 at 9:04 AM Heikki Linnakangas wrote: > I included changes like this in commit f83d709760 ("Merge prune, freeze > and vacuum WAL record formats"). Marking this as Committed in the > commitfest. Thanks for making sure that that happened. I suspect that the amount of pruning per

Re: Recording whether Heap2/PRUNE records are from VACUUM or from opportunistic pruning (Was: Show various offset arrays for heap WAL records)

2024-03-25 Thread Heikki Linnakangas
On 09/12/2023 23:48, Peter Geoghegan wrote: On Tue, Mar 21, 2023 at 3:37 PM Peter Geoghegan wrote: I think that we should do something like the attached, to completely avoid this ambiguity. This patch adds a new XLOG_HEAP2 bit that's similar to XLOG_HEAP_INIT_PAGE -- XLOG_HEAP2_BYVACUUM. This a

Recording whether Heap2/PRUNE records are from VACUUM or from opportunistic pruning (Was: Show various offset arrays for heap WAL records)

2023-12-09 Thread Peter Geoghegan
On Tue, Mar 21, 2023 at 3:37 PM Peter Geoghegan wrote: > I think that we should do something like the attached, to completely > avoid this ambiguity. This patch adds a new XLOG_HEAP2 bit that's > similar to XLOG_HEAP_INIT_PAGE -- XLOG_HEAP2_BYVACUUM. This allows all > XLOG_HEAP2 record types to in

Re: Show various offset arrays for heap WAL records

2023-10-02 Thread Heikki Linnakangas
On 04/09/2023 23:02, Melanie Plageman wrote: I might phrase the last bit as "neither the description functions nor the output format should be considered part of a stable API" +Guidelines for rmgrdesc output format += I noticed you used === for both headings

Re: Show various offset arrays for heap WAL records

2023-09-04 Thread Melanie Plageman
On Mon, Jul 10, 2023 at 3:44 AM Heikki Linnakangas wrote: > I'm late to the party, but regarding commit c03c2eae0a, which added the > guidelines for writing formatting desc functions: > > You moved the comment from rmgrdesc_utils.c into rmgrdesc_utils.h, but I > don't think that was a good idea. O

Suppress generating WAL records during the upgrade

2023-08-08 Thread Hayato Kuroda (Fujitsu)
ld node as binary-upgrade mode 2. Check confirmed_lsn of all the slots, and confirm all WALs are replicated to downstream 3. Dump slot info to sql file 4. Stop old node 5. Boot new node as binary-upgrade mode ... Here, step 2 was introduced for avoiding data loss. If there are some WAL rec

Re: Show various offset arrays for heap WAL records

2023-07-25 Thread Peter Geoghegan
On Mon, Jul 10, 2023 at 10:29 PM Peter Geoghegan wrote: > > Let's add a src/backend/access/rmgrdesc/README file. We don't currently > > have any explanation anywhere why the rmgr desc functions are in a > > separate directory. The README would be a good place to explain that, > > and to have the f

Re: Show various offset arrays for heap WAL records

2023-07-10 Thread Peter Geoghegan
On Mon, Jul 10, 2023 at 12:44 AM Heikki Linnakangas wrote: > This is still listed in the July commitfest; is there some work remaining? I don't think so; not in the scope of the original patch series from Melanie, at least. > You moved the comment from rmgrdesc_utils.c into rmgrdesc_utils.h, but

Re: Show various offset arrays for heap WAL records

2023-07-10 Thread Heikki Linnakangas
On 12/04/2023 01:29, Peter Geoghegan wrote: Thanks for your help with the follow-up work. Seems like we're done with this now. This is still listed in the July commitfest; is there some work remaining? I'm late to the party, but regarding commit c03c2eae0a, which added the guidelines for writ

Re: Show various offset arrays for heap WAL records

2023-04-11 Thread Peter Geoghegan
On Tue, Apr 11, 2023 at 2:29 PM Melanie Plageman wrote: > > That doesn't seem great to me either. I don't like this ambiguity, > > because it seems like it makes the description hard to parse in a way > > that flies in the face of what we're trying to do here, in general. > > So it seems like it m

Re: Show various offset arrays for heap WAL records

2023-04-11 Thread Melanie Plageman
On Tue, Apr 11, 2023 at 1:35 PM Peter Geoghegan wrote: > > On Tue, Apr 11, 2023 at 7:40 AM Melanie Plageman > wrote: > > Not the fault of this patch, but I also noticed that heap UPDATE and > > HOT_UPDATE records have xmax twice and don't differentiate between new > > and old. I think that was p

Re: Show various offset arrays for heap WAL records

2023-04-11 Thread Peter Geoghegan
On Tue, Apr 11, 2023 at 11:48 AM Peter Geoghegan wrote: > Attached revision deals with this by spelling out the names in full > (e.g., "old_xmax" and "new_xmax"). It also reorders the output fields > to match the order from the physical UPDATE, HOT_UPDATE, and LOCK WAL > record types, on the theor

Re: Show various offset arrays for heap WAL records

2023-04-11 Thread Peter Geoghegan
On Tue, Apr 11, 2023 at 10:34 AM Peter Geoghegan wrote: > > description | off: 119, xmax: 1105, flags: 0x00, old_infobits: > > [], new off: 100, xmax 0 > > That doesn't seem great to me either. I don't like this ambiguity, > because it seems like it makes the description hard to parse in a wa

Re: Show various offset arrays for heap WAL records

2023-04-11 Thread Peter Geoghegan
On Tue, Apr 11, 2023 at 7:40 AM Melanie Plageman wrote: > static void > infobits_desc(StringInfo buf, uint8 infobits, const char *keyname) > { > appendStringInfo(buf, "%s: [", keyname); > > Why can we assume that there will be no space at the end here? I don't think that anybody is going to t

Re: Show various offset arrays for heap WAL records

2023-04-11 Thread Melanie Plageman
Hi, static void infobits_desc(StringInfo buf, uint8 infobits, const char *keyname) { appendStringInfo(buf, "%s: [", keyname); Why can we assume that there will be no space at the end here? I know we need to be able to avoid doing the comma overwriting if no flags were set. In general, we exp

Re: Show various offset arrays for heap WAL records

2023-04-10 Thread Peter Geoghegan
On Mon, Apr 10, 2023 at 5:23 PM Melanie Plageman wrote: > If you keep the name, I'd explain it briefly in a comment above the code > then -- for those of us who spend less time with btrees. It is a tool > that will be often used by developers, so it is not unreasonable to > assume they may read th

Re: Show various offset arrays for heap WAL records

2023-04-10 Thread Melanie Plageman
On Mon, Apr 10, 2023 at 04:31:44PM -0700, Peter Geoghegan wrote: > On Mon, Apr 10, 2023 at 3:04 PM Melanie Plageman > wrote: > > > > I will say that the prefix of p in "ptid" makes it sound like pointer to > > a tid, which I don't believe is what you meant. > > I was thinking of the symbol name

Re: Show various offset arrays for heap WAL records

2023-04-10 Thread Peter Geoghegan
Note that the patch makes many individual (say) HOT_UPDATE records > > have descriptions that look like this: > > > > ... old_infobits: [], ... > > > > This differs from HEAD, where the output is totally suppressed because > > there are no flag bits to show. I think that

Re: Show various offset arrays for heap WAL records

2023-04-10 Thread Melanie Plageman
On Sun, Apr 9, 2023 at 8:12 PM Peter Geoghegan wrote: > > On Fri, Apr 7, 2023 at 4:46 PM Peter Geoghegan wrote: > > Pushed that one too. > > I noticed that the nbtree VACUUM and DELETE record types have their > update/xl_btree_update arrays output incorrectly. We cannot use the > generic array_de

Re: Show various offset arrays for heap WAL records

2023-04-10 Thread Peter Geoghegan
On Sun, Apr 9, 2023 at 5:12 PM Peter Geoghegan wrote: > I noticed that the nbtree VACUUM and DELETE record types have their > update/xl_btree_update arrays output incorrectly. We cannot use the > generic array_desc() approach with xl_btree_update elements, because > they're variable-width elements

Re: Show various offset arrays for heap WAL records

2023-04-09 Thread Peter Geoghegan
On Fri, Apr 7, 2023 at 4:46 PM Peter Geoghegan wrote: > Pushed that one too. I noticed that the nbtree VACUUM and DELETE record types have their update/xl_btree_update arrays output incorrectly. We cannot use the generic array_desc() approach with xl_btree_update elements, because they're variabl

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-04-09 Thread Michael Paquier
u would propose to have protection around needs_data as well as the whole record length. > PostgreSQL 15 specifically would benefit from this as it supports > external rmgrs which may generate WAL records and would benefit from > these additional checks, but all suppo

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-04-08 Thread Matthias van de Meent
benefit from this as it supports external rmgrs which may generate WAL records and would benefit from these additional checks, but all supported releases of PostgreSQL have pg_logical_emit_message and are thus easily subject to the issue of writing oversized WAL records and subsequent recovery- and repl

Re: Show various offset arrays for heap WAL records

2023-04-07 Thread Peter Geoghegan
On Fri, Apr 7, 2023 at 4:21 PM Melanie Plageman wrote: > It's come to my attention that I forgot to include the btree patch earlier. Pushed that one too. Also removed the use of the "restrict" keyword here. Thanks -- Peter Geoghegan

Re: Show various offset arrays for heap WAL records

2023-04-07 Thread Melanie Plageman
On Fri, Apr 7, 2023 at 7:09 PM Peter Geoghegan wrote: > > On Fri, Apr 7, 2023 at 4:01 PM Melanie Plageman > wrote: > > LGTM > > Pushed, thanks. It's come to my attention that I forgot to include the btree patch earlier. PFA From 4f502b2513ba79d738e7ed87aaf7d18ed2a2e30f Mon Sep 17 00:00:00 2001

Re: Show various offset arrays for heap WAL records

2023-04-07 Thread Peter Geoghegan
On Fri, Apr 7, 2023 at 4:01 PM Melanie Plageman wrote: > LGTM Pushed, thanks. -- Peter Geoghegan

Re: Show various offset arrays for heap WAL records

2023-04-07 Thread Melanie Plageman
On Fri, Apr 7, 2023 at 5:43 PM Peter Geoghegan wrote: > > On Fri, Apr 7, 2023 at 1:33 PM Melanie Plageman > wrote: > > Attached v3 is cleaned up and includes a pg_walinspect docs update as > > well as some edited comments in rmgr_utils.c > > Attached v4 has some small tweaks on your v3. Mostly ju

Re: Show various offset arrays for heap WAL records

2023-04-07 Thread Peter Geoghegan
On Fri, Apr 7, 2023 at 1:33 PM Melanie Plageman wrote: > Attached v3 is cleaned up and includes a pg_walinspect docs update as > well as some edited comments in rmgr_utils.c Attached v4 has some small tweaks on your v3. Mostly just whitespace tweaks. Two slightly notable tweaks: * I changed the

Re: Show various offset arrays for heap WAL records

2023-04-07 Thread Melanie Plageman
Attached v3 is cleaned up and includes a pg_walinspect docs update as well as some edited comments in rmgr_utils.c On Mon, Mar 27, 2023 at 6:27 PM Peter Geoghegan wrote: > > On Mon, Mar 27, 2023 at 2:29 PM Melanie Plageman > wrote: > > I went to add dedup records and noticed that since the actua

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-04-06 Thread Michael Paquier
On Fri, Apr 07, 2023 at 08:59:22AM +0900, Michael Paquier wrote: > Okay, cool! Done this one with 8fcb32d. -- Michael signature.asc Description: PGP signature

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-04-06 Thread Michael Paquier
On Fri, Apr 07, 2023 at 01:50:00AM +0200, Matthias van de Meent wrote: > Yes, that was a bad oversight, which would've shown up in tests on a system > with an endianness that my computer doesn't have... I don't think that we have many bigendian animals in the buildfarm, either.. > That looks fin

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-04-06 Thread Matthias van de Meent
On Fri, 7 Apr 2023, 01:35 Michael Paquier, wrote: > On Fri, Apr 07, 2023 at 08:08:34AM +0900, Michael Paquier wrote: > > So bumping mainrdata_len to uint64 is actually not entirely in line > > with this code. Well, it will work because we'd still fail a couple > > of lines down, but perhaps its

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-04-06 Thread Michael Paquier
ase, the upper bound would need to be adjusted. Before this, it was possible for an external module to create WAL records large enough to be assembled still not replayable, causing failures when replaying such WAL records on standbys. One case mentioned where this is possible is the in-core fun

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-04-06 Thread Michael Paquier
On Thu, Apr 06, 2023 at 10:54:43AM +0900, Michael Paquier wrote: > 0002 can also be done before 0001, so I'd like to get that part > applied on HEAD before the feature freeze and close this thread. If > there are any objections, please feel free.. I was doing a pre-commit review of the patch, and

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-04-05 Thread Michael Paquier
On Wed, Apr 05, 2023 at 04:35:37PM +0200, Matthias van de Meent wrote: > I thought that the plan was to use int64 to skip checking for most > overflows and to do a single check at the end in XLogRecordAssemble, > so that the checking has minimal overhead in the performance-critical > log record ass

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-04-05 Thread Matthias van de Meent
On Tue, 28 Mar 2023 at 13:42, Michael Paquier wrote: > > On Mon, Dec 19, 2022 at 12:37:19PM +0100, Alvaro Herrera wrote: > > I have created one in the January commitfest, > > https://commitfest.postgresql.org/41/ > > and rebased the patch on current master. (I have not reviewed this.) > > I have

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2023-03-28 Thread Michael Paquier
00:00:00 2001 From: Michael Paquier Date: Tue, 28 Mar 2023 20:34:31 +0900 Subject: [PATCH v10] Add protections in xlog record APIs against overflows Before this, it was possible for an extension to create malicious WAL records that were too large to replay; or that would overflow the xl_tot_len

Re: Show various offset arrays for heap WAL records

2023-03-27 Thread Peter Geoghegan
On Mon, Mar 27, 2023 at 2:29 PM Melanie Plageman wrote: > I went to add dedup records and noticed that since the actual > BTDedupInterval struct is what is put in the xlog, I would need access > to that type from nbtdesc.c, however, including nbtree.h doesn't seem to > work because it includes fil

Re: Show various offset arrays for heap WAL records

2023-03-27 Thread Melanie Plageman
On Mon, Mar 13, 2023 at 9:41 PM Peter Geoghegan wrote: > On Mon, Mar 13, 2023 at 4:01 PM Melanie Plageman > wrote: > > > I have added detail to xl_btree_delete and xl_btree_vacuum. I have added > > the updated/deleted target offset numbers and the updated tuples > > metadata. > > > > I wondered

Re: Show various offset arrays for heap WAL records

2023-03-21 Thread Peter Geoghegan
On Tue, Mar 21, 2023 at 3:37 PM Peter Geoghegan wrote: > One problem that I often run into when performing analysis of VACUUM > using pg_walinspect is the issue of *who* pruned which heap page, for > any given PRUNE record. Was it VACUUM/autovacuum, or was it > opportunistic pruning? There is no w

Re: Show various offset arrays for heap WAL records

2023-03-21 Thread Peter Geoghegan
On Mon, Mar 13, 2023 at 6:41 PM Peter Geoghegan wrote: > There are several different things that seem important to me > personally. These are in tension with each other, to a degree. These > are: > > 1. Like Andres, I'd really like to have some way of inspecting things > like heapam PRUNE, VACUUM,

Re: Show various offset arrays for heap WAL records

2023-03-13 Thread Peter Geoghegan
On Mon, Mar 13, 2023 at 4:01 PM Melanie Plageman wrote: > On Fri, Jan 27, 2023 at 3:02 PM Robert Haas wrote: > > I'm not sure what's best in terms of formatting details but I > > definitely like the idea of making pg_waldump show more details. > If I'm not mistaken, this would be quite difficult

Re: Show various offset arrays for heap WAL records

2023-03-13 Thread Melanie Plageman
g that deserves to be called an API. The obligation > to not break tools that are scraping the output in whatever way seems > kind of onerous right now -- just not having any gratuitous > inconsistencies (e.g., fixing totally inconsistent punctuation, making > the names for fields across

Re: Show various offset arrays for heap WAL records

2023-03-02 Thread Peter Eisentraut
On 01.03.23 17:11, Melanie Plageman wrote: diff --git a/contrib/pg_walinspect/pg_walinspect--1.0.sql b/contrib/pg_walinspect/pg_walinspect--1.0.sql index 08b3dd5556..eb8ff82dd8 100644 --- a/contrib/pg_walinspect/pg_walinspect--1.0.sql +++ b/contrib/pg_walinspect/pg_walinspect--1.0.sql @@ -17,7 +

Re: Show various offset arrays for heap WAL records

2023-03-01 Thread Melanie Plageman
On Tue, Jan 31, 2023 at 5:48 PM Peter Geoghegan wrote: > > On Tue, Jan 31, 2023 at 1:52 PM Peter Geoghegan wrote: > > > I would also like to see functions like XLogRecGetBlockRefInfo() pass > > > something more useful than a stringinfo buffer so that we could easily > > > extract out the relfilen

Re: Show various offset arrays for heap WAL records

2023-02-01 Thread Robert Haas
On Wed, Feb 1, 2023 at 12:47 PM Peter Geoghegan wrote: > On Wed, Feb 1, 2023 at 5:20 AM Robert Haas wrote: > > If we're dumping a lot of details out of each WAL record, we might > > want to switch to a multi-line format of some kind. No one enjoys a > > 460-character wide line, let alone 46000. >

Re: Show various offset arrays for heap WAL records

2023-02-01 Thread Peter Geoghegan
On Wed, Feb 1, 2023 at 5:20 AM Robert Haas wrote: > If we're dumping a lot of details out of each WAL record, we might > want to switch to a multi-line format of some kind. No one enjoys a > 460-character wide line, let alone 46000. I generally prefer it when I can use psql without using expanded

Re: Show various offset arrays for heap WAL records

2023-02-01 Thread Robert Haas
On Tue, Jan 31, 2023 at 6:20 PM Peter Geoghegan wrote: > Actually the really wide output comes from COMMIT records. After I run > the regression tests, and execute some of my own custom pg_walinspect > queries, I see that some individual COMMIT records have a > length(description) of over 10,000 b

Re: Show various offset arrays for heap WAL records

2023-01-31 Thread Peter Geoghegan
On Tue, Jan 31, 2023 at 1:52 PM Peter Geoghegan wrote: > Obviously what you're doing here will lead to a significant increase > in the verbosity of the output for affected WAL records. I don't feel > too bad about that, though. It's really an existing problem, and on

Re: Show various offset arrays for heap WAL records

2023-01-31 Thread Peter Geoghegan
On Tue, Jan 31, 2023 at 1:52 PM Peter Geoghegan wrote: > > I would also like to see functions like XLogRecGetBlockRefInfo() pass > > something more useful than a stringinfo buffer so that we could easily > > extract out the relfilenode in pgwalinspect. > > That does seem particularly important. It

Re: Show various offset arrays for heap WAL records

2023-01-31 Thread Peter Geoghegan
uitous inconsistencies (e.g., fixing totally inconsistent punctuation, making the names for fields across WAL records consistent when they serve exactly the same purpose) would be a big improvement. As I mentioned in passing already, I actually don't think that the B-Tree WAL records are all that spe

Re: Show various offset arrays for heap WAL records

2023-01-27 Thread Robert Haas
On Fri, Jan 27, 2023 at 12:24 PM Melanie Plageman wrote: > I believe I have addressed this in the attached patch. I'm not sure what's best in terms of formatting details but I definitely like the idea of making pg_waldump show more details. I'd even like to have a way to extract the tuple data, w

Re: Show various offset arrays for heap WAL records

2023-01-27 Thread Melanie Plageman
LOG_HEAP2_PRUNE, XLOG_HEAP2_VACUUM, > > XLOG_HEAP2_FREEZE_PAGE. > > I'm bound to end up doing the same in index access methods. Might make > sense for the utility routines to live somewhere more centralized, at > least when code reuse is likely. Practically every index AM has WAL

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-22 Thread Michael Paquier
On Thu, Jan 12, 2023 at 05:37:40PM +0530, Bharath Rupireddy wrote: > I understand. I don't mind discussing something like [1] with the > following behaviour and discarding till_end_of_wal functions > altogether: > If start_lsn is NULL, error out/return NULL. > If end_lsn isn't specified, default to

Re: Show various offset arrays for heap WAL records

2023-01-16 Thread Peter Geoghegan
On Wed, Jan 11, 2023 at 3:11 PM Peter Geoghegan wrote: > On Wed, Jan 11, 2023 at 3:00 PM Andres Freund wrote: > > What are your thoughts about the place for the helper functions? You're ok > > with rmgrdesc_utils.[ch]? > > Yeah, that seems okay. BTW, while playing around with this patch today, I

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-12 Thread Bharath Rupireddy
On Thu, Jan 12, 2023 at 11:23 AM Michael Paquier wrote: > > On Wed, Jan 11, 2023 at 06:59:18PM +0530, Bharath Rupireddy wrote: > > I've done it that way for pg_get_wal_fpi_info. If this format looks > > okay, I can propose to do the same for other functions (for > > backpatching too) in a separate

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-11 Thread Michael Paquier
On Wed, Jan 11, 2023 at 06:59:18PM +0530, Bharath Rupireddy wrote: > I've done it that way for pg_get_wal_fpi_info. If this format looks > okay, I can propose to do the same for other functions (for > backpatching too) in a separate thread though. My vote would be to make that happen first, to hav

Re: Show various offset arrays for heap WAL records

2023-01-11 Thread Peter Geoghegan
al of diversity that we need to be considered. For example, the WAL records used by each individual index access method are all very similar. In fact the most important index AM WAL records used by each index AM (e.g. insert, delete, vacuum) have virtually the same format as each other already. -- Peter Geoghegan

Re: Show various offset arrays for heap WAL records

2023-01-11 Thread Andres Freund
Hi, On 2023-01-11 14:53:54 -0800, Peter Geoghegan wrote: > On Tue, Jan 10, 2023 at 11:35 AM Andres Freund wrote: > > Nontrivial, I'm afraid. We don't pass any relevant parameters to rm_desc: > > void(*rm_desc) (StringInfo buf, XLogReaderState > > *record); > > > > so we'd nee

Re: Show various offset arrays for heap WAL records

2023-01-11 Thread Peter Geoghegan
On Tue, Jan 10, 2023 at 11:35 AM Andres Freund wrote: > Nontrivial, I'm afraid. We don't pass any relevant parameters to rm_desc: > void(*rm_desc) (StringInfo buf, XLogReaderState *record); > > so we'd need to patch all of them. That might be worth doing at some point, > but I

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-11 Thread Bharath Rupireddy
i++] = LSNGetDatum(record->ReadRecPtr); + values[i++] = ObjectIdGetDatum(rnode.spcOid); + values[i++] = ObjectIdGetDatum(rnode.dbOid); + values[i++] = ObjectIdGetDatum(rnode.relNumber); + values[i++] = Int64GetDatum((int64) blk); + + if (fork >= 0 && fork <= MAX_FORKNUM) +

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-10 Thread Michael Paquier
On Tue, Jan 10, 2023 at 09:29:03AM +0100, Drouvot, Bertrand wrote: > Thanks for updating the patch! > > +-- Compare FPI from WAL record and page from table, they must be same > > I think "must be the same" or "must be identical" sounds better (but not 100% > sure). > > Except this nit, V4 looks

Re: Show various offset arrays for heap WAL records

2023-01-10 Thread Andres Freund
gt; The attached patch adds details to XLOG_HEAP2_PRUNE, XLOG_HEAP2_VACUUM, > > XLOG_HEAP2_FREEZE_PAGE. > > I'm bound to end up doing the same in index access methods. Might make > sense for the utility routines to live somewhere more centralized, at > least when code reuse is likely.

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-10 Thread Drouvot, Bertrand
--- 0/1891D78 |0 | 0 |44 | 8032 |8192 | 8192 | 4 | 735 (1 row) Ugh, v2 patch missed the new file added, I'm attaching v3 patch for further review. Sorry for the noise. I took a stab at how and what gets logged as FPI in WAL records: Option 1: WAL recor

Re: Show various offset arrays for heap WAL records

2023-01-09 Thread Peter Geoghegan
to end up doing the same in index access methods. Might make sense for the utility routines to live somewhere more centralized, at least when code reuse is likely. Practically every index AM has WAL records that include a sorted page offset number array, just like these ones. It's a very standard t

Show various offset arrays for heap WAL records

2023-01-09 Thread Andres Freund
Hi, A couple times when investigating data corruption issues, the last time just yesterday in [1], I needed to see the offsets affected by PRUNE and VACUUM records. As that's probably not just me, I think we should make that change in-tree. The attached patch adds details to XLOG_HEAP2_PRUNE, XLO

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-06 Thread Bharath Rupireddy
Ptr); + values[i++] = ObjectIdGetDatum(rnode.spcOid); + values[i++] = ObjectIdGetDatum(rnode.dbOid); + values[i++] = ObjectIdGetDatum(rnode.relNumber); + values[i++] = Int64GetDatum((int64) blk); + + if (fork >= 0 && fork <= MAX_FORKNUM) + values[i++] = CStringGetTextDatum(f

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-05 Thread Bharath Rupireddy
On Thu, Jan 5, 2023 at 6:51 PM Bharath Rupireddy wrote: > > > I'm also wondering if it would make sense to extend the test coverage of it > > (and pg_waldump) to "validate" that both > > extracted images are the same and matches the one modified right after the > > checkpoint. > > > > What do yo

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-05 Thread vignesh C
CAOxo6XKjQb2bMSBRpePf3ZpzfNTwjQUc4Tafh21=jzjx6bx...@mail.gmail.com > > > > I just have a few comments: > > Thanks for reviewing. > > > + > > +/* > > + * Get full page images and their info associated with a given WAL record. > > + */ > > &

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-05 Thread Bharath Rupireddy
ing. > + > +/* > + * Get full page images and their info associated with a given WAL record. > + */ > > > + > + Gets raw full page images and their information associated with all the > + valid WAL records between start_lsn and > + end_lsn. Returns on

Re: Add a new pg_walinspect function to extract FPIs from WAL records

2023-01-04 Thread Drouvot, Bertrand
Hi, On 12/27/22 12:48 PM, Bharath Rupireddy wrote: Hi, Here's a patch that implements the idea of extracting full page images from WAL records [1] [2] with a function in pg_walinspect. This new function accepts start and end lsn and returns full page image info such as WAL recor

Add a new pg_walinspect function to extract FPIs from WAL records

2022-12-27 Thread Bharath Rupireddy
Hi, Here's a patch that implements the idea of extracting full page images from WAL records [1] [2] with a function in pg_walinspect. This new function accepts start and end lsn and returns full page image info such as WAL record lsn, tablespace oid, database oid, relfile number, block n

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-12-19 Thread Alvaro Herrera
rotections in xlog record APIs against large numbers and overflows. Before this, it was possible for an extension to create malicious WAL records that were too large to replay; or that would overflow the xl_tot_len field, causing potential corruption in WAL record IO ops. Emitting invalid records

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-12-02 Thread Andres Freund
Hi, On 2022-07-26 18:58:02 +0200, Matthias van de Meent wrote: > - updated the MaxXLogRecordSize and XLogRecordLengthIsValid(len) > macros (now in xlogrecord.h), with a max length of the somewhat > arbitrary 1020MiB. > This leaves room for approx. 4MiB of per-record allocation overhead > before y

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-12-02 Thread Andres Freund
Hi, On 2022-12-02 14:22:55 +0900, Michael Paquier wrote: > On Fri, Nov 04, 2022 at 09:52:39AM +0900, Ian Lawrence Barwick wrote: > > CommitFest 2022-11 is currently underway, so if you are interested > > in moving this patch forward, now would be a good time to update it. > > No replies after 4 w

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-12-01 Thread Michael Paquier
On Fri, Nov 04, 2022 at 09:52:39AM +0900, Ian Lawrence Barwick wrote: > CommitFest 2022-11 is currently underway, so if you are interested > in moving this patch forward, now would be a good time to update it. No replies after 4 weeks, so I have marked this entry as returned with feedback. I am s

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-11-03 Thread Ian Lawrence Barwick
2022年10月5日(水) 16:46 Michael Paquier : > > Hi Matthias, > > On Wed, Jul 27, 2022 at 02:07:05PM +0200, Matthias van de Meent wrote: > > My apologies for the time it took me to come back to this thread. > > > + * To accommodate some overhead, hhis MaxXLogRecordSize value allows for > > > s/hhis/this/.

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-10-05 Thread Michael Paquier
Hi Matthias, On Wed, Jul 27, 2022 at 02:07:05PM +0200, Matthias van de Meent wrote: My apologies for the time it took me to come back to this thread. > > + * To accommodate some overhead, hhis MaxXLogRecordSize value allows for > > s/hhis/this/. > > Will be included in the next update.. v8 fail

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-07-27 Thread Matthias van de Meent
On Wed, 27 Jul 2022 at 11:09, Michael Paquier wrote: > > On Tue, Jul 26, 2022 at 06:58:02PM +0200, Matthias van de Meent wrote: > > - Retained the check in XLogRegisterData, so that we check against > > integer overflows in the registerdata code instead of only an assert > > in XLogRecordAssemble

Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths

2022-07-27 Thread Michael Paquier
On Tue, Jul 26, 2022 at 06:58:02PM +0200, Matthias van de Meent wrote: > - Retained the check in XLogRegisterData, so that we check against > integer overflows in the registerdata code instead of only an assert > in XLogRecordAssemble where it might be too late. Why? The record has not been inser

  1   2   >