Re: BUG #17434: CREATE/DROP DATABASE can be executed in the same transaction with other commands

2022-11-25 Thread Israel Barth Rubio
> Attached is a draft patch along the lines I speculated about above.
> It breaks backwards compatibility in that PreventInTransactionBlock
> commands will now be rejected if they're a non-first command in a
> pipeline.  I think that's okay, and arguably desirable, for HEAD
> but I'm pretty uncomfortable about back-patching it.

I attempted to run these using HEAD, and it fails:

parse: create temporary table t1 (a int) on commit drop
bind
execute
parse: analyze t1
bind
execute
parse: select * from t1
bind
execute
sync

It then works fine after applying your patch!

Just for some context, this was brought by Peter E. based on an issue
reported by a customer. They are using PostgreSQL 11, and the issue
was observed after upgrading to PostgreSQL 11.17, which includes the
commit 9e3e1ac458abcda5aa03fa2a136e6fa492d58bd6. As a workaround
they downgraded the binaries to 11.16.

It would be great if we can back-patch this to all supported versions,
as the issue itself is currently affecting them all.

Regards,
Israel.


Re: Add support for DEFAULT specification in COPY FROM

2022-12-02 Thread Israel Barth Rubio
Hello all,

I'm submitting a new version of the patch. Instead of changing signature
of several functions in order to use the defaults parameter, it is now
storing
that in the cstate structure, which is already passed to all functions that
were previously modified.

Best regards,
Israel.

Em sex., 7 de out. de 2022 às 17:54, Israel Barth Rubio <
barthisr...@gmail.com> escreveu:

> Hello Zhihong,
>
> > For the last question, please take a look at:
> >
> > #define MemSetAligned(start, val, len) \
> >
> > which is called by palloc0().
>
> Oh, I totally missed that. Thanks for the heads up!
>
> I'm attaching the new patch version, which contains both the fix
> to the problem reported by Andres, and removes this useless
> MemSet call.
>
> Best regards,
> Israel.
>


v6-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data


Re: Authentication fails for md5 connections if ~/.postgresql/postgresql.{crt and key} exist

2023-01-19 Thread Israel Barth Rubio
Hello Jim,

> Hi Jelte, thanks for the message. You're right, an invalid cert path
> does solve the issue - I even use it for tests. Although it solves the
> authentication issue it still looks in my eyes like a non intuitive
> workaround/hack. Perhaps a new sslmode isn't the right place for this
> "feature"? Thanks again for the suggestion!

I do not think it is worth it to change the current behavior of PostgreSQL
in that sense.

PostgreSQL looks for the cert and key under `~/.postgresql` as a facility.
These files do not exist by default, so if PostgreSQL finds something in
there it assumes you want to use it.

I also think it is correct in the sense of choosing the certificate over
a password based authentication when it finds a certificate as the cert
based would provide you with stronger checks.

I believe that using libpq services would be a better approach if you
want to connect to several PostgreSQL clusters from the very same
source machine. That way you would specify whatever is specific to each
target cluster in a centralized configuration file and just reference each
target cluster by its service name in the connection string. It would
require that you move the SSL cert and key from `~/.postgresql` to somewhere
else and specify `sslcert` and `sslkey` in the expected service in the
`~/.pg_service.conf` file.

More info about that can be found at:

https://www.postgresql.org/docs/current/libpq-pgservice.html

Best regards,
Israel.

>


Re: Authentication fails for md5 connections if ~/.postgresql/postgresql.{crt and key} exist

2023-01-25 Thread Israel Barth Rubio
Hello Jim/Jacob,

> > I do not think it is worth it to change the current behavior of
> PostgreSQL
> > in that sense.
>
> Well, I am not suggesting to change the current behavior of PostgreSQL in
> that matter. Quite the contrary, I find this feature very convenient,
> specially when you need to deal with many different clusters. What I am
> proposing is rather the possibility to disable it on demand :) I mean,
> in case I do not want libpq to try to authenticate using the certificates
> in `~/.postgresql`.
>
> > PostgreSQL looks for the cert and key under `~/.postgresql` as a
> facility.
> > These files do not exist by default, so if PostgreSQL finds something in
> > there it assumes you want to use it.
>
> Yes. I'm just trying to find an elegant way to disable this assumption
> on demand.

Right, I do understand your proposal. I was just thinking out loud and
wondering about the broad audience of such a mode in the sslmode
argument.

Something else that came to my mind is that sslmode itself seems more
like an argument covering the client expectations regarding the connection
to the server, I mean, if it expects channel encryption and/or validation
of the
server identity.

I wonder if we are willing to add some functionality around the expectations
regarding the client certificate, if it wouldn't make more sense to be
controlled
through something like the clientcert option of pg_hba? If so, the downside
of
that is the fact that the client would still send the certificate even if
it would not
be used at all by the server. Again, just thinking out loud about what your
goal
is and possible ways of accomplishing that:)

> > I do realize that this patch is a big ask, since probably nobody except
> > me "needs it" :D
>
> I'd imagine other people have run into it too; it's just a matter of
> how palatable the workarounds were to them. :)

I imagine more people might have already hit a similar situation too. While
the
workaround can seem a bit weird, in my very humble opinion the user/client
is
somehow still the one to blame in this case as it is providing the "wrong"
file in
a path that is checked by libpq. With that in mind I would be inclined to
say it is
an acceptable workaround.

> > I think the sslcertmode=disable option that I introduced in [1]
solves this issue too;
>
> Well, I see there is indeed a significant overlap between our patches -
> but yours has a much more comprehensive approach! If I got it right,
> the new slcertmode=disable would indeed cancel the existing certs in
> '~/.postgresql/ in case they exist. Right?
>
> +if (conn->sslcertmode[0] == 'd') /* disable */
> +{
> +/* don't send a client cert even if we have one */
> +have_cert = false;
> +}
> +else if (fnbuf[0] == '\0')
>
> My idea was rather to use the existing sslmode with a new option
> "no-clientcert" that does actually the same:
>
>  /* sslmode no-clientcert */
>  if (conn->sslmode[0] == 'n')
>  {
>  fnbuf[0] = '\0';
>  }
>
>  ...
>
>  if (fnbuf[0] == '\0')
>  {
>  /* no home directory, proceed without a client cert */
>  have_cert = false;
>  }
>
> I wish I had found your patchset some months ago. Now I hate myself
> for the duplication of efforts :D

Although both patches achieve a similar goal regarding not sending the
client certificate there is still a slight but in my opinion important
difference
between them: sslmode=disable will also disable channel encryption. It
may or may not be acceptable depending on how the connection is between
your client and the server.

Kind regards,
Israel.


Re: Authentication fails for md5 connections if ~/.postgresql/postgresql.{crt and key} exist

2023-01-25 Thread Israel Barth Rubio
Hello Jacob,

> I'm not sure how helpful it is to assign "blame" here. I think the
> requested improvement is reasonable -- it should be possible to
> override the default for a particular connection, without having to
> pick a junk value that you hope doesn't match up with an actual file
> on the disk.

Right, I agree we can look for improvements. "blame" was likely
not the best word to express myself in that message.

> sslmode=disable isn't used in either of our proposals, though. Unless
> I'm missing what you mean?

Sorry about the noise, I misread the code snippet shared earlier
(sslmode x sslcertmode). I just took a closer read at the previously
mentioned patch about sslcertmode and it seems a bit
more elegant way of achieving something similar to what has
been proposed here.

Best regards,
Israel.

Em qua., 25 de jan. de 2023 às 14:09, Jacob Champion <
jchamp...@timescale.com> escreveu:

> On Wed, Jan 25, 2023 at 7:47 AM Israel Barth Rubio
>  wrote:
> > I imagine more people might have already hit a similar situation too.
> While the
> > workaround can seem a bit weird, in my very humble opinion the
> user/client is
> > somehow still the one to blame in this case as it is providing the
> "wrong" file in
> > a path that is checked by libpq. With that in mind I would be inclined
> to say it is
> > an acceptable workaround.
>
> I'm not sure how helpful it is to assign "blame" here. I think the
> requested improvement is reasonable -- it should be possible to
> override the default for a particular connection, without having to
> pick a junk value that you hope doesn't match up with an actual file
> on the disk.
>
> > Although both patches achieve a similar goal regarding not sending the
> > client certificate there is still a slight but in my opinion important
> difference
> > between them: sslmode=disable will also disable channel encryption. It
> > may or may not be acceptable depending on how the connection is between
> > your client and the server.
>
> sslmode=disable isn't used in either of our proposals, though. Unless
> I'm missing what you mean?
>
> --Jacob
>


Re: Add support for DEFAULT specification in COPY FROM

2022-09-26 Thread Israel Barth Rubio
Hello Andrew,

> . There needs to be a check that this is being used with COPY FROM, and
> the restriction needs to be stated in the docs and tested for. c.f.
> FORCE NULL.
>
> . There needs to be support for this in psql's tab_complete.c, and
> appropriate tests added
>
> . There needs to be support for it in contrib/file_fdw/file_fdw.c, and a
> test added
>
> . The tests should include psql's \copy as well as sql COPY
>
> . I'm not sure we need a separate regression test file for this.
> Probably these tests can go at the end of src/test/regress/sql/copy2.sql.

Thanks for your review! I have applied the suggested changes, and I'm
submitting the new patch version.

Kind regards,
Israel.


v3-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data


Re: Add support for DEFAULT specification in COPY FROM

2022-10-07 Thread Israel Barth Rubio
Hello Zhihong,

> +   /* attribute is NOT to be copied from input */
>
> I think saying `is NOT copied from input` should suffice.
>
> +   /* fieldno is 0-index and attnum is 1-index */
>
> 0-index -> 0-indexed

I have applied both suggestions, thanks! I'll submit a 4th version
of the patch soon.

> +   defaults = (bool *) palloc0(num_phys_attrs * sizeof(bool));
> +   MemSet(defaults, false, num_phys_attrs * sizeof(bool));
>
> Is the MemSet() call necessary ?

I would say it is, so it initializes the array with all flags set to false.
Later, if it detects attributes that should evaluate their default
expression,
it would set the flag to true.

Am I missing something?

Regards,
Israel.

>


Re: Add support for DEFAULT specification in COPY FROM

2022-10-07 Thread Israel Barth Rubio
Hello Andres,

> cfbot shows that tests started failing with this version:
>
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/39/3822
> A full backtrace is at
https://api.cirrus-ci.com/v1/task/5354378189078528/logs/cores.log

Thanks for pointing this out. I had initially missed this as my local runs
of *make check*
were working fine, sorry!

I'm attaching a new version of the patch, containing the memory context
switches.

Regards,
Israel.


v4-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data


Re: Add support for DEFAULT specification in COPY FROM

2022-10-07 Thread Israel Barth Rubio
Hello Zhihong,

> For the last question, please take a look at:
>
> #define MemSetAligned(start, val, len) \
>
> which is called by palloc0().

Oh, I totally missed that. Thanks for the heads up!

I'm attaching the new patch version, which contains both the fix
to the problem reported by Andres, and removes this useless
MemSet call.

Best regards,
Israel.


v5-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data


Apparent bug in WAL summarizer process (hung state)

2024-06-24 Thread Israel Barth Rubio
Hello,

Hope you are doing well.

I've been playing a bit with the incremental backup feature which might
come as
part of the 17 release, and I think I hit a possible bug in the WAL
summarizer
process.

The issue that I face refers to the summarizer process getting into a hung
state.
When the issue is triggered, it keeps in an infinite loop trying to process
a WAL
file that no longer exists.  It apparently comes up only when I perform
changes to
`wal_summarize` GUC and reload Postgres, while there is some load in
Postgres
which makes it recycle WAL files.

I'm running Postgres 17 in a Rockylinux 9 VM. In order to have less WAL
files
available in `pg_wal` and make it easier to reproduce the issue, I'm using
a low
value for `max_wal_size` ('100MB'). You can find below the steps that I
took to
reproduce this problem, assuming this small `max_wal_size`, and
`summarize_wal`
initially enabled:

```bash
# Assume we initially have max_wal_size = '100MB' and summarize_wal = on

# Create a table of ~ 100MB
psql -c "CREATE TABLE test AS SELECT generate_series(1, 300)"

# Take a full backup
pg_basebackup -X none -c fast -P -D full_backup_1

# Recreate a table of ~ 100MB
psql -c "DROP TABLE test"
psql -c "CREATE TABLE test AS SELECT generate_series(1, 300)"

# Take an incremental backup
pg_basebackup -X none -c fast -P -D incremental_backup_1 --incremental
full_backup_1/backup_manifest

# Disable summarize_wal
psql -c "ALTER SYSTEM SET summarize_wal TO off"
psql -c "SELECT pg_reload_conf()"

# Recreate a table of ~ 100MB
psql -c "DROP TABLE test"
psql -c "CREATE TABLE test AS SELECT generate_series(1, 300)"

# Re-enable sumarize_wal
psql -c "ALTER SYSTEM SET summarize_wal TO on"
psql -c "SELECT pg_reload_conf()"

# Take a full backup
pg_basebackup -X none -c fast -P -D full_backup_2

# Take an incremental backup
pg_basebackup -X none -c fast -P -D incremental_backup_2 --incremental
full_backup_2/backup_manifest
```

I'm able to reproduce the issue most of the time when running these steps
manually. It's harder to reproduce if I attempt to run those commands as a
bash script.

This is the sample output of a run of those commands:

```console

(barman) [postgres@barmandevhost ~]$ psql -c "CREATE TABLE test AS
SELECT generate_series(1, 300)"SELECT 300(barman)
[postgres@barmandevhost ~]$ pg_basebackup -X none -c fast -P -D
full_backup_1NOTICE:  WAL archiving is not enabled; you must ensure
that all required WAL segments are copied through other means to
complete the backup331785/331785 kB (100%), 1/1 tablespace(barman)
[postgres@barmandevhost ~]$ psql -c "DROP TABLE test"DROP
TABLE(barman) [postgres@barmandevhost ~]$ psql -c "CREATE TABLE test
AS SELECT generate_series(1, 300)"SELECT 300(barman)
[postgres@barmandevhost ~]$ pg_basebackup -X none -c fast -P -D
incremental_backup_1 --incremental
full_backup_1/backup_manifestNOTICE:  WAL archiving is not enabled;
you must ensure that all required WAL segments are copied through
other means to complete the backup111263/331720 kB (33%), 1/1
tablespace(barman) [postgres@barmandevhost ~]$ psql -c "ALTER SYSTEM
SET summarize_wal TO off"ALTER SYSTEM(barman) [postgres@barmandevhost
~]$ psql -c "SELECT pg_reload_conf()" pg_reload_conf
t(1 row)
(barman) [postgres@barmandevhost ~]$ psql -c "DROP TABLE test"DROP
TABLE(barman) [postgres@barmandevhost ~]$ psql -c "CREATE TABLE test
AS SELECT generate_series(1, 300)"SELECT 300(barman)
[postgres@barmandevhost ~]$ psql -c "ALTER SYSTEM SET summarize_wal TO
on"ALTER SYSTEM(barman) [postgres@barmandevhost ~]$ psql -c "SELECT
pg_reload_conf()" pg_reload_conf t(1 row)
(barman) [postgres@barmandevhost ~]$ pg_basebackup -X none -c fast -P
-D full_backup_2NOTICE:  WAL archiving is not enabled; you must ensure
that all required WAL segments are copied through other means to
complete the backup331734/331734 kB (100%), 1/1 tablespace(barman)
[postgres@barmandevhost ~]$ pg_basebackup -X none -c fast -P -D
incremental_backup_2 --incremental
full_backup_2/backup_manifestWARNING:  still waiting for WAL
summarization through 2/C128 after 10 secondsDETAIL:
Summarization has reached 2/B3D8 on disk and 2/B3D8 in
memory.WARNING:  still waiting for WAL summarization through
2/C128 after 20 secondsDETAIL:  Summarization has reached
2/B3D8 on disk and 2/B3D8 in memory.WARNING:  still waiting
for WAL summarization through 2/C128 after 30 secondsDETAIL:
Summarization has reached 2/B3D8 on disk and 2/B3D8 in
memory.WARNING:  still waiting for WAL summarization through
2/C128 after 40 secondsDETAIL:  Summarization has reached
2/B3D8 on disk and 2/B3D8 in memory.WARNING:  still waiting
for WAL summarization through 2/C128 after 50 secondsDETAIL:
Summarization has reached 2/B3D8 on disk and 2/B3D8 in
memory.WARNING:  still waiting for WAL summarization through
2/C128 after 60 secondsDETAIL:  Summarization has reached
2/B3D8 on disk and 

Re: Apparent bug in WAL summarizer process (hung state)

2024-06-24 Thread Israel Barth Rubio
I'm attaching the files which I missed in the original email.

>
19:34:17.437626 epoll_wait(5, [], 1, 8161) = 0 <8.171542>
19:34:25.610176 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.000334>
19:34:25.611012 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.000323>
19:34:25.611657 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.000406>
19:34:25.612327 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:34:25.612"..., 
132) = 132 <0.000203>
19:34:25.612873 epoll_wait(5, [], 1, 1) = 0 <10.010950>
19:34:35.623942 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.12>
19:34:35.624120 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.18>
19:34:35.624191 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.07>
19:34:35.624237 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:34:35.624"..., 
132) = 132 <0.52>
19:34:35.624341 epoll_wait(5, [], 1, 1) = 0 <10.010941>
19:34:45.635408 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.16>
19:34:45.635602 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.62>
19:34:45.635765 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.09>
19:34:45.635830 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:34:45.635"..., 
132) = 132 <0.000495>
19:34:45.636390 epoll_wait(5, [], 1, 1) = 0 <10.008785>
19:34:55.645305 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.16>
19:34:55.645520 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.32>
19:34:55.645645 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.27>
19:34:55.646214 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:34:55.645"..., 
132) = 132 <0.000136>
19:34:55.646445 epoll_wait(5, [], 1, 1) = 0 <10.011731>
19:35:05.658366 rt_sigprocmask(SIG_SETMASK, [URG], NULL, 8) = 0 <0.18>
19:35:05.658591 openat(AT_FDCWD, "pg_wal/0001000200B3", O_RDONLY) = 
-1 ENOENT (No such file or directory) <0.28>
19:35:05.658689 rt_sigprocmask(SIG_SETMASK, ~[QUIT ILL TRAP ABRT BUS FPE KILL 
SEGV CONT STOP SYS RTMIN RT_1], NULL, 8) = 0 <0.10>
19:35:05.658750 write(2, "\0\0{\0\357\23\1\0\0212024-06-24 17:35:05.658"..., 
132) = 132 <0.000521>
19:35:05.659335 epoll_wait(5,  
#0  0x7fb51c50e1da in epoll_wait (epfd=5, events=0x1409fd8, maxevents=1, 
timeout=timeout@entry=1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
#1  0x00922bad in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7ffe5492aba0, cur_timeout=, set=0x1409f70) at 
storage/ipc/latch.c:1570
#2  WaitEventSetWait (set=0x1409f70, timeout=1, 
occurred_events=0x7ffe5492aba0, nevents=1, wait_event_info=) at 
storage/ipc/latch.c:1516
#3  0x00920e65 in WaitLatch (latch=, 
wakeEvents=, timeout=, wait_event_info=150994953) 
at storage/ipc/latch.c:538
#4  0x008aa7e4 in WalSummarizerMain (startup_data=, 
startup_data_len=) at postmaster/walsummarizer.c:317
#5  0x008a46a0 in postmaster_child_launch (client_sock=0x0, 
startup_data_len=0, startup_data=0x0, child_type=B_WAL_SUMMARIZER) at 
postmaster/launch_backend.c:265
#6  postmaster_child_launch (client_sock=0x0, startup_data_len=0, 
startup_data=0x0, child_type=B_WAL_SUMMARIZER) at 
postmaster/launch_backend.c:226
#7  StartChildProcess (type=type@entry=B_WAL_SUMMARIZER) at 
postmaster/postmaster.c:3928
#8  0x008a4dcf in MaybeStartWalSummarizer () at 
postmaster/postmaster.c:4075
#9  MaybeStartWalSummarizer () at postmaster/postmaster.c:4070
#10 0x008a7f8f in ServerLoop () at postmaster/postmaster.c:1746
#11 0x0089f5e0 in PostmasterMain (argc=3, argv=0x14097b0) at 
postmaster/postmaster.c:1372
#12 0x005381aa in main (argc=3, argv=0x14097b0) at main/main.c:197
No symbol "full" in current context.
#0  0x7fb51c50e1da in epoll_wait (epfd=5, events=0x1409fd8, maxevents=1, 
timeout=timeout@entry=1) at ../sysdeps/unix/sysv/linux/epoll_wait.c:30
sc_ret = -4
sc_ret = 
#1  0x00922bad in WaitEventSetWaitBlock (nevents=1, 
occurred_events=0x7ffe5492aba0, cur_timeout=, set=0x1409f70) at 
storage/ipc/latch.c:1570
returned_events = 0
rc = 
cur_event = 
cur_epoll_event = 
returned_events = 
rc = 
cur_event = 
cur_epoll_event = 
__func__ = { }
__errno_location = 
#2  WaitEventSetWait (set=0x1409f70, timeout=1, 
occurred_events=0x7ffe5492aba0, nevents=1, wait_event_info=) at 
storage/ipc/latch.c:1516
rc = 
returned_events = 0
start_time = {ticks = 49310570173263}
cur_time = {ticks = }
cur_timeout = 
#3  0x00920

Re: Apparent bug in WAL summarizer process (hung state)

2024-06-27 Thread Israel Barth Rubio
> Yeah, this is a bug. It seems that the WAL summarizer process, when
> restarted, wants to restart from wherever it was previously
> summarizing WAL, which is correct if that WAL is still around, but if
> summarize_wal has been turned off in the meanwhile, it might not be
> correct. Here's a patch to fix that.

Thanks for checking this!

> Thanks. New version attached.

And besides that, thanks for the patch, of course!

I compiled Postgres locally with your patch. I attempted to break it several
times, both manually and through a shell script.

No success on that -- which in this case is actually success :)
The WAL summarizer seems able to always resume from a valid point,
so `pg_basebackup` isn't failing anymore.


Add support for DEFAULT specification in COPY FROM

2022-08-16 Thread Israel Barth Rubio
Hello all,

With the current implementation of COPY FROM in PostgreSQL we are able to
load the DEFAULT value/expression of a column if the column is absent in the
list of specified columns. We are not able to explicitly ask that
PostgreSQL uses
the DEFAULT value/expression in a column that is being fetched from the
input
file, though.

This patch adds support for handling DEFAULT values in COPY FROM. It works
similarly to NULL in COPY FROM: whenever the marker that was set for DEFAULT
value/expression is read from the input stream, it will evaluate the DEFAULT
value/expression of the corresponding column.

I'm currently working as a support engineer, and both me and some customers
had
already faced a situation where we missed an implementation like this in
COPY
FROM, and had to work around that by using an input file where the column
which
has a DEFAULT value/expression was removed.

That does not solve all issues though, as it might be the case that we just
want a
DEFAULT value to take place if no other value was set for the column in the
input
file, meaning we would like to have a column in the input file that
sometimes assume
the DEFAULT value/expression, and sometimes assume an actual given value.

The implementation was performed about one month ago and included all
regression
tests regarding the changes that were introduced. It was just rebased on
top of the
master branch before submitting this patch, and all tests are still
succeeding.

The implementation takes advantage of the logic that was already
implemented to
handle DEFAULT values for missing columns in COPY FROM. I just modified it
to
make it available the DEFAULT values/expressions for all columns instead of
only
for the ones that were missing in the specification. I had to change the
variables
accordingly, so it would index the correct positions in the new array of
DEFAULT
values/expressions.

Besides that, I also copied and pasted most of the checks that are
performed for the
NULL feature of COPY FROM, as the DEFAULT behaves somehow similarly.

Best regards,
Israel.


v1-0001-Added-support-for-DEFAULT-in-COPY-FROM.patch
Description: Binary data


Re: Add support for DEFAULT specification in COPY FROM

2022-08-17 Thread Israel Barth Rubio
Hello Andrew,

Thanks for reviewing this patch.

It is worth noting that DEFAULT will only take place if explicitly
specified, meaning there is
no default value for the option DEFAULT. The usage of \D in the tests was
only a suggestion.
Also, NULL marker will be an unquoted empty string by default in CSV mode.

In any case I have manually tested it and the behavior is compliant to what
we see in NULL
if it is defined to use \N both in text and CSV modes.

- NULL as \N:

postgres=# CREATE TEMP TABLE copy_null (id integer primary key, value text);
CREATE TABLE
postgres=# copy copy_null from stdin with (format text, NULL '\N');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1 \N
>> 2 \\N
>> 3 "\N"
>> \.
COPY 3
postgres=# TABLE copy_null ;
 id | value
+---
  1 |
  2 | \N
  3 | "N"
(3 rows)

postgres=# TRUNCATE copy_null ;
TRUNCATE TABLE
postgres=# copy copy_null from stdin with (format csv, NULL '\N');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,\N
>> 2,\\N
>> 3,"\N"
>> \.
COPY 3
postgres=# TABLE copy_null ;
 id | value
+---
  1 |
  2 | \\N
  3 | \N
(3 rows)

- DEFAULT as \D:

postgres=# CREATE TEMP TABLE copy_default (id integer primary key, value
text default 'test');
CREATE TABLE
postgres=# copy copy_default from stdin with (format text, DEFAULT '\D');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1 \D
>> 2 \\D
>> 3 "\D"
>> \.
COPY 3
postgres=# TABLE copy_default ;
 id | value
+---
  1 | test
  2 | \D
  3 | "D"
(3 rows)

postgres=# TRUNCATE copy_default ;
TRUNCATE TABLE
postgres=# copy copy_default from stdin with (format csv, DEFAULT '\D');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,\D
>> 2,\\D
>> 3,"\D"
>> \.
COPY 3
postgres=# TABLE copy_default ;
 id | value
+---
  1 | test
  2 | \\D
  3 | \D
(3 rows)

If you do not specify DEFAULT in COPY FROM, it will have no default value
for
that option. So, if you try to load \D in CSV mode, then it will load the
literal value:

postgres=# CREATE TEMP TABLE copy (id integer primary key, value text
default 'test');
CREATE TABLE
postgres=# copy copy from stdin with (format csv);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,\D
>> 2,\\D
>> 3,"\D"
>> \.
COPY 3
postgres=# TABLE copy ;
 id | value
+---
  1 | \D
  2 | \\D
  3 | \D
(3 rows)


Does that address your concerns?

I am attaching the new patch, containing the above test in the regress
suite.

Best regards,
Israel.

Em ter., 16 de ago. de 2022 às 17:27, Andrew Dunstan 
escreveu:

>
> On 2022-08-16 Tu 14:12, Israel Barth Rubio wrote:
> > Hello all,
> >
> > With the current implementation of COPY FROM in PostgreSQL we are able to
> > load the DEFAULT value/expression of a column if the column is absent
> > in the
> > list of specified columns. We are not able to explicitly ask that
> > PostgreSQL uses
> > the DEFAULT value/expression in a column that is being fetched from
> > the input
> > file, though.
> >
> > This patch adds support for handling DEFAULT values in COPY FROM. It
> > works
> > similarly to NULL in COPY FROM: whenever the marker that was set for
> > DEFAULT
> > value/expression is read from the input stream, it will evaluate the
> > DEFAULT
> > value/expression of the corresponding column.
> >
> > I'm currently working as a support engineer, and both me and some
> > customers had
> > already faced a situation where we missed an implementation like this
> > in COPY
> > FROM, and had to work around that by using an input file where the
> > column which
> > has a DEFAULT value/expression was removed.
> >
> > That does not solve all issues though, as it might be the case that we
> > just want a
> > DEFAULT value to take place if no other value was set for the column
> > in the input
> > file, meaning we would like to have a column in the input file that
> > sometimes assume
> > the DEFAULT value/expression, and sometimes assume an actual given value.
> >
> > The implementation was performed about one month ago and included all
> > regression
> > tests regarding the changes that were introduced. It was just rebased
> > on top of the
> > master branch before submitting this pat

Re: Add support for DEFAULT specification in COPY FROM

2022-08-18 Thread Israel Barth Rubio
Hello,

Thanks for your review. I submitted the patch to the next commit fest
(https://commitfest.postgresql.org/39/3822/).

Regards,
Israel.

Em qua., 17 de ago. de 2022 às 18:56, Andrew Dunstan 
escreveu:

>
> On 2022-08-17 We 17:12, Israel Barth Rubio wrote:
> >
> >
> > Does that address your concerns?
> >
> > I am attaching the new patch, containing the above test in the regress
> > suite.
>
>
> Thanks, yes, that all looks sane.
>
>
> Please add this to the next CommitFest if you haven't already done so.
>
>
> cheers
>
>
> andrew
>
>
> --
> Andrew Dunstan
> EDB: https://www.enterprisedb.com
>
>


Re: Add support for DEFAULT specification in COPY FROM

2022-08-18 Thread Israel Barth Rubio
Hello Ilmari,

Thanks for checking it, too. I can study to implement these changes
to include a way of overriding the behavior for the given columns.

Regards,
Israel.

Em qui., 18 de ago. de 2022 às 06:56, Dagfinn Ilmari Mannsåker <
ilm...@ilmari.org> escreveu:

> Andrew Dunstan  writes:
>
> > On 2022-08-16 Tu 14:12, Israel Barth Rubio wrote:
> >> Hello all,
> >>
> >> With the current implementation of COPY FROM in PostgreSQL we are
> >> able to load the DEFAULT value/expression of a column if the column
> >> is absent in the list of specified columns. We are not able to
> >> explicitly ask that PostgreSQL uses the DEFAULT value/expression in a
> >> column that is being fetched from the input file, though.
> >>
> >> This patch adds support for handling DEFAULT values in COPY FROM. It
> >> works similarly to NULL in COPY FROM: whenever the marker that was
> >> set for DEFAULT value/expression is read from the input stream, it
> >> will evaluate the DEFAULT value/expression of the corresponding
> >> column.
> […]
> > Interesting, and probably useful. I've only had a brief look, but it's
> > important that the default marker not be quoted in CSV mode (c.f. NULL)
> > -f it is it should be taken as a literal rather than a special value.
>
> For the NULL marker that can be overridden for individual columns with
> the FORCE(_NOT)_NULL option. This feature should have a similar
> FORCE(_NOT)_DEFAULT option to allow the DEFAULT marker to be ignored, or
> recognised even when quoted, respectively.
>
> - ilmari
>


Re: Add -k/--link option to pg_combinebackup

2025-02-07 Thread Israel Barth Rubio
Hello Robert,

> In general, +1, although I think that the patch needs polishing in a
> bunch of areas.

Thanks for your review!

> Originally, I thought what we wanted was something like a --in-place
> option to pg_combinebackup, allowing the output directory to be the
> same as one of the input directories. However, I now think that what
> this patch does is likely better, and this patch is a lot simpler to
> write than that one would be. With the --in-place option, if I'm
> combine backup1 and backup2, I must write either "pg_combinebackup
> backup1 backup2 -o backup1" or "pg_combinebackup backup1 backup2 -o
> backup2". In either case, I can skip whole-file copies from one of the
> two source directories -- whichever one happens to be the output
> directory. However, if I write "pg_combinebackup --link backup1
> backup2 -o result", I can skip *all* whole-file copies from *every*
> input directory, which seems a lot nicer.
>
> Furthermore, if all the input directories are staging directories,
> basically copies of original backups stored elsewhere, then the fact
> that those directories get trashed is of no concern to me. In fact,
> they don't even get trashed if pg_combinebackup is interrupted partway
> through, because I can just remove the output directory and recreate
> it and everything is fine. With an --in-place option, that would be
> trickier.

I see! I thought of not reinventing the wheel while writing, so I replayed
in pg_combinebackup the same behavior implemented in pg_upgrade.
And turns out the outcomes sounded convincing.

> Regarding the patch itself, I think we need to rethink the test case
> changes in particular. As written, the patch won't do any testing of
> link mode unless the user sets PG_TEST_PG_COMBINEBACKUP_MODE=--link;
> and if they do set that option, then we won't test anything else.
> Also, even with that environment variable set, we'll only test --link
> mode a bit ... accidentally. The patch doesn't really do anything to
> make sure that link mode actually does what it's intended to do. It
> only adapts the existing tests not to fail. I think it would be better
> to decide that you're not supposed to set
> PG_TEST_PG_COMBINEBACKUP_MODE=--link, and then write some new test,
> not depending on PG_TEST_PG_COMBINEBACKUP_MODE, that specifically
> verifies that link mode behaves as expected.
>
> After all, link mode is noticeably different from the other copy
> modes. Those should all produce equivalent results, or fail outright,
> we suppose. This produces a different user-visible result, so we
> probably ought to test that we get that result. For example, we might
> check that certain files end up with a link count of 1 or 2, as
> appropriate.

That makes sense to me too. I'm submitting a new patch which includes
a new file 010_links.pl. That test takes care of generating full and
incremental,
backups, then combines them with --link mode and ensures that the hard link
count for all files under PGDATA is as expected based on the operations that
were applied in the cluster.

I kept the "support" for PG_TEST_PG_COMBINEBACKUP_MODE=--link
around, just in case someone wants to run that and verify those outcomes
with
the --link mode. At least now we might have a proper test for checking
--link,
which always runs independently of the env variable.

> Does link() work on Windows?

link is a function available only in Linux. The equivalent function in
Windows is
CreateHardLink.

However, in Postgres link works in either of the operating systems because
we
implement a link function, which wraps the call to CreateHardLink. That's
coded
in the file src/port/win32link.c

> I'm not sure what to do about the issue that using --link trashes the
> input cluster if you subsequently start the database or otherwise
> modify any hard-linked files. Keep in mind that, for command-line
> arguments other than the first, these are incremental backups, and you
> already can't run postgres on those directories. Only the first input
> directory, which is a full backup not an incremental, is a potential
> target for an unexpected database start. I'm tentatively inclined to
> think we shouldn't modify the input directories and just emit a
> warning like:
>
> warning: --link mode was used; any modifications to the output
> directory may destructively modify input directories

The previous patch already had a warning. But I enjoyed your message,
so I slightly changed the warning and the docs in the new version of the
patch.

Best regards,
Israel.


v2-0001-pg_combinebackup-add-support-for-hard-links.patch
Description: Binary data


Add -k/--link option to pg_combinebackup

2025-01-15 Thread Israel Barth Rubio
Hello all,

With the current implementation of pg_combinebackup, we have a few
copy methods: --clone, --copy and --copy-file-range. By using either of
them, it implicates an actual file copy in the file system, i.e. among
other things, disk usage.

While discussing with some people, e.g. Robert Haas and Martín Marqués,
about possible ways to improve pg_combinebackup performance and/or
reduce disk usage taken by the synthetic backup, there was a thought
of using hard links.

I'm submitting a patch in this thread which introduces the -k/--link
option to pg_combinebackup, making it similar to the options exposed
by pg_upgrade. The patch reuses the same code flow that already exists
in pg_combinebackup, and whenever the tool judges a file should be
copied from either of the input backups to the synthetic backup, if the
link mode was chosen, a hard link is created instead of performing a
copy.

Depending on the pattern of modification of PGDATA files between
backups (based on the workload), the user might face improved
performance of pg_combinebackup as well as lower disk usage by the
synthetic backup.

That enhancement comes at a cost: the input backups may be invalidated
if the user modified the shared files from the synthetic backup, or if
the user starts the cluster from the synthetic backup. With that in
mind, a warning has been added to pg_combinebackup, as well as to the
docs, recommending that the user moves the synthetic backup to another
file system or machine before making use of it.

I've run the existing test suite with --link option and it worked fine,
except for an issue when running pg_verifybackup in t/003_timeline.pl.
The issue was expected given the description shared above. I modified
the tests to behave slightly differently depending on the copy method
selected by the runner.

Besides that, you can find below the outcomes of manual testing:

* Create some tables:

postgres=# CREATE TABLE test_1 AS SELECT generate_series(1, 100);
SELECT 100
postgres=# CREATE TABLE test_2 AS SELECT generate_series(1, 100);
SELECT 100
postgres=# CREATE TABLE test_3 AS SELECT generate_series(1, 100);
SELECT 100

* Check their OID:

postgres=# SELECT oid, relname FROM pg_class WHERE relname LIKE 'test_%';
  oid  | relname
---+-
 16388 | test_1
 16391 | test_2
 16394 | test_3
(3 rows)

* Enable the WAL summarizer:

postgres=# ALTER SYSTEM SET summarize_wal TO on;
ALTER SYSTEM
postgres=# SELECT pg_reload_conf();
 pg_reload_conf

 t
(1 row)

* Take a full backup:

$ pg_basebackup -d 'dbname=postgres' -D ~/monday -c fast -Xn
NOTICE:  WAL archiving is not enabled; you must ensure that all required
WAL segments are copied through other means to complete the backup

* Perform an update that will touch the whole file:

postgres=# UPDATE test_2 SET generate_series = generate_series + 1;
UPDATE 100

* Perform an update that will touch a single page:

postgres=# UPDATE test_3 SET generate_series = generate_series + 1 WHERE
generate_series = 1;
UPDATE 1

* Take an incremental backup:

$ pg_basebackup -d 'dbname=postgres' -D ~/tuesday -c fast -Xn --incremental
~/monday/backup_manifest
NOTICE:  WAL archiving is not enabled; you must ensure that all required
WAL segments are copied through other means to complete the backup

* Check a dry run of pg_combinebackup, focusing on the OIDs of the
  tables:

$ pg_combinebackup -d -n --link -o ~/tuesday_full ~/monday ~/tuesday/ 2>&1
| grep -P '16388|16391|16394'
pg_combinebackup: would copy "/home/vagrant/monday/base/5/16388" to
"/home/vagrant/tuesday_full/base/5/16388" using strategy link
pg_combinebackup: would copy "/home/vagrant/tuesday//base/5/16391" to
"/home/vagrant/tuesday_full/base/5/16391" using strategy link
pg_combinebackup: would reconstruct
"/home/vagrant/tuesday_full/base/5/16394" (4480 blocks, checksum CRC32C)
pg_combinebackup: reconstruction plan:
0:/home/vagrant/tuesday//base/5/INCREMENTAL.16394@8192
1-4423:/home/vagrant/monday/base/5/16394@36233216
4424:/home/vagrant/tuesday//base/5/INCREMENTAL.16394@16384
4425-4479:/home/vagrant/monday/base/5/16394@36691968
pg_combinebackup: would have read 4478 blocks from
"/home/vagrant/monday/base/5/16394"
pg_combinebackup: would have read 2 blocks from
"/home/vagrant/tuesday//base/5/INCREMENTAL.16394"
pg_combinebackup: would copy "/home/vagrant/monday/base/5/16388_vm" to
"/home/vagrant/tuesday_full/base/5/16388_vm" using strategy link
pg_combinebackup: would copy "/home/vagrant/tuesday//base/5/16388_fsm" to
"/home/vagrant/tuesday_full/base/5/16388_fsm" using strategy link
pg_combinebackup: would copy "/home/vagrant/tuesday//base/5/16391_vm" to
"/home/vagrant/tuesday_full/base/5/16391_vm" using strategy link
pg_combinebackup: would copy "/home/vagrant/tuesday//base/5/16391_fsm" to
"/home/vagrant/tuesday_full/base/5/16391_fsm" using strategy link
pg_combinebackup: would copy "/home/vagrant/monday/base/5/16394_vm" to
"/home/vagrant/tuesday_full/base/5/16394_vm" using strategy

Re: Add -k/--link option to pg_combinebackup

2025-01-15 Thread Israel Barth Rubio
One concern that I have about this --link mode, which Euler Taveira
also got concerned about: the fact that it can invalidate the original
backups if the user modifies or starts the synthetic backup without
moving it to another file system or machine.

At the moment, pg_combinebackup issues a warning about that problem as
soon as the execution of the tool is about to finish. There is also a
warning in the docs. But some users don't pay proper attention to the
output and/or docs.

I wonder if we have some way, like pg_upgrade does (rename a global
file in the original cluster), so we could increase the protection
against that issue?


Re: Add -k/--link option to pg_combinebackup

2025-03-03 Thread Israel Barth Rubio
Thanks for the new round of reviews!

> 1) The new file 010_links.pl added should be included to meson.build also

Added 010_links.pl to src/bin/pg_combinebackup/meson.build.

> 2) Since it is a new file, "Copyright (c) 2021-2025" should be
> "Copyright (c) 2025"

Done!

> 3) Since it is a single statement, no need of braces here

I missed removing the braces along with the warning hint in
the previous version. Adjusted.

> Also give it a different number than any existing file -- if we
> already have an 010 in that directory then make this one something
> else. 012, 050, or whatever makes most sense.

Oh, t/010_links.pl was not conflicting with anything.
The snippet that Vignesh quoted in his reply was from the
meson.build file of pg_basebackup instead of from
pg_combinebackup.

Attaching the new version of the patch in this reply.

I had to slightly change 010_links.pl and copy_file.c to
properly handle Windows, as the meson tests on
Windows had complaints with the code of v3:

* copy_file.c was forcing CopyFile on Windows regardless
  of the option chosen by the user. Now, to make that behavior
  backward compatible, I'm only forcing CopyFile on Windows
  when the copy method is not --link. That allows my patch to
  work, and doesn't change the previous behavior.
* Had to normalize paths when performing string matching in
  the test that verifies the hard link count on files.

Best regards,
Israel.


v4-0001-pg_combinebackup-add-support-for-hard-links.patch
Description: Binary data


Re: Add -k/--link option to pg_combinebackup

2025-03-04 Thread Israel Barth Rubio
> With regard to the second question, why does this test need to create
> three tables with a million rows each instead of some small number of
> rows? If you need to fill multiple blocks, consider making the rows
> wider instead of inserting such a large number.

Makes sense! Changed that to use tables with wider rows, but less
rows.

> With regard to the first question, I'm going to say that the answer is
> "no" because when I run this test locally, I get this:
>
> Use of uninitialized value $file in stat at
> /Users/robert.haas/pgsql/src/bin/pg_combinebackup/t/010_links.pl line
> 226.
>
> I'm not sure whether that "Use of uninitialized value" message at the
> end is killing the script at that point or whether it's just a
> warning, but it should be cleaned up either way.

That's unexpected. It seems somehow the function was called with a
"null" value as the argument.

> For the rest, I'm not
> a fan of testing every single file in the data directory like this. It
> seems sufficient to me to test two files, one that you expect to
> definitely be modified and one that you expect to definitely be not
> modified. That assumes that you can guarantee that the file definitely
> wasn't modified, which I'm not quite sure about. The test doesn't seem
> to disable autovacuum, so what would keep autovacuum from sometimes
> processing that table after the full backup and before the incremental
> backup, and sometimes not? But there's something else wrong here, too,
> because even when I adjusted the test to disable autovacuum, it still
> failed in the same way as shown above.

Right, maybe the previous test was trying to do much more than it
should.

I've refactored the test to focus only on the user tables that we create
and use in the test.

I've also added `autovacuum = off` to the configuration, just in case.

> Also, project style is uncuddled braces and lines less than 80 where
> possible. So don't do this:
>
> for my $test_3_segment (@test_3_segments) {
>
> The curly brace belongs on the next line. Similarly this should be
> three separate lines:
>
> } else {
>
> Regarding long lines, some of these cases are easy to fix:
>
> my $query = "SELECT pg_relation_filepath(oid) FROM pg_class WHERE
> relname = '%s';";
>
> This could be fixed by writing my $query = <
> my $pg_attribute_path = $primary->safe_psql('postgres',
> sprintf($query, 'pg_attribute'));
>
> This could be fixed by moving the sprintf() down a line and indenting
> it under 'postgres'.

Oh, I thought the styling guide from the Postgres wiki would apply to
the .c/.h files from the source code. Not sure why I got to that conclusion,
but I applied the styling guide to the perl files in the new version of the
patch.

> Attached is a small delta patch to fix a few other issues. It does the
> following:
>
> * adds a serial comma to pg_combinebackup.sgml. This isn't absolutely
> mandatory but we usually prefer this style.
>
> * puts the new -k option in proper alphabetical order in several places
>
> * changes the test in copy_file() to test for == COPY_METHOD_COPY
> instead of != COPY_METHOD_COPYFILE and updates the comment to reflect
> what I believe the actual reasoning to be

Thanks for the patch with the suggestions.
About the last one, related to the copy method, my first thought was to do
something like you did (in the sense of using == instead of !=). However, I
was concerned that I would change the previous behavior. But I do prefer
how it stands in your suggestion, thanks!

I'm attaching the v5 of the patch now.

Best regards,
Israel.


v5-0001-pg_combinebackup-add-support-for-hard-links.patch
Description: Binary data


Re: Add -k/--link option to pg_combinebackup

2025-03-06 Thread Israel Barth Rubio
> I'm happy with this now, so barring objections or complaints from CI,
> I plan to commit it soon.

Great, thank you!


Re: Add -k/--link option to pg_combinebackup

2025-03-05 Thread Israel Barth Rubio
> I don't think it does, because I think those options are anyway
> blocked on Windows. See the comment block that says /* Check that the
> platform supports the requested copy method. */

Yeah, you are right. I could swear I ran `pg_combinebackup --clone` on
a Windows VM, and instead of erroring out it instead ran with CopyFile.
But that's not the case, it errors out (as expected).

> So, this still fails for me, in a manner quite similar to before. The
> first call to check_data_file() is for
>
/Users/robert.haas/pgsql-meson/testrun/pg_combinebackup/010_links/data/t_010_links_restore_data/pgdata/base/5/16384
> but @data_file_segments ends up zero-length. Therefore, $last_segment
> is undef when we access it at the bottom of check_data_file(). I think
> the problem is here:
>
> my $basename = (split /\./, $full_path)[0];
>
> This code assumes that no path name component other than the filename
> at the end contains a period, but in my case that's not true, because
> my home directory on this particular computer is /Users/robert.haas.

Right, well spotted. The test code had bad assumptions. Those
assumptions should be in the file name, not in the full path, thus that
broke in your env.

> But I'm wondering why you are even using readdir() here in the first
> place. You could just test -f "16384"; if it exists test -f "16384.1";
> if that exists test -f "16384.2" and keep going until you reach a file
> that does not exist. Not only would this avoid reading the entire
> directory contents for every call to this function, but it would also
> generate the list of files in sorted order, which would let you throw
> away natural_sort().

That makes total sense. In the v6 version I threw out the natural_sort
and started using the approach you described, which is much more
efficient and less error prone.

> Overall, I feel like this could just be significantly simpler. For
> instance, I don't quite see the need to have both test_1 and test_2.
> If we just have one table and all the segments but the last have 2
> hard links while the last have 1, isn't that enough? What test
> coverage do we get by adding the second relation? All of the segments
> of test_1 behave just like the non-final segments of test_2, so why
> test both? If you go down to 1 table, you can simplify a bunch of
> things here.

I think that isn't enough because it makes assumptions on the
environment -- in this case that assumes the Linux autoconf environment
from Cirrus CI.

The test is creating a couple of tables with ~264KB. If you have a very
small segment size, like we have in the autoconf job of Cirrus CI, then
that's enough to create more than a single file, in such a way we are
able to verify segments with 2 or 1 hard links.

However, when running through other jobs, like Windows meson, it uses
the default segment size of 1GB, and in that case we would have a single
segment with 1 hard link if we had a single table.

With that in mind, and taking into consideration that local builds from
devs might use the default segment size, having a couple of tables to
test different purposes, i.e. test_1 for testing an "unmodified table"
scenario, and test_2 for testing a "modified table" scenario, looks more
appropriate to me.

For now I'm keeping the patch as it was in that sense, i.e. with a
check_data_file function that is called both for test_1 and test_2. I
added a comment about that in the test file. I can send a new patch
version if that still sounds like an overkill (having two test tables).

> Why does the insert add 10 new rows instead of just 1? Doesn't that
> introduce a significant risk that with some choice of segment size
> those inserts will be spread across two segments rather than one,
> leading to the test failing?

That was a leftover from some manual attempts on my side before sending
the patch, sorry! I started with a simple INSERT, then tried a INSERT
with a SELECT on a range, and missed reverting before sending the patch.
The v6 version of the patch contains the simple INSERT version, which
adds only one tuple to the test_2 table, and is safer as your pointed
out.

Best regards,
Israel.


v6-0001-pg_combinebackup-add-support-for-hard-links.patch
Description: Binary data


Re: Add -k/--link option to pg_combinebackup

2025-02-20 Thread Israel Barth Rubio
Thanks for the review, Robert!
I applied all of your suggestions.

> I'm still not a fan of the changes to 010_links.pl; let's drop those.

As discussed through a side chat, 010_links.pl is the new test file
which ensures the hard links are created as expected in the output
directory.

I've removed the changes that the v2 patch had in the other test files,
which were the ones you were suggesting to drop here.

> cfbot is fairly unhappy with your patch. See
> http://cfbot.cputube.org/israel-barth.html or
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/cf%2F5508 --
> I haven't looked into what is going wrong here, and there may well be
> more than one thing, but nothing can be committed until this is all
> green.

There are some failures on compiler warnings and Windows jobs that
seem unrelated with this patch.

Besides that, there were failures when executing `010_links.pl` in the
Linux autoconf jobs. As discussed in a side chat with Euler and Robert,
and pointed out by Robert, Cirrus CI uses a very small
`--with-segsize-blocks`. My tests in the V2 patch were assuming 1GB
segments, so they were failing in Cirrus CI.

The patch V3, which I'm attaching in this reply, implements some logic
to handle any segment size.

Best regards,
Israel.


v3-0001-pg_combinebackup-add-support-for-hard-links.patch
Description: Binary data


Re: pg_basebackup and pg_switch_wal()

2025-07-29 Thread Israel Barth Rubio
Hello Fabrice,

> But I do not understand this error with the barman wrapper:
>
> Backup completed (start time: 2025-07-26 21:45:07.238223, elapsed time:
19 seconds)
> Waiting for the WAL file 00030B3F000C from server 'x_service'
(max: 600 seconds)
> Processing xlog segments from streaming for x_service
> 00030B3F000A
> Processing xlog segments from streaming for x_service
> 00030B3F000B
> ERROR: The WAL file 00030B3F000C has not been received in 600
seconds

Are you by chance running PostgreSQL 17?

If you are using PostgreSQL 17, you are very likely encountering the issue
tracked here: https://github.com/EnterpriseDB/barman/issues/1041. In that
case,
a fix has already been merged and will be included in an upcoming release of
Barman.

As a quick note for the future, the Barman GitHub issues page is the
correct place
to report problems like this directly to the development team. In any case,
I understand
that you raised it here given the context of the original question about
pg_basebackup.

Hope that helps you.

Best regards,
Israel.