Thank you for shedding light on this. On Thu, Aug 1, 2024 at 4:04 PM Mateusz Henicz <mateuszhen...@gmail.com> wrote:
> When you are performing PITR you need to configure a timestamp before your > last committed transaction. In your case you provided timestamp after your > last commit. > > I am trying to understand this concept and correct me if I am wrong If I do STEPS: 1. Noting down the time stamp just before my table drop, then 2. Drop a table, then 3. Taking the incremental backup after this table drop on Repo server, Then 4. If I do a PITR, with --target = timestamp noted in step1 and --set = the incr backup taken in step3 Then if perform a PITR restore by pgbackrest in step4 will it succeed or fail theoretically ? Thanks, Krishane When postgtes is restoring until a specified point, it restores a > transaction from WAL, and checking if next transaction is before or after > said timestamp. If it is before it will replay it and check next > transaction. Until next transaction is after configured timestamp. > If there is no transaction after your current timestamp in current WAL, > postgres will try to restore next WAL from archive. And since there is no > next WAL, and your timestamp is past latest committed transaction, it is > unable to continue, because it does not know if there should be any other > transaction replayed or not. > > Just perform some other actions after you note down the timestamp after > drop table. Create another one, insert some data, do whatever to have > another transaction in WALs. > > Cheers, > Mateusz > > czw., 1 sie 2024 o 12:23 KK CHN <kkchn...@gmail.com> napisał(a): > >> The logs are here. >> >> https://pastecode.io/s/s5dp8ur1 >> >> >> >> On Thu, Aug 1, 2024 at 3:30 PM Kashif Zeeshan <kashi.zees...@gmail.com> >> wrote: >> >>> Hi >>> >>> On Thu, Aug 1, 2024 at 2:54 PM KK CHN <kkchn...@gmail.com> wrote: >>> >>>> List, >>>> >>>> *Not working (start EPAS server always fails):* >>>> >>>> 1. Testing PTR using PgBackRest(2.52.1) on RHEL9 EPAS-16, and RHEL9 >>>> ( Repo Server) >>>> >>>> When I do a PTR >>>> >>>> 1. After doing a table drop and then >>>> 2. Noting down the time stamp and then >>>> 3. Taking an incremental backup in hope that If I do a restore from >>>> this incr Backup, that won't contain the dropped table. >>>> 4. Correct me if I am conceptually wrong here. >>>> 5. I am *never *successful in restoring the EPAS server in this >>>> scenario. >>>> >>>> >>>> *I know the following will work for me, w*hy not the above one if I >>>> really want that state of cluster also ? >>>> >>>> *This is Working. * >>>> 1. Create table >>>> 2. Noting down the timestamp >>>> 3. Taking incremental backup on RepoServer. >>>> 4. drop the created table . >>>> 5. Then stop the EPAS server and do a PTR, by the --set=step 3 incr >>>> backup and target= step 2 time stamp .. It finished the pgaback restore >>>> and promote command >>>> 6. I am able to start back the EPAS server and see the dropped table >>>> recovered there. >>>> >>>> But If I want a PTR as in the first section it fails.. Why ? >>>> >>>> Thank you, >>>> Krishane >>>> >>>> >>>> >>>> >>>> >>>> *What I have done and results obtained: * >>>> >>>> Created a table important_table4 in my EPAS and note down the time >>>> after creation of this table it is ( t1 : "01-AUG-24 >>>> 14:08:32.447796+05:30" ) >>>> >>>> Then I performed an Incremental backup (incr backup: >>>> 20240729-160137F_20240801-141148I ) >>>> timestamp start/stop: 2024-08-01 14:11:48+05:30 / 2024-08-01 >>>> 14:11:52+05:30 >>>> >>>> >>>> Now I dropped the table table4 from the EPAS and noted down the time >>>> >>>> >>>> I want to restore the table4,, so I stopped EPAS and executed >>>> >>>> $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta >>>> --set=20240729-160137F_20240801-141148I --target-timeline=current >>>> --type=time --target="01-AUG-24 14:08:32.447796+05:30" >>>> --target-action=promote restore >>>> >>>> IT WORKS AS EXPECTED .. after restarting the EPAS I am able to get the >>>> important_table4 back. >>>> >>>> root@service01 ~]# sudo -u enterprisedb psql edb >>>> psql (16.3.0) >>>> Type "help" for help. >>>> >>>> edb=# \dt >>>> List of relations >>>> Schema | Name | Type | Owner >>>> --------+------------------+-------+-------------- >>>> public | important_table | table | enterprisedb >>>> public | important_table2 | table | enterprisedb >>>> public | important_table3 | table | enterprisedb >>>> public | important_table4 | table | enterprisedb >>>> (4 rows) >>>> >>>> SO all works fine !!!! . >>>> >>>> >>>> *But Now the PROBLEM Statement. * >>>> >>>> *1. I am dropping the table table 4 again * >>>> edb=# \q >>>> [root@service01 ~]# sudo -u enterprisedb psql -c "begin; drop table >>>> important_table4; commit;" edb >>>> BEGIN >>>> DROP TABLE >>>> COMMIT >>>> *2 . [root@service01 ~]#* sudo -u enterprisedb psql -Atc "select >>>> current_timestamp" edb 01-AUG-24 14:23:22.085076 +05:30 >>>> Noting the time as : (01-AUG-24 14:23:22.085076 +05:30 ) >>>> >>>> 3. Now I am performing an incremental backup after step 2 on REPO >>>> SErver ( Hoping that this latest INCR Backup is without dropped >>>> important_table4, so that a recovery of the cluster shouldn't show the >>>> table4 again. ) >>>> >>>> incr backup details. : 20240729-160137F_20240801-142433I >>>> timestamp start/stop*: 2024-08-01 14:24:33+05:30 / >>>> 2024-08-01 14:24:36+05:30* >>>> >>>> 4. Now I want to test the database recovery after dropping the table4 >>>> in step1 to verify that my EPAS restores from the backup in step 3 and time >>>> stamp (01-AUG-24 14:23:22.085076 +05:30, so that the restored EPAS >>>> cluster doesn't contain the important_table4. >>>> >>>> 5. $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta >>>> --set=20240729-160137F_20240801-142433I --target-timeline=current >>>> --type=time --target="01-AUG-24 14:23:22.085076+05:30" >>>> --target-action=promote restore >>>> ------------ >>>> ------------- >>>> INFO: restore command end: completed successfully (1035ms) >>>> >>>> *ISSUE: I am unable to get the EPAS Server* in running state after >>>> step 5 >>>> >>>> *What am I doing wrong ? OR am I conceptually wrong ?* >>>> >>>> >>>> >>>> >>>> OUTPUT on executing step 5. >>>> >>>> [root@service01 ~]# sudo -u enterprisedb pgbackrest --stanza=Demo_Repo >>>> --delta --set=20240729-160137F_20240801-142433I --target-timeline=current >>>> --type=time --target="01-AUG-24 14:23:22.085076+05:30" >>>> --target-action=promote restore >>>> >>>> 2024-08-01 14:30:03.535 P00 INFO: restore command begin 2.52.1: >>>> --delta --exec-id=82738-b5fe7415 --log-level-console=info >>>> --log-level-file=debug --pg1-path=/var/lib/edb/as16/data >>>> --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres >>>> --set=20240729-160137F_20240801-142433I --stanza=Demo_Repo >>>> --target="01-AUG-24 14:23:22.085076+05:30" --target-action=promote >>>> --target-timeline=current --type=time >>>> 2024-08-01 14:30:03.880 P00 INFO: repo1: restore backup set >>>> 20240729-160137F_20240801-142433I, recovery will start at 2024-08-01 >>>> 14:24:33 >>>> 2024-08-01 14:30:03.881 P00 INFO: remove invalid files/links/paths >>>> from '/var/lib/edb/as16/data' >>>> 2024-08-01 14:30:04.567 P00 INFO: write updated >>>> /var/lib/edb/as16/data/postgresql.auto.conf >>>> 2024-08-01 14:30:04.569 P00 INFO: restore global/pg_control >>>> (performed last to ensure aborted restores cannot be started) >>>> 2024-08-01 14:30:04.569 P00 INFO: restore size = 75.9MB, file total = >>>> 2171 >>>> 2024-08-01 14:30:04.569 P00 INFO: restore command end: completed >>>> successfully (1035ms) >>>> *[root@service01 ~]# systemctl start edb-as-16.service* >>>> >>>> *Now If I check the server status : Its dead * >>>> >>>> [root@service01 ~]# systemctl status edb-as-16.service >>>> × edb-as-16.service - EDB Postgres Advanced Server 16 >>>> Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled; >>>> preset: disabled) >>>> *Active: failed* (Result: exit-code) since Thu 2024-08-01 >>>> 14:30:58 IST; 4s ago >>>> Duration: 228ms >>>> Process: 82752 >>>> ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir ${PGDATA} >>>> (code=exited, status=0/SUCCESS) >>>> Process: 82757 ExecStart=/usr/edb/as16/bin/edb-postgres -D >>>> ${PGDATA} (code=exited, status=1/FAILURE) >>>> Main PID: 82757 (code=exited, status=1/FAILURE) >>>> CPU: 325ms >>>> >>>> Aug 01 14:30:56 service01 systemd[1]: Starting EDB Postgres Advanced >>>> Server 16... >>>> Aug 01 14:30:56 service01 edb-postgres[82757]: 2024-08-01 14:30:56 IST >>>> LOG: redirecting log output to logging collector process >>>> Aug 01 14:30:56 rservice01 edb-postgres[82757]: 2024-08-01 14:30:56 IST >>>> HINT: Future log output will appear in directory "log". >>>> Aug 01 14:30:58 service01 systemd[1]: Started EDB Postgres Advanced >>>> Server 16. >>>> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Main process >>>> exited, code=exited, status=1/FAILURE >>>> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Killing >>>> process 82758 (edb-postgres) with signal SIGKILL. >>>> A*ug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Failed with >>>> result 'exit-code'.* >>>> [root@service01 ~]# >>>> >>>> Any hints/guidance most welcome. >>>> >>>> Can you please share the DB Server log as it contains the exact error >>> which is causing the server not to start. >>> >>> Thanks >>> >>