sorry ignore the previous paste , was in complete Here the full one. https://pastecode.io/s/hya0fyvo
On Thu, Aug 1, 2024 at 4:02 PM KK CHN <kkchn...@gmail.com> wrote: > The logs are here. > > https://pastecode.io/s/s5dp8ur1 > > > > On Thu, Aug 1, 2024 at 3:30 PM Kashif Zeeshan <kashi.zees...@gmail.com> > wrote: > >> Hi >> >> On Thu, Aug 1, 2024 at 2:54 PM KK CHN <kkchn...@gmail.com> wrote: >> >>> List, >>> >>> *Not working (start EPAS server always fails):* >>> >>> 1. Testing PTR using PgBackRest(2.52.1) on RHEL9 EPAS-16, and RHEL9 ( >>> Repo Server) >>> >>> When I do a PTR >>> >>> 1. After doing a table drop and then >>> 2. Noting down the time stamp and then >>> 3. Taking an incremental backup in hope that If I do a restore from this >>> incr Backup, that won't contain the dropped table. >>> 4. Correct me if I am conceptually wrong here. >>> 5. I am *never *successful in restoring the EPAS server in this >>> scenario. >>> >>> >>> *I know the following will work for me, w*hy not the above one if I >>> really want that state of cluster also ? >>> >>> *This is Working. * >>> 1. Create table >>> 2. Noting down the timestamp >>> 3. Taking incremental backup on RepoServer. >>> 4. drop the created table . >>> 5. Then stop the EPAS server and do a PTR, by the --set=step 3 incr >>> backup and target= step 2 time stamp .. It finished the pgaback restore >>> and promote command >>> 6. I am able to start back the EPAS server and see the dropped table >>> recovered there. >>> >>> But If I want a PTR as in the first section it fails.. Why ? >>> >>> Thank you, >>> Krishane >>> >>> >>> >>> >>> >>> *What I have done and results obtained: * >>> >>> Created a table important_table4 in my EPAS and note down the time after >>> creation of this table it is ( t1 : "01-AUG-24 14:08:32.447796+05:30" ) >>> >>> Then I performed an Incremental backup (incr backup: >>> 20240729-160137F_20240801-141148I ) >>> timestamp start/stop: 2024-08-01 14:11:48+05:30 / 2024-08-01 >>> 14:11:52+05:30 >>> >>> >>> Now I dropped the table table4 from the EPAS and noted down the time >>> >>> >>> I want to restore the table4,, so I stopped EPAS and executed >>> >>> $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta >>> --set=20240729-160137F_20240801-141148I --target-timeline=current >>> --type=time --target="01-AUG-24 14:08:32.447796+05:30" >>> --target-action=promote restore >>> >>> IT WORKS AS EXPECTED .. after restarting the EPAS I am able to get the >>> important_table4 back. >>> >>> root@service01 ~]# sudo -u enterprisedb psql edb >>> psql (16.3.0) >>> Type "help" for help. >>> >>> edb=# \dt >>> List of relations >>> Schema | Name | Type | Owner >>> --------+------------------+-------+-------------- >>> public | important_table | table | enterprisedb >>> public | important_table2 | table | enterprisedb >>> public | important_table3 | table | enterprisedb >>> public | important_table4 | table | enterprisedb >>> (4 rows) >>> >>> SO all works fine !!!! . >>> >>> >>> *But Now the PROBLEM Statement. * >>> >>> *1. I am dropping the table table 4 again * >>> edb=# \q >>> [root@service01 ~]# sudo -u enterprisedb psql -c "begin; drop table >>> important_table4; commit;" edb >>> BEGIN >>> DROP TABLE >>> COMMIT >>> *2 . [root@service01 ~]#* sudo -u enterprisedb psql -Atc "select >>> current_timestamp" edb 01-AUG-24 14:23:22.085076 +05:30 >>> Noting the time as : (01-AUG-24 14:23:22.085076 +05:30 ) >>> >>> 3. Now I am performing an incremental backup after step 2 on REPO >>> SErver ( Hoping that this latest INCR Backup is without dropped >>> important_table4, so that a recovery of the cluster shouldn't show the >>> table4 again. ) >>> >>> incr backup details. : 20240729-160137F_20240801-142433I >>> timestamp start/stop*: 2024-08-01 14:24:33+05:30 / >>> 2024-08-01 14:24:36+05:30* >>> >>> 4. Now I want to test the database recovery after dropping the table4 >>> in step1 to verify that my EPAS restores from the backup in step 3 and time >>> stamp (01-AUG-24 14:23:22.085076 +05:30, so that the restored EPAS >>> cluster doesn't contain the important_table4. >>> >>> 5. $ sudo -u enterprisedb pgbackrest --stanza=Demo_Repo --delta >>> --set=20240729-160137F_20240801-142433I --target-timeline=current >>> --type=time --target="01-AUG-24 14:23:22.085076+05:30" >>> --target-action=promote restore >>> ------------ >>> ------------- >>> INFO: restore command end: completed successfully (1035ms) >>> >>> *ISSUE: I am unable to get the EPAS Server* in running state after >>> step 5 >>> >>> *What am I doing wrong ? OR am I conceptually wrong ?* >>> >>> >>> >>> >>> OUTPUT on executing step 5. >>> >>> [root@service01 ~]# sudo -u enterprisedb pgbackrest --stanza=Demo_Repo >>> --delta --set=20240729-160137F_20240801-142433I --target-timeline=current >>> --type=time --target="01-AUG-24 14:23:22.085076+05:30" >>> --target-action=promote restore >>> >>> 2024-08-01 14:30:03.535 P00 INFO: restore command begin 2.52.1: >>> --delta --exec-id=82738-b5fe7415 --log-level-console=info >>> --log-level-file=debug --pg1-path=/var/lib/edb/as16/data >>> --pg-version-force=16 --repo1-host=10.10.20.7 --repo1-host-user=postgres >>> --set=20240729-160137F_20240801-142433I --stanza=Demo_Repo >>> --target="01-AUG-24 14:23:22.085076+05:30" --target-action=promote >>> --target-timeline=current --type=time >>> 2024-08-01 14:30:03.880 P00 INFO: repo1: restore backup set >>> 20240729-160137F_20240801-142433I, recovery will start at 2024-08-01 >>> 14:24:33 >>> 2024-08-01 14:30:03.881 P00 INFO: remove invalid files/links/paths >>> from '/var/lib/edb/as16/data' >>> 2024-08-01 14:30:04.567 P00 INFO: write updated >>> /var/lib/edb/as16/data/postgresql.auto.conf >>> 2024-08-01 14:30:04.569 P00 INFO: restore global/pg_control (performed >>> last to ensure aborted restores cannot be started) >>> 2024-08-01 14:30:04.569 P00 INFO: restore size = 75.9MB, file total = >>> 2171 >>> 2024-08-01 14:30:04.569 P00 INFO: restore command end: completed >>> successfully (1035ms) >>> *[root@service01 ~]# systemctl start edb-as-16.service* >>> >>> *Now If I check the server status : Its dead * >>> >>> [root@service01 ~]# systemctl status edb-as-16.service >>> × edb-as-16.service - EDB Postgres Advanced Server 16 >>> Loaded: loaded (/etc/systemd/system/edb-as-16.service; disabled; >>> preset: disabled) >>> *Active: failed* (Result: exit-code) since Thu 2024-08-01 14:30:58 >>> IST; 4s ago >>> Duration: 228ms >>> Process: 82752 ExecStartPre=/usr/edb/as16/bin/edb-as-16-check-db-dir >>> ${PGDATA} (code=exited, status=0/SUCCESS) >>> Process: 82757 ExecStart=/usr/edb/as16/bin/edb-postgres -D ${PGDATA} >>> (code=exited, status=1/FAILURE) >>> Main PID: 82757 (code=exited, status=1/FAILURE) >>> CPU: 325ms >>> >>> Aug 01 14:30:56 service01 systemd[1]: Starting EDB Postgres Advanced >>> Server 16... >>> Aug 01 14:30:56 service01 edb-postgres[82757]: 2024-08-01 14:30:56 IST >>> LOG: redirecting log output to logging collector process >>> Aug 01 14:30:56 rservice01 edb-postgres[82757]: 2024-08-01 14:30:56 IST >>> HINT: Future log output will appear in directory "log". >>> Aug 01 14:30:58 service01 systemd[1]: Started EDB Postgres Advanced >>> Server 16. >>> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Main process >>> exited, code=exited, status=1/FAILURE >>> Aug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Killing process >>> 82758 (edb-postgres) with signal SIGKILL. >>> A*ug 01 14:30:58 service01 systemd[1]: edb-as-16.service: Failed with >>> result 'exit-code'.* >>> [root@service01 ~]# >>> >>> Any hints/guidance most welcome. >>> >>> Can you please share the DB Server log as it contains the exact error >> which is causing the server not to start. >> >> Thanks >> >