Hi ,

I am getting a server crash on standby while executing pg_logical_slot_get_changes function   , please refer this scenario

Master cluster( ./initdb -D master)
set wal_level='hot_standby in master/postgresql.conf file
start the server , connect to  psql terminal and create a physical replication slot ( SELECT * from pg_create_physical_replication_slot('p1');)

perform pg_basebackup using --slot 'p1'  (./pg_basebackup -D slave/ -R --slot p1 -v)) set wal_level='logical' , hot_standby_feedback=on, primary_slot_name='p1' in slave/postgresql.conf file start the server , connect to psql terminal and create a logical replication slot (  SELECT * from pg_create_logical_replication_slot('t','test_decoding');)

run pgbench ( ./pgbench -i -s 10 postgres) on master and select pg_logical_slot_get_changes on Slave database

postgres=# select * from pg_logical_slot_get_changes('t',null,null);
2019-03-13 20:34:50.274 IST [26817] LOG:  starting logical decoding for slot "t" 2019-03-13 20:34:50.274 IST [26817] DETAIL:  Streaming transactions committing after 0/6C000060, reading WAL from 0/6C000028. 2019-03-13 20:34:50.274 IST [26817] STATEMENT:  select * from pg_logical_slot_get_changes('t',null,null); 2019-03-13 20:34:50.275 IST [26817] LOG:  logical decoding found consistent point at 0/6C000028 2019-03-13 20:34:50.275 IST [26817] DETAIL:  There are no running transactions. 2019-03-13 20:34:50.275 IST [26817] STATEMENT:  select * from pg_logical_slot_get_changes('t',null,null); TRAP: FailedAssertion("!(data == tupledata + tuplelen)", File: "decode.c", Line: 977)
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: 2019-03-13 20:34:50.276 IST [26809] LOG:  server process (PID 26817) was terminated by signal 6: Aborted

Stack trace -

(gdb) bt
#0  0x00007f370e673277 in raise () from /lib64/libc.so.6
#1  0x00007f370e674968 in abort () from /lib64/libc.so.6
#2  0x0000000000a30edf in ExceptionalCondition (conditionName=0xc36090 "!(data == tupledata + tuplelen)", errorType=0xc35f5c "FailedAssertion", fileName=0xc35d70 "decode.c",
    lineNumber=977) at assert.c:54
#3  0x0000000000843c6f in DecodeMultiInsert (ctx=0x2ba1ac8, buf=0x7ffd7a5136d0) at decode.c:977 #4  0x0000000000842b32 in DecodeHeap2Op (ctx=0x2ba1ac8, buf=0x7ffd7a5136d0) at decode.c:375 #5  0x00000000008424dd in LogicalDecodingProcessRecord (ctx=0x2ba1ac8, record=0x2ba1d88) at decode.c:125 #6  0x000000000084830d in pg_logical_slot_get_changes_guts (fcinfo=0x2b95838, confirm=true, binary=false) at logicalfuncs.c:307 #7  0x000000000084846a in pg_logical_slot_get_changes (fcinfo=0x2b95838) at logicalfuncs.c:376 #8  0x00000000006e5b9f in ExecMakeTableFunctionResult (setexpr=0x2b93ee8, econtext=0x2b93d98, argContext=0x2b99940, expectedDesc=0x2b97970, randomAccess=false) at execSRF.c:233 #9  0x00000000006fb738 in FunctionNext (node=0x2b93c80) at nodeFunctionscan.c:94 #10 0x00000000006e52b1 in ExecScanFetch (node=0x2b93c80, accessMtd=0x6fb67b <FunctionNext>, recheckMtd=0x6fba77 <FunctionRecheck>) at execScan.c:93 #11 0x00000000006e5326 in ExecScan (node=0x2b93c80, accessMtd=0x6fb67b <FunctionNext>, recheckMtd=0x6fba77 <FunctionRecheck>) at execScan.c:143 #12 0x00000000006fbac1 in ExecFunctionScan (pstate=0x2b93c80) at nodeFunctionscan.c:270 #13 0x00000000006e3293 in ExecProcNodeFirst (node=0x2b93c80) at execProcnode.c:445 #14 0x00000000006d8253 in ExecProcNode (node=0x2b93c80) at ../../../src/include/executor/executor.h:241 #15 0x00000000006daa4e in ExecutePlan (estate=0x2b93a28, planstate=0x2b93c80, use_parallel_mode=false, operation=CMD_SELECT, sendTuples=true, numberTuples=0,     direction=ForwardScanDirection, dest=0x2b907e0, execute_once=true) at execMain.c:1643 #16 0x00000000006d8865 in standard_ExecutorRun (queryDesc=0x2afff28, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:362 #17 0x00000000006d869b in ExecutorRun (queryDesc=0x2afff28, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:306 #18 0x00000000008ccef1 in PortalRunSelect (portal=0x2b36168, forward=true, count=0, dest=0x2b907e0) at pquery.c:929 #19 0x00000000008ccb90 in PortalRun (portal=0x2b36168, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x2b907e0, altdest=0x2b907e0, completionTag=0x7ffd7a513e90 "")
    at pquery.c:770
#20 0x00000000008c6b58 in exec_simple_query (query_string=0x2adc1e8 "select * from pg_logical_slot_get_changes('t',null,null);") at postgres.c:1215 #21 0x00000000008cae88 in PostgresMain (argc=1, argv=0x2b06590, dbname=0x2b063d0 "postgres", username=0x2ad8da8 "centos") at postgres.c:4256
#22 0x0000000000828464 in BackendRun (port=0x2afe3b0) at postmaster.c:4399
#23 0x0000000000827c42 in BackendStartup (port=0x2afe3b0) at postmaster.c:4090
#24 0x0000000000824036 in ServerLoop () at postmaster.c:1703
#25 0x00000000008238ec in PostmasterMain (argc=3, argv=0x2ad6d00) at postmaster.c:1376
#26 0x0000000000748aab in main (argc=3, argv=0x2ad6d00) at main.c:228
(gdb)

regards,


On 03/07/2019 09:03 PM, tushar wrote:
There is an another issue , where i am getting error while executing "pg_logical_slot_get_changes" on SLAVE

Master (running on port=5432) -  run "make installcheck"  after setting  PATH=<installation/bin:$PATH )  and export PGDATABASE=postgres from regress/ folder Slave (running on port=5555)  -  Connect to regression database and select pg_logical_slot_get_changes

[centos@mail-arts bin]$ ./psql postgres -p 5555 -f t.sql
You are now connected to database "regression" as user "centos".
 slot_name |    lsn
-----------+-----------
 m61       | 1/D437AD8
(1 row)

psql:t.sql:3: ERROR:  could not resolve cmin/cmax of catalog tuple

[centos@mail-arts bin]$ cat t.sql
\c regression
SELECT * from   pg_create_logical_replication_slot('m61', 'test_decoding');
select * from pg_logical_slot_get_changes('m61',null,null);

regards,

On 03/04/2019 10:57 PM, Andres Freund wrote:
Hi,

On 2019-03-04 16:54:32 +0530, tushar wrote:
On 03/01/2019 11:16 PM, Andres Freund wrote:
So, if I understand correctly you do*not*  have a phyiscal replication
slot for this standby? For the feature to work reliably that needs to
exist, and you need to have hot_standby_feedback enabled. Does having
that fix the issue?
Ok, This time around  - I performed like this -

.)Master cluster (set wal_level=logical and hot_standby_feedback=on in
postgresql.conf) , start the server and create a physical replication slot
Note that hot_standby_feedback=on needs to be set on a standby, not on
the primary (although it doesn't do any harm there).

Thanks,

Andres



--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company


Reply via email to