Hello All,
I am recently using sysbench with PostgreSQL 9.0 and 8.4.5 and doing some tests on 8core systems with SSDs. I seem to be hitting some problems with the read-write tests and hoping to see if it is a possible concurrency bug or expected behavior. Using sysbench with 1M rows and 80+ threads with --oltp-dist-type=special --oltp-table-size=1000000 --oltp-read-only=off --oltp-test-mode=complex --max-requests=0 --max-time=300 --num-threads=80 --test=oltp Causes : sysbench 0.4.12: multi-threaded system evaluation benchmark Running the test with following options: Number of threads: 80 Doing OLTP test. Running mixed OLTP test Using Special distribution (12 iterations, 1 pct of values are returned in 75 pct cases) Using "BEGIN" for starting transactions Using auto_inc on the id column Threads started! FATAL: query execution failed: 8386864 FATAL: database error, exiting... Done. With error reported in pg_log: ERROR: duplicate key value violates unique constraint "sbtest_pkey" The way sysbench works that in a transaction it deletes a row and inserts the same row back. So in a transaction logic it should not hit this error since it just deleted the row. What's happening is the timing with transaction on some other process working with the same id. The error is pretty easily reproduced. Here are some condensed logs with respect to the two processes involved 445373 tm:2010-11-16 16:06:55.526 UTC db:sbtest pid:14244 LOG: duration: 0.004 ms execute sbstmt13377649542683423: BEGIN 441773 tm:2010-11-16 16:06:55.527 UTC db:sbtest pid:14307 LOG: duration: 0.003 ms execute sbstmt899357721350501620: BEGIN ...other statements..removed... 445440 tm:2010-11-16 16:06:55.829 UTC db:sbtest pid:14244 LOG: duration: 16.683 ms execute sbstmt610923974845906481: UPDATE sbtest set k=k+1 where id=$1 445441 tm:2010-11-16 16:06:55.829 UTC db:sbtest pid:14244 DETAIL: parameters: $1 = '500865' 445444 tm:2010-11-16 16:06:55.830 UTC db:sbtest pid:14244 LOG: duration: 0.058 ms execute sbstmt81255943842609644: DELETE from sbtest where id=$1 445445 tm:2010-11-16 16:06:55.830 UTC db:sbtest pid:14244 DETAIL: parameters: $1 = '500865' 441840 tm:2010-11-16 16:06:55.831 UTC db:sbtest pid:14307 LOG: duration: 17.812 ms execute sbstmt375950798157093453: UPDATE sbtest set k=k+1 where id=$1 441841 tm:2010-11-16 16:06:55.831 UTC db:sbtest pid:14307 DETAIL: parameters: $1 = '500865' 445448 tm:2010-11-16 16:06:55.831 UTC db:sbtest pid:14244 LOG: duration: 0.084 ms execute sbstmt388529267384136315: INSERT INTO sbtest values($1,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy') 445449 tm:2010-11-16 16:06:55.831 UTC db:sbtest pid:14244 DETAIL: parameters: $1 = '500865' 445451 tm:2010-11-16 16:06:55.880 UTC db:sbtest pid:14244 LOG: duration: 48.647 ms execute sbstmt271576009267161387: COMMIT 441844 tm:2010-11-16 16:06:55.880 UTC db:sbtest pid:14307 LOG: duration: 48.889 ms execute sbstmt718364989767970105: DELETE from sbtest where id=$1 441845 tm:2010-11-16 16:06:55.880 UTC db:sbtest pid:14307 DETAIL: parameters: $1 = '500865' 445452 tm:2010-11-16 16:06:55.881 UTC db:sbtest pid:14244 LOG: duration: 0.021 ms bind sbstmt13377649542683423: BEGIN 445453 tm:2010-11-16 16:06:55.881 UTC db:sbtest pid:14244 LOG: duration: 0.005 ms execute sbstmt13377649542683423: BEGIN 441846 tm:2010-11-16 16:06:55.881 UTC db:sbtest pid:14307 LOG: duration: 0.036 ms bind sbstmt16151529281767205: INSERT INTO sbtest values($1,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy') 441847 tm:2010-11-16 16:06:55.882 UTC db:sbtest pid:14307 DETAIL: parameters: $1 = '500865' 441848 tm:2010-11-16 16:06:55.882 UTC db:sbtest pid:14307 ERROR: duplicate key value violates unique constraint "sbtest_pkey" 441849 tm:2010-11-16 16:06:55.882 UTC db:sbtest pid:14307 DETAIL: Key (id)=(500865) already exists. 441850 tm:2010-11-16 16:06:55.882 UTC db:sbtest pid:14307 STATEMENT: INSERT INTO sbtest values($1,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy') The logs are condensed so please retry it with sysbench to get all logs related to 80+ processes if required So reducing the above I get in Read-Committed Isolation BEGIN BEGIN DELETE row; INSERT row; COMMIT; DELETE row; INSERT row; COMMIT; If I read this right if that is what is happening here.. then (a) since the COMMIT happens before DELETE then DELETE should have deleted that row again and insert should have been successful.. SO I tried this manually with two psql sessions; BEGIN; BEGIN; SELECT xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+------+-------- 2566416 | 0 | 500815 (1 row) select xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+------+-------- 2576468 | 0 | 500815 (1 row) DELETE FROM sbtest WHERE id=500815; DELETE 1 select xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+---------+-------- 2576468 | 2576518 | 500815 (1 row) INSERT INTO sbtest values(500815,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy'); INSERT 0 1 select xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+---------+-------- 2576468 | 2576518 | 500815 (1 row) END; COMMIT SELECT xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+------+-------- 2576518 | 0 | 500815 (1 row) DELETE FROM sbtest WHERE id=500815; DELETE 1 SELECT xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ------+------+---- (0 rows) sbtest=# INSERT INTO sbtest values(500815,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy'); INSERT 0 1 SELECT xmin,xmax,id from sbtest where id=500815; xmin | xmax | id ---------+------+-------- 2576519 | 0 | 500815 (1 row) END; COMMIT So this wont create a problem .. Since the delete and insert happens after the other committed.. Now lets do the same for DELETE happening before commit. so inbetween delete and commit; BEGIN; BEGIN; DELETE; INSERT; DELETE - hangs END; COMMIT DELETE 0 - fails but doesn't roll back the transaction INSERT INTO sbtest values(500815,0,' ','aaaaaaaaaaffffffffffrrrrrrrrrreeeeeeeeeeyyyyyyyyyy'); ERROR: duplicate key value violates unique constraint "sbtest_pkey" END; ROLLBACK So probably the log report is missing this point that the DELETE started before the first one committed since the timestamps are same though in the logs the are in different order. The question is should the delete fail if it doesn't exist or succeed with DELETE 0 ? The other way I tried is I tried doing the DELETE before insert in the first one and same results.. So irrespective of how it does things if DELETE starts before the first transaction commits, then the answer is "ERROR: duplicate key" What happening is while DELETE tries to delete the old row and INSERT then tries to compete against the new row.. Not sure which way should be right in read-committed. The end result is sysbench keeps failing at 64+ threads and I am trying to figure out what do I fix (sysbench or postgresql) and why? Thanks. Regards, Jignesh -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers