Try using SERIALIZABLE transactions instead of the default READ COMMITTED.
On 1/2/19 9:28 AM, Mark wrote:
Hi Ron,
Yes, my process will commit the transaction (or roll it back) eventually.
It's the window where
one transaction has deleted all the rows (before committing) and an
ANALYSE has ran.
The deleted rows won't make it into the sample even though the transaction
has NOT been commited.
During this time I will get bad row estimates on rows for that ID. You can
see this in the example below with two
database connections (labelled 1 and 2).
I would have expected the DELETE to have no effect until it was committed.
connection 1=> create table test_table(school_id integer, note text);
CREATE TABLE
connection 1=> insert into test_table(school_id, note) SELECT g.id
<http://g.id>,md5(random()::text) FROM generate_series(1,2) as g(id);
INSERT 0 2
connection 1=> insert into test_table(school_id, note) SELECT g.id
<http://g.id>,md5(random()::text) FROM generate_series(1,2) as g(id);
INSERT 0 2
connection 1=> insert into test_table(school_id, note) SELECT g.id
<http://g.id>,md5(random()::text) FROM generate_series(1,2) as g(id);
INSERT 0 2
connection 1=> select * from test_table ;
school_id | note
-----------+----------------------------------
1 | 0e08cf3990a04f0e943584517c564d31
2 | 96bf83ae5f8eb9342e8408b1ac25cb14
1 | f8fd943012edfe42a03a421df660bc33
2 | cd5d7ff0abca61f18857df9b21d234e0
1 | 60d731f430cb68c7285ddbcd9186eaa0
2 | 635e6c9cf305147ad8684213f0a9299c
(6 rows)
connection 1=> analyse verbose test_table ;
INFO: analyzing "public.test_table"
INFO: "test_table": scanned 1 of 1 pages, containing 6 live rows and 0
dead rows; 6 rows in sample, 6 estimated total rows
ANALYZE
connection 1=> explain analyse select * from test_table where school_id = 2 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Seq Scan on test_table (cost=0.00..1.07 rows=3 width=37) (actual
time=0.011..0.015 rows=3 loops=1)
Filter: (school_id = 2)
Rows Removed by Filter: 3
Planning time: 0.164 ms
Execution time: 0.043 ms
(5 rows)
connection 2=> BEGIN ;
BEGIN
connection 2=> delete from test_table where school_id = 2 ;
DELETE 3
connection 2=> /* This connection is now idle */
connection 1=> analyse verbose test_table ;
INFO: analyzing "public.test_table"
INFO: "test_table": scanned 1 of 1 pages, containing 6 live rows and 0
dead rows; 3 rows in sample, 6 estimated total rows
ANALYZE
connection 1=> explain analyse select * from test_table where school_id = 2 ;
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Seq Scan on test_table (cost=0.00..1.07 rows=1 width=37) (actual
time=0.009..0.014 rows=3 loops=1)
Filter: (school_id = 2)
Rows Removed by Filter: 3
Planning time: 0.095 ms
Execution time: 0.039 ms
(5 rows)
On Wed, 2 Jan 2019 at 14:04 Ron <ronljohnso...@gmail.com
<mailto:ronljohnso...@gmail.com>> wrote:
But Jeff said "left open indefinitely without either committing or
rolling back". Your process is committing the transaction.
On 1/2/19 6:15 AM, Mark wrote:
Hi Jeff,
Thanks for your help. That is exactly what is happening.
I have a long running job which deletes all of the common_student
table and then repopulates it. It takes long time to load all the
other data and commit the transaction. I didn't think the delete
inside the transaction would have any effect until it is commited or
rolled back.
I will have to rewrite the application so it updates the existing
rows rather than deleting all and then inserting.
Thanks again for helping me understand what's happening here.
Proof:
db=> explain analyze select * from common_student where school_id = 36;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on common_student (cost=88.30..3846.49 rows=1533
width=384) (actual time=4.852..7.065 rows=1388 loops=1)
Recheck Cond: (school_id = 36)
Heap Blocks: exact=67
-> Bitmap Index Scan on idx_common_student_sid (cost=0.00..87.91
rows=1533 width=0) (actual time=4.817..4.817 rows=1388 loops=1)
Index Cond: (school_id = 36)
Planning time: 0.097 ms
Execution time: 8.084 ms
(7 rows)
db=> /* At this point I have started a long running transaction that
deletes all of common_student for school_id 36 */ ;
db=> analyse verbose common_student(school_id);
INFO: analyzing "public.common_student"
INFO: "common_student": scanned 7322 of 7322 pages, containing 65431
live rows and 8060 dead rows; 56818 rows in sample, 65431 estimated
total rows
ANALYZE
db=> explain analyze select * from common_student where school_id = 36;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using idx_common_student_sid on common_student
(cost=0.41..8.43 rows=1 width=384) (actual time=0.017..1.498
rows=1388 loops=1)
Index Cond: (school_id = 36)
Planning time: 0.098 ms
Execution time: 2.583 ms
(4 rows)
db=> /* At this point I have killed the long running transaction that
deletes all of common_student for school_id 36 */ ;
db=> vacuum analyze common_student;
VACUUM
db=> explain analyze select * from common_student where school_id = 36;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on common_student (cost=79.17..3357.79 rows=1388
width=383) (actual time=0.088..1.302 rows=1388 loops=1)
Recheck Cond: (school_id = 36)
Heap Blocks: exact=67
-> Bitmap Index Scan on idx_common_student_sid (cost=0.00..78.83
rows=1388 width=0) (actual time=0.077..0.077 rows=1388 loops=1)
Index Cond: (school_id = 36)
Planning time: 0.327 ms
Execution time: 2.311 ms
(7 rows)
On Sun, 23 Dec 2018 at 02:57 Jeff Janes <jeff.ja...@gmail.com
<mailto:jeff.ja...@gmail.com>> wrote:
- Does the analyse output below mean that it only scanned
51538 of 65463 rows in the table? Is school_id 36 just being
missed in the sample? (This happens when the analyse is
repeated )
Is there a transaction which had deleted all of school_id=36, and
then was just left open indefinitely without either committing or
rolling back?
That would explain it, and I don't know of anything else that
could. The deleted but not committed tuples are still live, but
don't get sampled.
Cheers,
Jeff
--
Angular momentum makes the world go 'round.
--
Angular momentum makes the world go 'round.