2017-09-14 14:59 GMT+02:00 Frank Millman <fr...@chagford.com>: > Pavel Stehule wrote: > > 2017-09-14 10:14 GMT+02:00 Frank Millman <fr...@chagford.com>: > >> Hi all >> >> This is a follow-up to a recent question I posted regarding a slow query. >> I thought that the slowness was caused by the number of JOINs in the query, >> but with your assistance I have found the true reason. I said in the >> previous thread that the question had become academic, but now that I >> understand things better, it is no longer academic as it casts doubt on my >> whole approach. >> >> I have split my AR transaction table into three physical tables – >> ar_tran_inv, ar_tran_crn, ar_tran_rec. I will probably add others at some >> point, such as ar_tran_jnl. >> >> I then create a VIEW to view all transactions combined. The view is >> created like this - >> >> CREATE VIEW ar_trans AS >> SELECT ‘ar_inv’ AS tran_type, row_id AS tran_row_id, tran_number ... >> FROM ar_tran_inv WHERE posted = ‘1’ >> UNION ALL >> SELECT ‘ar_crn’ AS tran_type, row_id AS tran_row_id, tran_number ... >> FROM ar_tran_crn WHERE posted = ‘1’ >> UNION ALL >> SELECT ‘ar_rec’ AS tran_type, row_id AS tran_row_id, tran_number ... >> FROM ar_tran_rec WHERE posted = ‘1’ >> >> I have another table called ‘ar_trans_due’, to keep track of outstanding >> transactions. All of the three transaction types generate entries into this >> table. To identify the source of the transaction, I have created columns in >> ar_trans_due called ‘tran_type’ and ‘tran_row_id’. After inserting a row >> into ‘ar_tran_inv’, I invoke this - >> >> INSERT INTO ar_trans_due (tran_type, tran_row_id, ...) VALUES >> (‘ar_inv’, ar_tran_inv.row_id, ...), and similar for the other transaction >> types. It is handled by a Python program, and it all happens within a >> transaction. >> >> When I view a row in ar_trans_due, I want to retrieve data from the >> source transaction, so I have this - >> >> SELECT * FROM ar_trans_due a >> LEFT JOIN ar_trans b ON b.tran_type = a.tran_type AND b.tran_row_id = >> a.tran_row_id >> >> I understand that PostgreSQL must somehow follow a path from the view >> ‘ar_trans’ to the physical table ‘ar_tran_inv’, but I assumed it would >> execute the equivalent of SELECT * FROM ar_tran_inv WHERE row_id = >> a.tran_row_id AND posted = ‘1’. >> >> If this was the case, it would be an indexed read, and very fast. >> Instead, according to EXPLAIN, it performs a sequential scan of the >> ‘ar_tran_inv’ table. >> >> It also scans ‘ar_tran_crn’ and ‘ar_tran_rec’, but EXPLAIN shows that it >> uses a Bitmap Heap Scan on those. I assume that is because the tables are >> currently empty. >> >> Is this analysis correct? >> > > please, send EXPLAIN ANALYZE result :) > >> >> > > I tried to reduce this to its simplest form. > > Here is a SQL statement - > > SELECT * > FROM ccc.ar_trans_due a > LEFT JOIN ccc.ar_trans b ON > b.tran_type = a.tran_type AND b.tran_row_id = a.tran_row_id > WHERE a.row_id = 1 > > ar_trans_due is a physical table, ar_trans is a view. > > It takes about 28ms. Here is the explain - https://explain.depesz.com/s/ > 8YY >
T > > Then I changed it to join each of the physical tables, instead of the view > - > > SELECT * > FROM ccc.ar_trans_due a > LEFT JOIN ccc.ar_tran_inv b ON > b.row_id = a.tran_row_id > LEFT JOIN ccc.ar_tran_crn c ON > c.row_id = a.tran_row_id > LEFT JOIN ccc.ar_tran_rec d ON > d.row_id = a.tran_row_id > WHERE a.row_id = 1 > > This takes just over 1ms. Here is the explain - > https://explain.depesz.com/s/U29h > > I tried setting enable_seq_scan to off – it ran even slower! > > Frank > >