I would like to propose a patch to improve the cost of bitmap heap scan that is sensitive to work_mem. Currently, in bitmap scan, we don't consider work_mem. Now, in cases when there are a lot of lossy pages bitmap scan gets selected that eventually leads to degraded performance.
While evaluating parallel bitmap heap scan on TPCH we noticed that in many queries selecting bitmap heap scan gives good performance high work_mem but at low work_mem it slows the query compared to sequence scan or index scan. This shows that bitmap heap scan is a better alternative when most of the heap pages fit into work_mem. Attached POC patch fixes the problem by considering work_mem for bitmap costing. Performance numbers with this patch on different values of work_mem are as follows, workload: TPCH scale factor 20 machine: POWER 8 work_mem = 4MB Query Head(ms) Patch(ms) Improvement Change in plan 4 13759.632 14464.491 0.95x PBHS -> PSS 5 47581.558 41888.853 1.14x BHS -> SS 6 14051.553 13853.449 1.01x PBHS -> PSS 8 21529.98 11289.25 1.91x PBHS -> PSS 10 37844.51 34460.669 1.10x BHS -> SS 14 10131.49 15281.49 0.66x BHS -> SS 15 43579.833 34971.051 1.25x BHS -> SS work_mem = 20MB Query Head(ms) Patch(ms) Improvement Change in plan 6 14592 13521.06 1.08x PBHS -> PSS 8 20223.106 10716.062 1.89x PBHS -> PSS 15 40486.957 33687.706 1.20x BHS -> PSS work_mem = 64MB Query Head(ms) Patch(ms) Improvement Change in plan 15 40904.572 25750.873 1.59x BHS -> PBHS work_mem = 1GB No plan got changed Most of the queries show decent improvement, however, Q14 shows regression at work_mem = 4MB. On analysing this case, I found that number of pages_fetched calculated by "Mackert and Lohman formula" is very high (1112817) compared to the actual unique heap pages fetched (293314). Therefore, while costing bitmap scan using 1112817 pages and 4MB of work_mem, we predicted that even after we lossify all the pages it can not fit into work_mem, hence bitmap scan was not selected. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
improve_bitmap_cost_v1.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers