Hello, and thank you again for your example !

Sorry for my late answer, I was working on a patch for our requests. I am 
though not completely understanding what is happening. Here is a plan of a 
query where I splitted the calls with OR as you suggested, what seemed to have 
enabled parallel scans.

https://explain.dalibo.com/plan/gfa1cf9fffd01bcg#plan/node/1

But, I still wonder, why was my request that slow ? My current understanding of 
what happened is :


  *   When postgresql does an Index Scan, it goes through a loop (which is not 
parallel) of asking for a chunk of data, and then processing it. It wait for 
having processed the data to ask the next chunk, instead of loading the whole 
index in RAM (which, I suppose, would be much faster, but also not feasible if 
the index is too big and the RAM too small, so postgresql does not do it). 
Thus, the 2MB/s.
  *   When it does a Bitmap Index Scan, it can parallelize disk interactions, 
and does not use the processor to discard lines, thus a much faster index load 
and processing.

I might be completely wrong, and would really like to understand the details, 
in order to explain them to my team, and to other who might encounter the same 
problem.

Again, thank you very much for your help, we were really struggling with those 
slow queries !

Simon FREYBURGER



Interne

De : Andrei Lepikhov <lepi...@gmail.com>
Envoyé : vendredi 5 juillet 2024 04:05
À : FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD 
MANAGEMENT) <simon.freybur...@sncf.fr>; pgsql-performance@lists.postgresql.org; 
Peter Geoghegan <p...@bowt.ie>
Objet : Re: How to solve my slow disk i/o throughput during index scan

On 7/4/24 22: 23, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / 
DM RMP YIELD MANAGEMENT) wrote: > Hello, > > Thank you, splitting in “OR” query 
definitely enables bitmap heap scans, > and thus parallelized read to disk


On 7/4/24 22:23, FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE

TGV / DM RMP YIELD MANAGEMENT) wrote:

> Hello,

>

> Thank you, splitting in “OR” query definitely enables bitmap heap scans,

> and thus parallelized read to disk 😃! I though did not understand your

> second point, what is parallel append, and how to enable it ?

Just for example:



DROP TABLE IF EXISTS t CASCADE;

CREATE TABLE t (id int not null, payload text) PARTITION BY RANGE (id);

CREATE TABLE p1 PARTITION OF t FOR VALUES FROM (0) TO (1000);

CREATE TABLE p2 PARTITION OF t FOR VALUES FROM (1000) TO (2000);

CREATE TABLE p3 PARTITION OF t FOR VALUES FROM (2000) TO (3000);

CREATE TABLE p4 PARTITION OF t FOR VALUES FROM (3000) TO (4000);

INSERT INTO t SELECT x % 4000, repeat('a',128) || x FROM

generate_series(1,1E5) AS x;

ANALYZE t;



SET enable_parallel_append = on;

SET parallel_setup_cost = 0.00001;

SET parallel_tuple_cost = 0.00001;

SET max_parallel_workers_per_gather = 8;

SET min_parallel_table_scan_size = 0;

SET min_parallel_index_scan_size = 0;



EXPLAIN (COSTS OFF)

SELECT t.id, t.payload FROM t WHERE t.id % 2 = 0

GROUP BY t.id, t.payload;



  Group

    Group Key: t.id, t.payload

    ->  Gather Merge

          Workers Planned: 6

          ->  Sort

                Sort Key: t.id, t.payload

                ->  Parallel Append

                      ->  Parallel Seq Scan on p1 t_1

                            Filter: ((id % 2) = 0)

                      ->  Parallel Seq Scan on p2 t_2

                            Filter: ((id % 2) = 0)

                      ->  Parallel Seq Scan on p3 t_3

                            Filter: ((id % 2) = 0)

                      ->  Parallel Seq Scan on p4 t_4

                            Filter: ((id % 2) = 0)



Here the table is scanned in parallel. It also works with IndexScan.



--

regards, Andrei Lepikhov


-------
Ce message et toutes les pièces jointes sont établis à l'intention exclusive de 
ses destinataires et sont confidentiels. L'intégrité de ce message n'étant pas 
assurée sur Internet, la SNCF ne peut être tenue responsable des altérations 
qui pourraient se produire sur son contenu. Toute publication, utilisation, 
reproduction, ou diffusion, même partielle, non autorisée préalablement par la 
SNCF, est strictement interdite. Si vous n'êtes pas le destinataire de ce 
message, merci d'en avertir immédiatement l'expéditeur et de le détruire.
-------
This message and any attachments are intended solely for the addressees and are 
confidential. SNCF may not be held responsible for their contents whose 
accuracy and completeness cannot be guaranteed over the Internet. Unauthorized 
use, disclosure, distribution, copying, or any part thereof is strictly 
prohibited. If you are not the intended recipient of this message, please 
notify the sender immediately and delete it.
  • ... Andrei Lepikhov
    • ... FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT)
      • ... Andrei Lepikhov
        • ... FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT)
          • ... FREYBURGER Simon (SNCF VOYAGEURS / DIRECTION GENERALE TGV / DM RMP YIELD MANAGEMENT)
            • ... David G. Johnston
            • ... Andrei Lepikhov
          • ... Andrei Lepikhov

Reply via email to