Hello everyone,

> It should be noted that such workload does not really represents 'real life' 
> results,
> as all contention goes into certain partitions (in case of hashtable) and 
> subtrees (in case of tree).
> Next goal is to compare data structures on some kind of realistic benchmark 
> or just create
> multiple databases inside cluster and run corresponding number of pgbench 
> instances.
>

i am still working on the better way to allocate and recycle different parts of 
the tree (nodes, subtrees, etc),
but would like to share latest results of the benchmarks.

Here is a link to the google sheet:
https://docs.google.com/spreadsheets/d/1VfVY0NUnPQYqgxMEXkpxhHvspbT9uZPRV9mflu8UhLQ/edit?usp=sharing

(Excuse me for the link, it is convenient to accumulate and check results in 
the sheets.)

Comparison is done with pg 11.3 (0616aed243).
Each tpc-h query ran 12 times. Server restarts weren't performed between 
queries.
Average is calculated on base of 10 launches, first 2 are skipped.

Generally speaking, current shared tree performs worse than the hashtable in 
the majority of the TPC-H test queries,
especially in the case of 4GB shared buffers. With a greater size of the buffer 
cache - 16GB the situation looks better,
but there is still a 1-6% performance drop in most of the queries. I haven't 
yet profile any query, but suppose
that there are a couple of places worth optimizing.

It is interesting to note that results of pgbench tests have the same pattern 
as in 128MB and 1GB buffer cache size:
hashtable performs slightly better on select-only workload, while tree has 
better tps throughput in tpcb-like.



Reply via email to