Tomas' e-mail from earlier today, that I've already replied to directly, seems to have been lost to the mailing list. This must be due to having a 1MB attachment (results spreadsheet), which seems a bit aggressive as a reason to withold it IMV.
Here is a link to his results, converted to a Google docs spreadsheet: https://docs.google.com/spreadsheets/d/1SL_IIkPdiJUZ9BHUgROcYkEKWZUch4wRax2SyOmMuT8/edit?usp=sharing Here is his e-mail, in full: On Fri, Sep 15, 2017 at 6:34 AM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > Attached is a spreadsheet with updated data (using the more accurate > timing, and comparing master with the default replacement_sort_tuples > value (150k) and increased per Peter's instructions (to 1B). > > There are 6 sheets - one for each combination of a dataset size (100k > and 1M) and machine where I ran the tests (with different CPU models). > > There are 5 columns - first three are medians for each of the tested > PostgreSQL configurations: > > - master (default) > - master (1billion) > - no-replacement-sort (with patch applied) > > The numbers are medians from 5 runs (perhaps minimums would be better in > this case). > > The last two columns are comparisons / speed-ups > > - master(1B) / master(default) > - no-replacement-sort / master(default) > > Green numbers (below 1.0) mean speed-up, red (above 1.0) slow-down. > > Firstly, the master with r_s_t=1B setting performs either the same or > worse compared to a default values, in almost every test. On the 100k > data set the results are a bit noisy (particularly on the oldest CPU), > but on the 1M data set the difference is quite clear. So I've only > compared results for master(default) and patched. > > Quick summary, for each CPU model (which clearly affects the behavior). > > > e5-2620-v4 > ---------- > - probably the CPU we should be looking at, as it's the current model > - in some cases this gives us 3-5x speedup with low work_mem settings > - consider for example the very last line, which shows that > > SELECT DISTINCT a FROM text_test_padding ORDER BY a > > completed in ~5531ms on work_mem=8MB and 1067ms on 32MB, but with the > patch it completes in 1784ms (1MB), 1211ms (4MB) and 1104 (8MB). > > - Of course, this is for already-sorted data, and for other data sets > the improvement is more modest. It's difficult to summarize this into a > single number, but there are plenty of improvements in 20-30% range. > > - Some cases are a bit slower, of course, but overall I'd say the chart > is much more green than red. Also the slow-downs are much smaller, > compared to speed-ups (generally within 1-5%). > > i5-2500k > -------- > - same story, but this CPU gives more stable results (less noise) > > e5-5450 > ------- > - rather noisy CPU, particularly on the small (100k) dataset > - the larger data set mostly matches the other CPUs, although the > regressions are somewhat more significant > - I wouldn't really worry about this too much, it's clearly an obsolete > CPU and not something performance-conscious person would use nowadays > (the other CPUs are often 2-3x faster). > > If needed, full data is available here (each machine is pushing data to > a separate git repository): > > * https://bitbucket.org/tvondra/sort-bench-i5-2500k > * https://bitbucket.org/tvondra/sort-bench-e5-2620-v4 > * https://bitbucket.org/tvondra/sort-bench-e5-5450 > > At this point the 10M row tests are running, but I don't expect anything > particularly surprising from those results. That is, it's not something > that should block getting this patch committed, if the agreement is to > commit otherwise. > > regards > > -- > Tomas Vondra http://www.2ndQuadrant.com > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers