On Thursday, October 22, 2020 3:15 PM, Kyotaro Horiguchi 
<horikyota....@gmail.com> wrote:
> I'm not sure about the exact steps of the test, but it can be expected if we
> have many small relations to truncate.
> 
> Currently BUF_DROP_FULL_SCAN_THRESHOLD is set to Nbuffers / 512,
> which is quite arbitrary that comes from a wild guess.
> 
> Perhaps we need to run benchmarks that drops one relation with several
> different ratios between the number of buffers to-be-dropped and Nbuffers,
> and preferably both on spinning rust and SSD.

Sorry to get back to you on this just now.
Since we're prioritizing the vacuum patch, we also need to finalize which 
threshold value to use.
I proceeded testing with my latest set of patches because Amit-san's comments 
on the code, the ones we addressed,
don't really affect the performance. I'll post the updated patches for 0002 & 
0003 after we come up with the proper
boolean parameter name for smgrnblocks and the buffer full scan threshold value.

Test the VACUUM performance with the following thresholds: 
   NBuffers/512, NBuffers/256, NBuffers/128,
and determine which of the ratio has the best performance in terms of speed.

I tested this on my machine (CPU 4v, 8GB memory, ext4) running on SSD.
Configure streaming replication environment.
shared_buffers = 100GB
autovacuum = off
full_page_writes = off
checkpoint_timeout = 30min

[Steps]
1. Create TABLE
2. INSERT data
3. DELETE from TABLE
4. Pause WAL replay on standby
5. VACUUM. Stop the primary.
6. Resume WAL replay and promote standby.

With 1 relation, there were no significant changes that we can observe:
(In seconds)
| s_b   | Master | NBuffers/512 | NBuffers/256 | NBuffers/128 | 
|-------|--------|--------------|--------------|--------------| 
| 128MB | 0.106  | 0.105        | 0.105        | 0.105        | 
| 100GB | 0.106  | 0.105        | 0.105        | 0.105        |

So I tested with 100 tables and got more convincing measurements:

| s_b   | Master | NBuffers/512 | NBuffers/256 | NBuffers/128 | 
|-------|--------|--------------|--------------|--------------| 
| 128MB | 1.006  | 1.007        | 1.006        | 0.107        | 
| 1GB   | 0.706  | 0.606        | 0.606        | 0.605        | 
| 20GB  | 1.907  | 0.606        | 0.606        | 0.605        | 
| 100GB | 7.013  | 0.706        | 0.606        | 0.607        |

The threshold NBuffers/128 has the best performance for default shared_buffers 
(128MB)
with 0.107 s, and equally performing with large shared_buffers up to 100GB.

We can use NBuffers/128 for the threshold, although I don't have a measurement 
for HDD yet. 
However, I wonder if the above method would suffice to determine the final 
threshold that we
can use. If anyone has suggestions on how we can come up with the final value, 
like if I need
to modify some steps above, I'd appreciate it.

Regarding the parameter name. Instead of accurate, we can use "cached" as 
originally intended from
the early versions of the patch since it is the smgr that handles smgrnblocks 
to get the the block
size of smgr_cached_nblocks.. "accurate" may confuse us because the cached 
value may not
be actually accurate..

Regards,
Kirk Jamison



Reply via email to