On Wed, 20 Apr 2022 at 14:56, Bruce Momjian <br...@momjian.us> wrote: > NVMe devices have a maximum queue length of 64k:
> Should we increase its maximum to 64k? Backpatched? (SATA has a > maximum queue length of 256.) I have a machine here with 1 x PCIe 3.0 NVMe SSD and also 1 x PCIe 4.0 NVMe SSD. I ran a few tests to see how different values of effective_io_concurrency would affect performance. I tried to come up with a query that did little enough CPU processing to ensure that I/O was the clear bottleneck. The test was with a 128GB table on a machine with 64GB of RAM. I padded the tuples out so there were 4 per page so that the aggregation didn't have much work to do. The query I ran was: explain (analyze, buffers, timing off) select count(p) from r where a = 1; Here's what I saw: NVME PCIe 3.0 (Samsung 970 Evo 1TB) e_i_c query_time_ms 0 88627.221 1 652915.192 5 271536.054 10 141168.986 100 67340.026 1000 70686.596 10000 70027.938 100000 70106.661 Saw a max of 991 MB/sec in iotop NVME PCIe 4.0 (Samsung 980 Pro 1TB) e_i_c query_time_ms 0 59306.960 1 956170.704 5 237879.121 10 135004.111 100 55662.030 1000 51513.717 10000 59807.824 100000 53443.291 Saw a max of 1126 MB/sec in iotop I'm not pretending that this is the best query and table size to show it, but at least this test shows that there's not much to gain by prefetching further. I imagine going further than we need to is likely to have negative consequences due to populating the kernel page cache with buffers that won't be used for a while. I also imagine going too far out likely increases the risk that buffers we've prefetched are evicted before they're used. This does also highlight that an effective_io_concurrency of 1 (the default) is pretty terrible in this test. The bitmap contained every 2nd page. I imagine that would break normal page prefetching by the kernel. If that's true, then it does not explain why e_i_c = 0 was so fast. I've attached the test setup that I did. I'm open to modifying the test and running again if someone has an idea that might show benefits to larger values for effective_io_concurrency. David
setup.sql
Description: Binary data