On Wed, Sep 11, 2024 at 7:19 AM Nazir Bilal Yavuz <byavu...@gmail.com> wrote: > > Currently, in the pg_stat_io view, IOs are counted as blocks. However, there > are two issues with this approach: > > 1- The actual number of IO requests to the kernel is lower because IO > requests can be merged before sending the final request. Additionally, it > appears that all IOs are counted in block size.
I think this is a great idea. It will allow people to tune io_combine_limit as you mention below. > 2- Some IOs may not align with block size. For example, WAL read IOs are done > in variable bytes and it is not possible to correctly show these IOs in the > pg_stat_io view [1]. Yep, this makes a lot of sense as a solution. > To address this, I propose showing the total number of IO requests to the > kernel (as smgr function calls) and the total number of bytes in the IO. To > implement this change, the op_bytes column will be removed from the > pg_stat_io view. Instead, the [reads | writes | extends] columns will track > the total number of IO requests, and newly added [read | write | > extend]_bytes columns will track the total number of bytes in the IO. smgr API seems like the right place for this. > Example benefit of this change: > > Running query [2], the result is: > > ╔═══════════════════╦══════════╦══════════╦═══════════════╗ > ║ backend_type ║ object ║ context ║ avg_io_blocks ║ > ╠═══════════════════╬══════════╬══════════╬═══════════════╣ > ║ client backend ║ relation ║ bulkread ║ 15.99 ║ > ╠═══════════════════╬══════════╬══════════╬═══════════════╣ > ║ background worker ║ relation ║ bulkread ║ 15.99 ║ > ╚═══════════════════╩══════════╩══════════╩═══════════════╝ I don't understand why background worker is listed here. > You can rerun the same query [2] after setting io_combine_limit to 32 [3]. > The result is: > > ╔═══════════════════╦══════════╦══════════╦═══════════════╗ > ║ backend_type ║ object ║ context ║ avg_io_blocks ║ > ╠═══════════════════╬══════════╬══════════╬═══════════════╣ > ║ client backend ║ relation ║ bulkread ║ 31.70 ║ > ╠═══════════════════╬══════════╬══════════╬═══════════════╣ > ║ background worker ║ relation ║ bulkread ║ 31.60 ║ > ╚═══════════════════╩══════════╩══════════╩═══════════════╝ > > I believe that having visibility into avg_io_[bytes | blocks] is valuable > information that could help optimize Postgres. In general, for this example, I think it would be more clear if you compared what visibility we have in pg_stat_io on master with what visibility we have with your patch. I like that you show how io_combine_limit can be tuned using this, but I don't think the problem statement is clear nor is the full narrative. > CREATE TABLE t as select i, repeat('a', 600) as filler from > generate_series(1, 10000000) as i; > SELECT pg_stat_reset_shared('io'); > SELECT * FROM t WHERE i = 0; > SELECT backend_type, object, context, TRUNC((read_bytes / reads / (SELECT > current_setting('block_size')::numeric)), 2) as avg_io_blocks FROM pg_stat_io > WHERE reads > 0; I like that you calculate the avg_io_blocks, but I think it is good to show the raw columns as well. - Melanie