On Fri, 14 Nov 2025 at 09:34, Hannu Krosing <[email protected]> wrote: > Added to https://commitfest.postgresql.org/patch/6219/
I think this could be useful, but I think you'll need to find a way to not do this for non-heap tables. Per the comments in TableAmRoutine, both scan_set_tidrange and scan_getnextslot_tidrange are optional callback functions and the planner won't produce TIDRangePaths if either of those don't exist. Maybe that means you need to consult pg_class.relam to ensure the amname is 'heap' or at least the relam = 2. On testing Citus's columnar AM, I get: postgres=# select * from t where ctid between '(0,1)' and '(10,0)'; ERROR: UPDATE and CTID scans not supported for ColumnarScan 1. For the patch, I think you should tighten the new option up to mean the maximum segment size that a table will be dumped in. I see you have comments like: /* TODO: add hysteresis here, maybe < 1.1 * huge_table_chunk_pages */ You *have* to put the cutoff *somewhere*, so I think it very much should be exactly the specified threshold. If anyone is unhappy that some segments consist of a single page, then that's on them to adjust the parameter accordingly. Otherwise, someone complaints that they got a 1-page segment when the table was 10.0001% bigger than the cutoff and then we're tempted to add a new setting to control the 1.1 factor, which is just silly. If there's a 1-page segment, so what? It's not a big deal. Perhaps --max-table-segment-pages is a better name than --huge-table-chunk-pages as it's quite subjective what the minimum number of pages required to make a table "huge". 2. I'm not sure if you're going to get away with using relpages for this. Is it really that bad to query pg_relation_size() when this option is set? If it really is a problem, then maybe let the user choose with another option. I understand we're using relpages for sorting table sizes so we prefer dumping larger tables first, but that just seems way less important if it's not perfectly accurate. 3. You should be able to simplify the code in dumpTableData() so you're not adding any extra cases. You could use InvalidBlockNumber to indicate an unbounded ctid range and only add ctid qual to the WHERE clause when you have a bounded range (i.e not InvalidBlockNumber). That way the first segment will need WHERE ctid <= '...' and the final one will need WHERE ctid >= '...'. Everything in between will have an upper and lower bound. That results in no ctid quals being added when both ranges are set to InvalidBlockNumber, which you should use for all tables not large enough to be segmented, thus no special case. TID Range scans are perfectly capable of working when only bounded at one side. 4. I think using "int" here is a future complaint waiting to happen. + if (!option_parse_int(optarg, "--huge-table-chunk-pages", 1, INT32_MAX, + &dopt.huge_table_chunk_pages)) I bet we'll eventually see a complaint that someone can't make the segment size larger than 16TB. I think option_parse_uint32() might be called for. David
