On Tue, Mar 28, 2023 at 5:08 PM Peter J. Holzer <hjp-pg...@hjp.at> wrote:
> On 2023-03-28 07:26:47 -0700, Adrian Klaver wrote: > > On 3/28/23 06:17, Dominique Devienne wrote: > > > PS: At this point, I don't even know how much cluster affects > performance. > > I think that this depends a lot on your access patterns As I wrote, per-parent access to child and grandchild rows is typical. So w/o parent-based clustering of grandchild table(s), access those rows could potential seek to several (~50, see below) smaller clusters with arbitrary gaps. Cardinality is a few to ~20K on parent, x10-x50 on child, x20 - x100 on grandchild. So total row count rarely exceeds the 1M - 10M range. But there are LOBs/BYTEa... > (especially on > how much you update the grandchild table and whether those updates can > be HOT), so you will probably have to measure it yourself with a > realistic work load. > In this particular case, there aren't much UPDATEs, because of a deficiency of the client applications, which mostly do DELETE+INSERT instead of UPDATEs. Although we have to cascade modified dates up the parent hierarchy, so some UPDATEs do occur, but mostly on the less numerous child and parent tables. > (Personally I doubt the impact is large, but I don't know your data or > your access patterns.) > OK. > > > But because it can affect the schema structure (by denormalizing), > > > i'd rather know early. > > > > You will need to explain to me how it denormalizes? It reorders rows by > > index definition and does not maintain that order over updates and > inserts. > > I think he means that in order to cluster the grandchild table by the > parent.id > he would have to denormalize the table. > exactly. Thanks for your input.