Hi, PFA a patch that reduces the output size of nodeToString by 50%+ in most cases (measured on pg_rewrite), which on my system reduces the total size of pg_rewrite by 33% to 472KiB. This does keep the textual pg_node_tree format alive, but reduces its size signficantly.
The basic techniques used are - Don't emit scalar fields when they contain a default value, and make the reading code aware of this. - Reasonable defaults are set for most datatypes, and overrides can be added with new pg_node_attr() attributes. No introspection into non-null Node/Array/etc. is being done though. - Reset more fields to their default values before storing the values. - Don't write trailing 0s in outDatum calls for by-ref types. This saves many bytes for Name fields, but also some other pre-existing entry points. Future work will probably have to be on a significantly different storage format, as the textual format is about to hit its entropy limits. See also [0], [1] and [2], where complaints about the verbosity of nodeToString were vocalized. Kind regards, Matthias van de Meent [0] https://www.postgresql.org/message-id/CAEze2WgGexDM63dOvndLdAWwA6uSmSsc97jmrCuNmrF1JEDK7w%40mail.gmail.com [1] https://www.postgresql.org/message-id/flat/CACxu%3DvL_SD%3DWJiFSJyyBuZAp_2v_XBqb1x9JBiqz52a_g9z3jA%40mail.gmail.com [2] https://www.postgresql.org/message-id/4b27fc50-8cd6-46f5-ab20-88dbaadca645%40eisentraut.org
v0-0001-Reduce-the-size-of-serialized-nodes-in-nodeToStri.patch
Description: Binary data