On Wed, May 4, 2011 at 8:03 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvhe...@commandprompt.com> writes: >> As a followup idea there exists the desire to store records as records >> and not text representation of same (given differing record types, of >> course), for which it'd be more worthwhile. > > Maybe. The conventional wisdom is that text representation of data is > more compact than PG's internal representation by a significant factor > --- our FAQ says up to 5x, in fact. I know that that's including row > overhead and indexes and so on, but I still don't find it to be a given > that you're going to win on space with this sort of trick.
I've done a lot of testing of the text vs binary format on the wire format...not exactly the same set of issues, but pretty close since you have to send all the oids, lengths, etc. Conventional wisdom is correct although overstated for this topic. Even in truly pathological cases for text, for example in sending multiple levels of redundant escaping in complex structures, the text format will almost always be smaller. For 'typical' data it can be significantly smaller. Two exceptions most people will run into are bytea obviously and the timestamp family of types where binary style manipulation is a huge win both in terms of space and performance. For complex data (say 3+ levels of composites stacked in arrays), binary type formats are much *faster*, albeit larger, via binary as long as you are not bandwidth constrained, and presumably they would be as well for variants. Perhaps even more so, because some of the manipulations made converting tuple storage to binary wire formats don't have to happen. That said, while there are use cases for sending highly structured data over the wire, I can't think of any for direct storage on a table in variant type scenarios, at least not yet :-). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers