On Tue, Sep 20, 2022 at 1:00 PM Alexander Korotkov <aekorot...@gmail.com> wrote: > On Tue, Sep 20, 2022 at 7:48 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > > Peter Geoghegan <p...@bowt.ie> writes: > > > On Mon, Sep 19, 2022 at 8:39 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > > >> Our existing format is certainly not great on those metrics, but > > >> I do not see how "let's use JSON!" is a route to improvement. > > > > > The existing format was designed with developer convenience as a goal, > > > though -- despite my complaints, and in spite of your objections. > > > > As Munro adduces nearby, it'd be a stretch to conclude that the current > > format was designed with any Postgres-related goals in mind at all. > > I think he's right that it's a variant of some Lisp-y dump format that's > > probably far hoarier than even Berkeley Postgres. > > > > > If it didn't have to be easy (or even practical) for developers to > > > directly work with the output format, then presumably the format used > > > internally could be replaced with something lower level and faster. So > > > it seems like the two goals (developer ergonomics and faster > > > interchange format for users) might actually be complementary. > > > > I think the principal mistake in what we have now is that the storage > > format is identical to the "developer friendly" text format (plus or > > minus some whitespace). First we need to separate those. We could > > have more than one equivalent text format perhaps, and I don't have > > any strong objection to basing the text format (or one of them) on > > JSON. > > +1 for considering storage format and text format separately. > > Let's consider what our criteria could be for the storage format. > > 1) Storage effectiveness (shorter is better) and > serialization/deserialization effectiveness (faster is better). On > this criterion, the custom binary format looks perfect. > 2) Robustness in the case of corruption. It seems much easier to > detect the data corruption and possibly make some partial manual > recovery for textual format. > 3) Standartness. It's better to use something known worldwide or at > least used in other parts of PostgreSQL than something completely > custom. From this perspective, JSON/JSONB is better than custom > things.
(sorry, I've accidentally cut the last paragraph from the message) It seems that there is no perfect fit for this multi-criteria optimization, and we should pick what is more important. Any thoughts? ------ Regards, Alexander Korotkov