Hi,

first, thanks a lot for your input!

Am 19.05.20 16:16 schrieb(en) Stephen Frost:
That's only going to work up to a certain size for that text and blob value 
too, of course..

These three columns (stripped-down code, there are more in the table) are a 
“lazy” import from various python scripts.  The text and bytea (actually a 
sha256 hash) fields always have a fixed, short size, but the json varies 
between a single item and a quite large dict, so extracting the data into 
columns seems to be cumbersome.

This is looking like it might be some kind of KV store which is generally 
discouraged.

Well, yes, somehow…

"jsonb" uses an internal binary representation that reorders the attributes in 
a deterministic fashin, so yes, that is guaranteed.

Sure- but the md5() isn't going to run on the jsonb data directly, it'll run on 
the text representation that's returned, and I'm not at all convinced that the 
project has agreed to make that text form always be canonical and identical 
forever, including across major version upgrades..

Ok, I see.

Further, there are some other cases to consider- such as: do you think that 
0.00001230 is different from 0.0000123?  Since we'll store numbers in jsonb as 
numeric, we preserve trailing franctional zeroes.

Good point; currently no problem for me (strings and ints only in the json 
dicts), but might be a serious issue in the future.

Overall, if what you need is uniqueness here, I'd strongly reconsider the 
choice to use jsonb to store this data and instead see if you can break the 
data out into proper columns with a proper unique constraint across them (or 
across some hash of the combination of them that's done in a way that is clear 
and unambiguous).

Got your point, I hope – probably the clean solution would be a defined, sorted 
serialisation plus hashing of the json (and the text and bytea columns) either 
in the python producers, or in the (already existing) db function called by the 
python scripts for inserting data…

Thanks again,
Albrecht.

Attachment: pgpXHPj3X2xNI.pgp
Description: PGP signature

Reply via email to