> > > That’s certainly a fair point and my initial reaction (which could > certainly be wrong) is that it’s unlikely to be an issue- but also, if you > feel you could make it work with an array and passing all the attribute > info in with one call, which I suspect would be possible but just a bit > more complex to build, then sure, go for it. If it ends up being overly > unwieldy then perhaps the per-attribute call would be better and we could > perhaps acquire the lock before the function calls..? Doing a check to see > if we have already locked it would be cheaper than trying to acquire a new > lock, I’m fairly sure. >
Well the do_analyze() code was already ok with acquiring the lock once for non-inherited stats and again for inherited stats, so the locks were already not the end of the world. However, that's at most a 2x of the locking required, and this would natts * x, quite a bit more. Having the procedures check for a pre-existing lock seems like a good compromise. > Also per our prior discussion- this makes sense to include in post-data > section, imv, and also because then we have the indexes we may wish to load > stats for, but further that also means it’ll be in the paralleliziable part > of the process, making me a bit less concerned overall about the individual > timing. > The ability to parallelize is pretty persuasive. But is that per-statement parallelization or do we get transaction blocks? i.e. if we ended up importing stats like this: BEGIN; LOCK TABLE schema.relation IN SHARE UPDATE EXCLUSIVE MODE; LOCK TABLE pg_catalog.pg_statistic IN ROW UPDATE EXCLUSIVE MODE; SELECT pg_import_rel_stats('schema.relation', ntuples, npages); SELECT pg_import_pg_statistic('schema.relation', 'id', ...); SELECT pg_import_pg_statistic('schema.relation', 'name', ...); SELECT pg_import_pg_statistic('schema.relation', 'description', ...); ... COMMIT;