On Fri, Feb 24, 2023 at 08:54:00PM +0000, Imseih (AWS), Sami wrote: > I think the only thing to do here is to call this out in docs with a > suggestion to increase pg_stat_statements.max to reduce the > likelihood. I also attached the suggested doc enhancement as well.
Improving the docs about that sounds like a good idea. This would be less surprising for users, if we had some details about that. > Any thoughts? The risk of deallocation of an entry between the post-analyze hook and the planner/utility hook represented with two calls of pgss_store() means that this will never be able to work correctly as long as we don't know the shape of the normalized query in the second code path (planner, utility execution) updating the entries with the call information, etc. And we only want to pay the cost of normalization once, after the post-analyze where we do the query jumbling. Could things be done in a more stable way? For example, imagine that we have an extra Query field called void *private_data that extensions can use to store custom data associated to a query ID, then we could do something like that: - In the post-analyze hook, check if an entry with the query ID calculated exists. -- If the entry exists, grab a copy of the existing query string, which may be normalized or not, and save it into Query->private_data. -- If the entry does not exist, normalize the query, store it in Query->private_data but do not yet create an entry in the hash table. - In the planner/utility hook, fetch the normalized query from private_data, then use it if an entry needs to be created in the hash table. The entry may have been deallocated since the post-analyze hook, in which case it is re-created with the normalized copy saved in the first phase. -- Michael
signature.asc
Description: PGP signature