On Thu, Apr 01, 2021 at 01:59:15PM -0400, Bruce Momjian wrote: > On Thu, Apr 1, 2021 at 01:56:42PM -0400, Bruce Momjian wrote: > > You are using: > > > > /* ---------- > > * pgstat_get_my_queryid() - > > * > > * Return current backend's query identifier. > > */ > > uint64 > > pgstat_get_my_queryid(void) > > { > > if (!MyBEEntry) > > return 0; > > > > return MyBEEntry->st_queryid; > > } > > > > Looking at log_statement: > > > > /* Log immediately if dictated by log_statement */ > > if (check_log_statement(parsetree_list)) > > { > > ereport(LOG, > > (errmsg("statement: %s", query_string), > > errhidestmt(true), > > errdetail_execute(parsetree_list))); > > was_logged = true; > > } > > > > it uses the global variable query_string.
Unless I'm missing something query_string isn't a global variable, it's a parameter passed to exec_simple_query() from postgresMain(). It's then passed to the stats collector to be able to be displayed in pg_stat_activity through pgstat_report_activity() a bit like what I do for the queryid. There's a global variable debug_query_string, but it's only for debugging purpose. > > I wonder if the query hash > > should be a global variable too --- this would more clearly match how we > > handle top-level info like query_string. Digging into the stats system > > to get top-level info does seem odd. The main difference is that there's a single top level query_string, even if it contains multiple statements. But there would be multiple queryid calculated in that case and we don't want to change it during a top level multi-statements execution, so we can't use the same approach. Also, the query_string is directly logged from this code path, while the queryid is logged as a log_line_prefix, and almost all the code there also retrieve information from some shared structure. And since it also has to be available in pg_stat_activity, having a single source of truth looked like a better approach. > Also, if you go in that direction, make sure the hash it set in the same > places the query string is set, though I am unclear how extensions would > handle that. It should be transparent for application, it's extracting the first queryid seen for each top level statement and export it. The rest of the code still continue to see the queryid that corresponds to the really executed single statement.