On Mon, Dec 7, 2015 at 2:41 AM, Amit Langote <langote_amit...@lab.ntt.co.jp> wrote: > On 2015/12/03 19:05, Kyotaro HORIGUCHI wrote: >> At Thu, 3 Dec 2015 16:24:32 +0900, Amit Langote >> <langote_amit...@lab.ntt.co.jp> wrote >>> By the way, there are some non-st_progress_* fields that I was talking >>> about in my previous message. For example, st_activity_flag (which I have >>> suggested to rename to st_command instead). It needs to be set once at the >>> beginning of the command processing and need not be touched again. I think >>> it may be a better idea to do the same for table name or OID. It also >>> won't change over the duration of the command execution. So, we could set >>> it once at the beginning where that becomes known. Currently in the patch, >>> it's reported in st_progress_message[0] by every pgstat_report_progress() >>> invocation. So, the table name will be strcpy()'d to shared memory for >>> every scanned block that's reported. >> >> If we don't have dedicate reporting functions for each phase >> (that is, static data phase and progress phase), the one function >> pgstat_report_progress does that by having some instruction from >> the caller and it would be a bitfield. >> >> Reading the flags for making decision whether every field to copy >> or not and branching by that seems too much for int32 (or maybe >> 64?) fields. Alhtough it would be in effective when we have >> progress_message fields, I don't think it is a good idea without >> having progress_message. >> >>> pgstat_report_progress(uint32 *param1, uint16 param1_bitmap, >>> char param2[][..], uint16 param2_bitmap) >>> { >>> ... >>> for(i = 0; i < 16 ; i++) >>> { >>> if (param1_bitmap & (1 << i)) >>> beentry->st_progress_param[i] = param1[i]; >>> if (param2_bitmap & (1 << i)) >>> strcpy(beentry->..., param2[i]); >>> } >> >> Thoughts? > > Hm, I guess progress messages would not change that much over the course > of a long-running command. How about we pass only the array(s) that we > change and NULL for others: > > With the following prototype for pgstat_report_progress: > > void pgstat_report_progress(uint32 *uint32_param, int num_uint32_param, > bool *message_param[], int num_message_param); > > If we just wanted to change, say scanned_heap_pages, then we report that > using: > > uint32_param[1] = scanned_heap_pages; > pgstat_report_progress(uint32_param, 3, NULL, 0); > > Also, pgstat_report_progress() should check each of its parameters for > NULL before iterating over to copy. So in most reports over the course of > a command, the message_param would be NULL and hence not copied.
It's going to be *really* important that this facility provides a lightweight way of updating progress, so I think this whole API is badly designed. VACUUM, for example, is going to want a way to update the individual counter for the number of pages it's scanned after every page. It should not have to pass all of the other information that is part of a complete report. It should just be able to say pgstat_report_progress_update_counter(1, pages_scanned) or something of this sort. Don't marshal all of the data and then make pgstat_report_progress figure out what's changed. Provide a series of narrow APIs where the caller tells you exactly what they want to update, and you update only that. It's fine to have a pgstat_report_progress() function to update everything at once, for the use at the beginning of a command, but the incremental updates within the command should do something lighter-weight. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers