Re: Split index and table statistics into different types of stats

Drouvot, Bertrand Mon, 14 Nov 2022 02:38:25 -0800

Hi,

On 11/4/22 9:51 PM, Melanie Plageman wrote:

Hi Bertrand,


I'm glad you are working on this.

I had a few thoughts/ideas


Thanks for looking at it!

It seems better to have all of the counts in the various stats structs
not be prefixed with n_, i_, t_

typedef struct PgStat_StatDBEntry
{
...
     PgStat_Counter n_blocks_fetched;
     PgStat_Counter n_blocks_hit;
     PgStat_Counter n_tuples_returned;
     PgStat_Counter n_tuples_fetched;
...

I've attached a patch (0002) to change this in case you are interested
in making such a change


I did not renamed initially the fields/columns to ease the review.

Indeed, I think we should go further than removing the n_, i_ and t_prefixes so that the fields actually match the view's columns.

For example, currently pg_stat_all_indexes.idx_tup_read is linked to"tuples_returned", so that it would make sense to rename"tuples_returned" to "tuples_read" or even "tup_read" in the indexescounters.

That's why I had in mind to do this fields/columns renaming into aseparate patch (once this one is committed), so that the current onefocus only on splitting the stats: what do you think?

(I've attached all of my suggestions in patches
along with your original patch so that cfbot still passes).

On Wed, Nov 2, 2022 at 5:00 AM Drouvot, Bertrand
<[email protected]> wrote:

On 11/1/22 1:30 AM, Andres Freund wrote:

On 2022-10-31 14:14:15 +0100, Drouvot, Bertrand wrote:

@@ -240,9 +293,23 @@ pg_stat_get_blocks_fetched(PG_FUNCTION_ARGS)
      PG_RETURN_INT64(result);
   }

+Datum
+pg_stat_get_index_blocks_fetched(PG_FUNCTION_ARGS)
+{
+    Oid                     relid = PG_GETARG_OID(0);
+    int64           result;
+    PgStat_StatIndEntry *indentry;
+
+    if ((indentry = pgstat_fetch_stat_indentry(relid)) == NULL)
+            result = 0;
+    else
+            result = (int64) (indentry->blocks_fetched);
+
+    PG_RETURN_INT64(result);
+}


We have so many copies of this by now - perhaps we first should deduplicate
them somehow? Even if it's just a macro or such.


Yeah good point, a new macro has been defined for the "int64" return
case in V3 attached.


I looked for other opportunities to de-duplicate, but most of the functions
that were added that are identical except the return type and
PgStat_Kind are short enough that it doesn't make sense to make wrappers
or macros.


Yeah, agree.

I do think it makes sense to reorder the members of the two structs
PgStat_IndexCounts and PgStat_TableCounts so that they have a common
header. I've done that in the attached patch (0003).

That's a good idea, thanks! But for that we would need to have the samefields names, means:


- Remove the prefixes (as you've done in 0002)

- And probably reduce the number of fields in the newPgStat_RelationCounts that 003 is introducing (for exampletuples_returned should be excluded if we're going to rename it later onto "tuples_read" for the indexes to map thepg_stat_all_indexes.idx_tup_read column).

ISTM that we should do it in the "renaming" effort, after this patch iscommitted.

In the flush functions, I was also thinking it might be nice to use the
same pattern as is used in [1] and [2] to add the counts together. It
makes the lines a bit easier to read, IMO. If we remove the prefixes
from the count fields, this works for many of the fields. I've attached
a patch (0004) that does something like this, in case you wanted to go
in this direction.

I like it too but same remarks as previously. I think it should be partof the "renaming" effort.


Since you have made new parallel functions for indexes and tables for
many of the functions in pgstat_relation.c, perhaps it makes sense to
split it into pgstat_table.c and pgstat_index.c?


Good point, thanks, I'll work on it.


One question I had about the original code (not related to your
refactor) is why the pending stats aren't memset in the flush functions
after aggregating them into the shared stats.

Not sure I'm getting your point, do you think something is not rightwith the flush functions?


Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Re: Split index and table statistics into different types of stats

Reply via email to