Re: PATCH: Attempt to make dbsize a bit more consistent

gkokolatos Wed, 24 Feb 2021 06:38:04 -0800



‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Wednesday, February 24, 2021 3:34 PM, <gkokola...@pm.me> wrote:

>
>
> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> On Friday, February 19, 2021 4:57 PM, gkokola...@pm.me wrote:
>
> > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > On Monday, February 1, 2021 1:18 PM, Masahiko Sawada sawada.m...@gmail.com 
> > wrote:
> >
> > > On Thu, Nov 12, 2020 at 2:54 AM Soumyadeep Chakraborty
> > > soumyadeep2...@gmail.com wrote:
> > >
> > > > Hey Georgios,
> > > > On Tue, Nov 10, 2020 at 6:20 AM gkokola...@pm.me wrote:
> > > >
> > > > > ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
> > > > > On Monday, November 9, 2020 7:50 PM, Soumyadeep Chakraborty 
> > > > > soumyadeep2...@gmail.com wrote:
> > > > >
> > > > > > Hey Georgios,
> > > > > > Thanks for looking for more avenues to invoke tableAM APIS! Please 
> > > > > > find
> > > > > > my review below:
> > > > >
> > > > > A great review Soumyadeep, it is much appreciated.
> > > > > Please remember to add yourself as a reviewer in the commitfest
> > > > > [https://commitfest.postgresql.org/30/2701/]
> > > >
> > > > Ah yes. Sorry, I forgot that!
> > > >
> > > > > > On Tue, Oct 13, 2020 at 6:28 AM gkokola...@pm.me wrote:
> > > > > >
> > > > > > 1.
> > > > > >
> > > > > > > /*
> > > > > > >
> > > > > > > -   -   heap size, including FSM and VM
> > > > > > > -   -   table size, including FSM and VM
> > > > > > >         */
> > > > > > >
> > > > > >
> > > > > > We should not mention FSM and VM in dbsize.c at all as these are
> > > > > > heap AM specific. We can say:
> > > > > > table size, excluding TOAST relation
> > > > >
> > > > > Yeah, I was thinking that the notion that FSM and VM are still taken
> > > > > into account should be stated. We are iterating over ForkNumber
> > > > > after all.
> > > > > How about phrasing it as:
> > > > >
> > > > > -   table size, including all implemented forks from the AM (e.g. 
> > > > > FSM, VM)
> > > > > -   excluding TOAST relations
> > > > >
> > > > > Thoughts?
> > > >
> > > > Yes, I was thinking along the same lines. The concept of a "fork" forced
> > > > should not be forced down into the tableAM. But that is a discussion for
> > > > another day. We can probably say:
> > > >
> > > > -   table size, including all implemented forks from the AM (e.g. FSM, 
> > > > VM
> > > > -   for the heap AM) excluding TOAST relations
> > > >
> > > > > > 2.
> > > > > >
> > > > > > > /*
> > > > > > >
> > > > > > > -   Size of toast relation
> > > > > > >     */
> > > > > > >     if (OidIsValid(rel->rd_rel->reltoastrelid))
> > > > > > >
> > > > > > > -   size += 
> > > > > > > calculate_toast_table_size(rel->rd_rel->reltoastrelid);
> > > > > > >
> > > > > > > -   {
> > > > > > >
> > > > > > > -   Relation toastRel;
> > > > > > >
> > > > > > > -
> > > > > > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, 
> > > > > > > AccessShareLock);
> > > > > > >
> > > > > >
> > > > > > We can replace the OidIsValid check with 
> > > > > > relation_needs_toast_table()
> > > > > > and then have the OidIsValid() check in an Assert. Perhaps, that 
> > > > > > will
> > > > > > make things more readable.
> > > > >
> > > > > Please, allow me to kindly disagree.
> > > > > Relation is already open at this stage. Even create_toast_table(), the
> > > > > internal workhorse for creating toast relations, does check 
> > > > > reltoastrelid
> > > > > to test if the relation is already toasted.
> > > > > Furthermore, we do call:
> > > > >
> > > > > -   toastRel = relation_open(rel->rd_rel->reltoastrelid, 
> > > > > AccessShareLock);
> > > > >
> > > > > and in order to avoid elog() errors underneath us, we ought to have
> > > > > verified the validity of reltoastrelid.
> > > > > In short, I think that the code in the proposal is not too unreadable
> > > > > nor that it breaks the coding patterns throughout the codebase.
> > > > > Am I too wrong?
> > > >
> > > > No not at all. The code in the proposal is indeed adhering to the
> > > > codebase. What I was going for here was to increase the usage of
> > > > relation_needs_toast_table(). What I meant was:
> > > > if (table_relation_needs_toast_table(rel))
> > > > {
> > > > if (!OidIsValid(rel->rd_rel->reltoastrelid))
> > > > {
> > > > elog(ERROR, <errmsg that toast table wasn't found>);
> > > > }
> > > > //size calculation here..
> > > > }
> > > > We want to error out if the toast table can't be found and the relation
> > > > is expected to have one, which the existing code doesn't handle.
> > > >
> > > > > > 3.
> > > > > >
> > > > > > > -   if (rel->rd_rel->relkind == RELKIND_RELATION ||
> > > > > > > -   rel->rd_rel->relkind == RELKIND_TOASTVALUE ||
> > > > > > > -   rel->rd_rel->relkind == RELKIND_MATVIEW)
> > > > > > > -   size = calculate_table_size(rel);
> > > > > > > -   else
> > > > > > > -   {
> > > > > > > -   relation_close(rel, AccessShareLock);
> > > > > > > -   PG_RETURN_NULL();
> > > > > > > -   }
> > > > > >
> > > > > > This leads to behavioral changes:
> > > > > > I am talking about the case where one calls pg_table_size() on an 
> > > > > > index.
> > > > > > W/o your patch, it returns the size of the index. W/ your patch it
> > > > > > returns NULL. Judging by the documentation, this function should not
> > > > > > ideally apply to indexes but it does! I have a sinking feeling that 
> > > > > > lots
> > > > > > of users use it for this purpose, as there is no function to 
> > > > > > calculate
> > > > > > the size of a single specific index (except pg_relation_size()).
> > > > > > The same argument I have made above applies to sequences. Users may 
> > > > > > have
> > > > > > trial-and-errored their way into finding that pg_table_size() can 
> > > > > > tell
> > > > > > them the size of a specific index/sequence! I don't know how 
> > > > > > widespread
> > > > > > the use is in the user community, so IMO maybe we should be 
> > > > > > conservative
> > > > > > and not introduce this change? Alternatively, we could call out that
> > > > > > pg_table_size() is only for tables by throwing an error if anything
> > > > > > other than a table is passed in.
> > > > > > If we decide to preserve the existing behavior of the 
> > > > > > pg_table_size():
> > > > > > It seems that for things not backed by the tableAM (indexes and
> > > > > > sequences), they should still go through calculate_relation_size().
> > > > > > We can call table_relation_size() based on if relkind is
> > > > > > RELKIND_RELATION, RELKIND_TOASTVALUE or RELKIND_MATVIEW. Perhaps it
> > > > > > might be worthwhile making a new macro RELKIND_HAS_TABLE_STORAGE to
> > > > > > capture these three cases (See RELKIND_HAS_STORAGE). This also 
> > > > > > ensures
> > > > > > that we return 0 for things that don't qualify as 
> > > > > > RELKIND_HAS_STORAGE,
> > > > > > such as a partitioned table (Currently w/ the patch applied, we 
> > > > > > return
> > > > > > NULL for those cases, which is another behavior change)
> > > > >
> > > > > Excellent point. This is the discussion I was longing to have.
> > > > > I stand by the decision coded in the patch, that pg_table_size() 
> > > > > should
> > > > > return NULL for other kinds of relations, such as indexes, sequences
> > > > > etc.
> > > > > It is a conscious decision based on the following:
> > > > >
> > > > > -   Supported by the documentation, pg_table_size() applies to tables 
> > > > > only.
> > > > >     For other uses the higher-level functions 
> > > > > pg_total_relation_size() or
> > > > >     pg_relation_size() should be used.
> > > > >
> > > > > -   Commit fa352d662e taught pg_relation_size() and friends to return 
> > > > > NULL if the object doesn't exist. This makes perfect sense for the
> > > > >     scenarios described in the commit:
> > > > >     That avoids errors when the functions are used in queries like
> > > > >     "SELECT pg_relation_size(oid) FROM pg_class",
> > > > >     and a table is dropped concurrently.
> > > > >
> > > > >
> > > > > IMHO: It is more consistent to return NULL when the relation does 
> > > > > exist
> > > > > OR it is not a table kind.
> > > > >
> > > > > -   Returning 0 for things that do not have storage, is nonsensical 
> > > > > because
> > > > >     it implies that it can be NON zero at some point. Things that do 
> > > > > not
> > > > >     have storage have an unknown size.
> > > > >
> > > >
> > > > Fair. We will have to document the behavior change.
> > > >
> > > > > As far as for the argument that users might have trialed and errored
> > > > > their way into undocumented behaviour, I do not think it is strong
> > > > > enough to stop us from implementing the documented behaviour.
> > > >
> > > > Fair. I would strongly vote for having two additional functions
> > > > (pg_index_size() and pg_sequence_size()) to strongly signal our intent
> > > > of banning that kind of use of pg_table_size(). I think it would help
> > > > users a lot. It is not easy to find what function to call when you want
> > > > the size of a single index/sequence.
> > > >
> > > > > > 4.
> > > > > >
> > > > > > > @@ -3776,10 +3776,24 @@ listTables(const char *tabtypes, const 
> > > > > > > char *pattern, bool verbose, bool showSys
> > > > > > > gettext_noop("Access Method"));
> > > > > > >
> > > > > > >          /*
> > > > > > >
> > > > > > >
> > > > > > > -                 * As of PostgreSQL 14, do not use 
> > > > > > > pg_table_size() for indexes and
> > > > > > >
> > > > > > >
> > > > > > > -                 * sequences as it does not behave sanely for 
> > > > > > > those.
> > > > > > >
> > > > > > >
> > > > > > > -                 *
> > > > > > >                   * As of PostgreSQL 9.0, use pg_table_size() to 
> > > > > > > show a more accurate
> > > > > > >                   * size of a table, including FSM, VM and TOAST 
> > > > > > > tables.
> > > > > > >                   */
> > > > > > >
> > > > > > >
> > > > > > > -                if (pset.sversion >= 90000)
> > > > > > >
> > > > > > >
> > > > > > > -                if (pset.sversion >= 140000)
> > > > > > >
> > > > > > >
> > > > > > > -                    appendPQExpBuffer(&buf,
> > > > > > >
> > > > > > >
> > > > > > > -                                      ",\\\\\\\\\\\\\\\\n CASE"
> > > > > > >
> > > > > > >
> > > > > > > -                                      " WHEN c.relkind in 
> > > > > > > ("CppAsString2(RELKIND_INDEX)", "
> > > > > > >
> > > > > > >
> > > > > > > -                                                            
> > > > > > > CppAsString2(RELKIND_PARTITIONED_INDEX)", "
> > > > > > >
> > > > > > >
> > > > > > > -                                                            
> > > > > > > CppAsString2(RELKIND_SEQUENCE)") THEN"
> > > > > > >
> > > > > > >
> > > > > > > -                                      "     
> > > > > > > pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid))"
> > > > > > >
> > > > > > >
> > > > > > > -                                      " ELSE"
> > > > > > >
> > > > > > >
> > > > > > > -                                      "     
> > > > > > > pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid))"
> > > > > > >
> > > > > > >
> > > > > > > -                                      " END as 
> > > > > > > \\\\\\\\\\\\\\\\"%s\\\\\\\\\\\\\\\\"",
> > > > > > >
> > > > > > >
> > > > > > > -                                      gettext_noop("Size"));
> > > > > > >
> > > > > > >
> > > > > > > -                else if (pset.sversion >= 90000)
> > > > > > >                      appendPQExpBuffer(&buf,
> > > > > > >                                        ",\\\\\\\\\\\\\\\\n  
> > > > > > > pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as 
> > > > > > > \\\\\\\\\\\\\\\\"%s\\\\\\\\\\\\\\\\"",
> > > > > > >                                        gettext_noop("Size"));
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > Following on from point 3, if we decide to preserve the existing 
> > > > > > behavior,
> > > > > > we wouldn't need this change, as it would be internalized within
> > > > > > pg_table_size().
> > > > >
> > > > > We really should not decide to preserve the existing behaviour.
> > > > > I will reiterate my point: Returning 0 for things that do not have
> > > > > storage, implies that it can be NON zero at some point. We should not
> > > > > treat pg_table_size() as an alias for pg_relation_size().
> > > >
> > > > +1
> > > >
> > > > > > 4.
> > > > > >
> > > > > > > > -   return size;
> > > > > > > >
> > > > > > > > -   Assert(size < PG_INT64_MAX);
> > > > > > > >
> > > > > > > > -
> > > > > > > > -   return (int64)size;
> > > > > > > >     I assume that this change, and the other ones like that, 
> > > > > > > > aim to handle int64
> > > > > > > >     overflow? Using the extra legroom of uint64 can still lead 
> > > > > > > > to an overflow,
> > > > > > > >     however theoretical it may be. Wouldn't it be better to 
> > > > > > > > check for overflow
> > > > > > > >     before adding to size rather than after the fact? Something 
> > > > > > > > along the lines of
> > > > > > > >     checking for headroom left:
> > > > > > > >     rel_size = table_relation_size(..);
> > > > > > > >     if (rel_size > (PG_INT64_MAX - total_size))
> > > > > > > >     < error codepath >
> > > > > > > >
> > > > > > > >
> > > > > > > > total_size += rel_size;
> > > > > > >
> > > > > > > Actually not, the intention is not to handle overflow. The 
> > > > > > > table_relation_size() returns a uint64 and the calling function 
> > > > > > > returns int64.
> > > > > > > The Assert() has been placed in order to denote that it is 
> > > > > > > acknowledged that the two functions return different types. I was 
> > > > > > > of the opinion that a run time check will not be needed as even 
> > > > > > > the > smaller type can cover more than 9200 PetaByte tables.
> > > > > > > If we were to change anything, then I would prefer to change the 
> > > > > > > signature of the pg_*_size family of functions to return uint64 
> > > > > > > instead.
> > > > > >
> > > > > > Changing the signature would be the ideal change for all of this 
> > > > > > here.
> > > > > > But Postgres does not have support for an unsigned 64bit integer 
> > > > > > (bigint
> > > > > > is signed). One would need to turn to extensions such as [1]. Thus, 
> > > > > > +1
> > > > > > to what Daniel said above.
> > > > >
> > > > > Apologies, I do not follow. Are you suggesting that we should
> > > > > introduce overflow tests?
> > > >
> > > > Yes, to introduce the same overflow test that Daniel suggested above as
> > > > returning a uint64 in PG does not really return a uint64 AFAIU (since
> > > > the pg_**size() functions all return bigint which is signed and there
> > > > is no uint64 user-facing type).
> > >
> > > Status update for a commitfest entry.
> > > This patch gets review comments and there was some discussion. It
> > > seems we're waiting for the patch update. So I've moved this patch to
> > > the next commitfest and set it to "Waiting on Author".
> >
> > Thank you for your patience.
> > Please find v3 attached. I will reset the status to 'Ready for review'.
>
> Version 3 of the patch broke the cfbot. Please find v4 attached.
>
> This patch requires a catalog version bump.

Now with attachment. Apologies for the chatter.

>
> Cheers,
> //Georgios -- https://www.vmware.com
>
> > > Regards,
> > > Masahiko Sawada
> > > EDB: https://www.enterprisedb.com/
v4-0001-Attempt-to-make-dbsize-a-bit-more-consistent.patch
Description: Binary data
Re: PATCH: Attempt to make dbsize a bit more consistent

Reply via email to