RE: Failed transaction statistics to measure the logical replication progress

2021-12-04 Thread osumi.takami...@fujitsu.com
On Friday, December 3, 2021 3:12 PM vignesh C  wrote:
> Thanks for the updated patch.
> Currently we are storing the commit count, error_count and abort_count for
> each table of the table sync operation. If we have thousands of tables, we 
> will
> be storing the information for each of the tables.
> Shouldn't we be storing the consolidated information in this case.
> diff --git a/src/backend/replication/logical/tablesync.c
> b/src/backend/replication/logical/tablesync.c
> index f07983a..02e9486 100644
> --- a/src/backend/replication/logical/tablesync.c
> +++ b/src/backend/replication/logical/tablesync.c
> @@ -1149,6 +1149,11 @@ copy_table_done:
> MyLogicalRepWorker->relstate_lsn = *origin_startpos;
> SpinLockRelease(&MyLogicalRepWorker->relmutex);
> 
> +   /* Report the success of table sync. */
> +   pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid,
> +
>   MyLogicalRepWorker->relid,
> +
>   0 /* no logical message type */ );
Okay.

I united all stats into that of apply worker.
In line with this change, I fixed the TAP tests as well
to cover the updates of stats done by table sync workers.

Also, during my self-review, I noticed that
I should call pgstat_report_subworker_xact_end() before
process_syncing_tables() because it can lead to process
exit, which results in missing one increment of the stats columns.
I noted this point in a comment as well.

Best Regards,
Takamichi Osumi



v15-0001-Extend-pg_stat_subscription_workers-to-include-g.patch
Description:  v15-0001-Extend-pg_stat_subscription_workers-to-include-g.patch


Re: A test for replay of regression tests

2021-12-04 Thread Andrew Dunstan


On 12/3/21 23:21, Thomas Munro wrote:
>
> Next problem:  The below is clearly not the right way to find the
> pg_regress binary and bindir, and doesn't work on Windows or VPATH.
> Any suggestions for how to do this?  I feel like something like
> $node->installed_command() or similar logic is needed...
>
> # Run the regression tests against the primary.
> # XXX How should we find the pg_regress binary and bindir?
> system_or_bail("../regress/pg_regress",
>"--bindir=../../bin/psql",
>"--port=" . $node_primary->port,
>"--schedule=../regress/parallel_schedule",
>"--dlpath=../regress",
>"--inputdir=../regress");
>

TAP tests are passed a path to pg_regress as $ENV{PG_REGRESS}. You
should be using that. On non-MSVC, the path to a non-installed psql is
in this case  "$TESTDIR/../../bin/psql" - this should work for VPATH
builds as well as non-VPATH. On MSVC it's a bit harder - it's
"$top_builddir/$releasetype/psql" but we don't expose that. Perhaps we
should. c.f. commit f4ce6c4d3a


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: The "char" type versus non-ASCII characters

2021-12-04 Thread Tom Lane
Chapman Flack  writes:
> On 12/03/21 14:12, Tom Lane wrote:
>> This reminds me of something I've been intending to bring up, which
>> is that the "char" type is not very encoding-safe.  charout() for
>> example just regurgitates the single byte as-is.

> I wonder if maybe what to do about that lies downstream of some other
> thought about encoding-related type properties.

As you mentioned upthread, it's probably wrong to think of "char" as
character data at all.  The catalogs use it as a poor man's enum type,
and it's just for convenience that we assign readable ASCII codes for
the enum values of a given column.  The only reason to think of it as
encoding-dependent would be if you have ambitions to store a non-ASCII
character in a "char".  But I think that's something we want to
strongly discourage, even if we don't prohibit it altogether.  The
whole point of the type is to be one byte, so only in legacy encodings
can it possibly represent a non-ASCII character.

So I'm visualizing it as a uint8 that we happen to like to store
ASCII codes in, and that's what prompts the thought of using a
numeric representation for non-ASCII values.  I think you're just
in for pain if you want to consider such values as character data
rather than numbers.

> ... "char" is an existing
> example, because its current behavior is exactly as if it declared
> "I am one byte of SQL_ASCII regardless of server setting".

But it's not quite that.  If we treated it as SQL_ASCII, we'd refuse
to convert it to some other encoding unless the value passes encoding
verification, which is exactly what charout() is not doing.

> Indeed, cstring behaves completely as if it is a type with the server
> encoding.

Yup, cstring is definitely presumed to be in the server's encoding.

> So, is the current "char" situation so urgent that it demands some
> one-off solution be chosen for it, or could it be neglected with minimal
> risk until someday we've defined what "this datatype has encoding X that's
> different from the server encoding" means, and that takes care of it?

I'm not willing to leave it broken in the rather faint hope that
someday there will be a more general solution, especially since
I don't buy the premise that "char" ought to participate in any
such solution.

regards, tom lane




Re: The "char" type versus non-ASCII characters

2021-12-04 Thread Chapman Flack
On 12/04/21 11:34, Tom Lane wrote:
> Chapman Flack  writes:
>> "I am one byte of SQL_ASCII regardless of server setting".
> 
> But it's not quite that.  If we treated it as SQL_ASCII, we'd refuse
> to convert it to some other encoding unless the value passes encoding
> verification, which is exactly what charout() is not doing.

Ah, good point. I remembered noticing pg_do_encoding_conversion returning
the src pointer unchanged when SQL_ASCII is involved, but see that it does
verify the dest_encoding when SQL_ASCII is the source.

> encoding-dependent would be if you have ambitions to store a non-ASCII
> character in a "char".  But I think that's something we want to
> strongly discourage, even if we don't prohibit it altogether. ...
> So I'm visualizing it as a uint8 that we happen to like to store
> ASCII codes in, and that's what prompts the thought of using a
> numeric representation for non-ASCII values.

I'm in substantial agreement, though I also see that it is nearly always
set from a quoted literal, and tested against a quoted literal, and calls
itself "char", so I guess I am thinking for consistency's sake it might
be better not to invent some all-new convention for its text representation,
but adopt something that's already familiar, like bytea escaped format.
So it would always look and act like a one-octet bytea. Maybe have charin
accept either bytea-escaped or bytea-hex form too. (Or, never mind; when
restricted to one octet, bytea-hex and the \xhh bytea-escape form are
indistinguishable anyway.)

Then for free we get the property that if somebody today uses 'ű' as
an enum value, it might start appearing as '\xfb' now in dumps, etc.,
but their existing CASE WHEN thing = 'ű' code doesn't stop working
(as long as they haven't done something silly like change the encoding),
and they have the flexibility to update it to WHEN thing = '\xfb' as
time permits if they choose. If they don't, they accept the risk that
by switching to another encoding in the future, they may either see
their existing tests stop matching, or their existing literals fail
to parse, but there won't be invalidly-encoded strings created.

> Yup, cstring is definitely presumed to be in the server's encoding.

Without proposing to change it, I observe that by defining both cstring
and unknown in this way (with the latter being expressly the type of
any literal from the client destined for a type we don't know yet), we're
a bit painted into the corner as far as supporting types like NCHAR.
(I suppose clients could be banned from sending such values as literals,
and required to use extended form and bind them with a binary message.)
It's analogous to the way format-0 and format-1 both act as filters that
no encoding-dependent data can squish through without surviving both the
client and the server encoding, even if it is of a type that's defined
to be independent of either.

Regards,
-Chap




Re: SPI TupTable memory context

2021-12-04 Thread Chapman Flack
On 12/03/21 20:22, Tom Lane wrote:
> Seems kinda dangerous to me ...
> 
>> AtEOSubXact_SPI can free tuptables retail, but only in the rollback case.
> 
> ... precisely because of that.  If you wanted to take control of
> the TupTable, you'd really need to unhook it from the SPI context's
> tuptables list, and that *really* seems like undue familiarity
> with the implementation.

Fair enough. I didn't have an immediate use in mind, but had been reading
through DestReceiver code and noticed it worked that way, and that it
looked as if an SPI_keeptuptable could have been implemented in probably
no more than the 25 lines of SPI_keepplan, and I wasn't sure if that was
because it had been considered and deemed a Bad Thing, or because the idea
hadn't come up.

The equivalent with a custom DestReceiver would certainly work, but with
a lot more ceremony.

So that was why I asked. If the response had been more like "hmm, no clear
reason a patch to do that would be bad", and if such a patch got accepted
for PG release n, that could also implicitly assuage worries about undue
familiarity for implementing the compatible behavior when building on < n.

Regards,
-Chap




Re: RecoveryInProgress() has critical side effects

2021-12-04 Thread Michael Paquier
On Tue, Nov 16, 2021 at 09:41:58AM -0500, Robert Haas wrote:
> Maybe I'm not understanding you properly here, but it seems like you
> might be forgetting that this is a local variable and thus every
> backend is going to have something different. In the startup process,
> it will be initialized by StartupXLOG(); in other processes, it's
> currently initialized by RecoveryInProgress(), but with this patch it
> wouldn't be. Either way, it's then updated by future calls to
> XLogInsertRecord() as required. XLOG_FPW_CHANGE records might affect
> the new value that gets set the next time XLogInsertRecord(), but they
> don't directly affect doPageWrites.

My main worry here is that this changes slightly the definition of
doPageWrites across stable branches at the end of recovery as there
could be records generated there.  Note that GetFullPageWriteInfo()
gets called in XLogInsert(), while Insert->fullPageWrites gets updated
before CleanupAfterArchiveRecovery().  And it may influence
the value of doPageWrites in the startup process.
--
Michael


signature.asc
Description: PGP signature


Re: Remove extra Logging_collector check before calling SysLogger_Start()

2021-12-04 Thread Michael Paquier
On Fri, Dec 03, 2021 at 12:45:34PM -0500, Tom Lane wrote:
> I think it's fine as-is; good belt-and-suspenders-too programming.
> It's not like the extra check inside SysLogger_Start() is costly.

+1.
--
Michael


signature.asc
Description: PGP signature


Re: Do sys logger and stats collector need wait events WAIT_EVENT_SYSLOGGER_MAIN/_PGSTAT_MAIN?

2021-12-04 Thread Michael Paquier
On Fri, Dec 03, 2021 at 12:42:47PM -0500, Tom Lane wrote:
> On the whole I'd be inclined to leave it alone.  Even if the reporting
> mechanisms aren't able to report these events today, maybe they'll
> be able to do so in future.  The context of the stats collector
> process, in particular, seems likely to change.

These come from 6f3bd98, where I am pretty sure that I (or Robert)
defined those values to not have users guess what to pass down to
WaitLatch(), assuming that this may become useful in the future.  So
I'd rather leave them be.
--
Michael


signature.asc
Description: PGP signature


Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?

2021-12-04 Thread Peter Geoghegan
I wonder why we're counting the number of dead tuples (or LP_DEAD stub
items) in the relation as a whole in ANALYZE's acquire_sample_rows()
function. Wouldn't it make more sense to focus on the "live vs dead
tuple properties" of heap pages that are not known to be all-visible
when we generate statistics for our pgstat_report_analyze() report?
These statistic collector stats are only for the benefit of autovacuum
scheduling -- and so they're *consumed* in a way that is totally
different to the nearby pg_statistic stats.

There is no good reason for the pgstat_report_analyze() stats to be
based on the same pg_class.relpages "denominator" as the pg_statistic
stats (it's just slightly easier to do it that way in
acquire_sample_rows(), I suppose). On the other hand, an alternative
behavior involving counting totaldeadrows against sampled
not-all-visible pages (but not otherwise) has a big benefit: doing so
would remove any risk that older/earlier PageIsAllVisible() pages will
bias ANALYZE in the direction of underestimating the count. This isn't
a theoretical benefit -- I have tied it to an issue with the
BenchmarkSQL TPC-C implementation [1].

This approach just seems natural to me. VACUUM intrinsically only
expects dead tuples/line pointers in not-all-visible pages. So
PageIsAllVisible() pages should not be counted here -- they are simply
irrelevant, because these stats are for autovacuum, and autovacuum
thinks they're irrelevant. What's more, VACUUM currently uses
vac_estimate_reltuples() to compensate for the fact that it skips some
pages using the visibility map -- pgstat_report_vacuum() expects a
whole-relation estimate. But if
pgstat_report_vacuum()/pgstat_report_analyze() expected statistics
about the general properties of live vs dead tuples (or LP_DEAD items)
on not-all-visible pages in the first place, then we wouldn't need to
compensate like this.

This new approach also buys us the ability to extrapolate a new
estimated number of dead tuples using old, stale stats. The stats can
be combined with the authoritative/known number of not-all-visible
pages right this second, since it's cheap enough to *accurately*
determine the total number of not-all-visible pages for a heap
relation by calling visibilitymap_count(). My guess is that this would
be much more accurate in practice: provided the original average
number of dead/live tuples (tuples per not-all-visible block) was
still reasonably accurate, the extrapolated "total dead tuples right
now" values would also be accurate.

I'm glossing over some obvious wrinkles here, such as: what happens to
totaldeadrows when 100% of all the pages ANALYZE samples are
PageIsAllVisible() pages? I think that it shouldn't be too hard to
come up with solutions to those problems (the extrapolation idea
already hints at a solution), but for now I'd like to keep the
discussion high level.

[1] 
https://postgr.es/m/CAH2-Wz=9r83wcwzcpuh4fvpedm4znzbzmvp3rt21+xhqwmu...@mail.gmail.com
-- 
Peter Geoghegan