RE: Failed transaction statistics to measure the logical replication progress
On Friday, December 3, 2021 3:12 PM vignesh C wrote: > Thanks for the updated patch. > Currently we are storing the commit count, error_count and abort_count for > each table of the table sync operation. If we have thousands of tables, we > will > be storing the information for each of the tables. > Shouldn't we be storing the consolidated information in this case. > diff --git a/src/backend/replication/logical/tablesync.c > b/src/backend/replication/logical/tablesync.c > index f07983a..02e9486 100644 > --- a/src/backend/replication/logical/tablesync.c > +++ b/src/backend/replication/logical/tablesync.c > @@ -1149,6 +1149,11 @@ copy_table_done: > MyLogicalRepWorker->relstate_lsn = *origin_startpos; > SpinLockRelease(&MyLogicalRepWorker->relmutex); > > + /* Report the success of table sync. */ > + pgstat_report_subworker_xact_end(MyLogicalRepWorker->subid, > + > MyLogicalRepWorker->relid, > + > 0 /* no logical message type */ ); Okay. I united all stats into that of apply worker. In line with this change, I fixed the TAP tests as well to cover the updates of stats done by table sync workers. Also, during my self-review, I noticed that I should call pgstat_report_subworker_xact_end() before process_syncing_tables() because it can lead to process exit, which results in missing one increment of the stats columns. I noted this point in a comment as well. Best Regards, Takamichi Osumi v15-0001-Extend-pg_stat_subscription_workers-to-include-g.patch Description: v15-0001-Extend-pg_stat_subscription_workers-to-include-g.patch
Re: A test for replay of regression tests
On 12/3/21 23:21, Thomas Munro wrote: > > Next problem: The below is clearly not the right way to find the > pg_regress binary and bindir, and doesn't work on Windows or VPATH. > Any suggestions for how to do this? I feel like something like > $node->installed_command() or similar logic is needed... > > # Run the regression tests against the primary. > # XXX How should we find the pg_regress binary and bindir? > system_or_bail("../regress/pg_regress", >"--bindir=../../bin/psql", >"--port=" . $node_primary->port, >"--schedule=../regress/parallel_schedule", >"--dlpath=../regress", >"--inputdir=../regress"); > TAP tests are passed a path to pg_regress as $ENV{PG_REGRESS}. You should be using that. On non-MSVC, the path to a non-installed psql is in this case "$TESTDIR/../../bin/psql" - this should work for VPATH builds as well as non-VPATH. On MSVC it's a bit harder - it's "$top_builddir/$releasetype/psql" but we don't expose that. Perhaps we should. c.f. commit f4ce6c4d3a cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Re: The "char" type versus non-ASCII characters
Chapman Flack writes: > On 12/03/21 14:12, Tom Lane wrote: >> This reminds me of something I've been intending to bring up, which >> is that the "char" type is not very encoding-safe. charout() for >> example just regurgitates the single byte as-is. > I wonder if maybe what to do about that lies downstream of some other > thought about encoding-related type properties. As you mentioned upthread, it's probably wrong to think of "char" as character data at all. The catalogs use it as a poor man's enum type, and it's just for convenience that we assign readable ASCII codes for the enum values of a given column. The only reason to think of it as encoding-dependent would be if you have ambitions to store a non-ASCII character in a "char". But I think that's something we want to strongly discourage, even if we don't prohibit it altogether. The whole point of the type is to be one byte, so only in legacy encodings can it possibly represent a non-ASCII character. So I'm visualizing it as a uint8 that we happen to like to store ASCII codes in, and that's what prompts the thought of using a numeric representation for non-ASCII values. I think you're just in for pain if you want to consider such values as character data rather than numbers. > ... "char" is an existing > example, because its current behavior is exactly as if it declared > "I am one byte of SQL_ASCII regardless of server setting". But it's not quite that. If we treated it as SQL_ASCII, we'd refuse to convert it to some other encoding unless the value passes encoding verification, which is exactly what charout() is not doing. > Indeed, cstring behaves completely as if it is a type with the server > encoding. Yup, cstring is definitely presumed to be in the server's encoding. > So, is the current "char" situation so urgent that it demands some > one-off solution be chosen for it, or could it be neglected with minimal > risk until someday we've defined what "this datatype has encoding X that's > different from the server encoding" means, and that takes care of it? I'm not willing to leave it broken in the rather faint hope that someday there will be a more general solution, especially since I don't buy the premise that "char" ought to participate in any such solution. regards, tom lane
Re: The "char" type versus non-ASCII characters
On 12/04/21 11:34, Tom Lane wrote: > Chapman Flack writes: >> "I am one byte of SQL_ASCII regardless of server setting". > > But it's not quite that. If we treated it as SQL_ASCII, we'd refuse > to convert it to some other encoding unless the value passes encoding > verification, which is exactly what charout() is not doing. Ah, good point. I remembered noticing pg_do_encoding_conversion returning the src pointer unchanged when SQL_ASCII is involved, but see that it does verify the dest_encoding when SQL_ASCII is the source. > encoding-dependent would be if you have ambitions to store a non-ASCII > character in a "char". But I think that's something we want to > strongly discourage, even if we don't prohibit it altogether. ... > So I'm visualizing it as a uint8 that we happen to like to store > ASCII codes in, and that's what prompts the thought of using a > numeric representation for non-ASCII values. I'm in substantial agreement, though I also see that it is nearly always set from a quoted literal, and tested against a quoted literal, and calls itself "char", so I guess I am thinking for consistency's sake it might be better not to invent some all-new convention for its text representation, but adopt something that's already familiar, like bytea escaped format. So it would always look and act like a one-octet bytea. Maybe have charin accept either bytea-escaped or bytea-hex form too. (Or, never mind; when restricted to one octet, bytea-hex and the \xhh bytea-escape form are indistinguishable anyway.) Then for free we get the property that if somebody today uses 'ű' as an enum value, it might start appearing as '\xfb' now in dumps, etc., but their existing CASE WHEN thing = 'ű' code doesn't stop working (as long as they haven't done something silly like change the encoding), and they have the flexibility to update it to WHEN thing = '\xfb' as time permits if they choose. If they don't, they accept the risk that by switching to another encoding in the future, they may either see their existing tests stop matching, or their existing literals fail to parse, but there won't be invalidly-encoded strings created. > Yup, cstring is definitely presumed to be in the server's encoding. Without proposing to change it, I observe that by defining both cstring and unknown in this way (with the latter being expressly the type of any literal from the client destined for a type we don't know yet), we're a bit painted into the corner as far as supporting types like NCHAR. (I suppose clients could be banned from sending such values as literals, and required to use extended form and bind them with a binary message.) It's analogous to the way format-0 and format-1 both act as filters that no encoding-dependent data can squish through without surviving both the client and the server encoding, even if it is of a type that's defined to be independent of either. Regards, -Chap
Re: SPI TupTable memory context
On 12/03/21 20:22, Tom Lane wrote: > Seems kinda dangerous to me ... > >> AtEOSubXact_SPI can free tuptables retail, but only in the rollback case. > > ... precisely because of that. If you wanted to take control of > the TupTable, you'd really need to unhook it from the SPI context's > tuptables list, and that *really* seems like undue familiarity > with the implementation. Fair enough. I didn't have an immediate use in mind, but had been reading through DestReceiver code and noticed it worked that way, and that it looked as if an SPI_keeptuptable could have been implemented in probably no more than the 25 lines of SPI_keepplan, and I wasn't sure if that was because it had been considered and deemed a Bad Thing, or because the idea hadn't come up. The equivalent with a custom DestReceiver would certainly work, but with a lot more ceremony. So that was why I asked. If the response had been more like "hmm, no clear reason a patch to do that would be bad", and if such a patch got accepted for PG release n, that could also implicitly assuage worries about undue familiarity for implementing the compatible behavior when building on < n. Regards, -Chap
Re: RecoveryInProgress() has critical side effects
On Tue, Nov 16, 2021 at 09:41:58AM -0500, Robert Haas wrote: > Maybe I'm not understanding you properly here, but it seems like you > might be forgetting that this is a local variable and thus every > backend is going to have something different. In the startup process, > it will be initialized by StartupXLOG(); in other processes, it's > currently initialized by RecoveryInProgress(), but with this patch it > wouldn't be. Either way, it's then updated by future calls to > XLogInsertRecord() as required. XLOG_FPW_CHANGE records might affect > the new value that gets set the next time XLogInsertRecord(), but they > don't directly affect doPageWrites. My main worry here is that this changes slightly the definition of doPageWrites across stable branches at the end of recovery as there could be records generated there. Note that GetFullPageWriteInfo() gets called in XLogInsert(), while Insert->fullPageWrites gets updated before CleanupAfterArchiveRecovery(). And it may influence the value of doPageWrites in the startup process. -- Michael signature.asc Description: PGP signature
Re: Remove extra Logging_collector check before calling SysLogger_Start()
On Fri, Dec 03, 2021 at 12:45:34PM -0500, Tom Lane wrote: > I think it's fine as-is; good belt-and-suspenders-too programming. > It's not like the extra check inside SysLogger_Start() is costly. +1. -- Michael signature.asc Description: PGP signature
Re: Do sys logger and stats collector need wait events WAIT_EVENT_SYSLOGGER_MAIN/_PGSTAT_MAIN?
On Fri, Dec 03, 2021 at 12:42:47PM -0500, Tom Lane wrote: > On the whole I'd be inclined to leave it alone. Even if the reporting > mechanisms aren't able to report these events today, maybe they'll > be able to do so in future. The context of the stats collector > process, in particular, seems likely to change. These come from 6f3bd98, where I am pretty sure that I (or Robert) defined those values to not have users guess what to pass down to WaitLatch(), assuming that this may become useful in the future. So I'd rather leave them be. -- Michael signature.asc Description: PGP signature
Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?
I wonder why we're counting the number of dead tuples (or LP_DEAD stub items) in the relation as a whole in ANALYZE's acquire_sample_rows() function. Wouldn't it make more sense to focus on the "live vs dead tuple properties" of heap pages that are not known to be all-visible when we generate statistics for our pgstat_report_analyze() report? These statistic collector stats are only for the benefit of autovacuum scheduling -- and so they're *consumed* in a way that is totally different to the nearby pg_statistic stats. There is no good reason for the pgstat_report_analyze() stats to be based on the same pg_class.relpages "denominator" as the pg_statistic stats (it's just slightly easier to do it that way in acquire_sample_rows(), I suppose). On the other hand, an alternative behavior involving counting totaldeadrows against sampled not-all-visible pages (but not otherwise) has a big benefit: doing so would remove any risk that older/earlier PageIsAllVisible() pages will bias ANALYZE in the direction of underestimating the count. This isn't a theoretical benefit -- I have tied it to an issue with the BenchmarkSQL TPC-C implementation [1]. This approach just seems natural to me. VACUUM intrinsically only expects dead tuples/line pointers in not-all-visible pages. So PageIsAllVisible() pages should not be counted here -- they are simply irrelevant, because these stats are for autovacuum, and autovacuum thinks they're irrelevant. What's more, VACUUM currently uses vac_estimate_reltuples() to compensate for the fact that it skips some pages using the visibility map -- pgstat_report_vacuum() expects a whole-relation estimate. But if pgstat_report_vacuum()/pgstat_report_analyze() expected statistics about the general properties of live vs dead tuples (or LP_DEAD items) on not-all-visible pages in the first place, then we wouldn't need to compensate like this. This new approach also buys us the ability to extrapolate a new estimated number of dead tuples using old, stale stats. The stats can be combined with the authoritative/known number of not-all-visible pages right this second, since it's cheap enough to *accurately* determine the total number of not-all-visible pages for a heap relation by calling visibilitymap_count(). My guess is that this would be much more accurate in practice: provided the original average number of dead/live tuples (tuples per not-all-visible block) was still reasonably accurate, the extrapolated "total dead tuples right now" values would also be accurate. I'm glossing over some obvious wrinkles here, such as: what happens to totaldeadrows when 100% of all the pages ANALYZE samples are PageIsAllVisible() pages? I think that it shouldn't be too hard to come up with solutions to those problems (the extrapolation idea already hints at a solution), but for now I'd like to keep the discussion high level. [1] https://postgr.es/m/CAH2-Wz=9r83wcwzcpuh4fvpedm4znzbzmvp3rt21+xhqwmu...@mail.gmail.com -- Peter Geoghegan