Re: Improve WALRead() to suck data directly from WAL buffers when possible
On Fri, Dec 23, 2022 at 3:46 PM Bharath Rupireddy wrote: > > On Mon, Dec 12, 2022 at 8:27 AM Kyotaro Horiguchi > wrote: > > > > Thanks for providing thoughts. > > > At Fri, 9 Dec 2022 14:33:39 +0530, Bharath Rupireddy > > wrote in > > > The patch introduces concurrent readers for the WAL buffers, so far > > > only there are concurrent writers. In the patch, WALRead() takes just > > > one lock (WALBufMappingLock) in shared mode to enable concurrent > > > readers and does minimal things - checks if the requested WAL page is > > > present in WAL buffers, if so, copies the page and releases the lock. > > > I think taking just WALBufMappingLock is enough here as the concurrent > > > writers depend on it to initialize and replace a page in WAL buffers. > > > > > > I'll add this to the next commitfest. > > > > > > Thoughts? > > > > This adds copying of the whole page (at least) at every WAL *record* > > read, > > In the worst case yes, but that may not always be true. On a typical > production server with decent write traffic, it happens that the > callers of WALRead() read a full WAL page of size XLOG_BLCKSZ bytes or > MAX_SEND_SIZE bytes. I agree with this. > > This patch copies the bleeding edge WAL page without recording the > > (next) insertion point nor checking whether all in-progress insertion > > behind the target LSN have finished. Thus the copied page may have > > holes. That being said, the sequential-reading nature and the fact > > that WAL buffers are zero-initialized may make it work for recovery, > > but I don't think this also works for replication. > > WALRead() callers are smart enough to take the flushed bytes only. > Although they read the whole WAL page, they calculate the valid bytes. Right On first read the patch looks good, although it needs some more thoughts on 'XXX' comments in the patch. And also I do not like that XLogReadFromBuffers() is using 3 bools hit/partial hit/miss, instead of this we can use an enum or some tristate variable, I think that will be cleaner. -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
Re: [PATCH] Enable using llvm jitlink as an alternative llvm jit linker of old Rtdyld.
Hi, On 2022-11-23 21:13:04 +1100, Alex Fan wrote: > > @@ -241,6 +246,40 @@ llvm_mutable_module(LLVMJitContext *context) > > context->module = LLVMModuleCreateWithName("pg"); > > LLVMSetTarget(context->module, llvm_triple); > > LLVMSetDataLayout(context->module, llvm_layout); > > +#ifdef __riscv > > +#if __riscv_xlen == 64 > > +#ifdef __riscv_float_abi_double > > + abiname = "lp64d"; > > +#elif defined(__riscv_float_abi_single) > > + abiname = "lp64f"; > > +#else > > + abiname = "lp64"; > > +#endif > > +#elif __riscv_xlen == 32 > > +#ifdef __riscv_float_abi_double > > + abiname = "ilp32d"; > > +#elif defined(__riscv_float_abi_single) > > + abiname = "ilp32f"; > > +#else > > + abiname = "ilp32"; > > +#endif > > +#else > > + elog(ERROR, "unsupported riscv xlen %d", __riscv_xlen); > > +#endif > > + /* > > +* set this manually to avoid llvm defaulting to soft > > float and > > +* resulting in linker error: `can't link double-float > > modules > > +* with soft-float modules` > > +* we could set this for TargetMachine via MCOptions, but > > there > > +* is no C API for it > > +* ref: I think this is something that should go into the llvm code, rather than postgres. > > @@ -820,16 +861,21 @@ llvm_session_initialize(void) > > elog(DEBUG2, "LLVMJIT detected CPU \"%s\", with features \"%s\"", > > cpu, features); > > > > +#ifdef __riscv > > + reloc=LLVMRelocPIC; > > + codemodel=LLVMCodeModelMedium; > > +#endif Same. > > +#ifdef USE_JITLINK > > +/* > > + * There is no public C API to create ObjectLinkingLayer for JITLINK, > > create our own > > + */ > > +DEFINE_SIMPLE_CONVERSION_FUNCTIONS(llvm::orc::ExecutionSession, > > LLVMOrcExecutionSessionRef) > > +DEFINE_SIMPLE_CONVERSION_FUNCTIONS(llvm::orc::ObjectLayer, > > LLVMOrcObjectLayerRef) I recommend proposing a patch for adding such an API to LLVM. Greetings, Andres Freund
Re: daitch_mokotoff module
Hello On 2022-Dec-23, Dag Lem wrote: > It seems to me like you're trying to use soundex coding for something it > was never designed for. I'm not trying to use it for anything, actually. I'm just reading the pages your patch links to, to try and understand how this algorithm can be best implemented in Postgres. So I got to this page https://www.avotaynu.com/soundex.htm which explains that Daitch figured that it would be best if a letter that can have two possible encodings would be encoded in both ways: > 5. If a combination of letters could have two possible sounds, then it > is coded in both manners. For example, the letters ch can have a soft > sound such as in Chicago or a hard sound as in Christmas. which I understand as meaning that a single name returns two possible encodings, which is why these three names Barca Barco Parco have two possible encodings 795000 and 794000 which is what your algorithm returns. In fact, using the word Christmas we do get alternative codes for the first letter (either 4 or 5), precisely as in Daitch's example: =# select daitch_mokotoff('christmas'); daitch_mokotoff ─ 594364 494364 (1 fila) and if we take out the ambiguous 'ch', we get a single one: =# select daitch_mokotoff('ristmas'); daitch_mokotoff ─ 943640 (1 fila) and if we add another 'ch', we get the codes for each possibility at each position of the ambiguous 'ch': =# select daitch_mokotoff('christmach'); daitch_mokotoff ─ 594365 594364 494365 494364 (1 fila) So, yes, I'm proposing that we returns those as array elements and that @> is used to match them. > Daitch-Mokotoff Soundex indexes alternative sounds for the same name, > however if I understand correctly, you want to index names by single > sounds, linking all alike sounding names to the same soundex code. I > fail to see how that is useful - if you want to find matches for a name, > you simply match against all indexed names. If you only consider one > sound, you won't find all names that match. Hmm, I think we're saying the same thing, but from opposite points of view. No, I want each name to return multiple codes, but that those multiple codes can be treated as a multiple-value array of codes, rather than as a single string of space-separated codes. > In any case, as explained in the documentation, the implementation is > intended to be a companion to Full Text Search, thus text is the natural > representation for the soundex codes. Hmm, I don't agree with this point. The numbers are representations of the strings, but they don't necessarily have to be strings themselves. > BTW Vera 79 does not match Veras 794000, because they don't sound > the same (up to the maximum soundex code length). No, and maybe that's okay because they have different codes. But they are both similar, in Daitch-Mokotoff, to Borja, which has two codes, 79 and 794000. (Any Spanish speaker will readily tell you that neither Vera nor Veras are similar in any way to Borja, but D-M has chosen to say that each of them matches one of Borjas' codes. So they *are* related, even though indirectly, and as a genealogist you *may* be interested in getting a match for a person called Vera when looking for relatives to a person called Veras. And, as a Spanish speaker, that would make a lot of sense to me.) Now, it's true that I've chosen to use Spanish names for my silly little experiment. Maybe this isn't terribly useful as a practical example, because this algorithm seems to have been designed for Jew surnames and perhaps not many (or not any) Jews had Spanish surnames. I don't know; I'm not a Jew myself (though Noah Gordon tells the tale of a Spanish Jew called Josep Álvarez in his book "The Winemaker", so I guess it's not impossible). Anyway, I suspect if you repeat the experiment with names of other origins, you'll find pretty much the same results apply there, and that is the whole reason D-M returns multiple codes and not just one. Merry Christmas :-) -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Todo: Teach planner to evaluate multiple windows in the optimal order
Hi, While looking at one of the todo item in Window function, namely: /Teach planner to evaluate multiple windows in the optimal order Currently windows are always evaluated in the query-specified order./ From threads, relevant points. Point #1 In the above query Oracle 10g performs 2 sorts, DB2 and Sybase perform 3 sorts. We also perform 3. and Point #2 Teach planner to decide which window to evaluate first based on costs. Currently the first window in the query is evaluated first, there may be no index to help sort the first window, but perhaps there are for other windows in the query. This may allow an index scan instead of a seqscan -> sort. Repro: select pg_catalog.version(); /version // //// // PostgreSQL 16devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 12.2.0-3ubuntu1) 12.2.0, 64-bit// //(1 row)/ create table empsalary(depname text, empno int, salary int); insert into empsalary values (select substr(md5(random()::text), 0, 25), generate_series(1,10), generate_series(1,12000)); explain SELECT depname, SUM(salary) OVER (ORDER BY salary), SUM(salary) OVER (ORDER BY empno) FROM empsalary ORDER BY salary; / QUERY PLAN // //--// // WindowAgg (cost=289.47..324.48 rows=2001 width=49)// // -> Sort (cost=289.47..294.47 rows=2001 width=41)// // Sort Key: salary// // -> WindowAgg (cost=144.73..179.75 rows=2001 width=41)// // -> Sort (cost=144.73..149.73 rows=2001 width=33)// // Sort Key: empno// // -> Seq Scan on empsalary (cost=0.00..35.01 rows=2001 width=33)// //(7 rows)/ As it is seen, for case #1, issue looks like resolved and only 2 sorts are performed. For #2, index col ordering is changed. create index idx_emp on empsalary (empno); explain SELECT depname, SUM(salary) OVER (ORDER BY salary), SUM(salary) OVER (ORDER BY empno) FROM empsalary ORDER BY salary; / QUERY PLAN // //// // WindowAgg (cost=204.03..239.04 rows=2001 width=49)// // -> Sort (cost=204.03..209.03 rows=2001 width=41)// // Sort Key: salary// // -> WindowAgg (cost=0.28..94.31 rows=2001 width=41)// // -> Index Scan using idx_emp on empsalary (cost=0.28..64.29 rows=2001 width=33)// //(5 rows)/ explain SELECT depname, SUM(salary) OVER (ORDER BY empno), SUM(salary) OVER (ORDER BY salary) FROM empsalary ORDER BY salary; / QUERY PLAN // //// // WindowAgg (cost=204.03..239.04 rows=2001 width=49)// // -> Sort (cost=204.03..209.03 rows=2001 width=41)// // Sort Key: salary// // -> WindowAgg (cost=0.28..94.31 rows=2001 width=41)// // -> Index Scan using idx_emp on empsalary (cost=0.28..64.29 rows=2001 width=33)// //(5 rows)/ In both cases, index scan is performed, which means this issue is resolved as well. Is this todo still relevant? Further down the threads: I do think the patch has probably left some low-hanging fruit on the simpler end of the difficulty spectrum, namely when the window stuff requires only one ordering that could be done either explicitly or by an indexscan. That choice should ideally be done with a proper cost comparison taking any LIMIT into account. I think right now the LIMIT might not be accounted for, or might be considered even when it shouldn't be because another sort is needed anyway. I am not sure if I understand this fully but does it means proposed todo (to use indexes) should be refined to teach planner to take into account of cost model while deciding to use index or not in window functions? Meaning not always go with index route (modify point #2)? -- Regards, Ankit Kumar Pandey
[PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
Hi all, This is patch for TODO item: /Improve ability to display optimizer analysis using OPTIMIZER_DEBUG / As per as suggestion in the mailing list which is to replace current mechanism of getting optimizer log via OPTIMIZER_DEBUG macro to something more configurable (which doesn't require rebuilding postgres from source code). This patch replaces /OPTIMIZER_DEBUG / by introducing a//GUC /show_optimizer_log /which can be configured on and off to//display (or hide)//previously generated log from stdout to postmaster's log. Please check attached patch, any feedback is appreciated. -- Regards, Ankit Kumar Pandey From 420f0ed81d233bf63a10104d1164726b138bce83 Mon Sep 17 00:00:00 2001 From: Ankit Kumar Pandey Date: Sun, 25 Dec 2022 21:26:32 +0530 Subject: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG Currently optimizer logs are enabled as build flag. This patch adds a GUC variable to enable or disable optimizer logs and thus does away with build requirements. --- src/backend/nodes/print.c | 107 ++ src/backend/optimizer/path/allpaths.c | 134 ++ src/backend/optimizer/plan/planner.c | 11 +- src/backend/utils/misc/guc_tables.c | 11 ++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/nodes/print.h | 1 + src/include/optimizer/optimizer.h | 1 + src/include/optimizer/paths.h | 2 - 8 files changed, 200 insertions(+), 68 deletions(-) diff --git a/src/backend/nodes/print.c b/src/backend/nodes/print.c index a5c44adc6c..c33db27142 100644 --- a/src/backend/nodes/print.c +++ b/src/backend/nodes/print.c @@ -418,6 +418,113 @@ print_expr(const Node *expr, const List *rtable) printf("unknown expr"); } + + +/* + * format_expr + * stores an expression to given string for logging + */ +void +format_expr(const Node *expr, const List *rtable, StringInfoData *str) +{ + if (expr == NULL) + { + appendStringInfo(str, "<>"); + return; + } + + if (IsA(expr, Var)) + { + const Var *var = (const Var *) expr; + char *relname, + *attname; + + switch (var->varno) + { + case INNER_VAR: +relname = "INNER"; +attname = "?"; +break; + case OUTER_VAR: +relname = "OUTER"; +attname = "?"; +break; + case INDEX_VAR: +relname = "INDEX"; +attname = "?"; +break; + default: +{ + RangeTblEntry *rte; + + Assert(var->varno > 0 && + (int) var->varno <= list_length(rtable)); + rte = rt_fetch(var->varno, rtable); + relname = rte->eref->aliasname; + attname = get_rte_attribute_name(rte, var->varattno); +} +break; + } + appendStringInfo(str, "%s.%s", relname, attname); + } + else if (IsA(expr, Const)) + { + const Const *c = (const Const *) expr; + Oid typoutput; + bool typIsVarlena; + char *outputstr; + + if (c->constisnull) + { + appendStringInfo(str, "NULL"); + return; + } + + getTypeOutputInfo(c->consttype, + &typoutput, &typIsVarlena); + + outputstr = OidOutputFunctionCall(typoutput, c->constvalue); + appendStringInfo(str, "%s", outputstr); + pfree(outputstr); + } + else if (IsA(expr, OpExpr)) + { + const OpExpr *e = (const OpExpr *) expr; + char *opname; + + opname = get_opname(e->opno); + if (list_length(e->args) > 1) + { + print_expr(get_leftop((const Expr *) e), rtable); + appendStringInfo(str, " %s ", ((opname != NULL) ? opname : "(invalid operator)")); + print_expr(get_rightop((const Expr *) e), rtable); + } + else + { + appendStringInfo(str, "%s ", ((opname != NULL) ? opname : "(invalid operator)")); + print_expr(get_leftop((const Expr *) e), rtable); + } + } + else if (IsA(expr, FuncExpr)) + { + const FuncExpr *e = (const FuncExpr *) expr; + char *funcname; + ListCell *l; + + funcname = get_func_name(e->funcid); + appendStringInfo(str, "%s(", ((funcname != NULL) ? funcname : "(invalid function)")); + foreach(l, e->args) + { + print_expr(lfirst(l), rtable); + if (lnext(e->args, l)) +appendStringInfo(str, ","); + } + appendStringInfo(str, ")"); + } + else + appendStringInfo(str, "unknown expr"); +} + /* * print_pathkeys - * pathkeys list of PathKeys diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c index 0cfa3a1d6c..38cbc365b3 100644 --- a/src/backend/optimizer/path/allpaths.c +++ b/src/backend/optimizer/path/allpaths.c @@ -28,9 +28,7 @@ #include "nodes/makefuncs.h" #include "nodes/nodeFuncs.h" #include "nodes/supportnodes.h" -#ifdef OPTIMIZER_DEBUG #include "nodes/print.h" -#endif #include "optimizer/appendinfo.h" #include "optimizer/clauses.h" #include "optimizer/cost.h" @@ -65,6 +63,7 @@ bool enable_geqo = false; /* just in case GUC doesn't set it */ int geqo_threshold; int min_parallel_table_scan_size; int min_parallel_index_scan_size; +bool show_optimizer_log = false; /* Hook for plugins to get contro
Re: Error-safe user functions
Robert Haas writes: > On Fri, Dec 16, 2022 at 1:31 PM Tom Lane wrote: >> The reg* functions probably need a unified plan as to how far >> down we want to push non-error behavior. > I would be in favor of an aggressive approach. Here's a proposed patch for converting regprocin and friends to soft error reporting. I'll say at the outset that it's an engineering compromise, and it may be worth going further in future. But I doubt it's worth doing more than this for v16, because the next steps would be pretty invasive. I've converted all the errors thrown directly within regproc.c, and also converted parseTypeString, typeStringToTypeName, and stringToQualifiedNameList to report their own errors softly. This affected some outside callers, but not so many of them that I think it's worth inventing compatibility wrappers. I dealt with lookup failures by just changing the input functions to call the respective lookup functions with missing_ok = true, and then throw their own error softly on failure. Also, I've changed to_regproc() and friends to return NULL in exactly the same cases that are now soft errors for the input functions. Previously they were a bit inconsistent about what triggered hard errors vs. returning NULL. (Perhaps we should go further than this, and convert all these functions to just be DirectInputFunctionCallSafe wrappers around the corresponding input functions? That would save some duplicative code, but I've not done it here.) What's not fixed here: 1. As previously discussed, parse errors in type names are thrown by the main grammar, so getting those to not be hard errors seems like too big a lift for today. 2. Errors about invalid type modifiers (reported by typenameTypeMod or type-specific typmodin routines) are not trapped either. Fixing this would require extending the soft-error conventions to typmodin routines, which maybe will be worth doing someday but it seems pretty far down the priority list. Specifying a typmod is surely not main-line usage for regtypein. 3. Throwing our own error has the demerit that it might be different from what the underlying lookup function would have reported. This is visible in some changes in existing regression test cases, such as -ERROR: schema "ng_catalog" does not exist +ERROR: relation "ng_catalog.pg_class" does not exist This isn't wrong, exactly, but the loss of specificity is a bit annoying. 4. This still fails to trap errors about "too many dotted names" and "cross-database references are not implemented", which are thrown in DeconstructQualifiedName, LookupTypeName, RangeVarGetRelid, and maybe some other places. 5. We also don't trap errors about "the schema exists but you don't have USAGE permission to do a lookup in it", because LookupExplicitNamespace still throws that even when passed missing_ok = true. The obvious way to fix #3,#4,#5 is to change pretty much all of the catalog lookup infrastructure to deal in escontext arguments instead of "missing_ok" booleans. That might be worth doing --- it'd have benefits beyond the immediate problem, I think --- but I feel it's a bigger lift than we want to undertake for v16. It'd be better to spend the time we have left for v16 on building features that use soft error reporting than on refining corner cases in the reg* functions. So I think we should stop more or less here, possibly after changing the to_regfoo functions to be simple wrappers around the soft input functions. regards, tom lane diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c index 109bdfb33f..a1df8b1ddc 100644 --- a/src/backend/catalog/objectaddress.c +++ b/src/backend/catalog/objectaddress.c @@ -2182,7 +2182,7 @@ pg_get_object_address(PG_FUNCTION_ARGS) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("name or argument lists may not contain nulls"))); - typename = typeStringToTypeName(TextDatumGetCString(elems[0])); + typename = typeStringToTypeName(TextDatumGetCString(elems[0]), NULL); } else if (type == OBJECT_LARGEOBJECT) { @@ -2238,7 +2238,8 @@ pg_get_object_address(PG_FUNCTION_ARGS) (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("name or argument lists may not contain nulls"))); args = lappend(args, - typeStringToTypeName(TextDatumGetCString(elems[i]))); + typeStringToTypeName(TextDatumGetCString(elems[i]), +NULL)); } } else diff --git a/src/backend/parser/parse_type.c b/src/backend/parser/parse_type.c index f7ad689459..8f3850aa4e 100644 --- a/src/backend/parser/parse_type.c +++ b/src/backend/parser/parse_type.c @@ -727,10 +727,15 @@ pts_error_callback(void *arg) * Given a string that is supposed to be a SQL-compatible type declaration, * such as "int4" or "integer" or "character varying(32)", parse * the string and return the result as a TypeName. - * If the string cannot be parsed as a type, an error is raised. + * + * If the string ca
Re: [RFC] Add jit deform_counter
Hi ne 11. 12. 2022 v 5:44 odesílatel Pavel Stehule napsal: > Hi > > ne 11. 12. 2022 v 1:14 odesílatel Ian Lawrence Barwick > napsal: > >> 2022年6月12日(日) 18:14 Dmitry Dolgov <9erthali...@gmail.com>: >> > >> > Hi, >> > >> > I've noticed that JIT performance counter generation_counter seems to >> include >> > actions, relevant for both jit_expressions and jit_tuple_deforming >> options. It >> > means one can't directly see what is the influence of >> jit_tuple_deforming >> > alone, which would be helpful when adjusting JIT options. To make it >> better a >> > new counter can be introduced, does it make sense? >> >> Hi Pavel >> >> I see you are added as reviewer in the CF app; have you been able to take >> a look >> at this? >> > > I hope so yes > there are some problems with stability of regress tests http://cfbot.cputube.org/dmitry-dolgov.html Regards Pavel > Regards > > Pavel > > >> >> Regards >> >> Ian Barwick >> >
Re: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
Ankit Kumar Pandey writes: > As per as suggestion in the mailing list which is to replace current > mechanism of getting optimizer log via OPTIMIZER_DEBUG macro > to something more configurable (which doesn't require rebuilding > postgres from source code). This patch replaces /OPTIMIZER_DEBUG > by introducing a//GUC /show_optimizer_log /which can be configured on > and off to//display (or hide)//previously generated log from stdout to > postmaster's log. The problem with OPTIMIZER_DEBUG is that it's useless. I've been hacking on the PG planner for well more than twenty years, and I do not think I've ever made any use of that code --- certainly not since the turn of the century or so. Making it GUC-accessible isn't going to make it more useful. There certainly could be value in having some better trace of what the planner is doing, but I honestly don't have much of an idea of what that would look like. debug_print_rel isn't that, however, because it won't show you anything about paths that were considered and immediately rejected by add_path (or never got to add_path at all, because of the approximate-initial-cost filters in joinpath.c). We've sometimes speculated about logging every path submitted to add_path, but that would likely be too verbose to be very helpful. Also, because OPTIMIZER_DEBUG is such a backwater, it's never gotten much polishing for usability. There are large gaps in what it can deal with (print_expr is pretty much a joke for example), yet also lots of redundancy in the output ... and dumping stuff to stdout isn't terribly helpful to the average user in the first place. These days people would probably also wish that the output could be machine- readable in some way (JSON-formatted, perhaps). So I think this whole area would need some pretty serious rethinking and attention to detail before we should consider claiming that it's a useful feature. regards, tom lane
Re: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG
On 25/12/22 23:54, Tom Lane wrote: These days people would probably also wish that the output could be machine- readable in some way (JSON-formatted, perhaps). Perhaps switch to enable logs could be on (standard logging style), json, xml etc and print the output as required? however, because it won't show you anything about paths that were considered and immediately rejected by add_path (or never got to add_path at all, because of the approximate-initial-cost filters in joinpath.c). We've sometimes speculated about logging every path submitted to add_path, but that would likely be too verbose to be very helpful. Maybe we could add verbose option too and this could be one of the output. Are there any more relevant information that you could think of which can be included here? Also, inputs from other hackers are welcomed here. -- Regards, Ankit Kumar Pandey
Re: Error-safe user functions
I got annoyed by the fact that types cid, xid, xid8 don't throw error even for obvious garbage, because they just believe the result of strtoul or strtoull without any checking. That was probably up to project standards when cidin and xidin were written; but surely it's not anymore, especially when we can piggyback on work already done for type oid. Anybody have an objection to the following? One note is that because we already had test cases checking that xid would accept hex input, I made the common subroutines use "0" not "10" for strtoul's last argument, meaning that oid will accept hex now too. regards, tom lane diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c index 7cded73e6e..c67a79344a 100644 --- a/src/backend/utils/adt/numutils.c +++ b/src/backend/utils/adt/numutils.c @@ -477,6 +477,180 @@ invalid_syntax: "bigint", s))); } +/* + * Convert input string to an unsigned 32 bit integer. + * + * Allows any number of leading or trailing whitespace characters. + * + * If endloc isn't NULL, store a pointer to the rest of the string there, + * so that caller can parse the rest. Otherwise, it's an error if anything + * but whitespace follows. + * + * typname is what is reported in error messges. + * + * If escontext points to an ErrorSaveContext node, that is filled instead + * of throwing an error; the caller must check SOFT_ERROR_OCCURRED() + * to detect errors. + */ +uint32 +uint32in_subr(const char *s, char **endloc, + const char *typname, Node *escontext) +{ + uint32 result; + unsigned long cvt; + char *endptr; + + /* Ensure that empty-input is handled consistently across platforms */ + if (*s == '\0') + ereturn(escontext, 0, +(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), + errmsg("invalid input syntax for type %s: \"%s\"", + typname, s))); + + errno = 0; + cvt = strtoul(s, &endptr, 0); + + /* + * strtoul() normally only sets ERANGE. On some systems it also may set + * EINVAL, which simply means it couldn't parse the input string. This is + * handled by the second "if" consistent across platforms. + */ + if (errno && errno != ERANGE && errno != EINVAL) + ereturn(escontext, 0, +(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), + errmsg("invalid input syntax for type %s: \"%s\"", + typname, s))); + + if (endptr == s) + ereturn(escontext, 0, +(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), + errmsg("invalid input syntax for type %s: \"%s\"", + typname, s))); + + if (errno == ERANGE) + ereturn(escontext, 0, +(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("value \"%s\" is out of range for type %s", + s, typname))); + + if (endloc) + { + /* caller wants to deal with rest of string */ + *endloc = endptr; + } + else + { + /* allow only whitespace after number */ + while (*endptr && isspace((unsigned char) *endptr)) + endptr++; + if (*endptr) + ereturn(escontext, 0, + (errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), + errmsg("invalid input syntax for type %s: \"%s\"", + typname, s))); + } + + result = (uint32) cvt; + + /* + * Cope with possibility that unsigned long is wider than uint32, in which + * case strtoul will not raise an error for some values that are out of + * the range of uint32. + * + * For backwards compatibility, we want to accept inputs that are given + * with a minus sign, so allow the input value if it matches after either + * signed or unsigned extension to long. + * + * To ensure consistent results on 32-bit and 64-bit platforms, make sure + * the error message is the same as if strtoul() had returned ERANGE. + */ +#if PG_UINT32_MAX != ULONG_MAX + if (cvt != (unsigned long) result && + cvt != (unsigned long) ((int) result)) + ereturn(escontext, 0, +(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE), + errmsg("value \"%s\" is out of range for type %s", + s, typname))); +#endif + + return result; +} + +/* + * Convert input string to an unsigned 64 bit integer. + * + * Allows any number of leading or trailing whitespace characters. + * + * If endloc isn't NULL, store a pointer to the rest of the string there, + * so that caller can parse the rest. Otherwise, it's an error if anything + * but whitespace follows. + * + * typname is what is reported in error messges. + * + * If escontext points to an ErrorSaveContext node, that is filled instead + * of throwing an error; the caller must check SOFT_ERROR_OCCURRED() + * to detect errors. + */ +uint64 +uint64in_subr(const char *s, char **endloc, + const char *typname, Node *escontext) +{ + uint64 result; + char *endptr; + + /* Ensure that empty-input is handled consistently across platforms */ + if (*s == '\0') + ereturn(escontext, 0, +(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION), + errmsg("invalid input syntax for type %s: \"%s\"", + typname, s))); + + errno = 0; + result = strtou64(s, &endptr, 0); + + /* + * strtoul[l]() normally only sets
Re: pg_upgrade: Make testing different transfer modes easier
> On 19 Dec 2022, at 01:39, Shinoda, Noriyoshi (PN Japan FSIP) > wrote: > With the addition of --copy option, pg_upgrade now has three possible > transfer mode options. Currently, an error does not occur even if multiple > transfer modes are specified. For example, we can also run "pg_upgrade --link > --copy --clone" command. As discussed in Horiguchi-san's previous email, > options like "--mode=link|copy|clone" can prevent this problem. > The attached patch uses the current implementation and performs a minimum > check to prevent multiple transfer modes from being specified. We typically allow multiple invocations of the same parameter with a last-one-wins strategy, and only error out when competing *different* parameters are present. A --mode= parameter can still be added as syntactic sugar, but multiple-choice parameters is not a commonly used pattern in postgres utilities (pg_dump/restore and pg_basebackup are ones that come to mind). -- Daniel Gustafsson https://vmware.com/
Re: [BUG] pg_upgrade test fails from older versions.
On Fri, Dec 23, 2022 at 10:39:25AM -0600, Justin Pryzby wrote: > LGTM. Thanks. Done as of d3c0cc4. -- Michael signature.asc Description: PGP signature
Re: [BUG] pg_upgrade test fails from older versions.
On Fri, Dec 23, 2022 at 12:43:00PM +0300, Anton A. Melnikov wrote: > Sorry, didn't get to see the last letter! No worries, the result is the same :) I was looking at 0002 to add a callback to provide custom filtering rules. + my @ext_filter = split('\/', $_); Are you sure that enforcing a separation with a slash is a good idea? What if the filters include directory paths or characters that are escaped, for example? Rather than introducing a filter.regex, I would tend to just document that in the README with a small example. I have been considering a few alternatives while making this useful in most cases, still my mind alrways comes back to the simplest thing we to just read each line of the file, chomp it and apply the pattern to the log file.. -- Michael signature.asc Description: PGP signature
Re: [BUG] pg_upgrade test fails from older versions.
Hello! On 23.12.2022 05:42, Michael Paquier wrote: On Thu, Dec 22, 2022 at 09:59:18AM +0300, Anton A. Melnikov wrote: 2) v2-0002-Additional-dumps-filtering.patch + # Replace specific privilegies with ALL + $dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx; This should not be in 0002, I guess.. Made a separate patch for it: v3-0001-Fix-dumps-filtering.patch On 26.12.2022 05:52, Michael Paquier wrote: On Fri, Dec 23, 2022 at 12:43:00PM +0300, Anton A. Melnikov wrote: I was looking at 0002 to add a callback to provide custom filtering rules. + my @ext_filter = split('\/', $_); Are you sure that enforcing a separation with a slash is a good idea? What if the filters include directory paths or characters that are escaped, for example? Rather than introducing a filter.regex, I would tend to just document that in the README with a small example. I have been considering a few alternatives while making this useful in most cases, still my mind alrways comes back to the simplest thing we to just read each line of the file, chomp it and apply the pattern to the log file.. Thanks for your attention! Yes, indeed. It will be really simpler. Made it in the v3-0002-Add-external-dumps-filtering.patch With the best wishes, -- Anton A. Melnikov Postgres Professional: http://www.postgrespro.com The Russian Postgres Companycommit 602627daf32a6d33ae96ce8fbae918c9f00f0633 Author: Anton A. Melnikov Date: Mon Dec 26 06:21:32 2022 +0300 Fix dupms filtering in pg_upgrade test from older versions. Replace specific privilegies in dumps with ALL due to b5d63824 and 60684dd8. diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 4cc1469306..d23c4b2253 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -44,6 +44,8 @@ sub filter_dump $dump_contents =~ s/^\-\-.*//mgx; # Remove empty lines. $dump_contents =~ s/^\n//mgx; + # Replace specific privilegies with ALL + $dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx; my $dump_file_filtered = "${dump_file}_filtered"; open(my $dh, '>', $dump_file_filtered) commit 3870fb8263dc7e3fa240753b4eb89945b8d229be Author: Anton A. Melnikov Date: Thu Dec 22 08:01:40 2022 +0300 Add external dumps filtering during upgrade test. diff --git a/src/bin/pg_upgrade/TESTING b/src/bin/pg_upgrade/TESTING index 127dc30bbb..b8cce2bae1 100644 --- a/src/bin/pg_upgrade/TESTING +++ b/src/bin/pg_upgrade/TESTING @@ -56,3 +56,12 @@ Once the dump is created, it can be repeatedly used with $olddump and the dump out of the new database and the comparison of the dumps between the old and new databases. The contents of the dumps can also be manually compared. + +You can use additional dump filtering. To do this, you need to define the +'filter' environment variable and specify the path to the file with +filtering rules in it. Here is an example contens of such a file: +# examples: +# Remove all CREATE POLICY statements +s/^CREATE\sPOLICY.*//mgx +# Replace REFRESH with DROP for materialized views +s/^REFRESH\s(MATERIALIZED\sVIEW)/DROP $1/mgx diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl index 4cc1469306..e094fd08c3 100644 --- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl +++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl @@ -45,6 +45,30 @@ sub filter_dump # Remove empty lines. $dump_contents =~ s/^\n//mgx; + if (defined($ENV{filter})) + { + # Use the external filter + my $ext_filter_file = $ENV{filter}; + die "no file with external filter found!" unless -e $ext_filter_file; + + open my $ext_filter_handle, '<', $ext_filter_file; + chomp(my @ext_filter_lines = <$ext_filter_handle>); + close $ext_filter_handle; + + foreach (@ext_filter_lines) + { + # omit lines with comments + if (substr($_, 0, 1) eq '#') + { +next; + } + + # apply lines with filters + my $filter = "\$dump_contents =~ $_"; + eval $filter; + } + } + my $dump_file_filtered = "${dump_file}_filtered"; open(my $dh, '>', $dump_file_filtered) || die "opening $dump_file_filtered";
Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
On Sat, Dec 24, 2022 at 06:23:29PM +0530, Bharath Rupireddy wrote: > Thanks for the patch. I've made the above change as well as renamed > the test file name to be save_fpi.pl, everything else remains the same > as v11. Here's the v12 patch which LGTM. I'll mark it as RfC - > https://commitfest.postgresql.org/41/3628/. I have done a review of that, and here are my notes: - The variable names were a bit inconsistent, so I have switched most of the new code to use "fullpage". - The code was not able to handle the case of a target directory existing but empty, so I have added a wrapper on pg_check_dir(). - XLogRecordHasFPW() could be checked directly in the function saving the blocks. Still, there is no need for it as we apply the same checks again in the inner loop of the routine. - The new test has been renamed. - RestoreBlockImage() would report a failure and the code would just skip it and continue its work. This could point out to a compression failure for example, so like any code paths calling this routine I think that we'd better do a pg_fatal() and fail hard. - I did not understand why there is a reason to make this option conditional on the record prints or even the stats, so I have moved the FPW save routine into a separate code path. The other two could be silenced (or not) using --quiet for example, for the same result as v12 without impacting the usability of this feature. - Few tweaks to the docs, the --help output, the comments and the tests. - Indentation applied. Being able to filter the blocks saved using start/end LSNs or just --relation is really cool, especially as the file names use the same order as what's needed for this option. Comments? -- Michael From 66eedbb96858a4f6804509900b116d8a63b39925 Mon Sep 17 00:00:00 2001 From: Michael Paquier Date: Mon, 26 Dec 2022 16:28:00 +0900 Subject: [PATCH v13] Teach pg_waldump to extract FPIs from the WAL stream Extracts full-page images from the WAL stream into a given target directory. These images are subject to the same filtering rules as normal display in pg_waldump, which means that you can isolate the full page writes to a target relation, among other things. Files are saved with the filename: _ with formatting to make things somewhat sortable; for instance: -01C0.1663.1.6117.0_main -01000150.1664.0.6115.0_main -010001E0.1664.0.6114.0_main -01000270.1663.1.6116.0_main -01000300.1663.1.6113.0_main -01000390.1663.1.6112.0_main -01000420.1663.1.8903.0_main -010004B0.1663.1.8902.0_main -01000540.1663.1.6111.0_main -010005D0.1663.1.6110.0_main It's noteworthy that the raw block images do not have the current LSN stored with them in the WAL stream (as would be true for on-heap versions of the blocks), nor would the checksum be updated in them (though WAL itself has checksums, so there is some protection there). These images could be loaded/inspected via `pg_read_binary_file()` and used in the `pageinspect` suite of tools to perform detailed analysis on the pages in question, based on historical information, and may come in handy for forensics work. --- src/bin/pg_waldump/meson.build| 1 + src/bin/pg_waldump/pg_waldump.c | 107 + src/bin/pg_waldump/t/002_save_fullpage.pl | 111 ++ doc/src/sgml/ref/pg_waldump.sgml | 66 + 4 files changed, 285 insertions(+) create mode 100644 src/bin/pg_waldump/t/002_save_fullpage.pl diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build index 3fa1b53e71..0428998350 100644 --- a/src/bin/pg_waldump/meson.build +++ b/src/bin/pg_waldump/meson.build @@ -31,6 +31,7 @@ tests += { 'tap': { 'tests': [ 't/001_basic.pl', + 't/002_save_fullpage.pl', ], }, } diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index 9993378ca5..d90a142c68 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -23,9 +23,13 @@ #include "access/xlogrecord.h" #include "access/xlogstats.h" #include "common/fe_memutils.h" +#include "common/file_perm.h" +#include "common/file_utils.h" #include "common/logging.h" +#include "common/relpath.h" #include "getopt_long.h" #include "rmgrdesc.h" +#include "storage/bufpage.h" /* * NOTE: For any code change or issue fix here, it is highly recommended to @@ -70,6 +74,9 @@ typedef struct XLogDumpConfig bool filter_by_relation_block_enabled; ForkNumber filter_by_relation_forknum; bool filter_by_fpw; + + /* save options */ + char *save_fullpage_path; } XLogDumpConfig; @@ -112,6 +119,37 @@ verify_directory(const char *directory) return true; } +/* + * Create if necessary the directory storing the full-page images extracted + * from the WAL records read. + */ +static void +create_fullpage_directory(char *path) +{ + int ret; + + switch ((ret = pg_check_dir(path))) + { + case 0: + /* Does no