Re: Improve WALRead() to suck data directly from WAL buffers when possible

2022-12-25 Thread Dilip Kumar
On Fri, Dec 23, 2022 at 3:46 PM Bharath Rupireddy
 wrote:
>
> On Mon, Dec 12, 2022 at 8:27 AM Kyotaro Horiguchi
>  wrote:
> >
>
> Thanks for providing thoughts.
>
> > At Fri, 9 Dec 2022 14:33:39 +0530, Bharath Rupireddy 
> >  wrote in
> > > The patch introduces concurrent readers for the WAL buffers, so far
> > > only there are concurrent writers. In the patch, WALRead() takes just
> > > one lock (WALBufMappingLock) in shared mode to enable concurrent
> > > readers and does minimal things - checks if the requested WAL page is
> > > present in WAL buffers, if so, copies the page and releases the lock.
> > > I think taking just WALBufMappingLock is enough here as the concurrent
> > > writers depend on it to initialize and replace a page in WAL buffers.
> > >
> > > I'll add this to the next commitfest.
> > >
> > > Thoughts?
> >
> > This adds copying of the whole page (at least) at every WAL *record*
> > read,
>
> In the worst case yes, but that may not always be true. On a typical
> production server with decent write traffic, it happens that the
> callers of WALRead() read a full WAL page of size XLOG_BLCKSZ bytes or
> MAX_SEND_SIZE bytes.

I agree with this.

> > This patch copies the bleeding edge WAL page without recording the
> > (next) insertion point nor checking whether all in-progress insertion
> > behind the target LSN have finished. Thus the copied page may have
> > holes.  That being said, the sequential-reading nature and the fact
> > that WAL buffers are zero-initialized may make it work for recovery,
> > but I don't think this also works for replication.
>
> WALRead() callers are smart enough to take the flushed bytes only.
> Although they read the whole WAL page, they calculate the valid bytes.

Right

On first read the patch looks good, although it needs some more
thoughts on 'XXX' comments in the patch.

And also I do not like that XLogReadFromBuffers() is using 3 bools
hit/partial hit/miss, instead of this we can use an enum or some
tristate variable, I think that will be cleaner.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] Enable using llvm jitlink as an alternative llvm jit linker of old Rtdyld.

2022-12-25 Thread Andres Freund
Hi,

On 2022-11-23 21:13:04 +1100, Alex Fan wrote:
> > @@ -241,6 +246,40 @@ llvm_mutable_module(LLVMJitContext *context)
> > context->module = LLVMModuleCreateWithName("pg");
> > LLVMSetTarget(context->module, llvm_triple);
> > LLVMSetDataLayout(context->module, llvm_layout);
> > +#ifdef __riscv
> > +#if __riscv_xlen == 64
> > +#ifdef __riscv_float_abi_double
> > +   abiname = "lp64d";
> > +#elif defined(__riscv_float_abi_single)
> > +   abiname = "lp64f";
> > +#else
> > +   abiname = "lp64";
> > +#endif
> > +#elif __riscv_xlen == 32
> > +#ifdef __riscv_float_abi_double
> > +   abiname = "ilp32d";
> > +#elif defined(__riscv_float_abi_single)
> > +   abiname = "ilp32f";
> > +#else
> > +   abiname = "ilp32";
> > +#endif
> > +#else
> > +   elog(ERROR, "unsupported riscv xlen %d", __riscv_xlen);
> > +#endif
> > +   /*
> > +* set this manually to avoid llvm defaulting to soft
> > float and
> > +* resulting in linker error: `can't link double-float
> > modules
> > +* with soft-float modules`
> > +* we could set this for TargetMachine via MCOptions, but
> > there
> > +* is no C API for it
> > +* ref:

I think this is something that should go into the llvm code, rather than
postgres.


> > @@ -820,16 +861,21 @@ llvm_session_initialize(void)
> > elog(DEBUG2, "LLVMJIT detected CPU \"%s\", with features \"%s\"",
> >  cpu, features);
> >
> > +#ifdef __riscv
> > +   reloc=LLVMRelocPIC;
> > +   codemodel=LLVMCodeModelMedium;
> > +#endif

Same.




> > +#ifdef USE_JITLINK
> > +/*
> > + * There is no public C API to create ObjectLinkingLayer for JITLINK,
> > create our own
> > + */
> > +DEFINE_SIMPLE_CONVERSION_FUNCTIONS(llvm::orc::ExecutionSession,
> > LLVMOrcExecutionSessionRef)
> > +DEFINE_SIMPLE_CONVERSION_FUNCTIONS(llvm::orc::ObjectLayer,
> > LLVMOrcObjectLayerRef)

I recommend proposing a patch for adding such an API to LLVM.



Greetings,

Andres Freund




Re: daitch_mokotoff module

2022-12-25 Thread Alvaro Herrera
Hello

On 2022-Dec-23, Dag Lem wrote:

> It seems to me like you're trying to use soundex coding for something it
> was never designed for.

I'm not trying to use it for anything, actually.  I'm just reading the
pages your patch links to, to try and understand how this algorithm can
be best implemented in Postgres.

So I got to this page
https://www.avotaynu.com/soundex.htm
which explains that Daitch figured that it would be best if a letter
that can have two possible encodings would be encoded in both ways:

> 5. If a combination of letters could have two possible sounds, then it
> is coded in both manners. For example, the letters ch can have a soft
> sound such as in Chicago or a hard sound as in Christmas.

which I understand as meaning that a single name returns two possible
encodings, which is why these three names
 Barca Barco Parco
have two possible encodings
 795000 and 794000
which is what your algorithm returns.

In fact, using the word Christmas we do get alternative codes for the first
letter (either 4 or 5), precisely as in Daitch's example:

=# select daitch_mokotoff('christmas');
 daitch_mokotoff 
─
 594364 494364
(1 fila)

and if we take out the ambiguous 'ch', we get a single one:

=# select daitch_mokotoff('ristmas');
 daitch_mokotoff 
─
 943640
(1 fila)

and if we add another 'ch', we get the codes for each possibility at each
position of the ambiguous 'ch':

=# select daitch_mokotoff('christmach');
   daitch_mokotoff   
─
 594365 594364 494365 494364
(1 fila)


So, yes, I'm proposing that we returns those as array elements and that
@> is used to match them.

> Daitch-Mokotoff Soundex indexes alternative sounds for the same name,
> however if I understand correctly, you want to index names by single
> sounds, linking all alike sounding names to the same soundex code. I
> fail to see how that is useful - if you want to find matches for a name,
> you simply match against all indexed names. If you only consider one
> sound, you won't find all names that match.

Hmm, I think we're saying the same thing, but from opposite points of
view.  No, I want each name to return multiple codes, but that those
multiple codes can be treated as a multiple-value array of codes, rather
than as a single string of space-separated codes.

> In any case, as explained in the documentation, the implementation is
> intended to be a companion to Full Text Search, thus text is the natural
> representation for the soundex codes.

Hmm, I don't agree with this point.  The numbers are representations of
the strings, but they don't necessarily have to be strings themselves.


> BTW Vera 79 does not match Veras 794000, because they don't sound
> the same (up to the maximum soundex code length).

No, and maybe that's okay because they have different codes.  But they
are both similar, in Daitch-Mokotoff, to Borja, which has two codes,
79 and 794000.  (Any Spanish speaker will readily tell you that
neither Vera nor Veras are similar in any way to Borja, but D-M has
chosen to say that each of them matches one of Borjas' codes.  So they
*are* related, even though indirectly, and as a genealogist you *may* be
interested in getting a match for a person called Vera when looking for
relatives to a person called Veras.  And, as a Spanish speaker, that
would make a lot of sense to me.)


Now, it's true that I've chosen to use Spanish names for my silly little
experiment.  Maybe this isn't terribly useful as a practical example,
because this algorithm seems to have been designed for Jew surnames and
perhaps not many (or not any) Jews had Spanish surnames.  I don't know;
I'm not a Jew myself (though Noah Gordon tells the tale of a Spanish Jew
called Josep Álvarez in his book "The Winemaker", so I guess it's not
impossible).  Anyway, I suspect if you repeat the experiment with names
of other origins, you'll find pretty much the same results apply there,
and that is the whole reason D-M returns multiple codes and not just
one.


Merry Christmas :-)

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Todo: Teach planner to evaluate multiple windows in the optimal order

2022-12-25 Thread Ankit Kumar Pandey

Hi,

While looking at one of the todo item in Window function, namely:

/Teach planner to evaluate multiple windows in the optimal order 
Currently windows are always evaluated in the query-specified order./


From threads, relevant points.

Point #1


In the above query Oracle 10g performs 2 sorts, DB2 and Sybase perform 3
sorts. We also perform 3.

and Point #2

Teach planner to decide which window to evaluate first based on costs.
Currently the first window in the query is evaluated first, there may 
be no
index to help sort the first window, but perhaps there are for other 
windows

in the query. This may allow an index scan instead of a seqscan -> sort.

Repro:

select pg_catalog.version();

/version //
////
// PostgreSQL 16devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 
12.2.0-3ubuntu1) 12.2.0, 64-bit//

//(1 row)/

create table empsalary(depname text, empno int, salary int);
insert into empsalary values (select substr(md5(random()::text), 0, 25), 
generate_series(1,10), generate_series(1,12000));


explain SELECT depname, SUM(salary) OVER (ORDER BY salary), SUM(salary) 
OVER (ORDER BY empno) FROM empsalary ORDER BY salary;


/  QUERY PLAN //
//--//
// WindowAgg  (cost=289.47..324.48 rows=2001 width=49)//
//   ->  Sort  (cost=289.47..294.47 rows=2001 width=41)//
// Sort Key: salary//
// ->  WindowAgg  (cost=144.73..179.75 rows=2001 width=41)//
//   ->  Sort  (cost=144.73..149.73 rows=2001 width=33)//
// Sort Key: empno//
// ->  Seq Scan on empsalary (cost=0.00..35.01 
rows=2001 width=33)//

//(7 rows)/

As it is seen, for case #1, issue looks like resolved and only 2 sorts 
are performed.


For #2, index col ordering is changed.

create index idx_emp on empsalary (empno);
explain SELECT depname, SUM(salary) OVER (ORDER BY salary), SUM(salary) 
OVER (ORDER BY empno) FROM empsalary ORDER BY salary;

/   QUERY PLAN //
////
// WindowAgg  (cost=204.03..239.04 rows=2001 width=49)//
//   ->  Sort  (cost=204.03..209.03 rows=2001 width=41)//
// Sort Key: salary//
// ->  WindowAgg  (cost=0.28..94.31 rows=2001 width=41)//
//   ->  Index Scan using idx_emp on empsalary  
(cost=0.28..64.29 rows=2001 width=33)//

//(5 rows)/

explain SELECT depname, SUM(salary) OVER (ORDER BY empno), SUM(salary) 
OVER (ORDER BY salary) FROM empsalary ORDER BY salary;

/   QUERY PLAN //
////
// WindowAgg  (cost=204.03..239.04 rows=2001 width=49)//
//   ->  Sort  (cost=204.03..209.03 rows=2001 width=41)//
// Sort Key: salary//
// ->  WindowAgg  (cost=0.28..94.31 rows=2001 width=41)//
//   ->  Index Scan using idx_emp on empsalary  
(cost=0.28..64.29 rows=2001 width=33)//

//(5 rows)/

In both cases, index scan is performed, which means this issue is 
resolved as well.


Is this todo still relevant?

Further down the threads:


I do think the patch has probably left some low-hanging fruit on the
simpler end of the difficulty spectrum, namely when the window stuff
requires only one ordering that could be done either explicitly or
by an indexscan. That choice should ideally be done with a proper
cost comparison taking any LIMIT into account. I think right now
the LIMIT might not be accounted for, or might be considered even
when it shouldn't be because another sort is needed anyway.


I am not sure if I understand this fully but does it means proposed todo 
(to use indexes) should be refined to


teach planner to  take into account of cost model while deciding to use 
index or not in window functions? Meaning not always go with index route 
(modify point #2)?


--
Regards,
Ankit Kumar Pandey


[PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG

2022-12-25 Thread Ankit Kumar Pandey

Hi all,


This is patch for TODO item: /Improve ability to display optimizer 
analysis using OPTIMIZER_DEBUG

/

As per as suggestion in the mailing list which is to replace current 
mechanism of getting optimizer log via OPTIMIZER_DEBUG macro


to something more configurable (which doesn't require rebuilding 
postgres from source code). This patch replaces /OPTIMIZER_DEBUG

/

by introducing a//GUC /show_optimizer_log /which can be configured on 
and off to//display (or hide)//previously generated log from stdout to 
postmaster's log.


Please check attached patch, any feedback is appreciated.

--
Regards,
Ankit Kumar Pandey
From 420f0ed81d233bf63a10104d1164726b138bce83 Mon Sep 17 00:00:00 2001
From: Ankit Kumar Pandey 
Date: Sun, 25 Dec 2022 21:26:32 +0530
Subject: [PATCH] Improve ability to display optimizer analysis using
 OPTIMIZER_DEBUG

Currently optimizer logs are enabled as build flag. This patch adds a
GUC variable to enable or disable optimizer logs and thus does away with
build requirements.
---
 src/backend/nodes/print.c | 107 ++
 src/backend/optimizer/path/allpaths.c | 134 ++
 src/backend/optimizer/plan/planner.c  |  11 +-
 src/backend/utils/misc/guc_tables.c   |  11 ++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/nodes/print.h |   1 +
 src/include/optimizer/optimizer.h |   1 +
 src/include/optimizer/paths.h |   2 -
 8 files changed, 200 insertions(+), 68 deletions(-)

diff --git a/src/backend/nodes/print.c b/src/backend/nodes/print.c
index a5c44adc6c..c33db27142 100644
--- a/src/backend/nodes/print.c
+++ b/src/backend/nodes/print.c
@@ -418,6 +418,113 @@ print_expr(const Node *expr, const List *rtable)
 		printf("unknown expr");
 }
 
+
+
+/*
+ * format_expr
+ *	  stores an expression to given string for logging
+ */
+void
+format_expr(const Node *expr, const List *rtable, StringInfoData *str)
+{
+	if (expr == NULL)
+	{
+		appendStringInfo(str, "<>");
+		return;
+	}
+
+	if (IsA(expr, Var))
+	{
+		const Var  *var = (const Var *) expr;
+		char	   *relname,
+   *attname;
+
+		switch (var->varno)
+		{
+			case INNER_VAR:
+relname = "INNER";
+attname = "?";
+break;
+			case OUTER_VAR:
+relname = "OUTER";
+attname = "?";
+break;
+			case INDEX_VAR:
+relname = "INDEX";
+attname = "?";
+break;
+			default:
+{
+	RangeTblEntry *rte;
+
+	Assert(var->varno > 0 &&
+		   (int) var->varno <= list_length(rtable));
+	rte = rt_fetch(var->varno, rtable);
+	relname = rte->eref->aliasname;
+	attname = get_rte_attribute_name(rte, var->varattno);
+}
+break;
+		}
+		appendStringInfo(str, "%s.%s", relname, attname);
+	}
+	else if (IsA(expr, Const))
+	{
+		const Const *c = (const Const *) expr;
+		Oid			typoutput;
+		bool		typIsVarlena;
+		char	   *outputstr;
+
+		if (c->constisnull)
+		{
+			appendStringInfo(str, "NULL");
+			return;
+		}
+
+		getTypeOutputInfo(c->consttype,
+		  &typoutput, &typIsVarlena);
+
+		outputstr = OidOutputFunctionCall(typoutput, c->constvalue);
+		appendStringInfo(str, "%s", outputstr);
+		pfree(outputstr);
+	}
+	else if (IsA(expr, OpExpr))
+	{
+		const OpExpr *e = (const OpExpr *) expr;
+		char	   *opname;
+
+		opname = get_opname(e->opno);
+		if (list_length(e->args) > 1)
+		{
+			print_expr(get_leftop((const Expr *) e), rtable);
+			appendStringInfo(str, " %s ", ((opname != NULL) ? opname : "(invalid operator)"));
+			print_expr(get_rightop((const Expr *) e), rtable);
+		}
+		else
+		{
+			appendStringInfo(str, "%s ", ((opname != NULL) ? opname : "(invalid operator)"));
+			print_expr(get_leftop((const Expr *) e), rtable);
+		}
+	}
+	else if (IsA(expr, FuncExpr))
+	{
+		const FuncExpr *e = (const FuncExpr *) expr;
+		char	   *funcname;
+		ListCell   *l;
+
+		funcname = get_func_name(e->funcid);
+		appendStringInfo(str, "%s(", ((funcname != NULL) ? funcname : "(invalid function)"));
+		foreach(l, e->args)
+		{
+			print_expr(lfirst(l), rtable);
+			if (lnext(e->args, l))
+appendStringInfo(str, ",");
+		}
+		appendStringInfo(str, ")");
+	}
+	else
+		appendStringInfo(str, "unknown expr");
+}
+
 /*
  * print_pathkeys -
  *	  pathkeys list of PathKeys
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 0cfa3a1d6c..38cbc365b3 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -28,9 +28,7 @@
 #include "nodes/makefuncs.h"
 #include "nodes/nodeFuncs.h"
 #include "nodes/supportnodes.h"
-#ifdef OPTIMIZER_DEBUG
 #include "nodes/print.h"
-#endif
 #include "optimizer/appendinfo.h"
 #include "optimizer/clauses.h"
 #include "optimizer/cost.h"
@@ -65,6 +63,7 @@ bool		enable_geqo = false;	/* just in case GUC doesn't set it */
 int			geqo_threshold;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
+bool		show_optimizer_log = false;
 
 /* Hook for plugins to get contro

Re: Error-safe user functions

2022-12-25 Thread Tom Lane
Robert Haas  writes:
> On Fri, Dec 16, 2022 at 1:31 PM Tom Lane  wrote:
>> The reg* functions probably need a unified plan as to how far
>> down we want to push non-error behavior.

> I would be in favor of an aggressive approach.

Here's a proposed patch for converting regprocin and friends
to soft error reporting.  I'll say at the outset that it's an
engineering compromise, and it may be worth going further in
future.  But I doubt it's worth doing more than this for v16,
because the next steps would be pretty invasive.

I've converted all the errors thrown directly within regproc.c,
and also converted parseTypeString, typeStringToTypeName, and
stringToQualifiedNameList to report their own errors softly.
This affected some outside callers, but not so many of them
that I think it's worth inventing compatibility wrappers.

I dealt with lookup failures by just changing the input functions
to call the respective lookup functions with missing_ok = true,
and then throw their own error softly on failure.

Also, I've changed to_regproc() and friends to return NULL
in exactly the same cases that are now soft errors for the
input functions.  Previously they were a bit inconsistent
about what triggered hard errors vs. returning NULL.
(Perhaps we should go further than this, and convert all these
functions to just be DirectInputFunctionCallSafe wrappers
around the corresponding input functions?  That would save
some duplicative code, but I've not done it here.)

What's not fixed here:

1. As previously discussed, parse errors in type names are
thrown by the main grammar, so getting those to not be
hard errors seems like too big a lift for today.

2. Errors about invalid type modifiers (reported by
typenameTypeMod or type-specific typmodin routines) are not
trapped either.  Fixing this would require extending the
soft-error conventions to typmodin routines, which maybe will
be worth doing someday but it seems pretty far down the
priority list.  Specifying a typmod is surely not main-line
usage for regtypein.

3. Throwing our own error has the demerit that it might be
different from what the underlying lookup function would have
reported.  This is visible in some changes in existing
regression test cases, such as

-ERROR:  schema "ng_catalog" does not exist
+ERROR:  relation "ng_catalog.pg_class" does not exist

This isn't wrong, exactly, but the loss of specificity is
a bit annoying.

4. This still fails to trap errors about "too many dotted names"
and "cross-database references are not implemented", which are
thrown in DeconstructQualifiedName, LookupTypeName,
RangeVarGetRelid, and maybe some other places.

5. We also don't trap errors about "the schema exists but
you don't have USAGE permission to do a lookup in it",
because LookupExplicitNamespace still throws that even
when passed missing_ok = true.

The obvious way to fix #3,#4,#5 is to change pretty much all
of the catalog lookup infrastructure to deal in escontext
arguments instead of "missing_ok" booleans.  That might be
worth doing --- it'd have benefits beyond the immediate
problem, I think --- but I feel it's a bigger lift than we
want to undertake for v16.  It'd be better to spend the time
we have left for v16 on building features that use soft error
reporting than on refining corner cases in the reg* functions.

So I think we should stop more or less here, possibly after
changing the to_regfoo functions to be simple wrappers
around the soft input functions.

regards, tom lane

diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index 109bdfb33f..a1df8b1ddc 100644
--- a/src/backend/catalog/objectaddress.c
+++ b/src/backend/catalog/objectaddress.c
@@ -2182,7 +2182,7 @@ pg_get_object_address(PG_FUNCTION_ARGS)
 			ereport(ERROR,
 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 	 errmsg("name or argument lists may not contain nulls")));
-		typename = typeStringToTypeName(TextDatumGetCString(elems[0]));
+		typename = typeStringToTypeName(TextDatumGetCString(elems[0]), NULL);
 	}
 	else if (type == OBJECT_LARGEOBJECT)
 	{
@@ -2238,7 +2238,8 @@ pg_get_object_address(PG_FUNCTION_ARGS)
 		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 		 errmsg("name or argument lists may not contain nulls")));
 			args = lappend(args,
-		   typeStringToTypeName(TextDatumGetCString(elems[i])));
+		   typeStringToTypeName(TextDatumGetCString(elems[i]),
+NULL));
 		}
 	}
 	else
diff --git a/src/backend/parser/parse_type.c b/src/backend/parser/parse_type.c
index f7ad689459..8f3850aa4e 100644
--- a/src/backend/parser/parse_type.c
+++ b/src/backend/parser/parse_type.c
@@ -727,10 +727,15 @@ pts_error_callback(void *arg)
  * Given a string that is supposed to be a SQL-compatible type declaration,
  * such as "int4" or "integer" or "character varying(32)", parse
  * the string and return the result as a TypeName.
- * If the string cannot be parsed as a type, an error is raised.
+ *
+ * If the string ca

Re: [RFC] Add jit deform_counter

2022-12-25 Thread Pavel Stehule
Hi


ne 11. 12. 2022 v 5:44 odesílatel Pavel Stehule 
napsal:

> Hi
>
> ne 11. 12. 2022 v 1:14 odesílatel Ian Lawrence Barwick 
> napsal:
>
>> 2022年6月12日(日) 18:14 Dmitry Dolgov <9erthali...@gmail.com>:
>> >
>> > Hi,
>> >
>> > I've noticed that JIT performance counter generation_counter seems to
>> include
>> > actions, relevant for both jit_expressions and jit_tuple_deforming
>> options. It
>> > means one can't directly see what is the influence of
>> jit_tuple_deforming
>> > alone, which would be helpful when adjusting JIT options. To make it
>> better a
>> > new counter can be introduced, does it make sense?
>>
>> Hi Pavel
>>
>> I see you are added as reviewer in the CF app; have you been able to take
>> a look
>> at this?
>>
>
> I hope so yes
>

there are some problems with stability of regress tests

http://cfbot.cputube.org/dmitry-dolgov.html

Regards

Pavel


> Regards
>
> Pavel
>
>
>>
>> Regards
>>
>> Ian Barwick
>>
>


Re: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG

2022-12-25 Thread Tom Lane
Ankit Kumar Pandey  writes:
> As per as suggestion in the mailing list which is to replace current 
> mechanism of getting optimizer log via OPTIMIZER_DEBUG macro
> to something more configurable (which doesn't require rebuilding 
> postgres from source code). This patch replaces /OPTIMIZER_DEBUG
> by introducing a//GUC /show_optimizer_log /which can be configured on 
> and off to//display (or hide)//previously generated log from stdout to 
> postmaster's log.

The problem with OPTIMIZER_DEBUG is that it's useless.  I've
been hacking on the PG planner for well more than twenty years,
and I do not think I've ever made any use of that code ---
certainly not since the turn of the century or so.
Making it GUC-accessible isn't going to make it more useful.

There certainly could be value in having some better trace
of what the planner is doing, but I honestly don't have much
of an idea of what that would look like.  debug_print_rel
isn't that, however, because it won't show you anything about
paths that were considered and immediately rejected by
add_path (or never got to add_path at all, because of the
approximate-initial-cost filters in joinpath.c).  We've
sometimes speculated about logging every path submitted to
add_path, but that would likely be too verbose to be very
helpful.

Also, because OPTIMIZER_DEBUG is such a backwater, it's
never gotten much polishing for usability.  There are large
gaps in what it can deal with (print_expr is pretty much
a joke for example), yet also lots of redundancy in the
output ... and dumping stuff to stdout isn't terribly helpful
to the average user in the first place.  These days people
would probably also wish that the output could be machine-
readable in some way (JSON-formatted, perhaps).

So I think this whole area would need some pretty serious
rethinking and attention to detail before we should consider
claiming that it's a useful feature.

regards, tom lane




Re: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG

2022-12-25 Thread Ankit Kumar Pandey



On 25/12/22 23:54, Tom Lane wrote:


These days people
would probably also wish that the output could be machine-
readable in some way (JSON-formatted, perhaps).


Perhaps switch to enable logs could be on (standard logging style), 
json, xml etc and print the output


as required?



  however, because it won't show you anything about
paths that were considered and immediately rejected by
add_path (or never got to add_path at all, because of the
approximate-initial-cost filters in joinpath.c).  We've
sometimes speculated about logging every path submitted to
add_path, but that would likely be too verbose to be very
helpful.

Maybe we could add verbose option too and this could be one of the 
output. Are there any more relevant information


that you could think of which can be included here?


Also, inputs from other hackers are welcomed here.


--
Regards,
Ankit Kumar Pandey





Re: Error-safe user functions

2022-12-25 Thread Tom Lane
I got annoyed by the fact that types cid, xid, xid8 don't throw
error even for obvious garbage, because they just believe the
result of strtoul or strtoull without any checking.  That was
probably up to project standards when cidin and xidin were
written; but surely it's not anymore, especially when we can
piggyback on work already done for type oid.

Anybody have an objection to the following?  One note is that
because we already had test cases checking that xid would
accept hex input, I made the common subroutines use "0" not
"10" for strtoul's last argument, meaning that oid will accept
hex now too.

regards, tom lane

diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 7cded73e6e..c67a79344a 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -477,6 +477,180 @@ invalid_syntax:
 	"bigint", s)));
 }
 
+/*
+ * Convert input string to an unsigned 32 bit integer.
+ *
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * If endloc isn't NULL, store a pointer to the rest of the string there,
+ * so that caller can parse the rest.  Otherwise, it's an error if anything
+ * but whitespace follows.
+ *
+ * typname is what is reported in error messges.
+ *
+ * If escontext points to an ErrorSaveContext node, that is filled instead
+ * of throwing an error; the caller must check SOFT_ERROR_OCCURRED()
+ * to detect errors.
+ */
+uint32
+uint32in_subr(const char *s, char **endloc,
+			  const char *typname, Node *escontext)
+{
+	uint32		result;
+	unsigned long cvt;
+	char	   *endptr;
+
+	/* Ensure that empty-input is handled consistently across platforms */
+	if (*s == '\0')
+		ereturn(escontext, 0,
+(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid input syntax for type %s: \"%s\"",
+		typname, s)));
+
+	errno = 0;
+	cvt = strtoul(s, &endptr, 0);
+
+	/*
+	 * strtoul() normally only sets ERANGE.  On some systems it also may set
+	 * EINVAL, which simply means it couldn't parse the input string. This is
+	 * handled by the second "if" consistent across platforms.
+	 */
+	if (errno && errno != ERANGE && errno != EINVAL)
+		ereturn(escontext, 0,
+(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid input syntax for type %s: \"%s\"",
+		typname, s)));
+
+	if (endptr == s)
+		ereturn(escontext, 0,
+(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid input syntax for type %s: \"%s\"",
+		typname, s)));
+
+	if (errno == ERANGE)
+		ereturn(escontext, 0,
+(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("value \"%s\" is out of range for type %s",
+		s, typname)));
+
+	if (endloc)
+	{
+		/* caller wants to deal with rest of string */
+		*endloc = endptr;
+	}
+	else
+	{
+		/* allow only whitespace after number */
+		while (*endptr && isspace((unsigned char) *endptr))
+			endptr++;
+		if (*endptr)
+			ereturn(escontext, 0,
+	(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+	 errmsg("invalid input syntax for type %s: \"%s\"",
+			typname, s)));
+	}
+
+	result = (uint32) cvt;
+
+	/*
+	 * Cope with possibility that unsigned long is wider than uint32, in which
+	 * case strtoul will not raise an error for some values that are out of
+	 * the range of uint32.
+	 *
+	 * For backwards compatibility, we want to accept inputs that are given
+	 * with a minus sign, so allow the input value if it matches after either
+	 * signed or unsigned extension to long.
+	 *
+	 * To ensure consistent results on 32-bit and 64-bit platforms, make sure
+	 * the error message is the same as if strtoul() had returned ERANGE.
+	 */
+#if PG_UINT32_MAX != ULONG_MAX
+	if (cvt != (unsigned long) result &&
+		cvt != (unsigned long) ((int) result))
+		ereturn(escontext, 0,
+(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("value \"%s\" is out of range for type %s",
+		s, typname)));
+#endif
+
+	return result;
+}
+
+/*
+ * Convert input string to an unsigned 64 bit integer.
+ *
+ * Allows any number of leading or trailing whitespace characters.
+ *
+ * If endloc isn't NULL, store a pointer to the rest of the string there,
+ * so that caller can parse the rest.  Otherwise, it's an error if anything
+ * but whitespace follows.
+ *
+ * typname is what is reported in error messges.
+ *
+ * If escontext points to an ErrorSaveContext node, that is filled instead
+ * of throwing an error; the caller must check SOFT_ERROR_OCCURRED()
+ * to detect errors.
+ */
+uint64
+uint64in_subr(const char *s, char **endloc,
+			  const char *typname, Node *escontext)
+{
+	uint64		result;
+	char	   *endptr;
+
+	/* Ensure that empty-input is handled consistently across platforms */
+	if (*s == '\0')
+		ereturn(escontext, 0,
+(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid input syntax for type %s: \"%s\"",
+		typname, s)));
+
+	errno = 0;
+	result = strtou64(s, &endptr, 0);
+
+	/*
+	 * strtoul[l]() normally only sets 

Re: pg_upgrade: Make testing different transfer modes easier

2022-12-25 Thread Daniel Gustafsson
> On 19 Dec 2022, at 01:39, Shinoda, Noriyoshi (PN Japan FSIP) 
>  wrote:

> With the addition of --copy option, pg_upgrade now has three possible 
> transfer mode options. Currently, an error does not occur even if multiple 
> transfer modes are specified. For example, we can also run "pg_upgrade --link 
> --copy --clone" command. As discussed in Horiguchi-san's previous email, 
> options like "--mode=link|copy|clone" can prevent this problem.
> The attached patch uses the current implementation and performs a minimum 
> check to prevent multiple transfer modes from being specified.

We typically allow multiple invocations of the same parameter with a
last-one-wins strategy, and only error out when competing *different*
parameters are present.  A --mode= parameter can still be added as
syntactic sugar, but multiple-choice parameters is not a commonly used pattern
in postgres utilities (pg_dump/restore and pg_basebackup are ones that come to
mind).

--
Daniel Gustafsson   https://vmware.com/





Re: [BUG] pg_upgrade test fails from older versions.

2022-12-25 Thread Michael Paquier
On Fri, Dec 23, 2022 at 10:39:25AM -0600, Justin Pryzby wrote:
> LGTM.  Thanks.

Done as of d3c0cc4.
--
Michael


signature.asc
Description: PGP signature


Re: [BUG] pg_upgrade test fails from older versions.

2022-12-25 Thread Michael Paquier
On Fri, Dec 23, 2022 at 12:43:00PM +0300, Anton A. Melnikov wrote:
> Sorry, didn't get to see the last letter!

No worries, the result is the same :)

I was looking at 0002 to add a callback to provide custom filtering
rules.

+   my @ext_filter = split('\/', $_);
Are you sure that enforcing a separation with a slash is a good idea?
What if the filters include directory paths or characters that are
escaped, for example?

Rather than introducing a filter.regex, I would tend to just document
that in the README with a small example.  I have been considering a
few alternatives while making this useful in most cases, still my mind
alrways comes back to the simplest thing we to just read each line of
the file, chomp it and apply the pattern to the log file..
--
Michael


signature.asc
Description: PGP signature


Re: [BUG] pg_upgrade test fails from older versions.

2022-12-25 Thread Anton A. Melnikov

Hello!

On 23.12.2022 05:42, Michael Paquier wrote:

On Thu, Dec 22, 2022 at 09:59:18AM +0300, Anton A. Melnikov wrote:

2) v2-0002-Additional-dumps-filtering.patch


+   # Replace specific privilegies with ALL
+   $dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx;
This should not be in 0002, I guess..


Made a separate patch for it: v3-0001-Fix-dumps-filtering.patch

On 26.12.2022 05:52, Michael Paquier wrote:

On Fri, Dec 23, 2022 at 12:43:00PM +0300, Anton A. Melnikov wrote:
I was looking at 0002 to add a callback to provide custom filtering
rules.

+   my @ext_filter = split('\/', $_);
Are you sure that enforcing a separation with a slash is a good idea?
What if the filters include directory paths or characters that are
escaped, for example?

Rather than introducing a filter.regex, I would tend to just document
that in the README with a small example.  I have been considering a
few alternatives while making this useful in most cases, still my mind
alrways comes back to the simplest thing we to just read each line of
the file, chomp it and apply the pattern to the log file..


Thanks for your attention!
Yes, indeed. It will be really simpler.
Made it in the  v3-0002-Add-external-dumps-filtering.patch

With the best wishes,

--
Anton A. Melnikov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Companycommit 602627daf32a6d33ae96ce8fbae918c9f00f0633
Author: Anton A. Melnikov 
Date:   Mon Dec 26 06:21:32 2022 +0300

Fix dupms filtering in pg_upgrade test from older versions. Replace
specific privilegies in dumps with ALL due to b5d63824
and 60684dd8.

diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index 4cc1469306..d23c4b2253 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -44,6 +44,8 @@ sub filter_dump
 	$dump_contents =~ s/^\-\-.*//mgx;
 	# Remove empty lines.
 	$dump_contents =~ s/^\n//mgx;
+	# Replace specific privilegies with ALL
+	$dump_contents =~ s/^(GRANT\s|REVOKE\s)(\S*)\s/$1ALL /mgx;
 
 	my $dump_file_filtered = "${dump_file}_filtered";
 	open(my $dh, '>', $dump_file_filtered)
commit 3870fb8263dc7e3fa240753b4eb89945b8d229be
Author: Anton A. Melnikov 
Date:   Thu Dec 22 08:01:40 2022 +0300

Add external dumps filtering during upgrade test.

diff --git a/src/bin/pg_upgrade/TESTING b/src/bin/pg_upgrade/TESTING
index 127dc30bbb..b8cce2bae1 100644
--- a/src/bin/pg_upgrade/TESTING
+++ b/src/bin/pg_upgrade/TESTING
@@ -56,3 +56,12 @@ Once the dump is created, it can be repeatedly used with $olddump and
 the dump out of the new database and the comparison of the dumps between
 the old and new databases.  The contents of the dumps can also be manually
 compared.
+
+You can use additional dump filtering. To do this, you need to define the
+'filter' environment variable and specify the path to the file with
+filtering rules in it. Here is an example contens of such a file:
+# examples:
+# Remove all CREATE POLICY statements
+s/^CREATE\sPOLICY.*//mgx
+# Replace REFRESH with DROP for materialized views
+s/^REFRESH\s(MATERIALIZED\sVIEW)/DROP $1/mgx
diff --git a/src/bin/pg_upgrade/t/002_pg_upgrade.pl b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
index 4cc1469306..e094fd08c3 100644
--- a/src/bin/pg_upgrade/t/002_pg_upgrade.pl
+++ b/src/bin/pg_upgrade/t/002_pg_upgrade.pl
@@ -45,6 +45,30 @@ sub filter_dump
 	# Remove empty lines.
 	$dump_contents =~ s/^\n//mgx;
 
+	if (defined($ENV{filter}))
+	{
+		# Use the external filter
+		my $ext_filter_file = $ENV{filter};
+		die "no file with external filter found!" unless -e $ext_filter_file;
+
+		open my $ext_filter_handle, '<', $ext_filter_file;
+		chomp(my @ext_filter_lines = <$ext_filter_handle>);
+		close $ext_filter_handle;
+
+		foreach (@ext_filter_lines)
+		{
+			# omit lines with comments
+			if (substr($_, 0, 1) eq '#')
+			{
+next;
+			}
+
+			# apply lines with filters
+			my $filter = "\$dump_contents =~ $_";
+			eval $filter;
+		}
+	}
+
 	my $dump_file_filtered = "${dump_file}_filtered";
 	open(my $dh, '>', $dump_file_filtered)
 	  || die "opening $dump_file_filtered";


Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL

2022-12-25 Thread Michael Paquier
On Sat, Dec 24, 2022 at 06:23:29PM +0530, Bharath Rupireddy wrote:
> Thanks for the patch. I've made the above change as well as renamed
> the test file name to be save_fpi.pl, everything else remains the same
> as v11. Here's the v12 patch which LGTM. I'll mark it as RfC -
> https://commitfest.postgresql.org/41/3628/.

I have done a review of that, and here are my notes:
- The variable names were a bit inconsistent, so I have switched most
of the new code to use "fullpage".
- The code was not able to handle the case of a target directory
existing but empty, so I have added a wrapper on pg_check_dir().
- XLogRecordHasFPW() could be checked directly in the function saving
the blocks.  Still, there is no need for it as we apply the same
checks again in the inner loop of the routine.
- The new test has been renamed.
- RestoreBlockImage() would report a failure and the code would just
skip it and continue its work.  This could point out to a compression
failure for example, so like any code paths calling this routine I
think that we'd better do a pg_fatal() and fail hard.
- I did not understand why there is a reason to make this option
conditional on the record prints or even the stats, so I have moved
the FPW save routine into a separate code path.  The other two could
be silenced (or not) using --quiet for example, for the same result as
v12 without impacting the usability of this feature.
- Few tweaks to the docs, the --help output, the comments and the
tests.
- Indentation applied.

Being able to filter the blocks saved using start/end LSNs or just
--relation is really cool, especially as the file names use the same
order as what's needed for this option.

Comments?
--
Michael
From 66eedbb96858a4f6804509900b116d8a63b39925 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 26 Dec 2022 16:28:00 +0900
Subject: [PATCH v13] Teach pg_waldump to extract FPIs from the WAL stream

Extracts full-page images from the WAL stream into a given target directory.
These images are subject to the same filtering rules as normal display in
pg_waldump, which means that you can isolate the full page writes to a
target relation, among other things.

Files are saved with the filename: _ with
formatting to make things somewhat sortable; for instance:

-01C0.1663.1.6117.0_main
-01000150.1664.0.6115.0_main
-010001E0.1664.0.6114.0_main
-01000270.1663.1.6116.0_main
-01000300.1663.1.6113.0_main
-01000390.1663.1.6112.0_main
-01000420.1663.1.8903.0_main
-010004B0.1663.1.8902.0_main
-01000540.1663.1.6111.0_main
-010005D0.1663.1.6110.0_main

It's noteworthy that the raw block images do not have the current LSN
stored with them in the WAL stream (as would be true for on-heap
versions of the blocks), nor would the checksum be updated in
them (though WAL itself has checksums, so there is some protection
there).

These images could be loaded/inspected via `pg_read_binary_file()` and
used in the `pageinspect` suite of tools to perform detailed analysis on
the pages in question, based on historical information, and may come in
handy for forensics work.
---
 src/bin/pg_waldump/meson.build|   1 +
 src/bin/pg_waldump/pg_waldump.c   | 107 +
 src/bin/pg_waldump/t/002_save_fullpage.pl | 111 ++
 doc/src/sgml/ref/pg_waldump.sgml  |  66 +
 4 files changed, 285 insertions(+)
 create mode 100644 src/bin/pg_waldump/t/002_save_fullpage.pl

diff --git a/src/bin/pg_waldump/meson.build b/src/bin/pg_waldump/meson.build
index 3fa1b53e71..0428998350 100644
--- a/src/bin/pg_waldump/meson.build
+++ b/src/bin/pg_waldump/meson.build
@@ -31,6 +31,7 @@ tests += {
   'tap': {
 'tests': [
   't/001_basic.pl',
+  't/002_save_fullpage.pl',
 ],
   },
 }
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index 9993378ca5..d90a142c68 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -23,9 +23,13 @@
 #include "access/xlogrecord.h"
 #include "access/xlogstats.h"
 #include "common/fe_memutils.h"
+#include "common/file_perm.h"
+#include "common/file_utils.h"
 #include "common/logging.h"
+#include "common/relpath.h"
 #include "getopt_long.h"
 #include "rmgrdesc.h"
+#include "storage/bufpage.h"
 
 /*
  * NOTE: For any code change or issue fix here, it is highly recommended to
@@ -70,6 +74,9 @@ typedef struct XLogDumpConfig
 	bool		filter_by_relation_block_enabled;
 	ForkNumber	filter_by_relation_forknum;
 	bool		filter_by_fpw;
+
+	/* save options */
+	char	   *save_fullpage_path;
 } XLogDumpConfig;
 
 
@@ -112,6 +119,37 @@ verify_directory(const char *directory)
 	return true;
 }
 
+/*
+ * Create if necessary the directory storing the full-page images extracted
+ * from the WAL records read.
+ */
+static void
+create_fullpage_directory(char *path)
+{
+	int			ret;
+
+	switch ((ret = pg_check_dir(path)))
+	{
+		case 0:
+			/* Does no