Re: Allow escape in application_name

2021-09-09 Thread Fujii Masao




On 2021/09/08 21:32, kuroda.hay...@fujitsu.com wrote:

Dear Horiguchi-san,

Thank you for reviewing! I attached the fixed version.


Thanks for updating the patch!

+   for (i = n - 1; i >= 0; i--)
+   {
+   if (strcmp(keywords[i], "application_name") == 0)
+   {
+   parse_pgfdw_appname(&buf, values[i]);
+   values[i] = buf.data;
+   break;
+   }

postgres_fdw gets out of the loop after processing appname even
when buf.data is '\0'. Is this expected behavior? Because of this,
when postgres_fdw.application_name = '%b', unprocessed appname
of the server object is used.


+CREATE FUNCTION for_escapes() RETURNS bool AS $$
+DECLARE
+appname text;
+c bool;
+BEGIN
+SHOW application_name INTO appname;
+EXECUTE 'SELECT COUNT(*) FROM pg_stat_activity WHERE application_name 
LIKE '''

Could you tell me why the statement checking
application_name should be wrapped in a function?
Instead, we can just execute something like the following?

SELECT COUNT(*) FROM pg_stat_activity WHERE application_name = 
current_setting('application_name') || current_user || current_database() || 
pg_backend_pid() || '%';


+   char *endptr = NULL;
+   padding = (int)strtol(p, &endptr, 10);

strtol() seems to work differently from process_log_prefix_padding(),
for example, when the input string is "%-p".


+   case 'a':
+   {
+   const char *appname = application_name;

When log_line_prefix() processes "%a", "%u" or "%d" characters in
log_line_prefix, it checks whether MyProcPort is NULL or not.
Likewise shouldn't parse_pgfdw_appname() do the same thing?
For now it's ok not to check that because only process having
MyProcPort can use postgres_fdw. But in the future the process
not having that may use postgres_fdw, for example, 2PC resolver
process that Sawada-san is proposing to add to support automatic
2PC among multiple remote servers.


+You can use escape sequences for application_name even 
if
+it is set as a connection or a user mapping option.
+Please refer the later section.

I was thinking that application_name cannot be set in a user mapping.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Remove Value node struct

2021-09-09 Thread Peter Eisentraut

On 08.09.21 04:04, Kyotaro Horiguchi wrote:

At Tue, 7 Sep 2021 11:22:24 +0200, Peter Eisentraut 
 wrote in

On 30.08.21 04:13, Kyotaro Horiguchi wrote:

+   else if (IsA(obj, Integer))
+   _outInteger(str, (Integer *) obj);
+   else if (IsA(obj, Float))
+   _outFloat(str, (Float *) obj);
I felt that the type enames are a bit confusing as they might be too
generic, or too close with the corresponding binary types.
-   Node   *arg;/* a (Value *) or a (TypeName 
*) */
+   Node   *arg;
Mmm. It's a bit pity that we lose the generic name for the value
nodes.


Not sure what you mean here.


The member arg loses the information on what kind of nodes are to be
stored there. Concretely it just removes the comment "a (Value *) or a
(TypeName *)". If the (Value *) were expanded in a straight way, the
comment would be "a (Integer *), (Float *), (String *), (BitString *),
or (TypeName *)". I supposed that the member loses the comment because
it become too long.


Ok, I added the comment back in in a modified form.

The patches have been committed now.  Thanks.




Re: [PROPOSAL] new diagnostic items for the dynamic sql

2021-09-09 Thread Pavel Stehule
čt 9. 9. 2021 v 8:23 odesílatel Dinesh Chemuduru 
napsal:

>
>
> On Thu, 9 Sept 2021 at 11:07, Pavel Stehule 
> wrote:
>
>> Hi
>>
>> I tested the last patch, and I think I found unwanted behavior.
>>
>> The value of PG_SQL_TEXT is not empty only when the error is related to
>> the parser stage. When the error is raised in the query evaluation stage,
>> then the value is empty.
>> I think this is too confusing. PL/pgSQL is a high level language, and the
>> behaviour should be consistent independent of internal implementation. I am
>> afraid this feature requires much more work.
>>
>> postgres=# DO $$
>> DECLARE
>>   err_sql_stmt TEXT;
>>   err_sql_pos INT;
>> BEGIN
>>   EXECUTE 'SELECT 1/0';
>> EXCEPTION
>>   WHEN OTHERS THEN
>> GET STACKED DIAGNOSTICS
>>   err_sql_stmt = PG_SQL_TEXT,
>>   err_sql_pos = PG_ERROR_LOCATION;
>> RAISE NOTICE 'exception sql "%"', err_sql_stmt;
>> RAISE NOTICE 'exception sql position %', err_sql_pos;
>> END;
>> $$;
>> NOTICE:  exception sql ""
>> NOTICE:  exception sql position 0
>> DO
>>
>> For this case, the empty result is not acceptable in this language. It is
>> too confusing. The implemented behaviour is well described in regress
>> tests, but I don't think it is user (developer) friendly. The location
>> field is not important, and can be 0 some times. But query text should be
>> not empty in all possible cases related to any query evaluation. I think
>> this can be a nice and useful feature, but the behavior should be
>> consistent.
>>
>> Thanks for your time in evaluating this patch.
> Let me try to fix the suggested case(I tried to fix this case in the past,
> but this time I will try to spend more time on this), and will submit a new
> patch.
>

sure

Pavel


>
>
>> Second, but minor, objection to this patch is zero formatting in a
>> regress test.
>>
>> Regards
>>
>> Pavel
>>
>>


RE: Failed transaction statistics to measure the logical replication progress

2021-09-09 Thread osumi.takami...@fujitsu.com
Hi

On Thursday, September 2, 2021 11:23 AM I wrote:
> I've made a new patch to extend pg_stat_subscription as suggested in [1] to
> have columns xact_commit, xact_error and independent xact_abort mentioned
> in [2].
> Also, during discussion, we touched a topic if we should include data sizes 
> for
> each column above and concluded that it's better to have ones. Accordingly,
> I've implemented corresponding columns to show the data sizes as well.
I've updated my previous patch of subscriber's stats.
The main change of this version is
the bytes calculation that are exported by pg_stat_subscription.
I prepared a new function which is equivalent to ReorderBufferChangeSize
on the subscriber side to calculate the resource consumption.
It's because this is in line with actual process of the subscriber.

Other changes are minor and cosmetic.

Also, this patch is based on the v12 patch-set of skip xid,
as described in my previous email.
Note that you need to use past commit-id if you want to apply v12 set.
I'll update mine as well when the latest patch-set v13 is shared on hackers.

Best Regards,
Takamichi Osumi
 

extend_subscription_stats_of_transaction_v03.patch
Description: extend_subscription_stats_of_transaction_v03.patch


Re: row filtering for logical replication

2021-09-09 Thread Amit Kapila
On Thu, Sep 9, 2021 at 11:43 AM Peter Smith  wrote:
>
> PSA my new incremental patch (v28-0002) that introduces row filter
> validation for the publish mode "delete". The validation requires that
> any columns referred to in the filter expression must also be part of
> REPLICA IDENTITY or PK.
>
> [This v28-0001 is identical to the most recently posted rebased base
> patch. It is included again here only so the cfbot will be happy]
>
> ~~
>
> A requirement for some filter validation like this has been mentioned
> several times in this thread [1][2][3][4][5].
>
> I also added some test code for various kinds of replica identity.
>
> A couple of existing tests had to be modified so they could continue
> to work  (e.g. changed publish = "insert" or REPLICA IDENTITY FULL)
>
> Feedback is welcome.
>
> ~~
>
> NOTE: This validation currently only checks when the filters are first
> created. Probably there are many other scenarios that need to be
> properly handled. What to do if something which impacts the existing
> filter is changed?
>
> e.g.
> - what if the user changes the publish parameter using ALTER
> PUBLICATION set (publish="delete") etc?
> - what if the user changes the replication identity?
> - what if the user changes the filter using ALTER PUBLICATION in a way
> that is no longer compatible with the necessary cols?
> - what if the user changes the table (e.g. removes a column referred
> to by a filter)?
> - what if the user changes a referred column name?
> - more...
>
> (None of those are addressed yet - thoughts?)
>

I think we need to remove the filter or the table from publication in
such cases. Now, one can think of just removing the condition related
to the column being removed/changed in some way but I think that won't
be appropriate because it would change the meaning of the filter. We
are discussing similar stuff in the column filter thread and we might
want to do the same for row filters as well. I would prefer to remove
the table in both cases as Rahila has proposed in the column filter
patch.

-- 
With Regards,
Amit Kapila.




Re: Proposal: More structured logging

2021-09-09 Thread Ronan Dunklau
Le mercredi 8 septembre 2021, 11:51:31 CEST Peter Eisentraut a écrit :
> On 01.09.21 10:00, Ronan Dunklau wrote:
> > In-core it would open up the possibility to split log messages into
> > different fields, for example the different statistics reported in the
> > logs by VACUUM / ANALYZE VERBOSE and make it easier to consume the output
> > without having to parse the message. Parsing the message also means that
> > any tool parsing it needs to either be aware of the locale, or to force
> > the user to use a specific one.
> 
> I think those messages would themselves have substructure.  For example,
> the current message
> 
>  "I/O timings: read: %.3f ms, write: %.3f ms\n"
> 
> might be
> 
>  {"I/O timings": {"read": ..., "write": ...}}
> 
> and that in turn is already part of a larger message.
> 
> So just having a single level of tags wouldn't really work for this.

I agree having a nested structure may feel more natural, but I don't think it 
would matter too much if we standardize on ini-style property names (ie, 
iotimings.read, iotimings.write and so on...)

Regards,

-- 
Ronan Dunklau






回复:Queries that should be canceled will get stuck on secure_write function

2021-09-09 Thread 蔡梦娟(玊于)


I changed the implementation about this problem: 
a) if the cancel query interrupt is from db for some reason, such as recovery 
conflict, then handle it immediately, and cancel request is treated as 
terminate request;
b) if the cancel query interrupt is from client, then ignore as original way

In the attached patch, I also add a tap test to generate a recovery conflict on 
a standby during the backend process is stuck on client write, check whether it 
can handle the cancel query request due to recovery conflict.

what do you think of it, hope to get your reply

Thanks & Best Regards




0001-Handle-cancel-interrupts-during-client-readwrite.patch
Description: Binary data


Re: [PATCH] Proof of concept for GUC improvements

2021-09-09 Thread Aleksander Alekseev
It looks like this patch rotted a little and needs to be rebased. Please see 
http://cfbot.cputube.org/

The new status of this patch is: Waiting on Author


Re: Asymmetric partition-wise JOIN

2021-09-09 Thread Aleksander Alekseev
It looks like this patch needs to be updated. According to 
http://cfbot.cputube.org/ it applies but doesn't pass any tests. Changing the 
status to save time for reviewers.

The new status of this patch is: Waiting on Author


Re: Regression in PG14 LookupFuncName

2021-09-09 Thread Sven Klemm
> It was intentional, because all internal callers of LookupFuncName only
> want to see functions.  See the last few messages in the referenced
> discussion thread:
>
> https://www.postgresql.org/message-id/flat/3742981.1621533210%40sss.pgh.pa.us

Thank you for the clarification.

--
Regards, Sven Klemm




Re: EXPLAIN(VERBOSE) to CTE with SEARCH BREADTH FIRST fails

2021-09-09 Thread Peter Eisentraut

On 07.09.21 20:31, Tom Lane wrote:

torikoshia  writes:

While working on [1], we found that EXPLAIN(VERBOSE) to CTE with SEARCH
BREADTH FIRST ends up ERROR.


Yeah.  It's failing here:

  * We're deparsing a Plan tree so we don't have a CTE
  * list.  But the only place we'd see a Var directly
  * referencing a CTE RTE is in a CteScan plan node, and we
  * can look into the subplan's tlist instead.

 if (!dpns->inner_plan)
 elog(ERROR, "failed to find plan for CTE %s",
  rte->eref->aliasname);

The problematic Var is *not* in a CteScan plan node; it's in a
WorkTableScan node.  It's not clear to me whether this is a bug
in the planner's handling of SEARCH BREADTH FIRST, or if the plan
is as-intended and ruleutils.c is failing to cope.


The search clause is resolved by the rewriter, so it's unlikely that the 
planner is doing something wrong.  Either the rewriting produces 
something incorrect (but then one might expect that the query results 
would be wrong), or the structures constructed by rewriting are not 
easily handled by ruleutils.c.


If we start from the example in the documentation 
:


"""
WITH RECURSIVE search_tree(id, link, data, depth) AS (
SELECT t.id, t.link, t.data, 0
FROM tree t
  UNION ALL
SELECT t.id, t.link, t.data, depth + 1
FROM tree t, search_tree st
WHERE t.id = st.link
)
SELECT * FROM search_tree ORDER BY depth;

To get a stable sort, add data columns as secondary sorting columns.
"""

In order to handle that part about the stable sort, the query 
constructed internally is something like


WITH RECURSIVE search_tree(id, link, data, seq) AS (
SELECT t.id, t.link, t.data, ROW(0, id, link)
FROM tree t
  UNION ALL
SELECT t.id, t.link, t.data, ROW(seq.depth + 1, id, link)
FROM tree t, search_tree st
WHERE t.id = st.link
)
SELECT * FROM search_tree ORDER BY seq;

The bit "seq.depth" isn't really valid when typed in like that, I think, 
but of course internally this is all wired together with numbers rather 
than identifiers.  I suspect that that is what ruleutils.c trips over.





Re: Schema variables - new implementation for Postgres 15

2021-09-09 Thread Erik Rijkers

> [schema-variables-20210909.patch]

Hi Pavel,

The patch applies and compiles fine but 'make check' for the 
assert-enabled fails on 131 out of 210 tests.


(while compiling HEAD checks run without errors for both assert-disabled 
and assert-enabled)



Erik Rijkers


test tablespace   ... ok  303 ms
parallel group (20 tests):  oid char pg_lsn int2 varchar txid int4 
regproc uuid float4 text name money boolean bit float8 int8 enum numeric 
rangetypes

 boolean  ... ok  112 ms
 char ... ok   57 ms
 name ... ok  106 ms
 varchar  ... ok   74 ms
 text ... ok  106 ms
 int2 ... ok   73 ms
 int4 ... ok   92 ms
 int8 ... ok  130 ms
 oid  ... ok   55 ms
 float4   ... ok  102 ms
 float8   ... ok  126 ms
 bit  ... ok  124 ms
 numeric  ... ok  362 ms
 txid ... ok   87 ms
 uuid ... ok  100 ms
 enum ... ok  142 ms
 money... ok  109 ms
 rangetypes   ... ok  433 ms
 pg_lsn   ... ok   64 ms
 regproc  ... ok   91 ms
parallel group (20 tests):  lseg path circle time macaddr 
create_function_0 timetz line macaddr8 numerology point interval inet 
date strings polygon box multirangetypes timestamp timestamptz

 strings  ... ok  166 ms
 numerology   ... ok   89 ms
 point... ok   96 ms
 lseg ... ok   35 ms
 line ... ok   70 ms
 box  ... ok  255 ms
 path ... ok   50 ms
 polygon  ... ok  237 ms
 circle   ... ok   53 ms
 date ... ok  127 ms
 time ... ok   60 ms
 timetz   ... ok   67 ms
 timestamp... ok  379 ms
 timestamptz  ... ok  413 ms
 interval ... ok   97 ms
 inet ... ok  118 ms
 macaddr  ... ok   60 ms
 macaddr8 ... ok   80 ms
 multirangetypes  ... ok  307 ms
 create_function_0... ok   63 ms
parallel group (12 tests):  comments unicode misc_sanity tstypes xid 
expressions horology geometry mvcc type_sanity regex opr_sanity

 geometry ... ok  140 ms
 horology ... ok  120 ms
 tstypes  ... ok   53 ms
 regex... ok  335 ms
 type_sanity  ... ok  155 ms
 opr_sanity   ... ok  355 ms
 misc_sanity  ... ok   43 ms
 comments ... ok   20 ms
 expressions  ... ok  100 ms
 unicode  ... ok   25 ms
 xid  ... ok   56 ms
 mvcc ... ok  146 ms
test create_function_1... ok   10 ms
test create_type  ... ok   30 ms
test create_table ... ok  333 ms
test create_function_2... ok   11 ms
parallel group (5 tests):  copydml copyselect insert_conflict insert copy
 copy ... ok  336 ms
 copyselect   ... ok   34 ms
 copydml  ... ok   28 ms
 insert   ... ok  291 ms
 insert_conflict  ... FAILED (test process exited with 
exit code 2)  239 ms

parallel group (3 tests):  create_operator create_procedure create_misc
 create_misc  ... ok  131 ms
 create_operator  ... ok   29 ms
 create_procedure ... ok   52 ms
parallel group (5 tests):  create_view create_index_spgist 
index_including create_index index_including_gist
 create_index ... FAILED (test process exited with 
exit code 2) 3801 ms

 create_index_spgist  ... ok  523 ms
 create_view  ... FAILED (test process exited with

Re: Schema variables - new implementation for Postgres 15

2021-09-09 Thread Pavel Stehule
Hi

čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers  napsal:

>  > [schema-variables-20210909.patch]
>
> Hi Pavel,
>
> The patch applies and compiles fine but 'make check' for the
> assert-enabled fails on 131 out of 210 tests.
>

This morning I tested it. I'll recheck it.

Pavel


> (while compiling HEAD checks run without errors for both assert-disabled
> and assert-enabled)
>
>
> Erik Rijkers
>
>
> test tablespace   ... ok  303 ms
> parallel group (20 tests):  oid char pg_lsn int2 varchar txid int4
> regproc uuid float4 text name money boolean bit float8 int8 enum numeric
> rangetypes
>   boolean  ... ok  112 ms
>   char ... ok   57 ms
>   name ... ok  106 ms
>   varchar  ... ok   74 ms
>   text ... ok  106 ms
>   int2 ... ok   73 ms
>   int4 ... ok   92 ms
>   int8 ... ok  130 ms
>   oid  ... ok   55 ms
>   float4   ... ok  102 ms
>   float8   ... ok  126 ms
>   bit  ... ok  124 ms
>   numeric  ... ok  362 ms
>   txid ... ok   87 ms
>   uuid ... ok  100 ms
>   enum ... ok  142 ms
>   money... ok  109 ms
>   rangetypes   ... ok  433 ms
>   pg_lsn   ... ok   64 ms
>   regproc  ... ok   91 ms
> parallel group (20 tests):  lseg path circle time macaddr
> create_function_0 timetz line macaddr8 numerology point interval inet
> date strings polygon box multirangetypes timestamp timestamptz
>   strings  ... ok  166 ms
>   numerology   ... ok   89 ms
>   point... ok   96 ms
>   lseg ... ok   35 ms
>   line ... ok   70 ms
>   box  ... ok  255 ms
>   path ... ok   50 ms
>   polygon  ... ok  237 ms
>   circle   ... ok   53 ms
>   date ... ok  127 ms
>   time ... ok   60 ms
>   timetz   ... ok   67 ms
>   timestamp... ok  379 ms
>   timestamptz  ... ok  413 ms
>   interval ... ok   97 ms
>   inet ... ok  118 ms
>   macaddr  ... ok   60 ms
>   macaddr8 ... ok   80 ms
>   multirangetypes  ... ok  307 ms
>   create_function_0... ok   63 ms
> parallel group (12 tests):  comments unicode misc_sanity tstypes xid
> expressions horology geometry mvcc type_sanity regex opr_sanity
>   geometry ... ok  140 ms
>   horology ... ok  120 ms
>   tstypes  ... ok   53 ms
>   regex... ok  335 ms
>   type_sanity  ... ok  155 ms
>   opr_sanity   ... ok  355 ms
>   misc_sanity  ... ok   43 ms
>   comments ... ok   20 ms
>   expressions  ... ok  100 ms
>   unicode  ... ok   25 ms
>   xid  ... ok   56 ms
>   mvcc ... ok  146 ms
> test create_function_1... ok   10 ms
> test create_type  ... ok   30 ms
> test create_table ... ok  333 ms
> test create_function_2... ok   11 ms
> parallel group (5 tests):  copydml copyselect insert_conflict insert copy
>   copy ... ok  336 ms
>   copyselect   ... ok   34 ms
>   copydml  ... ok   28 ms
>   insert   ... ok  291 ms
>   insert_conflict  ... FAILED (test process exited with
> exit code 2)  239 

Re: [UNVERIFIED SENDER] Re: [BUG] Failed Assertion in ReorderBufferChangeMemoryUpdate()

2021-09-09 Thread Amit Kapila
On Tue, Sep 7, 2021 at 2:02 PM Drouvot, Bertrand  wrote:
>
> On 9/7/21 9:11 AM, Drouvot, Bertrand wrote:
> >
>
> Please find enclosed patch v2 (for the master branch) implementing the
> modified approach of option 2) proposed by Amit.
>

The patch looks good to me. I have made a minor modification in the
comment to make it explicit why we are not subtracting the size
immediately and ran pgindent. Kindly prepare back-branch patches and
test the same.

-- 
With Regards,
Amit Kapila.


v3-0001-Fix-reorder-buffer-memory-accounting-for-toast-ch.patch
Description: Binary data


Re: Schema variables - new implementation for Postgres 15

2021-09-09 Thread Gilles Darold

Le 09/09/2021 à 11:40, Pavel Stehule a écrit :

Hi

čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers <mailto:e...@xs4all.nl>> napsal:


 > [schema-variables-20210909.patch]

Hi Pavel,

The patch applies and compiles fine but 'make check' for the
assert-enabled fails on 131 out of 210 tests.


This morning I tested it. I'll recheck it.

Pavel



I had not this problem yesterday.


--
Gilles Darold
http://www.darold.net/



TAP test for recovery_end_command

2021-09-09 Thread Amul Sul
Hi,

The attached patch adds a small test for recovery_end_command execution.

Currently, patch tests execution of recovery_end_command by creating
dummy file, I am not wedded only to this approach, other suggestions
also welcome.

Also, we don't have a good test for archive_cleanup_command as well, I
am not sure how we could test that which executes with every
restart-point.

Thanks to my colleague Neha Sharma for confirming the test execution on Windows.

Regards,
Amul
diff --git a/src/test/recovery/t/002_archiving.pl b/src/test/recovery/t/002_archiving.pl
index ce60159f036..f90c698ba47 100644
--- a/src/test/recovery/t/002_archiving.pl
+++ b/src/test/recovery/t/002_archiving.pl
@@ -6,7 +6,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 3;
+use Test::More tests => 4;
 use File::Copy;
 
 # Initialize primary node, doing archives
@@ -65,13 +65,23 @@ $node_standby->promote;
 my $node_standby2 = PostgresNode->new('standby2');
 $node_standby2->init_from_backup($node_primary, $backup_name,
 	has_restoring => 1);
+my $node_standby2_data = $node_standby2->data_dir;
+
+# Also, test recovery_end_command by creating empty file.
+my $recovery_end_command_file = "$node_standby2_data/recovery_end_command.done";
+$node_standby2->append_conf('postgresql.conf',
+	"recovery_end_command='echo recovery_ended > $recovery_end_command_file'");
+
 $node_standby2->start;
 
 # Now promote standby2, and check that temporary files specifically
 # generated during archive recovery are removed by the end of recovery.
 $node_standby2->promote;
-my $node_standby2_data = $node_standby2->data_dir;
 ok( !-f "$node_standby2_data/pg_wal/RECOVERYHISTORY",
 	"RECOVERYHISTORY removed after promotion");
 ok( !-f "$node_standby2_data/pg_wal/RECOVERYXLOG",
 	"RECOVERYXLOG removed after promotion");
+
+# Also, check recovery_end_command execution.
+ok(-f "$recovery_end_command_file",
+	'recovery_end_command executed after promotion');


Re: missing warning in pg_import_system_collations

2021-09-09 Thread Ranier Vilela
Em qui., 9 de set. de 2021 às 03:45, Anton Voloshin <
a.volos...@postgrespro.ru> escreveu:

> Hello hackers,
>
> In pg_import_system_collations() there is this fragment of code:
>
> enc = pg_get_encoding_from_locale(localebuf, false);
> if (enc < 0)
> {
> /* error message printed by pg_get_encoding_from_locale() */
> continue;
> }
>
> However, false passed to pg_get_encoding_from_locale() means
> write_message argument is false, so no error message is ever printed.
> I propose an obvious patch (see attachment).
>
Yeah, seems correct to me.
The comment clearly expresses the intention.


> Introduced in aa17c06fb in January 2017 when debug was replaced by
> false, so I guess back-patching through 10 would be appropriate.
>
This is an oversight.

+1 from me.

Ranier Vilela


trap instead of error on 32 TiB table

2021-09-09 Thread Christoph Berg
I was wondering what happens when the 32 TiB per table limit is
reached, so I faked 32767 1 GiB sparse files using dd and then tried
inserting more rows.

On a cassert-enabled build I got:

TRAP: FailedAssertion("tagPtr->blockNum != P_NEW", File: 
"./build/../src/backend/storage/buffer/buf_table.c", Line: 125)

On a normal build, I got:

ERROR:  cannot extend file "base/18635/53332" beyond 4294967295 blocks
ORT:  mdextend, md.c:443

Shouldn't the cassert build raise the ERROR instead as well?

PostgreSQL 13.4.

Christoph
-- 
Senior Consultant, Tel.: +49 2166 9901 187
credativ GmbH, HRB Mönchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Sascha Heuer, Geoff Richardson,
Peter Lilley; Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz




Re: Why does bootstrap and later initdb stages happen via client?

2021-09-09 Thread Peter Eisentraut

On 08.09.21 21:07, Andres Freund wrote:

There of course is historical raisins for things happening in initdb - the
setup logic didn't use to be C. But now that it is C, it seems a bit absurd to
read bootstrap data in initdb, write the data to a pipe, and then read it
again in the backend. It for sure doesn't make things faster.


A couple of things I was looking into a while ago:  We could probably 
get a bit of performance by replacing the line-by-line substitutions 
(replace_token()) by processing the whole buffer at once.  And we could 
get even more performance by not doing any post-processing of the files 
at all.  For example, we don't need to replace_token() SIZEOF_POINTER, 
which is known at compile time.  Handling ENCODING, LC_COLLATE, etc. is 
not quite as obvious, but moving some of that logic into the backend 
could be helpful in that direction.





Re: drop tablespace failed when location contains .. on win32

2021-09-09 Thread Andrew Dunstan


On 9/8/21 11:44 PM, Michael Paquier wrote:
> On Thu, Sep 09, 2021 at 02:35:52AM +, wangsh.f...@fujitsu.com wrote:
>> Do you mean changing the action of canonicalize_path(), like remove all the 
>> (..) ?
>>
>> I'm willing to fix this problem.
> Looking at canonicalize_path(), we have already some logic around
> pending_strips to remove paths when we find a "/.." in the path, so
> that's a matter of adjusting this area to trim properly the previous
> directory.
>
> On *nix platforms, we don't apply this much caution either, say a
> simple /tmp/path/../path/ results in this same path used in the link
> from pg_tblspc.  But we are speaking about Windows here, and junction
> points.
>
> Based on the lack of complains over the years, that does not seem
> really worth backpatching.  Just my 2c on this point.



Maybe, although it's arguably a bug.


I think I would say that we should remove any "." or ".." element in the
path except at the beginning, and in the case of ".." also remove the
preceding element, unless someone can convince me that there's a problem
with that.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: create table like: ACCESS METHOD

2021-09-09 Thread Peter Eisentraut

On 27.08.21 12:37, Vik Fearing wrote:

It seems like this should error to me:

CREATE ACCESS METHOD heapdup TYPE TABLE HANDLER heap_tableam_handler;
CREATE TABLE likeam1() USING heap;
CREATE TABLE likeam2() USING heapdup;
CREATE TABLE likeamlike(
 LIKE likeam1 INCLUDING ACCESS METHOD,
 LIKE likeam2 INCLUDING ACCESS METHOD
);

At the very least, the documentation should say that the last one wins.


Hmm.  The problem is that the LIKE clause is really a macro that expands 
to the column definitions of the other table.  So there is, so far, no 
such as thing as two LIKE clauses contradicting.  Whereas the access 
method is a table property.  So I don't think this syntax is the right 
approach for this feature.


You might think about something like

CREATE TABLE t2 (...) USING (LIKE t1);

At least in terms of how the syntax should be structured.




RE: Allow escape in application_name

2021-09-09 Thread kuroda.hay...@fujitsu.com
Dear Fujii-san,

Thank you for reviewing!

> postgres_fdw gets out of the loop after processing appname even
> when buf.data is '\0'. Is this expected behavior? Because of this,
> when postgres_fdw.application_name = '%b', unprocessed appname
> of the server object is used.

In this case, I expected to use fallback_application_name instead of appname,
but I missed the case where appname was set as a server option.
Maybe we should do continue when buf.data is \0.

> + char *endptr = NULL;
> + padding = (int)strtol(p, &endptr, 10);
> 
> strtol() seems to work differently from process_log_prefix_padding(),
> for example, when the input string is "%-p".

Yeah, and I found that "%+5p" is another example.
Our padding function does not understand plus sign
(and maybe this is the specification of printf()), but strtol() reads.

I did not fix here because I lost my way. Do we treats these cases
and continue to use strtol(), or use process_padding()?

> Could you tell me why the statement checking
> application_name should be wrapped in a function?
> Instead, we can just execute something like the following?
> 
> SELECT COUNT(*) FROM pg_stat_activity WHERE application_name =
> current_setting('application_name') || current_user || current_database() ||
> pg_backend_pid() || '%';

I did not know current_setting() function...
The function was replaced as you expected.

> When log_line_prefix() processes "%a", "%u" or "%d" characters in
> log_line_prefix, it checks whether MyProcPort is NULL or not.
> Likewise shouldn't parse_pgfdw_appname() do the same thing?
> For now it's ok not to check that because only process having
> MyProcPort can use postgres_fdw. But in the future the process
> not having that may use postgres_fdw, for example, 2PC resolver
> process that Sawada-san is proposing to add to support automatic
> 2PC among multiple remote servers.

Sure, I did not consider about other patches. I added checks.

> +You can use escape sequences for application_name
> even if
> +it is set as a connection or a user mapping option.
> +Please refer the later section.
> 
> I was thinking that application_name cannot be set in a user mapping.

I confirmed codes and found that is_valid_option() returns false
if libpq options are set. So removed.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v12_0002_allow_escapes.patch
Description: v12_0002_allow_escapes.patch


Re: trap instead of error on 32 TiB table

2021-09-09 Thread Robert Haas
On Thu, Sep 9, 2021 at 7:52 AM Christoph Berg
 wrote:
> Shouldn't the cassert build raise the ERROR instead as well?

We should definitely get an ERROR in both cases, not an assertion
failure. Exactly which ERROR we get seems negotiable.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: trap instead of error on 32 TiB table

2021-09-09 Thread Tom Lane
Christoph Berg  writes:
> I was wondering what happens when the 32 TiB per table limit is
> reached, so I faked 32767 1 GiB sparse files using dd and then tried
> inserting more rows.

> On a cassert-enabled build I got:

> TRAP: FailedAssertion("tagPtr->blockNum != P_NEW", File: 
> "./build/../src/backend/storage/buffer/buf_table.c", Line: 125)

Can you provide a stack trace from that?

(or else a recipe for reproducing the bug ... I'm not excited
about reverse-engineering the details of the method)

regards, tom lane




Re: trap instead of error on 32 TiB table

2021-09-09 Thread Christoph Berg
Re: Tom Lane
> Can you provide a stack trace from that?

PG log:

TRAP: FailedAssertion("tagPtr->blockNum != P_NEW", File: 
"./build/../src/backend/storage/buffer/buf_table.c", Line: 125)
postgres: 13/main: cbe postgres [local] 
INSERT(ExceptionalCondition+0x7d)[0x558b6223d44d]
postgres: 13/main: cbe postgres [local] 
INSERT(BufTableInsert+0x89)[0x558b620bafb9]
postgres: 13/main: cbe postgres [local] INSERT(+0x441827)[0x558b620bf827]
postgres: 13/main: cbe postgres [local] 
INSERT(ReadBufferExtended+0x7a)[0x558b620c021a]
postgres: 13/main: cbe postgres [local] 
INSERT(RelationGetBufferForTuple+0x250)[0x558b61da7850]
postgres: 13/main: cbe postgres [local] INSERT(heap_insert+0x8b)[0x558b61d965cb]
postgres: 13/main: cbe postgres [local] INSERT(+0x123b89)[0x558b61da1b89]
postgres: 13/main: cbe postgres [local] INSERT(+0x30294c)[0x558b61f8094c]
postgres: 13/main: cbe postgres [local] INSERT(+0x303660)[0x558b61f81660]
postgres: 13/main: cbe postgres [local] 
INSERT(standard_ExecutorRun+0x115)[0x558b61f4eaa5]
postgres: 13/main: cbe postgres [local] INSERT(+0x47fb72)[0x558b620fdb72]
postgres: 13/main: cbe postgres [local] INSERT(+0x4809be)[0x558b620fe9be]
postgres: 13/main: cbe postgres [local] INSERT(PortalRun+0x1c2)[0x558b620fee72]
postgres: 13/main: cbe postgres [local] INSERT(+0x47c2e0)[0x558b620fa2e0]
postgres: 13/main: cbe postgres [local] 
INSERT(PostgresMain+0x1a53)[0x558b620fc153]
postgres: 13/main: cbe postgres [local] INSERT(+0x3e7f74)[0x558b62065f74]
postgres: 13/main: cbe postgres [local] 
INSERT(PostmasterMain+0xd78)[0x558b62066e68]
postgres: 13/main: cbe postgres [local] INSERT(main+0x796)[0x558b61d40356]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x7f57071fad0a]
postgres: 13/main: cbe postgres [local] INSERT(_start+0x2a)[0x558b61d403fa]
2021-09-09 13:19:31.024 CEST [1533530] LOG:  server process (PID 1534108) was 
terminated by signal 6: Aborted
2021-09-09 13:19:31.024 CEST [1533530] DETAIL:  Failed process was running: 
insert into huge select generate_series(1,10);

gdb bt:

Core was generated by `postgres: 13/main: cbe postgres [local] INSERT   
 '.
Program terminated with signal SIGABRT, Aborted.
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50  ../sysdeps/unix/sysv/linux/raise.c: Datei oder Verzeichnis nicht 
gefunden.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x7f57071f9537 in __GI_abort () at abort.c:79
#2  0x558b6223d46e in ExceptionalCondition 
(conditionName=conditionName@entry=0x558b623b2577 "tagPtr->blockNum != P_NEW",
errorType=errorType@entry=0x558b6229b016 "FailedAssertion",
fileName=fileName@entry=0x558b623b2598 
"./build/../src/backend/storage/buffer/buf_table.c", 
lineNumber=lineNumber@entry=125)
at ./build/../src/backend/utils/error/assert.c:67
#3  0x558b620bafb9 in BufTableInsert (tagPtr=tagPtr@entry=0x7ffec8919330, 
hashcode=hashcode@entry=960067002, buf_id=)
at ./build/../src/backend/storage/buffer/buf_table.c:125
#4  0x558b620bf827 in BufferAlloc (foundPtr=0x7ffec891932b, strategy=0x0, 
blockNum=4294967295, forkNum=MAIN_FORKNUM,
relpersistence=112 'p', smgr=0x558b62ed4b38) at 
./build/../src/backend/storage/buffer/bufmgr.c:1234
#5  ReadBuffer_common (smgr=0x558b62ed4b38, relpersistence=, 
forkNum=forkNum@entry=MAIN_FORKNUM,
blockNum=blockNum@entry=4294967295, mode=mode@entry=RBM_ZERO_AND_LOCK, 
strategy=0x0, hit=0x7ffec89193d7)
at ./build/../src/backend/storage/buffer/bufmgr.c:761
#6  0x558b620c021a in ReadBufferExtended (reln=0x7f56fb4f8120, 
forkNum=forkNum@entry=MAIN_FORKNUM,
blockNum=blockNum@entry=4294967295, mode=mode@entry=RBM_ZERO_AND_LOCK, 
strategy=strategy@entry=0x0)
at ./build/../src/backend/storage/buffer/bufmgr.c:677
#7  0x558b61da7056 in ReadBufferBI (relation=relation@entry=0x7f56fb4f8120, 
targetBlock=targetBlock@entry=4294967295,
mode=mode@entry=RBM_ZERO_AND_LOCK, bistate=bistate@entry=0x0) at 
./build/../src/backend/access/heap/hio.c:87
#8  0x558b61da7850 in RelationGetBufferForTuple 
(relation=relation@entry=0x7f56fb4f8120, len=32, 
otherBuffer=otherBuffer@entry=0,
options=options@entry=0, bistate=bistate@entry=0x0, 
vmbuffer=vmbuffer@entry=0x7ffec89194c8, vmbuffer_other=0x0)
at ./build/../src/backend/access/heap/hio.c:598
#9  0x558b61d965cb in heap_insert (relation=relation@entry=0x7f56fb4f8120, 
tup=tup@entry=0x558b62f0b780, cid=cid@entry=0,
options=options@entry=0, bistate=bistate@entry=0x0) at 
./build/../src/backend/access/heap/heapam.c:1868
#10 0x558b61da1b89 in heapam_tuple_insert (relation=0x7f56fb4f8120, 
slot=0x558b62f0b6b0, cid=0, options=0, bistate=0x0)
at ./build/../src/backend/access/heap/heapam_handler.c:251
#11 0x558b61f8094c in table_tuple_insert (bistate=0x0, options=0, 
cid=, slot=0x558b62f0b6b0, rel=0x7f56fb4f8120)
at ./build/../src/include/access/tableam.h:1156
#12 ExecInsert (mtstate=0x558b62f0ab50, slot=0x558b62f0b6b0,

Re: Non-decimal integer literals

2021-09-09 Thread Vik Fearing
On 9/8/21 3:14 PM, Tom Lane wrote:
> Vik Fearing  writes:
> 
>> Is there any hope of adding the optional underscores?  I see a potential
>> problem there as SELECT 1_a; is currently parsed as SELECT 1 AS _a; when
>> it should be parsed as SELECT 1_ AS a; or perhaps even as an error since
>> 0x1_a would be a valid number with no alias.
> 
> Even without that point, this patch *is* going to break valid queries,
> because every one of those cases is a valid number-followed-by-identifier
> today,

Ah, true that.  So if this does go in, we may as well add the
underscores at the same time.

> AFAIR we've seen exactly zero field demand for this feature,

I have often wanted something like this, even if I didn't bring it up on
this list.  I have had customers who have wanted this, too.  My response
has always been to show these exact problems to explain why it's not
possible, but if it's going to be in the standard then I favor doing it.

I have never really had a use for octal, but sometimes binary and hex
make things much clearer.  Having a grouping separator for large numbers
is even more useful.

> so I kind of wonder why we're in such a hurry to adopt something
> that hasn't even made it past draft-standard status.
I don't really see a hurry here.  I am fine with waiting until the draft
becomes final.
-- 
Vik Fearing




Re: trap instead of error on 32 TiB table

2021-09-09 Thread Tom Lane
Christoph Berg  writes:
>> Can you provide a stack trace from that?

> #2  0x558b6223d46e in ExceptionalCondition 
> (conditionName=conditionName@entry=0x558b623b2577 "tagPtr->blockNum != P_NEW",
> errorType=errorType@entry=0x558b6229b016 "FailedAssertion",
> fileName=fileName@entry=0x558b623b2598 
> "./build/../src/backend/storage/buffer/buf_table.c", 
> lineNumber=lineNumber@entry=125)
> at ./build/../src/backend/utils/error/assert.c:67
> #3  0x558b620bafb9 in BufTableInsert (tagPtr=tagPtr@entry=0x7ffec8919330, 
> hashcode=hashcode@entry=960067002, buf_id=)
> at ./build/../src/backend/storage/buffer/buf_table.c:125
> #4  0x558b620bf827 in BufferAlloc (foundPtr=0x7ffec891932b, strategy=0x0, 
> blockNum=4294967295, forkNum=MAIN_FORKNUM,
> relpersistence=112 'p', smgr=0x558b62ed4b38) at 
> ./build/../src/backend/storage/buffer/bufmgr.c:1234

Ah, thanks.  I don't think it's unreasonable for BufTableInsert to contain
that assertion --- we shouldn't be trying to allocate a buffer for an
illegal block number.

The regular error comes from mdextend, but that is too late under this
worldview, because smgrextend expects to be given a zero-filled buffer
to write out.  I think where we ought to be making the check is right
where ReadBuffer_common replaces P_NEW:

/* Substitute proper block number if caller asked for P_NEW */
if (isExtend)
+   {
blockNum = smgrnblocks(smgr, forkNum);
+   if (blockNum == InvalidBlockNumber)
+ereport(ERROR,
+(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("cannot extend file \"%s\" beyond %u blocks",
+relpath(smgr->smgr_rnode, forkNum),
+InvalidBlockNumber)));
+   }

Having done that, the check in md.c could be reduced to an Assert,
although there's something to be said for leaving it as-is as an
extra layer of defense.

regards, tom lane




Re: Skipping logical replication transactions on subscriber side

2021-09-09 Thread Masahiko Sawada
On Sun, Sep 5, 2021 at 10:58 PM Masahiko Sawada  wrote:
>
> On Fri, Sep 3, 2021 at 3:46 PM Greg Nancarrow  wrote:
> >
> > On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada  
> > wrote:
> > >
> > > I've attached rebased patches. 0004 patch is not the scope of this
> > > patch. It's borrowed from another thread[1] to fix the assertion
> > > failure for newly added tests. Please review them.
> > >
> >
> > BTW, these patches need rebasing (broken by recent commits, patches
> > 0001, 0003 and 0004 no longer apply, and it's failing in the cfbot).
>
> Thanks! I'll submit the updated patches early this week.
>

Sorry for the late response. I've attached the updated patches that
incorporate all comments unless I missed something. Please review
them.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v13-0003-Add-skip_xid-option-to-ALTER-SUBSCRIPTION.patch
Description: Binary data


v13-0002-Add-RESET-command-to-ALTER-SUBSCRIPTION-command.patch
Description: Binary data


v13-0001-Add-pg_stat_subscription_errors-statistics-view.patch
Description: Binary data


Re: Queries that should be canceled will get stuck on secure_write function

2021-09-09 Thread Robert Haas
On Fri, Aug 27, 2021 at 4:10 PM Andres Freund  wrote:
> On Fri, Aug 27, 2021, at 13:07, Robert Haas wrote:
> > On Fri, Aug 27, 2021 at 3:24 PM Andres Freund  wrote:
> > > I wonder if we could improve the situation somewhat by using the 
> > > non-blocking
> > > pqcomm functions in a few select places. E.g. if elog.c's
> > > send_message_to_frontend() sent its message via a new 
> > > pq_endmessage_noblock()
> > > (which'd use the existing pq_putmessage_noblock()) and used
> > > pq_flush_if_writable() instead of pq_flush(), we'd a) not block sending 
> > > to the
> > > client before AbortCurrentTransaction(), b) able to queue further error
> > > messages safely.
> >
> > pq_flush_if_writable() could succeed in sending only part of the data,
> > so I don't see how this works.
>
> All the data is buffered though, so I don't see that problem that causes?

OK, I guess I'm confused here.

If we're always buffering the data, then I suppose there's no risk of
injecting a protocol message into the middle of some other protocol
message, assuming that we don't have a non-local transfer of control
halfway through putting a message in the buffer. But there's still the
risk of getting out of step with the client. Suppose the client does
SELECT 1/0 and we send an ErrorResponse complaining about the division
by zero. But as we're trying to send that response, we block. Later, a
query cancel occurs. We can't queue another ErrorResponse because the
client will interpret that as the response to the next query, since
the division by zero error is the response to the current one. I don't
think that changing pq_flush() to pq_flush_if_writable() in elog.c or
anywhere else can fix that problem.

But that doesn't mean that it isn't a good idea. Any place where we're
doing a pq_flush() and know that another one will happen soon
afterward, before we wait for data from the client, can be changed to
pq_flush_if_writable() without harm, and it's beneficial to do so,
because like you say, it avoids blocking in places that users may find
inconvenient - e.g. while holding locks, as Fujii-san said. The
comment here claims that "postgres.c will flush out waiting data when
control returns to the main loop" but the only pq_flush() call that's
directly present in postgres.c is in response to receiving a Flush
message, so I suppose this is actually talking about the pq_flush()
inside ReadyForQuery. It's not 100% clear to me that we do that in all
relevant cases, though. Suppose we hit an error while processing some
protocol message that does not set send_ready_for_query = true, like
for example Describe ('D'). I think in that case the flush in elog.c
is the only one. Perhaps we ought to change postgres.c so that if we
don't enter the block guarded by "if (send_ready_for_query)" we
instead pq_flush().

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: missing warning in pg_import_system_collations

2021-09-09 Thread Tom Lane
Anton Voloshin  writes:
> In pg_import_system_collations() there is this fragment of code:

> enc = pg_get_encoding_from_locale(localebuf, false);
> if (enc < 0)
> {
>   /* error message printed by pg_get_encoding_from_locale() */
>   continue;
> }

> However, false passed to pg_get_encoding_from_locale() means 
> write_message argument is false, so no error message is ever printed.
> I propose an obvious patch (see attachment).
> Introduced in aa17c06fb in January 2017 when debug was replaced by 
> false, so I guess back-patching through 10 would be appropriate.

I don't think this is obvious at all.

In the original coding (before aa17c06fb, when this code was in initdb),
we printed a warning if "debug" was true and otherwise printed nothing.
The other "if (debug)" cases in the code that got moved over were
translated to "elog(DEBUG1)", but there isn't any API to make
pg_get_encoding_from_locale() log at that level.

What you propose to do here would promote this case from
ought-to-be-DEBUG1 to WARNING, which seems to me to be way too much in the
user's face.  Or, if there actually is a case for complaining, then all
those messages ought to be WARNING not DEBUG1.  But I'm inclined to think
that having pg_import_system_collations silently ignore unusable locales
is the right thing most of the time.

Assuming we don't want to change pg_get_encoding_from_locale()'s API,
the simplest fix is to duplicate its error message, so more or less

 if (enc < 0)
 {
-/* error message printed by pg_get_encoding_from_locale() */
+elog(DEBUG1, "could not determine encoding for locale \"%s\"",
+ localebuf)));
 continue;
 }

regards, tom lane




Re: Migração Postgresql 8.3 para versão Postgresql 9.3

2021-09-09 Thread Robert Haas
On Wed, Sep 8, 2021 at 12:20 PM Ranier Vilela  wrote:
> Olá Oswaldo,
> Sinto, mas essa lista não é o local apropriado para essas questões.
> Por favor encaminhar essas questões para:
> https://www.postgresql.org/list/pgsql-general/

The Portuguese list might be a better choice:

https://www.postgresql.org/list/pgsql-pt-geral/

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Possible missing segments in archiving on standby

2021-09-09 Thread Fujii Masao



On 2021/09/08 22:41, Fujii Masao wrote:

So probably we reached the consensus that something like the attached patch
(XLogArchiveCheckDone_walfile_xlog_switch.patch) is enough for the corner
case where walreceiver fails to create .ready file of WAL file including
XLOG_SWITCH?


I attached the patch again, just in the case.



Sounds convincing.  Ok, I agree to that.


So barring any objection, I will commit the patch
and back-patch it to all supported version.


Pushed. Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index e51a7a749d..6046e24f0f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7392,6 +7392,27 @@ StartupXLOG(void)
/* Handle interrupt signals of startup process 
*/
HandleStartupProcInterrupts();
 
+   /*
+* In standby mode, create an archive 
notification file of a
+* WAL segment if it includes an XLOG_SWITCH 
record and its
+* notification file has not been created yet. 
This is
+* necessary to handle the corner case that 
walreceiver may
+* fail to create such notification file if it 
exits after it
+* receives XLOG_SWITCH record but while it's 
receiving the
+* remaining bytes in the segment. Without this 
handling, WAL
+* archiving of the segment will be delayed 
until subsequent
+* checkpoint creates its notification file 
when removing it
+* even though it can be archived soon.
+*/
+   if (StandbyMode && record->xl_rmid == 
RM_XLOG_ID &&
+   (record->xl_info & ~XLR_INFO_MASK) == 
XLOG_SWITCH)
+   {
+   char
xlogfilename[MAXFNAMELEN];
+
+   XLogFileName(xlogfilename, curFileTLI, 
readSegNo, wal_segment_size);
+   XLogArchiveCheckDone(xlogfilename);
+   }
+
/*
 * Pause WAL replay, if requested by a 
hot-standby session via
 * SetRecoveryPause().


Re: missing warning in pg_import_system_collations

2021-09-09 Thread Anton Voloshin

On 09/09/2021 21:51, Tom Lane wrote:

What you propose to do here would promote this case from
ought-to-be-DEBUG1 to WARNING, which seems to me to be way too much in the
user's face.  Or, if there actually is a case for complaining, then all
those messages ought to be WARNING not DEBUG1. ...

Assuming we don't want to change pg_get_encoding_from_locale()'s API,
the simplest fix is to duplicate its error message, so more or less

  if (enc < 0)
  {
-/* error message printed by pg_get_encoding_from_locale() */
+elog(DEBUG1, "could not determine encoding for locale \"%s\"",
+ localebuf)));
  continue;
  }


Upon thinking a little more, I agree.
The warnings I happen to get from initdb on my current machine (with 
many various locales installed, more than on a typical box) are:


performing post-bootstrap initialization ... 2021-09-09 22:04:01.678 +07 
[482312] WARNING:  could not determine encoding for locale 
"hy_AM.armscii8": codeset is "ARMSCII-8"
2021-09-09 22:04:01.679 +07 [482312] WARNING:  could not determine 
encoding for locale "ka_GE": codeset is "GEORGIAN-PS"
2021-09-09 22:04:01.679 +07 [482312] WARNING:  could not determine 
encoding for locale "kk_KZ": codeset is "PT154"
2021-09-09 22:04:01.679 +07 [482312] WARNING:  could not determine 
encoding for locale "kk_KZ.rk1048": codeset is "RK1048"
2021-09-09 22:04:01.686 +07 [482312] WARNING:  could not determine 
encoding for locale "tg_TJ": codeset is "KOI8-T"
2021-09-09 22:04:01.686 +07 [482312] WARNING:  could not determine 
encoding for locale "th_TH": codeset is "TIS-620"

ok

While they are definitely interesting as DEBUG1, not so as a WARNING.

So, +1 from me for your proposed elog(DEBUG1, ...); patch. Thank you.

--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




Re: Asymmetric partition-wise JOIN

2021-09-09 Thread Jaime Casanova
On Thu, Sep 09, 2021 at 09:50:46AM +, Aleksander Alekseev wrote:
> It looks like this patch needs to be updated. According to 
> http://cfbot.cputube.org/ it applies but doesn't pass any tests. Changing the 
> status to save time for reviewers.
> 
> The new status of this patch is: Waiting on Author

Just to give some more info to work on I found this patch made postgres
crash with a segmentation fault.

"""
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x556e37ef1b55 in bms_equal (a=0x7f6e37a9c5b0, b=0x7f6e37a9c5b0) at 
bitmapset.c:126
126 if (shorter->words[i] != longer->words[i])
"""

attached are the query that triggers the crash and the backtrace.

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL


bug.sql
Description: application/sql
#0  0x556e37ef1b55 in bms_equal (a=0x7f6e37a9c5b0, b=0x7f6e37a9c5b0) at 
bitmapset.c:126
shorter = 0x7f6e37a9c5b0
longer = 0x7f6e37a9c5b0
shortlen = 2139062143
longlen = 32622
i = 138057
#1  0x556e37fc9f5d in find_param_path_info (rel=0x556e3944c490, 
required_outer=0x7f6e37a9c5b0) at relnode.c:1634
ppi = 0x7f6e37a9cc08
lc__state = {l = 0x7f6e37a9cc40, i = 0}
lc = 0x7f6e37a5f898
#2  0x556e37fbeaeb in reparameterize_path_by_child (root=0x556e394340e0, 
path=0x7f6e37a9c6c8, child_rel=0x7f6e37a77ac0) at pathnode.c:4205
new_path = 0x7f6e37a89408
new_ppi = 0x7f6e37a77d30
old_ppi = 0x7f6e37a9cc08
required_outer = 0x7f6e37a9c5b0
#3  0x556e37f66545 in try_nestloop_path (root=0x556e394340e0, 
joinrel=0x7f6e37a88a80, outer_path=0x7f6e37a788e8, inner_path=0x7f6e37a9c6c8, 
pathkeys=0x0, jointype=JOIN_LEFT, extra=0x7ffde36f3e50)
at joinpath.c:662
required_outer = 0x0
workspace = {startup_cost = 46733.4674, total_cost = 
1195882.0625, run_cost = 1149148.595, inner_run_cost = 0.020375816993464052, 
inner_rescan_run_cost = 0.020375816993464052, 
  outer_rows = 6.9224208015025731e-310, inner_rows = 
6.9224208087218603e-310, outer_skip_rows = 6.9224208013235237e-310, 
inner_skip_rows = 4.640852265067508e-310, numbuckets = 960708832, 
  numbatches = 21870, inner_rows_total = 6.9224208047606396e-310}
innerrel = 0x556e3944c490
outerrel = 0x7f6e37a77ac0
innerrelids = 0x556e3944c2c8
outerrelids = 0x7f6e37a77d30
inner_paramrels = 0x7f6e37a9c5b0
outer_paramrels = 0x0
#4  0x556e37f67bf9 in match_unsorted_outer (root=0x556e394340e0, 
joinrel=0x7f6e37a88a80, outerrel=0x7f6e37a77ac0, innerrel=0x556e3944c490, 
jointype=JOIN_LEFT, extra=0x7ffde36f3e50) at joinpath.c:1702
innerpath = 0x7f6e37a9c6c8
mpath = 0x0
lc2__state = {l = 0x7f6e37a9d0a0, i = 1}
lc2 = 0x7f6e37a9d0c0
outerpath = 0x7f6e37a788e8
merge_pathkeys = 0x0
lc1__state = {l = 0x7f6e37a786c8, i = 0}
save_jointype = JOIN_LEFT
nestjoinOK = true
useallclauses = false
inner_cheapest_total = 0x7f6e37a9c3b0
matpath = 0x7f6e37a89370
lc1 = 0x7f6e37a786e0
__func__ = "match_unsorted_outer"
#5  0x556e37f65c14 in add_paths_to_joinrel (root=0x556e394340e0, 
joinrel=0x7f6e37a88a80, outerrel=0x7f6e37a77ac0, innerrel=0x556e3944c490, 
jointype=JOIN_LEFT, sjinfo=0x7f6e37a88630, 
restrictlist=0x7f6e37a88a28) at joinpath.c:291
extra = {restrictlist = 0x7f6e37a88a28, mergeclause_list = 
0x7f6e37a88e28, inner_unique = true, sjinfo = 0x7f6e37a88630, semifactors = 
{outer_match_frac = 0.00039215686274509802, match_count = 2550}, 
  param_source_rels = 0x0}
mergejoin_allowed = true
lc = 0x0
joinrelids = 0x7f6e37a886e8
#6  0x556e37f6a3d9 in populate_joinrel_with_paths (root=0x556e394340e0, 
rel1=0x7f6e37a77ac0, rel2=0x556e3944c490, joinrel=0x7f6e37a88a80, 
sjinfo=0x7f6e37a88630, restrictlist=0x7f6e37a88a28)
at joinrels.c:825
__func__ = "populate_joinrel_with_paths"
#7  0x556e37f6bca5 in extract_asymmetric_partitionwise_subjoin 
(root=0x556e394340e0, joinrel=0x7f6e37a86bd8, append_path=0x7f6e37a7b6c8, 
inner_rel=0x556e3944c490, jointype=JOIN_LEFT, extra=0x7ffde36f4150)
at joinrels.c:1617
child_path = 0x7f6e37a788e8
child_joinrelids = 0x7f6e37a885e0
child_sjinfo = 0x7f6e37a88630
child_rel = 0x7f6e37a77ac0
parent_relids = 0x7f6e37a88608
child_joinrel = 0x7f6e37a88a80
child_restrictlist = 0x7f6e37a88a28
lc__state = {l = 0x7f6e37a7b8a8, i = 0}
result = 0x0
lc = 0x7f6e37a7b8c0
#8  0x556e37f6bfb8 in try_asymmetric_partitionwise_join 
(root=0x556e394340e0, joinrel=0x7f6e37a86bd8, outer_rel=0x7f6e37a65f68, 
inner_rel=0x556e3944c490, jointype=JOIN_LEFT, extra=0x7ffde36f4150)
at joinrels.c:1713
_save_exception_stack = 0x7ffde36f4b80
_save_context_stack = 0x0
_local_sigjmp_buf = {{__jmpbuf = {0,

Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

2021-09-09 Thread Etsuro Fujita
On Thu, Sep 2, 2021 at 5:42 AM Tom Lane  wrote:
> The real reason that this hasn't gotten committed is that I remain
> pretty uncomfortable about whether it's an acceptable solution to
> the problem.  Suddenly asking people to plaster COLLATE clauses
> on all their textual remote columns seems like a big compatibility
> gotcha.

I think so too.  I reviewed the patch:

/*
 * If the Var is from the foreign table, we consider its
-* collation (if any) safe to use.  If it is from another
+* collation (if any) safe to use, *unless* it's
+* DEFAULT_COLLATION_OID.  We treat that as meaning "we don't
+* know which collation this is".  If it is from another
 * table, we treat its collation the same way as we would a
 * Param's collation, ie it's not safe for it to have a
 * non-default collation.
@@ -350,7 +352,12 @@ foreign_expr_walker(Node *node,

/* Else check the collation */
collation = var->varcollid;
-   state = OidIsValid(collation) ? FDW_COLLATE_SAFE :
FDW_COLLATE_NONE;
+   if (collation == InvalidOid)
+   state = FDW_COLLATE_NONE;
+   else if (collation == DEFAULT_COLLATION_OID)
+   state = FDW_COLLATE_UNSAFE;
+   else
+   state = FDW_COLLATE_SAFE;

One thing I noticed about this change is:

explain (verbose, costs off) select * from ft3 order by f2;
   QUERY PLAN
-
 Sort
   Output: f1, f2, f3
   Sort Key: ft3.f2
   ->  Foreign Scan on public.ft3
 Output: f1, f2, f3
 Remote SQL: SELECT f1, f2, f3 FROM public.loct3
(6 rows)

where ft3 is defined as in the postgres_fdw regression test (see the
section “test handling of collations”).  For this query, the sort is
done locally, but I think it should be done remotely, or an error
should be raised, as we don’t know the collation assigned to the
column “f2”.  So I think we need to do something about this.

Having said that, I think another option for this would be to left the
code as-is; assume that 1) the foreign var has "COLLATE default”, not
an unknown collation, when labeled with "COLLATE default”, and 2)
"COLLATE default” on the local database matches "COLLATE default” on
the remote database.  This would be the same as before, so we could
avoid the concern mentioned above.  I agree with the
postgresImportForeignSchema() change, except creating a local column
with "COLLATE default" silently if that function can’t find a remote
collation matching the database's datcollate/datctype when seeing
"COLLATE default”, in which case I think an error should be raised to
prompt the user to check the settings for the remote server and/or
define foreign tables manually with collations that match the remote
side.  Maybe I’m missing something, though.

Anyway, here is a patch created on top of your patch to modify
async-related test cases to work as intended.  I’m also attaching your
patch to make the cfbot quiet.

Sorry for the delay.

Best regards,
Etsuro Fujita


0001-fix-postgres-fdw-collation-handling-4.patch
Description: Binary data


0002-modify-postgres-fdw-async-test-cases.patch
Description: Binary data


Re: Why does bootstrap and later initdb stages happen via client?

2021-09-09 Thread Andrew Dunstan


On 9/8/21 5:48 PM, Andres Freund wrote:
> Hi,
>
> On September 8, 2021 1:24:00 PM PDT, Andrew Dunstan  
> wrote:
>> On 9/8/21 3:07 PM, Andres Freund wrote:
>>> There of course is historical raisins for things happening in initdb - the
>>> setup logic didn't use to be C. But now that it is C, it seems a bit absurd 
>>> to
>>> read bootstrap data in initdb, write the data to a pipe, and then read it
>>> again in the backend. It for sure doesn't make things faster.
>>
>> I guess the downside would be that we'd need to teach the backend how to
>> do more stuff that only needs to be done once per cluster, and then that
>> code would be dead space for the rest of the lifetime of the cluster.
>>
>>
>> Maybe the difference is sufficiently small that it doesn't matter.
> Unused code doesn't itself cost much - the OS won't even page it in. And disk 
> space wise, there's not much difference between code in initdb and code in 
> postgres. It's callsites to the code that can be problematic. But there were 
> already paying the price via --boot and a fair number of if (bootstrap) 
> blocks.
>

Fair enough. You're quite right, of course, the original design of
initdb.c was to do what the preceding shell script did as closely as
possible. It does leak a bit of memory, which doesn't matter in the
context of a short-lived program - but that shouldn't be too hard to
manage in the backend.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

2021-09-09 Thread Tom Lane
Etsuro Fujita  writes:
> On Thu, Sep 2, 2021 at 5:42 AM Tom Lane  wrote:
>> The real reason that this hasn't gotten committed is that I remain
>> pretty uncomfortable about whether it's an acceptable solution to
>> the problem.  Suddenly asking people to plaster COLLATE clauses
>> on all their textual remote columns seems like a big compatibility
>> gotcha.

> I think so too.

Yeah :-(.  It seems like a very unpleasant change.

> Having said that, I think another option for this would be to left the
> code as-is; assume that 1) the foreign var has "COLLATE default”, not
> an unknown collation, when labeled with "COLLATE default”, and 2)
> "COLLATE default” on the local database matches "COLLATE default” on
> the remote database.

The fundamental complaint that started this thread was exactly that
assumption (2) isn't safe.  So it sounds to me like you're proposing
that we do nothing, which isn't a great answer either.  I suppose
we could try documenting our way out of this, but people will
continue to get bit because they won't read or won't understand
the limitation.

I'd be happier if we had a way to check whether the local and remote
default collations are compatible.  But it seems like that's a big ask,
especially in cross-operating-system situations.

regards, tom lane




Re: Schema variables - new implementation for Postgres 15

2021-09-09 Thread Pavel Stehule
čt 9. 9. 2021 v 13:17 odesílatel Gilles Darold  napsal:

> Le 09/09/2021 à 11:40, Pavel Stehule a écrit :
>
> Hi
>
> čt 9. 9. 2021 v 12:21 odesílatel Erik Rijkers  napsal:
>
>>  > [schema-variables-20210909.patch]
>>
>> Hi Pavel,
>>
>> The patch applies and compiles fine but 'make check' for the
>> assert-enabled fails on 131 out of 210 tests.
>>
>
> This morning I tested it. I'll recheck it.
>
> Pavel
>
>
> I had not this problem yesterday.
>

I am able to reproduce it. Looks like some current changes of Nodes don't
work with this patch. I have to investigate it.

Regards

Pavel


> --
> Gilles Daroldhttp://www.darold.net/
>
>


Re: Avoid stuck of pbgench due to skipped transactions

2021-09-09 Thread Fujii Masao




On 2021/09/08 23:40, Fujii Masao wrote:

Agreed. Since it's hard to imagine the issue happens in practice,
we don't need to bother back-patch to the stable branches.
So I'm thinking to commit the patch to 15dev and 14.


Pushed. Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




We don't enforce NO SCROLL cursor restrictions

2021-09-09 Thread Tom Lane
We don't actually prevent you from scrolling a NO SCROLL cursor:

regression=# begin;
BEGIN
regression=*# declare c no scroll cursor for select * from int8_tbl;
DECLARE CURSOR
regression=*# fetch all from c;
q1|q2 
--+---
  123 |   456
  123 |  4567890123456789
 4567890123456789 |   123
 4567890123456789 |  4567890123456789
 4567890123456789 | -4567890123456789
(5 rows)

regression=*# fetch absolute 2 from c;
 q1  |q2
-+--
 123 | 4567890123456789
(1 row)

There are probably specific cases where you do get an error,
but we don't have a blanket you-can't-do-that check.  Should we?

The reason this came to mind is that while poking at [1]
I noticed that commit ba2c6d6ce has created some user-visible
anomalies for non-scrollable cursors WITH HOLD.  If you advance
the cursor a bit, commit, and then try to scroll the cursor,
it will work but the part of the output that you advanced over
will be missing.  I think we should probably throw an error
to prevent that from being visible.  I'm worried though that
putting in a generic prohibition may break applications that
used to get away with this kind of thing.

regards, tom lane

[1] 
https://www.postgresql.org/message-id/CAPV2KRjd%3DErgVGbvO2Ty20tKTEZZr6cYsYLxgN_W3eAo9pf5sw%40mail.gmail.com




Re: [Patch] ALTER SYSTEM READ ONLY

2021-09-09 Thread Mark Dilger



> On Sep 8, 2021, at 6:44 AM, Amul Sul  wrote:
> 
> Here is the rebased version.

v33-0004 

This patch moves the include of "catalog/pg_control.h" from transam/xlog.c into 
access/xlog.h, making pg_control.h indirectly included from a much larger set 
of files.  Maybe that's ok.  I don't know.  But it seems you are doing this 
merely to get the symbol (not even the definition) for struct DBState.  I'd 
recommend rearranging the code so this isn't necessary, but otherwise you'd at 
least want to remove the now redundant includes of catalog/pg_control.h from 
xlogdesc.c, xloginsert.c, auth-scram.c, postmaster.c, misc/pg_controldata.c, 
and pg_controldata/pg_controldata.c.

v33-0005 

This patch makes bool XLogInsertAllowed() more complicated than before.  The 
result used to depend mostly on the value of LocalXLogInsertAllowed except that 
when that value was negative, the result was determined by 
RecoveryInProgress().  There was an arcane rule that LocalXLogInsertAllowed 
must have the non-negative values binary coercible to boolean "true" and 
"false", with the basis for that rule being the coding of XLogInsertAllowed().  
Now that the function is more complicated, this rule seems even more arcane.  
Can we change the logic to not depend on casting an integer to bool?

The code comment change in autovacuum.c introduces a non-grammatical sentence: 
"First, the system is not read only i.e. wal writes permitted".

The function comment in checkpointer.c reads more like it toggles the system 
into allowing something, rather than actually doing that same something: 
"SendSignalToCheckpointer allows a process to send a signal to the checkpoint 
process".

The new code comment in ipci.c contains a typo, but more importantly, it 
doesn't impart any knowledge beyond what a reader of the function name could 
already surmise.  Perhaps the comment can better clarify what is happening: 
"Set up wal probibit shared state"

The new code comment in sync.c copies and changes a nearby comment but drops 
part of the verb phrase:  "As in ProcessSyncRequests, we don't want to stop wal 
prohibit change requests".  The nearby comment reads "stop absorbing".  I think 
this one should read "stop processing".  This same comment is used again below. 
  Then a third comment reads "For the same reason mentioned previously for the 
wal prohibit state change request check."  That third comment is too glib.

tcop/utility.c needlessly includes "access/walprohibit.h"

wait_event.h extends enum WaitEventIO with new values 
WAIT_EVENT_WALPROHIBIT_STATE and WAIT_EVENT_WALPROHIBIT_STATE_CHANGE.  I don't 
find the difference between these two names at all clear.  Waiting for a state 
change is clear enough.  But how is waiting on a state different?

xlog.h defines a new enum.  I don't find any of it clear; not the comment, nor 
the name of the enum, nor the names of the values:

/* State of work that enables wal writes */
typedef enum XLogAcceptWritesState
{
XLOG_ACCEPT_WRITES_PENDING = 0, /* initial state, not started */
XLOG_ACCEPT_WRITES_SKIPPED, /* skipped wal writes */
XLOG_ACCEPT_WRITES_DONE /* wal writes are enabled */
} XLogAcceptWritesState; 

This enum seems to have been written from the point of view of someone who 
already knew what it was for.  It needs to be written in a way that will be 
clear to people who have no idea what it is for.

v33-0006:

The new code comments in brin.c and elsewhere should use the verb "require" 
rather than "have", otherwise "building indexes" reads as a noun phrase rather 
than as a gerund: /* Building indexes will have an XID */

The new function CheckWALPermitted() seems to test the current state of 
variables but not lock any of them, and the new function comment says:

/*
 * In opposite to the above assertion if a transaction doesn't have valid XID
 * (e.g. VACUUM) then it won't be killed while changing the system state to WAL
 * prohibited.  Therefore, we need to explicitly error out before entering into
 * the critical section.
 */

This suggests to me that a vacuum process can check whether wal is prohibited, 
then begin a critical section which needs wal to be allowed, and concurrently 
somebody else might disable wal without killing the vacuum process.  I'm given 
to wonder what horrors await when the vacuum process does something that needs 
to be wal logged but cannot be.  Does it trigger a panic?  I don't like the 
idea that calling pg_prohibit_wal durning a vacuum might panic the cluster.  If 
there is some reason this is not a problem, I think the comment should explain 
it.  In particular, why is it sufficient to check whether wal is prohibited 
before entering the critical section and not necessary to be sure it remains 
allowed through the lifetime of that critical section?

v33-0007:

I don't really like what the documentation has to say about pg_prohibit_wal.  
Why should pg_prohibit_wal differ from other signal sending functions in 
whether it returns a 

Re: We don't enforce NO SCROLL cursor restrictions

2021-09-09 Thread Vik Fearing
On 9/9/21 7:10 PM, Tom Lane wrote:
> We don't actually prevent you from scrolling a NO SCROLL cursor:
> 
> There are probably specific cases where you do get an error,
> but we don't have a blanket you-can't-do-that check.  Should we?


I would say yes.  NO SCROLL means no scrolling; or at least should.

On the other hand, if there is no optimization or other meaningful
difference between SCROLL and NO SCROLL, then we can just document it as
a no-op that is only provided for standard syntax compliance.
-- 
Vik Fearing




Re: We don't enforce NO SCROLL cursor restrictions

2021-09-09 Thread Tom Lane
Vik Fearing  writes:
> On 9/9/21 7:10 PM, Tom Lane wrote:
>> There are probably specific cases where you do get an error,
>> but we don't have a blanket you-can't-do-that check.  Should we?

> I would say yes.  NO SCROLL means no scrolling; or at least should.
> On the other hand, if there is no optimization or other meaningful
> difference between SCROLL and NO SCROLL, then we can just document it as
> a no-op that is only provided for standard syntax compliance.

There are definitely optimizations that happen or don't happen
depending on the SCROLL option.  I think ba2c6d6ce may be the
first patch that introduces any user-visible semantic difference,
but I'm not completely sure about that.

[ pokes at it some more ... ]  Hm, we let you do this:

regression=# begin;
BEGIN
regression=*# declare c cursor for select * from int8_tbl for update;
DECLARE CURSOR
regression=*# fetch all from c;
q1|q2 
--+---
  123 |   456
  123 |  4567890123456789
 4567890123456789 |   123
 4567890123456789 |  4567890123456789
 4567890123456789 | -4567890123456789
(5 rows)

regression=*# fetch absolute 2 from c;
 q1  |q2
-+--
 123 | 4567890123456789
(1 row)

which definitely flies in the face of the fact that we disallow
combining SCROLL and FOR UPDATE:

regression=*# declare c scroll cursor for select * from int8_tbl for update;
ERROR:  DECLARE SCROLL CURSOR ... FOR UPDATE is not supported
DETAIL:  Scrollable cursors must be READ ONLY.

I don't recall the exact reason for that prohibition, but I wonder
if there aren't user-visible anomalies reachable from the fact that
you can bypass it.

regards, tom lane




Re: [Patch] ALTER SYSTEM READ ONLY

2021-09-09 Thread Robert Haas
On Thu, Sep 9, 2021 at 1:42 PM Mark Dilger  wrote:
> v33-0006:
>
> The new code comments in brin.c and elsewhere should use the verb "require" 
> rather than "have", otherwise "building indexes" reads as a noun phrase 
> rather than as a gerund: /* Building indexes will have an XID */

Honestly that sentence doesn't sound very clear even with a different verb.

> This suggests to me that a vacuum process can check whether wal is 
> prohibited, then begin a critical section which needs wal to be allowed, and 
> concurrently somebody else might disable wal without killing the vacuum 
> process.  I'm given to wonder what horrors await when the vacuum process does 
> something that needs to be wal logged but cannot be.  Does it trigger a 
> panic?  I don't like the idea that calling pg_prohibit_wal durning a vacuum 
> might panic the cluster.  If there is some reason this is not a problem, I 
> think the comment should explain it.  In particular, why is it sufficient to 
> check whether wal is prohibited before entering the critical section and not 
> necessary to be sure it remains allowed through the lifetime of that critical 
> section?

The idea here is that if a transaction already has an XID assigned, we
have to kill it off before we can declare the system read-only,
because it will definitely write WAL when the transaction ends: either
a commit record, or an abort record, but definitely something. So
cases where we write WAL without necessarily having an XID require
special handling. They have to check whether WAL has become prohibited
and error out if so, and they need to do so before entering the
critical section - because if the problem were detected for the first
time inside the critical section it would escalate to a PANIC, which
we do not want. Places where we're guaranteed to have an XID - e.g.
inserting a heap tuple - don't need a run-time check before entering
the critical section, because the code can't be reached in the first
place if the system is WAL-read-only.

> Why is this function defined to take a boolean such that 
> pg_prohibit_wal(true) means to prohibit wal and pg_prohibit_wal(false) means 
> to allow wal.  Wouldn't a different function named pg_allow_wal() make it 
> more clear?  This also would be a better interface if taking the system 
> read-only had a timeout as I suggested above, as such a timeout parameter 
> when allowing wal is less clearly useful.

Hmm, I find pg_prohibit_wal(true/false) better than pg_prohibit_wal()
and pg_allow_wal(), and would prefer pg_prohibit_wal(true/false,
timeout) over pg_prohibit_wal(timeout) and pg_allow_wal(), because I
think then once you find that one function you know how to do
everything about that feature, whereas the other way you need to find
both functions to have the whole story. That said, I can see why
somebody else might prefer something else.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Feedback on table expansion hook (including patch)

2021-09-09 Thread Jaime Casanova
On Wed, May 12, 2021 at 10:19:17PM +0900, Amit Langote wrote:
> (Sorry about being very late to this thread.)
> 
> > Would it be unreasonable of us to ask for a worked-out example making
> > use of the proposed hook?  That'd go a long way towards resolving the
> > question of whether you can do anything useful without duplicating
> > lots of code.
> >
> > I've also been wondering, given the table-AM projects that are
> > going on, whether we shouldn't refactor things to give partitioned
> > tables a special access method, and then shove most of the planner
> > and executor's hard-wired partitioning logic into access method
> > callbacks.  That would make it a lot more feasible for extensions
> > to implement custom partitioning-like behavior ... or so I guess.
> 
> Interesting proposition...
> 

Since there is no clear definition here, we seems to be expecting an
example of how the hook will be used and there have been no activity
since may.

I suggest we move this to Returned with feedback. Which I'll do in a
couple hours.

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL




Re: missing warning in pg_import_system_collations

2021-09-09 Thread Tom Lane
Anton Voloshin  writes:
> On 09/09/2021 21:51, Tom Lane wrote:
>> Assuming we don't want to change pg_get_encoding_from_locale()'s API,
>> the simplest fix is to duplicate its error message, so more or less
>> 
>> if (enc < 0)
>> {
>> -/* error message printed by pg_get_encoding_from_locale() */
>> +elog(DEBUG1, "could not determine encoding for locale \"%s\"",
>> + localebuf)));
>> continue;
>> }

> Upon thinking a little more, I agree.

Another approach we could take is to deem the comment incorrect and
just remove it, codifying the current behavior of silently ignoring
unrecognized encodings.  The reason that seems like it might be
appropriate is that the logic immediately below this bit silently
ignores encodings that are known but are frontend-only:

if (!PG_VALID_BE_ENCODING(enc))
continue;/* ignore locales for client-only encodings */

It's sure not very clear to me why one case deserves a message and the
other not.  Perhaps they both do, which would lead to adding another
DEBUG1 message here.

regards, tom lane




Re: [Patch] ALTER SYSTEM READ ONLY

2021-09-09 Thread Mark Dilger



> On Sep 9, 2021, at 11:21 AM, Robert Haas  wrote:
> 
> They have to check whether WAL has become prohibited
> and error out if so, and they need to do so before entering the
> critical section - because if the problem were detected for the first
> time inside the critical section it would escalate to a PANIC, which
> we do not want.

But that is the part that is still not clear.  Should the comment say that a 
concurrent change to prohibit wal after the current process checks but before 
the current process exists the critical section will result in a panic?  What 
is unclear about the comment is that it implies that a check before the 
critical section is sufficient, but ordinarily one would expect a lock to be 
held and the check-and-lock dance to carefully avoid any race condition.  If 
somehow this is safe, the logic for why it is safe should be spelled out.  If 
not, a mia culpa saying, "hey, were not terribly safe about this" should be 
explicit in the comment.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: We don't enforce NO SCROLL cursor restrictions

2021-09-09 Thread Tom Lane
I wrote:
> [ pokes at it some more ... ]  Hm, we let you do this:
> ...
> which definitely flies in the face of the fact that we disallow
> combining SCROLL and FOR UPDATE:
> regression=*# declare c scroll cursor for select * from int8_tbl for update;
> ERROR:  DECLARE SCROLL CURSOR ... FOR UPDATE is not supported
> DETAIL:  Scrollable cursors must be READ ONLY.
> 
> I don't recall the exact reason for that prohibition, but I wonder
> if there aren't user-visible anomalies reachable from the fact that
> you can bypass it.

I dug in the archives.  The above-quoted error message was added by
me in 048efc25e, responding to Heikki's complaint here:

https://www.postgresql.org/message-id/471F69FE.5000500%40enterprisedb.com

What I now see is that I put that check at the wrong level.  It
successfully blocks off the case Heikki complained of:

DROP TABLE IF EXISTS foo;
CREATE TABLE foo (id integer);
INSERT INTO foo SELECT a from generate_series(1,10) a;
BEGIN;
DECLARE c CURSOR FOR SELECT id FROM foo FOR UPDATE;
FETCH 2 FROM c;
UPDATE foo set ID=20 WHERE CURRENT OF c;
FETCH RELATIVE 0 FROM c;
COMMIT;

The FETCH RELATIVE 0 fails with

ERROR:  cursor can only scan forward
HINT:  Declare it with SCROLL option to enable backward scan.

However, if you replace that with the should-be-equivalent

FETCH ABSOLUTE 2 FROM c;

then what you get is not an error but

 id 

  3
(1 row)

which is for certain anomalous, because that is not the row you
saw as being row 2 in the initial FETCH.

The reason for this behavior is that the only-scan-forward check
is in the relatively low-level function PortalRunSelect, which
is passed a "forward" flag and a count.  The place that interprets
FETCH_ABSOLUTE and friends is DoPortalRunFetch, and what it's doing
in this particular scenario is to rewind to start and fetch forwards,
thus bypassing PortalRunSelect's error check.  And, since the query
is using FOR UPDATE, this table scan sees the row with ID=2 as already
dead.  (Its replacement with ID=20 has been installed at the end of
the table, so while it would be visible to the cursor, it's not at
the same position as before.)

So basically, we *do* have this check and have done since 2007,
but it's not water-tight for all variants of FETCH.  I think
tightening it up in HEAD and v14 is a no-brainer, but I'm a bit
more hesitant about whether to back-patch into stable branches.

regards, tom lane




Re: trap instead of error on 32 TiB table

2021-09-09 Thread Heikki Linnakangas

On 09/09/2021 17:25, Tom Lane wrote:

Having done that, the check in md.c could be reduced to an Assert,
although there's something to be said for leaving it as-is as an
extra layer of defense.


Some operations call smgrextend() directly, like B-tree index creation. 
We don't want those operations to hit an assertion either.


- Heikki




Re: .ready and .done files considered harmful

2021-09-09 Thread Bossart, Nathan
On 9/8/21, 10:49 AM, "Dipesh Pandit"  wrote:
> Updated log level to DEBUG3 and rebased the patch. PFA patch.

Thanks for the new patch.

+ * by checking the availability of next WAL file. "xlogState" specifies the
+ * segment number and timeline ID corresponding to the next WAL file.

"xlogState" probably needs to be updated here.

As noted before [0], I think we need to force a directory scan at the
beginning of pgarch_MainLoop() and when pgarch_ArchiverCopyLoop()
returns before we exit the "while" loop.  Else, there's probably a
risk that we skip archiving a file until the next directory scan.  IMO
forcing a directory scan at the beginning of pgarch_ArchiverCopyLoop()
is a simpler way to do roughly the same thing.  I'm skeptical that
persisting the next-anticipated state between calls to
pgarch_ArchiverCopyLoop() is worth the complexity.

Nathan

[0] 
https://www.postgresql.org/message-id/AC78607B-9DA6-41F4-B253-840D3DD964BF%40amazon.com



Re: parallelizing the archiver

2021-09-09 Thread Bossart, Nathan
On 9/7/21, 11:38 PM, "Julien Rouhaud"  wrote:
> On Wed, Sep 8, 2021 at 6:36 AM Bossart, Nathan  wrote:
>>
>> I'd like to gauge interest in parallelizing the archiver process.
>> [...]
>> Based on previous threads I've seen, I believe many in the community
>> would like to replace archive_command entirely, but what I'm proposing
>> here would build on the existing tools.
>
> Having a new implementation that would remove the archive_command is
> probably a better long term solution, but I don't know of anyone
> working on that and it's probably gonna take some time.  Right now we
> have a lot of users that face archiving bottleneck so I think it would
> be a good thing to implement parallel archiving, fully compatible with
> current archive_command, as a short term solution.

Thanks for chiming in.  I'm planning to work on a patch next week.

Nathan



Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2021-09-09 Thread Bossart, Nathan
On 9/8/21, 6:49 AM, "Bharath Rupireddy" 
 wrote:
> How about we create a new extension, called pg_walinspect (synonymous
> to pageinspect extension) with a bunch of SQL-callable functions that
> get the raw WAL records or stats out of a running postgres database
> instance in a more structured way that is easily consumable by all the
> users or DBAs or developers? We can also provide these functionalities
> into the core postgres (in xlogfuncs.c) instead of a new extension,
> but we would like it to be pluggable so that the functions will be
> used only if required.

+1

Nathan



Re: We don't enforce NO SCROLL cursor restrictions

2021-09-09 Thread Tom Lane
I wrote:
> The reason for this behavior is that the only-scan-forward check
> is in the relatively low-level function PortalRunSelect, which
> is passed a "forward" flag and a count.  The place that interprets
> FETCH_ABSOLUTE and friends is DoPortalRunFetch, and what it's doing
> in this particular scenario is to rewind to start and fetch forwards,
> thus bypassing PortalRunSelect's error check.

After some further study, I've reached a few conclusions:

* The missing bit in pquery.c is exactly that we'll allow a portal
rewind even with a no-scroll cursor.  I think that the reason it's
like that is that the code was mainly interested in closing off
cases where we'd attempt to run a plan backwards, to protect plan
node types that can't do that.  As far as the executor is concerned,
rewind-to-start is okay in any case.  However, as we see from this
thread, that definition doesn't protect us against anomalous results
from volatile queries.  So putting an error check in DoPortalRewind
seems to be enough to fix this, as in patch 0001 below.  (This also
fixes one bogus copied-and-pasted comment, and adjusts the one
regression test case that breaks.)

* The anomaly for held cursors boils down to ba2c6d6ce having ignored
this good advice in portal.h:

 * ... Also note that various code inspects atStart and atEnd, but
 * only the portal movement routines should touch portalPos.

Thus, PersistHoldablePortal has no business changing the cursor's
atStart/atEnd/portalPos.  The only thing that resetting portalPos
actually bought us was to make the tuplestore_skiptuples call a bit
further down into a no-op, but we can just bypass that call for a
no-scroll cursor, as in 0002 below.  However, 0002 does have a
dependency on 0001, because if we allow tuplestore_rescan on the
holdStore it will expose the fact that the tuplestore doesn't contain
the whole cursor result.  (I was a bit surprised to find that those
were the only two places where we weren't positioning in the holdStore
by dead reckoning, but it seems to be the case.)

I was feeling nervous about back-patching 0001 already, and finding
that one of our own regression tests was dependent on the omission
of this check doesn't make me any more confident.  However, I'd really
like to be able to back-patch 0002 to get rid of the held-cursor
positioning anomaly.  What I think might be an acceptable compromise
in the back branches is to have DoPortalRewind complain only if
(a) it needs to reposition a no-scroll cursor AND (b) the cursor has
a holdStore, ie it's held over from some previous transaction.
The extra restriction (b) should prevent most people from running into
the error check, even if they've been sloppy about marking cursors
scrollable.  In HEAD we'd drop the restriction (b) and commit 0001 as
shown.  I'm kind of inclined to do that in v14 too, but there's an
argument to be made that it's too late in the beta process to be
changing user-visible semantics without great need.

Thoughts?

regards, tom lane

diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a3c27d9d74..a9e1056057 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -1472,9 +1472,8 @@ PortalRunFetch(Portal portal,
  * DoPortalRunFetch
  *		Guts of PortalRunFetch --- the portal context is already set up
  *
- * count <= 0 is interpreted as a no-op: the destination gets started up
- * and shut down, but nothing else happens.  Also, count == FETCH_ALL is
- * interpreted as "all rows".  (cf FetchStmt.howMany)
+ * Here, count < 0 typically reverses the direction.  Also, count == FETCH_ALL
+ * is interpreted as "all rows".  (cf FetchStmt.howMany)
  *
  * Returns number of rows processed (suitable for use in result tag)
  */
@@ -1491,6 +1490,15 @@ DoPortalRunFetch(Portal portal,
 		   portal->strategy == PORTAL_ONE_MOD_WITH ||
 		   portal->strategy == PORTAL_UTIL_SELECT);
 
+	/*
+	 * Note: we disallow backwards fetch (including re-fetch of current row)
+	 * for NO SCROLL cursors, but we interpret that very loosely: you can use
+	 * any of the FetchDirection options, so long as the end result is to move
+	 * forwards by at least one row.  Currently it's sufficient to check for
+	 * NO SCROLL in DoPortalRewind() and in the forward == false path in
+	 * PortalRunSelect(); but someday we might prefer to account for that
+	 * restriction explicitly here.
+	 */
 	switch (fdirection)
 	{
 		case FETCH_FORWARD:
@@ -1668,6 +1676,19 @@ DoPortalRewind(Portal portal)
 {
 	QueryDesc  *queryDesc;
 
+	/*
+	 * No work is needed if we've not advanced the cursor (and we don't want
+	 * to throw a NO SCROLL error in this case).
+	 */
+	if (portal->atStart && !portal->atEnd)
+		return;
+
+	if (portal->cursorOptions & CURSOR_OPT_NO_SCROLL)
+		ereport(ERROR,
+(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cursor can only scan forward"),
+ errhint("Declare it with SCROLL option to enable backward scan.")));
+
 	/* Re

Re: PROXY protocol support

2021-09-09 Thread Jacob Champion
On Wed, 2021-09-08 at 18:51 +, Jacob Champion wrote:
> I still owe you that overall review. Hoping to get to it this week.

And here it is. I focused on things other than UnwrapProxyConnection()
for this round, since I think that piece is looking solid.

> +   if (port->isProxy)
> +   {
> +   ereport(LOG,
> +   (errcode_for_socket_access(),
> +errmsg("Ident authentication cannot be used 
> over PROXY connections")));

What are the rules on COMMERROR vs LOG when dealing with authentication
code? I always thought COMMERROR was required, but I see now that LOG
(among others) is suppressed to the client during authentication.

>  #ifdef USE_SSL
> /* No SSL when disabled or on Unix sockets */
> -   if (!LoadedSSL || IS_AF_UNIX(port->laddr.addr.ss_family))
> +   if (!LoadedSSL || (IS_AF_UNIX(port->laddr.addr.ss_family) && 
> !port->isProxy))
> SSLok = 'N';
> else
> SSLok = 'S';/* Support for SSL */
> @@ -2087,7 +2414,7 @@ retry1:
>  
>  #ifdef ENABLE_GSS
> /* No GSSAPI encryption when on Unix socket */
> -   if (!IS_AF_UNIX(port->laddr.addr.ss_family))
> +   if (!IS_AF_UNIX(port->laddr.addr.ss_family) || port->isProxy)
> GSSok = 'G';

Now that we have port->daddr, could these checks be simplified to just
IS_AF_UNIX(port->daddr...)? Or is there a corner case I'm missing for
the port->isProxy case?

> +* Note: AuthenticationTimeout is applied here while waiting for the
> +* startup packet, and then again in InitPostgres for the duration of any
> +* authentication operations.  So a hostile client could tie up the
> +* process for nearly twice AuthenticationTimeout before we kick him off.

This comment needs to be adjusted after the move; waiting for the
startup packet comes later, and it looks like we can now tie up 3x the
timeout for the proxy case.

> +   /* Check if this is a proxy connection and if so unwrap the proxying 
> */
> +   if (port->isProxy)
> +   {
> +   enable_timeout_after(STARTUP_PACKET_TIMEOUT, 
> AuthenticationTimeout * 1000);
> +   if (UnwrapProxyConnection(port) != STATUS_OK)
> +   proc_exit(0);

I think the timeout here could comfortably be substantially less than
the overall authentication timeout, since the proxy should send its
header immediately even if the client takes its time with the startup
packet. The spec suggests allowing 3 seconds minimum to cover a
retransmission. Maybe something to tune in the future?

> +   /* Also listen to the PROXY port on this address, if 
> configured */
> +   if (ProxyPortNumber)
> +   {
> +   if (strcmp(curhost, "*") == 0)
> +   socket = StreamServerPort(AF_UNSPEC, 
> NULL,
> + 
> (unsigned short) ProxyPortNumber,
> + 
> NULL,
> + 
> ListenSocket, MAXLISTEN);

Sorry if you already talked about this upthread somewhere, but it looks
like another downside of treating "proxy mode" as a server-wide on/off
switch is that it cuts the effective MAXLISTEN in half, from 64 to 32,
since we're opening two sockets for every address. If I've understood
that correctly, it might be worth mentioning in the docs.

> -   if (!success && elemlist != NIL)
> +   if (socket == NULL && elemlist != NIL)
> ereport(FATAL,
> (errmsg("could not create any TCP/IP 
> sockets")));

With this change in PostmasterMain, it looks like `success` is no
longer a useful variable. But I'm not convinced that this is the
correct logic -- this is just checking to see if the last socket
creation succeeded, as opposed to seeing if any of them succeeded. Is
that what you intended?

> +plan tests => 25;
> +
> +my $node = get_new_node('node');

The TAP test will need to be rebased over the changes in 201a76183e.

--Jacob


Re: Allow escape in application_name

2021-09-09 Thread Fujii Masao




On 2021/09/09 21:36, kuroda.hay...@fujitsu.com wrote:

In this case, I expected to use fallback_application_name instead of appname,
but I missed the case where appname was set as a server option.
Maybe we should do continue when buf.data is \0.


Yes.

+   /*
+* If the input format is wrong and the string 
becomes '\0',
+* this parameter is no longer used.
+* We conitnue searching application_name.
+*/
+   if (values[i] == '\0')
+   continue;
+   else
+   break;

We can simplify the code as follows.

if (values[i] != '\0')
break;


Yeah, and I found that "%+5p" is another example.
Our padding function does not understand plus sign
(and maybe this is the specification of printf()), but strtol() reads.

I did not fix here because I lost my way. Do we treats these cases
and continue to use strtol(), or use process_padding()?


IMO it's better to use process_padding() to process log_line_prefix and
postgres_fdw.application in the same way as possible.
Which would be less confusing.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Add jsonlog log_destination for JSON server logs

2021-09-09 Thread Michael Paquier
On Thu, Sep 09, 2021 at 11:17:01AM +0900, Michael Paquier wrote:
> I was thinking to track stderr as a case where no bits are set in the
> flags for the area of the destinations, but that's a bit crazy if we
> have a lot of margin in what can be stored.  I have looked at that and
> finished with the attached which is an improvement IMO, especially
> when it comes to the header validation.

One part that was a bit flacky after more consideration is that the
header validation would consider as correct the case where both stderr
and csvlog are set in the set of flags.  I have finished by just using
pg_popcount() on one byte with a filter on the log destinations,
making the whole more robust.  If there are any objections, please let
me know.
--
Michael
diff --git a/src/include/postmaster/syslogger.h 
b/src/include/postmaster/syslogger.h
index 1491eecb0f..c79dfbeba2 100644
--- a/src/include/postmaster/syslogger.h
+++ b/src/include/postmaster/syslogger.h
@@ -46,8 +46,7 @@ typedef struct
charnuls[2];/* always \0\0 */
uint16  len;/* size of this chunk (counts 
data only) */
int32   pid;/* writer's pid */
-   charis_last;/* last chunk of message? 't' 
or 'f' ('T' or
-* 'F' for CSV 
case) */
+   bits8   flags;  /* bitmask of PIPE_PROTO_* */
chardata[FLEXIBLE_ARRAY_MEMBER];/* data payload starts 
here */
 } PipeProtoHeader;
 
@@ -60,6 +59,11 @@ typedef union
 #define PIPE_HEADER_SIZE  offsetof(PipeProtoHeader, data)
 #define PIPE_MAX_PAYLOAD  ((int) (PIPE_CHUNK_SIZE - PIPE_HEADER_SIZE))
 
+/* flag bits for PipeProtoHeader->flags */
+#define PIPE_PROTO_IS_LAST 0x01/* last chunk of message? */
+/* log destinations */
+#define PIPE_PROTO_DEST_STDERR 0x10
+#define PIPE_PROTO_DEST_CSVLOG 0x20
 
 /* GUC options */
 extern bool Logging_collector;
diff --git a/src/backend/postmaster/syslogger.c 
b/src/backend/postmaster/syslogger.c
index cad43bdef2..bca3883572 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -38,6 +38,7 @@
 #include "nodes/pg_list.h"
 #include "pgstat.h"
 #include "pgtime.h"
+#include "port/pg_bitutils.h"
 #include "postmaster/fork_process.h"
 #include "postmaster/interrupt.h"
 #include "postmaster/postmaster.h"
@@ -885,14 +886,15 @@ process_pipe_input(char *logbuffer, int 
*bytes_in_logbuffer)
{
PipeProtoHeader p;
int chunklen;
+   bits8   dest_flags;
 
/* Do we have a valid header? */
memcpy(&p, cursor, offsetof(PipeProtoHeader, data));
+   dest_flags = p.flags & (PIPE_PROTO_DEST_STDERR | 
PIPE_PROTO_DEST_CSVLOG);
if (p.nuls[0] == '\0' && p.nuls[1] == '\0' &&
p.len > 0 && p.len <= PIPE_MAX_PAYLOAD &&
p.pid != 0 &&
-   (p.is_last == 't' || p.is_last == 'f' ||
-p.is_last == 'T' || p.is_last == 'F'))
+   pg_popcount((char *) &dest_flags, 1) == 1)
{
List   *buffer_list;
ListCell   *cell;
@@ -906,8 +908,15 @@ process_pipe_input(char *logbuffer, int 
*bytes_in_logbuffer)
if (count < chunklen)
break;
 
-   dest = (p.is_last == 'T' || p.is_last == 'F') ?
-   LOG_DESTINATION_CSVLOG : LOG_DESTINATION_STDERR;
+   if ((p.flags & PIPE_PROTO_DEST_STDERR) != 0)
+   dest = LOG_DESTINATION_STDERR;
+   else if ((p.flags & PIPE_PROTO_DEST_CSVLOG) != 0)
+   dest = LOG_DESTINATION_CSVLOG;
+   else
+   {
+   /* this should never happen as of the header 
validation */
+   Assert(false);
+   }
 
/* Locate any existing buffer for this source pid */
buffer_list = buffer_lists[p.pid % NBUFFER_LISTS];
@@ -924,7 +933,7 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
free_slot = buf;
}
 
-   if (p.is_last == 'f' || p.is_last == 'F')
+   if ((p.flags & PIPE_PROTO_IS_LAST) == 0)
{
/*
 * Save a complete non-final chunk in a per-pid 
buffer
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 816b071afa..2af87ee3bd 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -3250,11 +3250,16 @@ write_pipe_chunk

Re: corruption of WAL page header is never reported

2021-09-09 Thread Fujii Masao



On 2021/09/07 2:02, Fujii Masao wrote:

Even if we do this while NOT in standby mode, ISTM that this function doesn't
return with a valid errmsg_buf because it's reset. So probably the comment
should be updated as follows?

-
We don't do this while not in standby mode because we don't need to retry
immediately if the page header is not valid. Instead, XLogReadRecord() is
responsible to check the page header.
-


I updated the comment as above. Patch attached.

-* it's not valid. This may seem unnecessary, because XLogReadRecord()
+* it's not valid. This may seem unnecessary, because ReadPageInternal()
 * validates the page header anyway, and would propagate the failure up 
to

I also applied this change because ReadPageInternal() not XLogReadRecord()
calls XLogReaderValidatePageHeader().

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index e51a7a749d..35817d9d1e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -12274,7 +12274,7 @@ retry:
 
/*
 * Check the page header immediately, so that we can retry immediately 
if
-* it's not valid. This may seem unnecessary, because XLogReadRecord()
+* it's not valid. This may seem unnecessary, because ReadPageInternal()
 * validates the page header anyway, and would propagate the failure up 
to
 * ReadRecord(), which would retry. However, there's a corner case with
 * continuation records, if a record is split across two pages such that
@@ -12298,9 +12298,22 @@ retry:
 *
 * Validating the page header is cheap enough that doing it twice
 * shouldn't be a big deal from a performance point of view.
+*
+* Note that we don't do this while not in standby mode because we don't
+* need to retry immediately if the page header is not valid. Instead,
+* ReadPageInternal() is responsible for validating the page header.
 */
-   if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
+   if (StandbyMode &&
+   !XLogReaderValidatePageHeader(xlogreader, targetPagePtr, 
readBuf))
{
+   /*
+* Emit this error right now then retry this page immediately. 
Use
+* errmsg_internal() because the message was already translated.
+*/
+   if (xlogreader->errormsg_buf[0])
+   ereport(emode_for_corrupt_record(emode, EndRecPtr),
+   (errmsg_internal("%s", 
xlogreader->errormsg_buf)));
+
/* reset any error XLogReaderValidatePageHeader() might have 
set */
xlogreader->errormsg_buf[0] = '\0';
goto next_record_is_invalid;


Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2021-09-09 Thread Michael Paquier
On Thu, Sep 09, 2021 at 10:49:46PM +, Bossart, Nathan wrote:
> +1

A backend approach has the advantage that you can use the proper locks
to make sure that a segment is not recycled or removed by a concurrent
checkpoint, so that would be reliable.
--
Michael


signature.asc
Description: PGP signature


Re: Estimating HugePages Requirements?

2021-09-09 Thread Michael Paquier
On Thu, Sep 09, 2021 at 09:53:22PM +, Bossart, Nathan wrote:
> For 0002, I have two small concerns.  My first concern is that it
> might be confusing to customers when the runtime GUCs cannot be
> returned for a running server.  We have the note in the docs, but if
> you're encountering it on the command line, it's not totally clear
> what the problem is.

Yeah, that's true.  There are more unlikely-to-happen errors that
could be triggered and prevent the command to work.  I have never
tried using error_context_stack in a code path as early as that, to be
honest.

> Running these commands with log_min_messages=debug5 emits way more
> information for the runtime-computed GUCs than for others, but IMO
> that is alright.  However, perhaps we should adjust the logging in
> 0002 to improve the default user experience.  I attached an attempt at
> that.

Registered bgworkers would generate a DEBUG entry, for one.

> I'm not tremendously happy with the patch, but I hope that it at least
> helps with the discussion.

As far as the behavior is documented, I'd be fine with the approach to
keep the code in its simplest shape.  I agree that the message is
confusing, still it is not wrong either as we try to query a run-time
parameter, but we need the lock.
--
Michael


signature.asc
Description: PGP signature


Re: Estimating HugePages Requirements?

2021-09-09 Thread Bossart, Nathan
On 9/9/21, 7:03 PM, "Michael Paquier"  wrote:
> As far as the behavior is documented, I'd be fine with the approach to
> keep the code in its simplest shape.  I agree that the message is
> confusing, still it is not wrong either as we try to query a run-time
> parameter, but we need the lock.

That seems alright to me.

Nathan



RE: Added schema level support for publication.

2021-09-09 Thread houzj.f...@fujitsu.com
From Wed, Sept 8, 2021 7:44 PM vignesh C  wrote:
> Modified
> Thanks for the comments, the attached v26 patch has the changes for the same.

Hi,

Thanks for updating the patch, I have a suggestion for the gram.y.

Currently, we have the following two members in PublicationObjSpec to 
distinguish
between names of different objects for the post-analysis phase.

>bool   inh;
>bool   spl_rel_type_syn;

I was thinking do we have another way to distinguish that which can make code
smaller. I tried serval approaches and found a possible better approach.

First, I refer to the design of Grant/Revoke syntax, it use two members
'targtype' and 'objtype' to mark different type. 'targtype' can be
ACL_TARGET_OBJECT(single target) or ACL_TARGET_ALL_IN_SCHEMA(schema level)
.'objtype' is the actual type which can be OBJECT_SEQUENCE or OBJECT_TABLE or
... . I think if we follow this way, the code could be cleaner.

Second, we can move the special relation expression into a separate rule
'speical_relation_expr' like the following, this can remove duplicate code
used by pubobj_expr.
--
relation_expr:
qualified_name
{}
| speical_relation_expr
{
$$ = $1;
}
;
speical_relation_expr:
qualified_name '*'
{}
| ONLY qualified_name
{}
| ONLY '(' qualified_name ')'
{}
;
--

Finally, the gram.y will look like the following.
Personally, the code looks cleaner in this approach.

-
pubobj_expr:
any_name
{
/* inheritance query, implicitly */
$$ = makeNode(PublicationObjSpec);
$$->targettype = TARGETOBJ_UNKNOWN;
$$->object = $1;
}
| special_relation_expr
{
$$ = makeNode(PublicationObjSpec);
$$->targettype = TARGETOBJ_TABLE;
$$->object = $1;
}
| CURRENT_SCHEMA
{
$$ = makeNode(PublicationObjSpec);
$$->object = 
list_make1(makeString("CURRENT_SCHEMA"));
$$->targettype = TARGETOBJ_SCHEMA;
}
;

/* FOR TABLE and FOR ALL TABLES IN SCHEMA specifications */
PublicationObjSpec: TABLE pubobj_expr
{
$$ = $2;
$$->pubobjtype = 
PUBLICATIONOBJ_TABLE_LIST;
$$->location = @1;
}

| ALL TABLES IN_P SCHEMA pubobj_expr
{
$$ = $5;
$$->pubobjtype = 
PUBLICATIONOBJ_SCHEMA_LIST;
$$->location = @1;
}
| pubobj_expr
{
$$ = $1;
$$->pubobjtype = 
PUBLICATIONOBJ_UNKNOWN;
$$->location = @1;
}
;

Attach a diff patch based on v26-0002 patch, which change the corresponding
code based on the design above. Please have a look.

Best regards,
Hou zj


0001-refactor_patch
Description: 0001-refactor_patch


RE: Added schema level support for publication.

2021-09-09 Thread houzj.f...@fujitsu.com
From Friday, September 10, 2021 10:33 AM Hou Zhijie 
wrote:
> From Wed, Sept 8, 2021 7:44 PM vignesh C  wrote:
> > Modified
> > Thanks for the comments, the attached v26 patch has the changes for the
> same.
> 
> Hi,
> 
> Thanks for updating the patch, I have a suggestion for the gram.y.
> 
> Currently, we have the following two members in PublicationObjSpec to
> distinguish
> between names of different objects for the post-analysis phase.
> 
> >bool inh;
> >bool spl_rel_type_syn;
> 
> I was thinking do we have another way to distinguish that which can make code
> smaller. I tried serval approaches and found a possible better approach.
> 
> First, I refer to the design of Grant/Revoke syntax, it use two members
> 'targtype' and 'objtype' to mark different type. 'targtype' can be
> ACL_TARGET_OBJECT(single target) or ACL_TARGET_ALL_IN_SCHEMA(schema
> level)
> .'objtype' is the actual type which can be OBJECT_SEQUENCE or OBJECT_TABLE
> or
> ... . I think if we follow this way, the code could be cleaner.
> 
> Second, we can move the special relation expression into a separate rule
> 'speical_relation_expr' like the following, this can remove duplicate code
> used by pubobj_expr.
> --
> relation_expr:
> qualified_name
> {}
> | speical_relation_expr
> {
> $$ = $1;
> }
> ;
> speical_relation_expr:
> qualified_name '*'
> {}
>   | ONLY qualified_name
>   {}
>   | ONLY '(' qualified_name ')'
>   {}
>   ;
> --
> 
> Finally, the gram.y will look like the following.
> Personally, the code looks cleaner in this approach.

Besides, If we don't want to use a new flag to distinguish tablename and 
schemaname,
We can only check the NodeTag to distinguish the difference.

Attach two diff patches based on the latest schema patch
which change the code with a flag and without a flag.

Best regards,
Hou zj


refactor-with-flag_patch
Description: refactor-with-flag_patch


refactor-without-flag_patch
Description: refactor-without-flag_patch


RE: Allow escape in application_name

2021-09-09 Thread kuroda.hay...@fujitsu.com
Dear Fujii-san,

Thank you for quick reviewing!

> We can simplify the code as follows.
> 
>  if (values[i] != '\0')
>  break;

Fixed. And type mismatching was also fixed.

> IMO it's better to use process_padding() to process log_line_prefix and
> postgres_fdw.application in the same way as possible.
> Which would be less confusing.

OK, I followed that. Some comments were added above the function.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v13_0002_allow_escapes.patch
Description: v13_0002_allow_escapes.patch


RE: Allow escape in application_name

2021-09-09 Thread houzj.f...@fujitsu.com
>From Friday, September 10, 2021 11:24 AM kuroda.hay...@fujitsu.com 
>
> > We can simplify the code as follows.
> >
> >  if (values[i] != '\0')
> >  break;
> 
> Fixed. And type mismatching was also fixed.
> 
> > IMO it's better to use process_padding() to process log_line_prefix
> > and postgres_fdw.application in the same way as possible.
> > Which would be less confusing.
> 
> OK, I followed that. Some comments were added above the function.

Hi Kuroda-san,

I found one minor thing in the patch.

+   appendStringInfoSpaces(buf,
+   
   padding > 0 ? padding : -padding);

Is it better to use Abs(padding) here ?
Although some existing codes are using " padding > 0 ? padding : -padding ".

Best regards,
Hou zj




Re: Add jsonlog log_destination for JSON server logs

2021-09-09 Thread Michael Paquier
On Wed, Sep 01, 2021 at 04:39:43PM -0400, Sehrope Sarkuni wrote:
> This version splits out the existing csvlog code into its own file and
> centralizes the common helpers into a new elog-internal.h so that they're
> only included by the actual write_xyz sources.
> 
> That makes the elog.c changes in the JSON logging patch minimal as all it's
> really doing is invoking the new write_jsonlog(...) function.
> 
> It also adds those missing fields to the JSON logger output.

Forking a bit this thread while looking at 0002 that adds new tests
for csvlog.  While I agree that it would be useful to have more
coverage with the syslogger message chunk protocol in this area, I
think that having a separate test is a waste of resources.  Creating a
new node is not cheap either, and this adds more wait phases, making
the tests take longer.  It would be much better to extend
004_logrotate.pl and update it to use log_destination = 'stderr,
csvlog', to minimize the number of nodes we create as well as the
additional amount of time we'd spend for those tests.  Plugging in
JSON into that would not be complicated either once we have in place a
set of small routines that limit the code duplication between the
checks for each log destination type.
--
Michael


signature.asc
Description: PGP signature


RE: Allow escape in application_name

2021-09-09 Thread kuroda.hay...@fujitsu.com
Dear Hou,

> I found one minor thing in the patch.

Thanks!

> Is it better to use Abs(padding) here ?
> Although some existing codes are using " padding > 0 ? padding : -padding ".

+1, fixed.

And I found that some comments were not added above padding calculation,
so added.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v14_0002_allow_escapes.patch
Description: v14_0002_allow_escapes.patch


Re: stat() vs ERROR_DELETE_PENDING, round N + 1

2021-09-09 Thread Thomas Munro
On Tue, Sep 7, 2021 at 7:00 PM Alexander Lakhin  wrote:
> 07.09.2021 09:05, Michael Paquier wrote:
> > On Tue, Sep 07, 2021 at 09:00:01AM +0300, Alexander Lakhin wrote:
> >> The new approach looks very promising. Knowing that the file is really
> >> in the DELETE_PENDING state simplifies a lot.
> >> I've tested the patch v2_0001_Check... with my demo tests [1] and [2],
> >> and it definitely works.

> > Oho, nice.  Just to be sure.  You are referring to
> > v2-0001-Check*.patch posted here, right?
> > https://www.postgresql.org/message-id/ca+hukgkj3p+2acibgaccf_cxe0jlcyevwhexvopk6ul1+v-...@mail.gmail.com

> Yes, i've tested that one, on the master branch (my tests needed a minor
> modification due to PostgresNode changes).

Thanks very much!

Time to tidy up some loose ends.  There are a couple of judgement
calls involved.  Here's what Andres and I came up with in an off-list
chat.  Any different suggestions?

1.  I abandoned the "direct NtCreateFile()" version for now.  I guess
using more and wider unstable interfaces might expose us to greater
risk of silent API/behavior changes or have subtle bugs.  If we ever
have a concrete reason to believe that RtlGetLastNtStatus() is not
reliable here, we could reconsider.

2.  I dropped the assertion that the signal event has been created
before the first call to the open() wrapper.  Instead, I taught
pg_usleep() to fall back to plain old SleepEx() if the signal stuff
isn't up yet.  Other solutions are possible of course, but it struck
me as a bad idea to place initialisation ordering constraints on very
basic facilities like open() and stat().

I should point out explicitly that with this patch, stat() benefits
from open()'s tolerance for sharing violations, as a side effect.
That is, it'll retry for a short time in the hope that whoever opened
our file without allowing sharing will soon go away.  I don't know how
useful that bandaid loop really is in practice, but I don't see why
we'd want that for open() and not stat(), so this change seems good to
me on consistency grounds at the very least.

3.  We fixed the warnings about macro redefinition with #define
UMDF_USING_NTSTATUS and #include  in win32_port.h.  (Other
ideas considered: (1) Andres reported that it also works to move the
#include to ~12 files that need things from it, ie things that were
suppressed from windows.h by that macro and must now be had from
ntstatus.h, but the files you have to change are probably different in
back branches if we decide to do that, (2) I tried defining that macro
locally in files that need it, *before* including c.h/postgres.h, and
then locally include ntstatus.h afterwards, but that seems to violate
project style and generally seems weird.)

Another thing to point out explicitly is that I added a new file
src/port/win32ntdll.c, which is responsible for fishing out the NT
function pointers.  It was useful to be able to do that in the
abandoned NtCreateFile() variant because it needed three of them and I
could reduce boiler-plate noise with a static array of function names
to loop over.  In this version the array has just one element, but I'd
still rather centralise this stuff in one place and make it easy to
add any more of these that we eventually find a need for.

BTW, I also plan to help Victor get his "POSIX semantics" patch[1]
into the tree (and extend it to cover more ops).  That should make
these problems go away in a more complete way IIUC, but doesn't work
everywhere (not sure if we have any build farm animals where it
doesn't work, if so it might be nice to change that), so it's
complementary to this patch.  (My earlier idea that that stuff would
magically start happening for free on all relevant systems some time
soon has faded.)

[1] 
https://www.postgresql.org/message-id/flat/a529b660-da15-5b62-21a0-9936768210fd%40postgrespro.ru
From 6ab7d5e6b5cc6bf62513a5264641e13fc007ebc7 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Sun, 5 Sep 2021 23:49:23 +1200
Subject: [PATCH v3] Check for STATUS_DELETE_PENDING on Windows.

1.  Update our open() wrapper to check for NT's STATUS_DELETE_PENDING
and translate it to appropriate errors.  This is done with
RtlGetLastNtStatus(), which is dynamically loaded from ntdll.  A new
file win32ntdll.c centralizes lookup of NT functions, in case we decide
to add more in the future.

2.  Remove non-working code that was trying to do something similar for
stat(), and just reuse the open() wrapper code.  As a side effect,
stat() also gains resilience against "sharing violation" errors.

3.  Since stat() is used very early in process startup, remove the
requirement that the Win32 signal event has been created before
pgwin32_open_handle() is reached.  Instead, teach pg_usleep() to fall
back to a non-interruptible sleep if reached before the signal event is
available.

Reviewed-by: Andres Freund 
Reviewed-by: Alexander Lakhin 
Discussion: https://postgr.es/m/CA%2BhUKGJz_pZTF9mckn6XgSv69%2BjGwdgLkxZ6b3NWGLBCVjqUZA%40mail.gmail.com
---
 con

Re: parallelizing the archiver

2021-09-09 Thread Julien Rouhaud
On Fri, Sep 10, 2021 at 6:30 AM Bossart, Nathan  wrote:
>
> Thanks for chiming in.  I'm planning to work on a patch next week.

Great news!

About the technical concerns:

> I'm currently thinking of
> something a bit like autovacuum_max_workers, but the archive workers
> would be created once and would follow a competing consumers model.

In this approach, the launched archiver workers would be kept as long
as the instance is up, or should they be stopped if they're not
required anymore, e.g. if there was a temporary write activity spike?
I think we should make sure that at least one worker is always up.

> Another approach I'm looking at is to use background worker processes,
> although I'm not sure if linking such a critical piece of
> functionality to max_worker_processes is a good idea.  However, I do
> see that logical replication uses background workers.

I think that using background workers is a good approach, and the
various guc in that area should allow users to properly configure
archiving too.  If that's not the case, it might be an opportunity to
add some new infrastructure that could benefit all bgworkers users.




Re: Added schema level support for publication.

2021-09-09 Thread Amit Kapila
On Fri, Sep 10, 2021 at 8:54 AM houzj.f...@fujitsu.com
 wrote:
>
> From Friday, September 10, 2021 10:33 AM Hou Zhijie 
> wrote:
>
> Besides, If we don't want to use a new flag to distinguish tablename and 
> schemaname,
> We can only check the NodeTag to distinguish the difference.
>
> Attach two diff patches based on the latest schema patch
> which change the code with a flag and without a flag.
>

I would prefer a version without additional flags unless you think it
is difficult to extend it in the future for other objects like
sequences which as far as I can see shouldn't be the case. Is there a
reason to define pubobj_name similar to any_name? If so, then please
do add the comments. One reason I could think of is that any_name is
not used for schema names currently which might have motivated you to
define a separate naming convention for publication.

-- 
With Regards,
Amit Kapila.




Re: Toast compression method options

2021-09-09 Thread Jaime Casanova
On Mon, Jul 19, 2021 at 01:24:03PM +0530, Dilip Kumar wrote:
> On Sun, Jul 18, 2021 at 9:15 PM Dilip Kumar  wrote:
> >
> > On Wed, Jul 14, 2021 at 5:35 PM vignesh C  wrote:
> > >
> > > On Thu, May 6, 2021 at 7:24 PM Dilip Kumar  wrote
> > >
> > > The patch does not apply on Head anymore, could you rebase and post a
> > > patch. I'm changing the status to "Waiting for Author".
> >
> > Okay, I will rebase and send it by next week.
> 
> I have rebased the patch.
> 

Hi,

This doesn't apply cleanly nor compile.
Are you planning to send a rebased version?

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL




Re: Toast compression method options

2021-09-09 Thread Dilip Kumar
On Fri, 10 Sep 2021 at 10:40 AM, Jaime Casanova <
jcasa...@systemguards.com.ec> wrote:

> On Mon, Jul 19, 2021 at 01:24:03PM +0530, Dilip Kumar wrote:
> > On Sun, Jul 18, 2021 at 9:15 PM Dilip Kumar 
> wrote:
> > >
> > > On Wed, Jul 14, 2021 at 5:35 PM vignesh C  wrote:
> > > >
> > > > On Thu, May 6, 2021 at 7:24 PM Dilip Kumar 
> wrote
> > > >
> > > > The patch does not apply on Head anymore, could you rebase and post a
> > > > patch. I'm changing the status to "Waiting for Author".
> > >
> > > Okay, I will rebase and send it by next week.
> >
> > I have rebased the patch.
> >
>
> Hi,
>
> This doesn't apply cleanly nor compile.
> Are you planning to send a rebased version?


I will do that early next week.

> --
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: parallelizing the archiver

2021-09-09 Thread Andrey Borodin



> 8 сент. 2021 г., в 03:36, Bossart, Nathan  написал(а):
> 
> Anyway, I'm curious what folks think about this.  I think it'd help
> simplify server administration for many users.

BTW this thread is also related [0].

My 2 cents.
It's OK if external tool is responsible for concurrency. Do we want this 
complexity in core? Many users do not enable archiving at all.
Maybe just add parallelism API for external tool?
It's much easier to control concurrency in external tool that in PostgreSQL 
core. Maintaining parallel worker is a tremendously harder than spawning 
goroutine, thread, task or whatever.
External tool needs to know when xlog segment is ready and needs to report when 
it's done. Postgres should just ensure that external archiever\restorer is 
running.
For example external tool could read xlog names from stdin and report finished 
files from stdout. I can prototype such tool swiftly :)
E.g. postgres runs ```wal-g wal-archiver``` and pushes ready segment filenames 
on stdin. And no more listing of archive_status and hacky algorithms to predict 
next WAL name and completition time!

Thoughts?

Best regards, Andrey Borodin.

[0] 
https://www.postgresql.org/message-id/flat/CA%2BTgmobhAbs2yabTuTRkJTq_kkC80-%2Bjw%3DpfpypdOJ7%2BgAbQbw%40mail.gmail.com



RE: Added schema level support for publication.

2021-09-09 Thread houzj.f...@fujitsu.com
From Friday, September 10, 2021 1:10 PM Amit Kapila  
wrote:
> On Fri, Sep 10, 2021 at 8:54 AM Hou zhijie  wrote:
> >
> > From Friday, September 10, 2021 10:33 AM Hou
> Zhijie wrote:
> >
> > Besides, If we don't want to use a new flag to distinguish tablename and
> schemaname,
> > We can only check the NodeTag to distinguish the difference.
> >
> > Attach two diff patches based on the latest schema patch
> > which change the code with a flag and without a flag.
> >
> 
> I would prefer a version without additional flags unless you think it
> is difficult to extend it in the future for other objects like
> sequences which as far as I can see shouldn't be the case.

Ok, I agreed.

> Is there a
> reason to define pubobj_name similar to any_name? If so, then please
> do add the comments. One reason I could think of is that any_name is
> not used for schema names currently which might have motivated you to
> define a separate naming convention for publication.

When I used any_name, Bison reported that the dot('.') in rule attr
would have a shift/reduce conflict with the dot('.') in rule indirection_el
which also used in pubobj_expr. So, I declared a new rule which will directly
use indirection_el to resolve the conflicts.

Attach the without-flag version and add comments about the pubobj_name.

Best regards,
Hou zj


refactor-without-flag_patch
Description: refactor-without-flag_patch


Re: parallelizing the archiver

2021-09-09 Thread Julien Rouhaud
On Fri, Sep 10, 2021 at 1:28 PM Andrey Borodin  wrote:
>
> It's OK if external tool is responsible for concurrency. Do we want this 
> complexity in core? Many users do not enable archiving at all.
> Maybe just add parallelism API for external tool?
> It's much easier to control concurrency in external tool that in PostgreSQL 
> core. Maintaining parallel worker is a tremendously harder than spawning 
> goroutine, thread, task or whatever.

Yes, but it also means that it's up to every single archiving tool to
implement a somewhat hackish parallel version of an archive_command,
hoping that core won't break it.  If this problem is solved in
postgres core whithout API change, then all existing tool will
automatically benefit from it (maybe not the one who used to have
hacks to make it parallel though, but it seems easier to disable it
rather than implement it).

> External tool needs to know when xlog segment is ready and needs to report 
> when it's done. Postgres should just ensure that external archiever\restorer 
> is running.
> For example external tool could read xlog names from stdin and report 
> finished files from stdout. I can prototype such tool swiftly :)
> E.g. postgres runs ```wal-g wal-archiver``` and pushes ready segment 
> filenames on stdin. And no more listing of archive_status and hacky 
> algorithms to predict next WAL name and completition time!

Yes, but that requires fundamental design changes for the archive
commands right?  So while I agree it could be a better approach
overall, it seems like a longer term option.  As far as I understand,
what Nathan suggested seems more likely to be achieved in pg15 and
could benefit from a larger set of backup solutions.  This can give us
enough time to properly design a better approach for designing a new
archiving approach.




Re: parallelizing the archiver

2021-09-09 Thread Andrey Borodin



> 10 сент. 2021 г., в 10:52, Julien Rouhaud  написал(а):
> 
> On Fri, Sep 10, 2021 at 1:28 PM Andrey Borodin  wrote:
>> 
>> It's OK if external tool is responsible for concurrency. Do we want this 
>> complexity in core? Many users do not enable archiving at all.
>> Maybe just add parallelism API for external tool?
>> It's much easier to control concurrency in external tool that in PostgreSQL 
>> core. Maintaining parallel worker is a tremendously harder than spawning 
>> goroutine, thread, task or whatever.
> 
> Yes, but it also means that it's up to every single archiving tool to
> implement a somewhat hackish parallel version of an archive_command,
> hoping that core won't break it.
I'm not proposing to remove existing archive_command. Just deprecate it 
one-WAL-per-call form.

>  If this problem is solved in
> postgres core whithout API change, then all existing tool will
> automatically benefit from it (maybe not the one who used to have
> hacks to make it parallel though, but it seems easier to disable it
> rather than implement it).
True hacky tools already can coordinate swarm of their processes and are 
prepared that they are called multiple times concurrently :)

>> External tool needs to know when xlog segment is ready and needs to report 
>> when it's done. Postgres should just ensure that external archiever\restorer 
>> is running.
>> For example external tool could read xlog names from stdin and report 
>> finished files from stdout. I can prototype such tool swiftly :)
>> E.g. postgres runs ```wal-g wal-archiver``` and pushes ready segment 
>> filenames on stdin. And no more listing of archive_status and hacky 
>> algorithms to predict next WAL name and completition time!
> 
> Yes, but that requires fundamental design changes for the archive
> commands right?  So while I agree it could be a better approach
> overall, it seems like a longer term option.  As far as I understand,
> what Nathan suggested seems more likely to be achieved in pg15 and
> could benefit from a larger set of backup solutions.  This can give us
> enough time to properly design a better approach for designing a new
> archiving approach.

It's a very simplistic approach. If some GUC is set - archiver will just feed 
ready files to stdin of archive command. What fundamental design changes we 
need?

Best regards, Andrey Borodin.



Re: parallelizing the archiver

2021-09-09 Thread Julien Rouhaud
On Fri, Sep 10, 2021 at 2:03 PM Andrey Borodin  wrote:
>
> > 10 сент. 2021 г., в 10:52, Julien Rouhaud  написал(а):
> >
> > Yes, but it also means that it's up to every single archiving tool to
> > implement a somewhat hackish parallel version of an archive_command,
> > hoping that core won't break it.
> I'm not proposing to remove existing archive_command. Just deprecate it 
> one-WAL-per-call form.

Which is a big API beak.

> It's a very simplistic approach. If some GUC is set - archiver will just feed 
> ready files to stdin of archive command. What fundamental design changes we 
> need?

I'm talking about the commands themselves.  Your suggestion is to
change archive_command to be able to spawn a daemon, and it looks like
a totally different approach.  I'm not saying that having a daemon
based approach to take care of archiving is a bad idea, I'm saying
that trying to fit that with the current archive_command + some new
GUC looks like a bad idea.




Re: parallelizing the archiver

2021-09-09 Thread Andrey Borodin



> 10 сент. 2021 г., в 11:11, Julien Rouhaud  написал(а):
> 
> On Fri, Sep 10, 2021 at 2:03 PM Andrey Borodin  wrote:
>> 
>>> 10 сент. 2021 г., в 10:52, Julien Rouhaud  написал(а):
>>> 
>>> Yes, but it also means that it's up to every single archiving tool to
>>> implement a somewhat hackish parallel version of an archive_command,
>>> hoping that core won't break it.
>> I'm not proposing to remove existing archive_command. Just deprecate it 
>> one-WAL-per-call form.
> 
> Which is a big API beak.
Huge extension, not a break.

>> It's a very simplistic approach. If some GUC is set - archiver will just 
>> feed ready files to stdin of archive command. What fundamental design 
>> changes we need?
> 
> I'm talking about the commands themselves.  Your suggestion is to
> change archive_command to be able to spawn a daemon, and it looks like
> a totally different approach.  I'm not saying that having a daemon
> based approach to take care of archiving is a bad idea, I'm saying
> that trying to fit that with the current archive_command + some new
> GUC looks like a bad idea.
It fits nicely, even in corner cases. E.g. restore_command run from pg_rewind 
seems compatible with this approach.
One more example: after failover DBA can just ```ls|wal-g wal-push``` to 
archive all WALs unarchived before network partition.

This is simple yet powerful approach, without any contradiction to existing 
archive_command API.
Why it's a bad idea?

Best regards, Andrey Borodin.



Re: missing warning in pg_import_system_collations

2021-09-09 Thread Anton Voloshin

On 10/09/2021 01:37, Tom Lane wrote:

Another approach we could take is to deem the comment incorrect and
just remove it, codifying the current behavior of silently ignoring
unrecognized encodings.  The reason that seems like it might be
appropriate is that the logic immediately below this bit silently
ignores encodings that are known but are frontend-only:

 if (!PG_VALID_BE_ENCODING(enc))
 continue;/* ignore locales for client-only encodings */

It's sure not very clear to me why one case deserves a message and the
other not.  Perhaps they both do, which would lead to adding another
DEBUG1 message here.


I'm not an expert in locales, but I think it makes some sense to be 
silent about encodings we have consciously decided to ignore as we have 
them in our tables, but marked them as frontend-only 
(!PG_VALID_BE_ENCODING(enc)).
Just like it makes sense to do give a debug-level warning about 
encodings seen in locale -a output but not recognized by us at all 
(pg_get_encoding_from_locale(localebuf, false) < 0).


Therefore I think your patch with duplicated error message is better 
than what we have currently. I don't see how adding debug-level messages 
about skipping frontend-only encodings would be of any significant use here.


Unless someone more experienced in locales' subtleties would like to 
chime in.


--
Anton Voloshin
Postgres Professional, The Russian Postgres Company
https://postgrespro.ru




Re: Add jsonlog log_destination for JSON server logs

2021-09-09 Thread Michael Paquier
On Fri, Sep 10, 2021 at 01:07:00PM +0900, Michael Paquier wrote:
> Forking a bit this thread while looking at 0002 that adds new tests
> for csvlog.  While I agree that it would be useful to have more
> coverage with the syslogger message chunk protocol in this area, I
> think that having a separate test is a waste of resources.  Creating a
> new node is not cheap either, and this adds more wait phases, making
> the tests take longer.  It would be much better to extend
> 004_logrotate.pl and update it to use log_destination = 'stderr,
> csvlog', to minimize the number of nodes we create as well as the
> additional amount of time we'd spend for those tests.  Plugging in
> JSON into that would not be complicated either once we have in place a
> set of small routines that limit the code duplication between the
> checks for each log destination type.

And this part leads me to the attached, where the addition of the JSON
format would result in the addition of a couple of lines.
--
Michael
diff --git a/src/bin/pg_ctl/t/004_logrotate.pl 
b/src/bin/pg_ctl/t/004_logrotate.pl
index fa14b98c7d..aa0d64a4f7 100644
--- a/src/bin/pg_ctl/t/004_logrotate.pl
+++ b/src/bin/pg_ctl/t/004_logrotate.pl
@@ -6,15 +6,64 @@ use warnings;
 
 use PostgresNode;
 use TestLib;
-use Test::More tests => 5;
+use Test::More tests => 10;
 use Time::HiRes qw(usleep);
 
+# Extract the file name of a $format from the contents of
+# current_logfiles.
+sub fetch_file_name
+{
+   my $logfiles = shift;
+   my $format   = shift;
+   my @lines= split(/\n/, $logfiles);
+   my $filename = undef;
+   foreach my $line (@lines)
+   {
+   if ($line =~ /$format (.*)$/gm)
+   {
+   $filename = $1;
+   }
+   }
+
+   return $filename;
+}
+
+# Check for a pattern in the logs associated to one format.
+sub check_log_pattern
+{
+   my $format   = shift;
+   my $logfiles = shift;
+   my $pattern  = shift;
+   my $node = shift;
+   my $lfname   = fetch_file_name($logfiles, $format);
+
+   my $max_attempts = 180 * 10;
+
+   my $logcontents;
+   for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
+   {
+   $logcontents = slurp_file($node->data_dir . '/' . $lfname);
+   last if $logcontents =~ m/$pattern/;
+   usleep(100_000);
+   }
+
+   like($logcontents, qr/$pattern/,
+   "found expected log file content for $format");
+
+   # While we're at it, test pg_current_logfile() function
+   is( $node->safe_psql('postgres', "SELECT 
pg_current_logfile('$format')"),
+   $lfname,
+   "pg_current_logfile() gives correct answer with $format");
+   return;
+}
+
 # Set up node with logging collector
 my $node = PostgresNode->new('primary');
 $node->init();
 $node->append_conf(
'postgresql.conf', qq(
 logging_collector = on
+log_destination = 'stderr, csvlog'
 # these ensure stability of test results:
 log_rotation_age = 0
 lc_messages = 'C'
@@ -44,26 +93,12 @@ note "current_logfiles = $current_logfiles";
 
 like(
$current_logfiles,
-   qr|^stderr log/postgresql-.*log$|,
+   qr|^stderr log/postgresql-.*log
+csvlog log/postgresql-.*csv$|,
'current_logfiles is sane');
 
-my $lfname = $current_logfiles;
-$lfname =~ s/^stderr //;
-chomp $lfname;
-
-my $first_logfile;
-for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
-{
-   $first_logfile = slurp_file($node->data_dir . '/' . $lfname);
-   last if $first_logfile =~ m/division by zero/;
-   usleep(100_000);
-}
-
-like($first_logfile, qr/division by zero/, 'found expected log file content');
-
-# While we're at it, test pg_current_logfile() function
-is($node->safe_psql('postgres', "SELECT pg_current_logfile('stderr')"),
-   $lfname, 'pg_current_logfile() gives correct answer');
+check_log_pattern('stderr', $current_logfiles, 'division by zero', $node);
+check_log_pattern('csvlog', $current_logfiles, 'division by zero', $node);
 
 # Sleep 2 seconds and ask for log rotation; this should result in
 # output into a different log file name.
@@ -84,28 +119,14 @@ note "now current_logfiles = $new_current_logfiles";
 
 like(
$new_current_logfiles,
-   qr|^stderr log/postgresql-.*log$|,
+   qr|^stderr log/postgresql-.*log
+csvlog log/postgresql-.*csv$|,
'new current_logfiles is sane');
 
-$lfname = $new_current_logfiles;
-$lfname =~ s/^stderr //;
-chomp $lfname;
-
 # Verify that log output gets to this file, too
-
 $node->psql('postgres', 'fee fi fo fum');
 
-my $second_logfile;
-for (my $attempts = 0; $attempts < $max_attempts; $attempts++)
-{
-   $second_logfile = slurp_file($node->data_dir . '/' . $lfname);
-   last if $second_logfile =~ m/syntax error/;
-   usleep(100_000);
-}
-
-like(
-   $second_logfile,
-   qr/syntax error/,
-   'found expected log file content in new log file');
+check_log_pattern('