Re: Next Steps with Hash Indexes

2021-10-14 Thread Simon Riggs
On Wed, 13 Oct 2021 at 20:16, Peter Geoghegan  wrote:
>
> On Wed, Oct 13, 2021 at 3:44 AM Simon Riggs
>  wrote:
> > > IMO it'd be nice to show some numbers to support the claims that storing
> > > the extra hashes and/or 8B hashes is not worth it ...
> >
> > Using an 8-byte hash is possible, but only becomes effective when
> > 4-byte hash collisions get hard to manage. 8-byte hash also makes the
> > index 20% bigger, so it is not a good default.
>
> Are you sure? I know that nbtree index tuples for a single-column int8
> index are exactly the same size as those from a single column int4
> index, due to alignment overhead at the tuple level. So my guess is
> that hash index tuples (which use the same basic IndexTuple
> representation) work in the same way.

The hash index tuples are 20-bytes each. If that were rounded up to
8-byte alignment, then that would be 24 bytes.

Using pageinspect, the max(live_items) on any data page (bucket or
overflow) is 407 items, so they can't be 24 bytes long.


Other stats of interest would be that the current bucket design/page
splitting is very effective at maintaining distribution. On a hash
index for a table with 2 billion rows in it, with integer values from
1 to 2billion, there are 3670016 bucket pages and 524286 overflow
pages, distributed so that 87.5% of buckets have no overflow pages,
and 12.5% of buckets have only one overflow page; there are no buckets
with >1 overflow page. The most heavily populated overflow page has
209 items.

The CREATE INDEX time is fairly poor at present, but that can be
optimized easily enough, but I expect to do that after uniqueness is
added, since it would complicate the code to do that work in a
different order.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: GIN pending list cleanup during autoanalyze blocks cleanup by VACUUM

2021-10-14 Thread Masahiko Sawada
On Thu, Oct 14, 2021 at 7:58 AM Peter Geoghegan  wrote:
>
> On Sat, Oct 9, 2021 at 5:51 PM Peter Geoghegan  wrote:
> > While I'm no closer to a backpatchable fix than I was on Thursday, I
> > do have some more ideas about what to do on HEAD. I now lean towards
> > completely ripping analyze_only calls out, there -- the whole idea of
> > calling amvacuumcleanup() routines during autoanalyze (but not plain
> > ANALYZE) seems bolted on. It's not just the risk of similar problems
> > cropping up in the future -- it's that the whole approach seems
> > obsolete. We now generally expect autovacuum to run against
> > insert-only tables.
>
> Attached patch removes calls to each index's amvacuumcleanup() routine
> that take place during ANALYZE. These days we can just rely on
> autovacuum to run against insert-only tables (assuming the user didn't
> go out of their way to disable that behavior).

Looking at the original commit, as you mentioned, ISTM performing
pending list cleanup during (auto)analyze (and analyze_only) was
introduced to perform the pending list cleanup on insert-only tables.
Now that we have autovacuum_vacuum_insert_threshold, we don’t
necessarily need to rely on that.

On the other hand, I still see a little value in performing the
pending list cleanup during autoanalyze. For example, if the user
wants to clean up the pending list frequently in the background (so
that it's not triggered in the INSERT path), it might be better to do
that during autoanalyze than autovacuum. If the table has garbage,
autovacuum has to vacuum all indexes and the table, taking a very long
time. But autoanalyze can be done in a shorter time. If we trigger
autoanalyze frequently and perform pending list cleanup, the pending
list cleanup can also be done in a relatively short time, preventing
MVCC snapshots from being held for a long time.

Therefore, I personally think that it's better to eliminate
analyze_only code after introducing a way that allows us to perform
the pending list cleanup more frequently. I think that the idea of
Jaime Casanova's patch is a good solution.

>
> Having thought about it some more, I have arrived at the conclusion
> that we should backpatch this to Postgres 13, the first version that
> had insert-driven autovacuums (following commit b07642db). This
> approach is unorthodox, because it amounts to disabling a
> theoretically-working feature in the backbranches. Also, I'd be
> drawing the line at Postgres 13, due only to the quite accidental fact
> that that's the first major release that clearly doesn't need this
> mechanism. (As it happens Nikolay was on 12 anyway, so this won't work
> for him, but he already has a workaround IIUC.)
>
> I reached this conclusion because I can't think of a non-invasive fix,
> and I really don't want to go there. At the same time, this behavior
> is barely documented, and is potentially very harmful indeed. I'm sure
> that we should get rid of it on HEAD, but getting rid of it a couple
> of years earlier seems prudent.
>
> Does anybody have any opinion on this, either in favor or against my
> backpatch-to-13 proposal?

I'm not very positive about back-patching. The first reason is what I
described above; I still see little value in performing pending list
cleanup during autoanalyze. Another reason is that if the user relies
on autoanalyze to perform pending list cleanup, they have to enable
autovacuum_vacuum_insert_threshold instead during the minor upgrade.
Since it also means to trigger autovacuum in more cases I think it
will have a profound impact on the existing system.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit()

2021-10-14 Thread Kyotaro Horiguchi
At Wed, 13 Oct 2021 19:52:52 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> Although needing a bit of care for the difference of invalid values
> for both though, BackendId can be easily replaced with pgprocno almost
> mechanically except sinvaladt. Therefore, we can confine the current
> backend ID within sinvaladt isolating from other part.  The ids
> dedicated for sinvaladt can be packed to small range and perfomance
> won't be damaged.

Since I said it "mechanically doable", did I that. FWIW the attached is that.  
All behavioral differences come from the difrence of the valid range nad the 
invaild value between old BackendId and pgprpcno.

- Only sinvaladt uses the packed old backendid internally.
  So procsignal can be sent to auxiliary processes.

- All other part uses pgprocno but it is named as backendid so as to
  reduce the diff size.

- vxid's backendid part start from 0 not 1, and invalid backendid
  becomes -1 to PG_INT32_MAX. Since it is mere a cosmetic issue, we
  can replace PG_INT32_MAX as -1 on printing.

- The name of an exported snapshot changes.  The first part start from
  0, not 1.

- Prepared transactions' vxid is changed so that it's backendid part
  has a valid value. Previously it was invalid id (-1).  I'm not sure
  it doesn't harm but I faced no trouble with make check-world.
  (MarkAsPreparingGuts)

- With only 0002, backendid starts from 99 (when max_connection is
  100) then decremented to 0, which is quite odd.  0001 reverses the
  order of freelist.

> In the future, if we can get rid of looping over the procState array,
> sinvaladt - the last user of the current backend ID - can move to
> pgprocno and we will say good-bye to the current backend ID.

The attached files are named as *.txt so that bots don't recognize
them as a patch.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From dd8187954eb72edc3fbe0a807ce531b438412030 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi 
Date: Wed, 13 Oct 2021 10:31:26 +0900
Subject: [PATCH 1/2] procfreelist in ascending order

---
 src/backend/storage/lmgr/proc.c | 62 +++--
 1 file changed, 36 insertions(+), 26 deletions(-)

diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index b7d9da0aa9..78e05976a4 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -163,6 +163,9 @@ InitProcGlobal(void)
j;
boolfound;
uint32  TotalProcs = MaxBackends + NUM_AUXILIARY_PROCS + 
max_prepared_xacts;
+   PGPROC**freelist;
+   int switchpoint;
+   int arraykind;
 
/* Create the ProcGlobal shared structure */
ProcGlobal = (PROC_HDR *)
@@ -214,6 +217,8 @@ InitProcGlobal(void)
ProcGlobal->statusFlags = (uint8 *) ShmemAlloc(TotalProcs * 
sizeof(*ProcGlobal->statusFlags));
MemSet(ProcGlobal->statusFlags, 0, TotalProcs * 
sizeof(*ProcGlobal->statusFlags));
 
+   switchpoint = 0;
+   arraykind = 0;
for (i = 0; i < TotalProcs; i++)
{
/* Common initialization for all PGPROCs, regardless of type. */
@@ -239,33 +244,38 @@ InitProcGlobal(void)
 * linear search.   PGPROCs for prepared transactions are added 
to a
 * free list by TwoPhaseShmemInit().
 */
-   if (i < MaxConnections)
+   if (i < MaxBackends)
{
-   /* PGPROC for normal backend, add to freeProcs list */
-   procs[i].links.next = (SHM_QUEUE *) 
ProcGlobal->freeProcs;
-   ProcGlobal->freeProcs = &procs[i];
-   procs[i].procgloballist = &ProcGlobal->freeProcs;
-   }
-   else if (i < MaxConnections + autovacuum_max_workers + 1)
-   {
-   /* PGPROC for AV launcher/worker, add to 
autovacFreeProcs list */
-   procs[i].links.next = (SHM_QUEUE *) 
ProcGlobal->autovacFreeProcs;
-   ProcGlobal->autovacFreeProcs = &procs[i];
-   procs[i].procgloballist = &ProcGlobal->autovacFreeProcs;
-   }
-   else if (i < MaxConnections + autovacuum_max_workers + 1 + 
max_worker_processes)
-   {
-   /* PGPROC for bgworker, add to bgworkerFreeProcs list */
-   procs[i].links.next = (SHM_QUEUE *) 
ProcGlobal->bgworkerFreeProcs;
-   ProcGlobal->bgworkerFreeProcs = &procs[i];
-   procs[i].procgloballist = 
&ProcGlobal->bgworkerFreeProcs;
-   }
-   else if (i < MaxBackends)
-   {
-   /* PGPROC for walsender, add to walsenderFreeProcs list 
*/
-   procs[i].links.next = (SHM_QUEUE *) 
ProcGlobal->walsenderFreeProcs;
-   ProcGlobal->walsenderFreeProcs = &procs[i];

Re: [Proposal] Global temporary tables

2021-10-14 Thread wenjing zeng


> 2021年10月13日 13:08,Andrew Bille  写道:
> 
> Thanks for the fix. It works for me.
> 
> Now I'm exploring another crash related to GTT, but I need a few days to 
> present a simple repro.

Be deeply grateful.
Perhaps you can give the stack of problems so that you can start analyzing them 
as soon as possible.


Wenjing

> 
> On Sat, Oct 9, 2021 at 2:41 PM wenjing  > wrote:
> 
> Thank you for pointing it out. 
> This is a bug that occurs during transaction rollback and process exit, I 
> fixed it, please confirm it.
> 
> Wenjing 



Re: Skipping logical replication transactions on subscriber side

2021-10-14 Thread Greg Nancarrow
On Tue, Oct 12, 2021 at 4:00 PM Masahiko Sawada  wrote:
>
> I've attached updated patches.
>

A couple more comments for some issues that I noticed in the v16 patches:

v16-0002

doc/src/sgml/ref/alter_subscription.sgml

(1) Order of parameters that can be reset doesn't match those that can be set.
Also, it doesn't match the order specified in the documentation
updates in the v16-0003 patch.

Suggested change:

BEFORE:
+   The parameters that can be reset are: streaming,
+   binary, synchronous_commit.
AFTER:
+   The parameters that can be reset are:
synchronous_commit,
+   binary, streaming.


v16-0003

doc/src/sgml/ref/alter_subscription.sgml

(1) Documentation update says "slot_name" is a parameter that can be
reset, but this is not correct, it can't be reset.
Also, the doc update is missing "the" before "parameter".

Suggested change:

BEFORE:
+  The parameters that can be reset are: slot_name,
+  synchronous_commit, binary,
+  streaming, and following parameter:
AFTER:
+  The parameters that can be reset are:
synchronous_commit,
+  binary, streaming, and
the following
+  parameter:


Regards,
Greg Nancarrow
Fujitsu Australia




Re: a misbehavior of partition row movement (?)

2021-10-14 Thread Amit Langote
On Mon, Sep 20, 2021 at 3:32 PM Amit Langote  wrote:
> The problem was that the tuplestore
> (afterTriggers.query_stack[query_level].tuplestore) that I decided to
> use to store the AFTER trigger tuples of a partitioned table that is
> the target of an cross-partition update lives only for the duration of
> a given query.  So that approach wouldn't work if the foreign key
> pointing into that partitioned table is marked INITIALLY DEFERRED.  To
> fix, I added a List field to AfterTriggersData that stores the
> tuplestores to store the tuples of partitioned tables that undergo
> cross-partition updates in a transaction and are pointed to by
> INITIALLY DEFERRED foreign key constraints.  I couldn't understand one
> comment about why using a tuplestore for such cases *might not* work,
> which as follows:
>
>  * Note that we need triggers on foreign tables to be fired in exactly the
>  * order they were queued, so that the tuples come out of the tuplestore in
>  * the right order.  To ensure that, we forbid deferrable (constraint)
>  * triggers on foreign tables.  This also ensures that such triggers do not
>  * get deferred into outer trigger query levels, meaning that it's okay to
>  * destroy the tuplestore at the end of the query level.
>
> I tried to break the approach using various test cases (some can be
> seen added by the patch to foreign_key.sql), but couldn't see the
> issue alluded to in the above comment.  So I've marked the comment
> with an XXX note as follows:
>
> - * Note that we need triggers on foreign tables to be fired in exactly the
> - * order they were queued, so that the tuples come out of the tuplestore in
> - * the right order.  To ensure that, we forbid deferrable (constraint)
> - * triggers on foreign tables.  This also ensures that such triggers do not
> - * get deferred into outer trigger query levels, meaning that it's okay to
> - * destroy the tuplestore at the end of the query level.
> + * Note that we need triggers on foreign and partitioned tables to be fired 
> in
> + * exactly the order they were queued, so that the tuples come out of the
> + * tuplestore in the right order.  To ensure that, we forbid deferrable
> + * (constraint) triggers on foreign tables.  This also ensures that such
> + * triggers do not get deferred into outer trigger query levels, meaning that
> + * it's okay to destroy the tuplestore at the end of the query level.
> + * XXX - update this paragraph if the new approach, whereby tuplestores in
> + * afterTriggers.deferred_tuplestores outlive any given query, can be proven
> + * to not really break any assumptions mentioned here.
>
> If anyone reading this can think of the issue the original comment
> seems to be talking about, please let me know.

I brought this up in an off-list discussion with Robert and he asked
why I hadn't considered not using tuplestores to remember the tuples
in the first place, specifically pointing out that it may lead to
unnecessarily consuming a lot of memory when such updates move a bunch
of rows around.  Like, we could simply remember the tuples to be
passed to the trigger function using their CTIDs as is done for normal
(single-heap-relation) updates, though in this case also remember the
OIDs of the source and the destination partitions of a particular
cross-partition update.

I had considered that idea before but I think I had overestimated the
complexity of that approach so didn't go with it.  I tried and the
resulting patch doesn't look all that complicated, with the added
bonus that the bulk update case shouldn't consume so much memory with
this approach like the previous tuplestore version would have.

Updated patches attached.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


v10-0001-Create-foreign-key-triggers-in-partitioned-table.patch
Description: Binary data


v10-0002-Enforce-foreign-key-correctly-during-cross-parti.patch
Description: Binary data


Re: Reset snapshot export state on the transaction abort

2021-10-14 Thread Dilip Kumar
On Thu, Oct 14, 2021 at 12:24 PM Michael Paquier  wrote:
>
> On Wed, Oct 13, 2021 at 10:53:24AM +0530, Dilip Kumar wrote:
> > Actually, it is not required because 1) Snapshot export can not be
> > allowed within a transaction block, basically, it starts its own
> > transaction block and aborts that while executing any next replication
> > command see SnapBuildClearExportedSnapshot().  So our problem is only
> > if the transaction block internally started for exporting, gets
> > aborted before any next command arrives.  So there is no possibility
> > of starting any sub transaction.
>
> Yes, you are right here.  I did not remember the semantics this relies
> on.  I have played more with the patch, reviewed the whole, and the
> fields you are resetting as part of the snapshot builds seem correct
> to me.  So let's fix this.

Great, thanks!

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Some doubts about streaming startpoint in WaitForWALToBecomeAvailable()

2021-10-14 Thread 蔡梦娟(玊于)

Hi, all
I have some doubts about the request xlog streaming startpoint in 
WaitForWALToBecomeAvailable(). In this function, RecPtr is the endpoint lsn we 
are waiting for, tliRecPtr is the position of the WAL record we are interested 
in, I want to know why use RecPtr rather than tliRecPtr as the startpoint when 
call RequestXLogStreaming, although the start position will be set as the 
beginning of the corresponding segment in RequestXLogStreaming.

Thanks & Best Regard

Fwd: [Bug] Logical Replication failing if the DateStyle is different in Publisher & Subscriber

2021-10-14 Thread Sadhuprasad Patro
Hi All,

Publisher 'DateStyle' is set as "SQL, MDY", whereas in Subscriber as
"SQL, DMY", the logical replication is not working...

>From Publisher:
postgres=# INSERT INTO calendar VALUES ('07-18-1036', '1'), ('05-15-1135', '1');
INSERT 0 2

Getting below error in the subscriber log file,
2021-10-14 00:59:23.067 PDT [38262] ERROR:  date/time field value out
of range: "07/18/1036"
2021-10-14 00:59:23.067 PDT [38262] HINT:  Perhaps you need a
different "datestyle" setting.

Is this an expected behavior?

Thanks & Regards
SadhuPrasad
http://www.EnterpriseDB.com/




Re: Added schema level support for publication.

2021-10-14 Thread Amit Kapila
On Wed, Oct 13, 2021 at 1:40 PM Greg Nancarrow  wrote:
>
> On Wed, Oct 13, 2021 at 12:15 AM vignesh C  wrote:
> >
> > Attached v40 patch has the fix for the above comments.
> >
>
> [Maybe this has some overlap with what Hou-san reported, and I have
> not tested this against his proposed fixes]
>
> If partitions belong to a different schema than the parent partitioned
> table, then the current patch implementation allows the partitions to
> (optionally) be explicitly added to a publication that includes the
> parent partitioned table (and for the most part, it doesn't seem to
> make any difference to the publication behavior). Should this be
> allowed?
>
> e.g.
>
> CREATE SCHEMA sch;
> CREATE SCHEMA sch1;
> CREATE TABLE sch.sale (sale_date date not null, country_code text,
> product_sku text, units integer) PARTITION BY RANGE (sale_date);
> CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
> ('2019-01-01') TO ('2019-02-01');
> CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
> ('2019-02-01') TO ('2019-03-01');
>
> postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE
> sch1.sale_201901, TABLE sch1.sale_201902;
> CREATE PUBLICATION
> postgres=# \dRp+
>  Publication pub
>  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> ---++-+-+-+---+--
>  gregn | f  | t   | t   | t   | t | f
> Tables:
> "sch1.sale_201901"
> "sch1.sale_201902"
> Tables from schemas:
> "sch"
>

I don't see any problem with this. Do you have a specific problem in
mind due to this?

-- 
With Regards,
Amit Kapila.




Re: Inconsistent behavior of pg_dump/pg_restore on DEFAULT PRIVILEGES

2021-10-14 Thread Tom Lane
Masahiko Sawada  writes:
>> I agree with your analysis on this bug. For non-default
>> (defaclnamespace != 0) entries, their acl should be compared to NULL.
>> 
>> The fix also looks good to me. But I think it'd be better to add tests for 
>> this.

> Since the patch conflicts with the current HEAD, I've rebased and
> slightly updated the patch, adding the regression tests.

Hmmm ... if we're adding a comment about the defaclnamespace check,
seems like it would also be a nice idea to explain the S-to-s
transformation, because the reason for that is surely not apparent.

regards, tom lane




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Aleksander Alekseev
Hi Gilles,

> Any though and interest in this feature?

Personally, I wouldn't call this feature particularly useful. `SELECT
*` is intended for people who are working with DBMS directly e.g. via
psql and want to see ALL columns. The applications should never use
`SELECT *`. So I can't see any real benefits of adding this feature to
PostgreSQL. It will only make the existing code and the existing user
interface even more complicated than they are now.

Also, every yet another feature is x N corner cases when this feature
works with other N features of PostgreSQL. How should it work with
partitioned or inherited tables? Or with logical replication? With
pg_dump? With COPY?

So all in all, -1. This being said, I very much appreciate your
attempt to improve PostgreSQL. However next time before writing the
code I suggest submitting an RFC first.

-- 
Best regards,
Aleksander Alekseev




Re: [Bug] Logical Replication failing if the DateStyle is different in Publisher & Subscriber

2021-10-14 Thread Dilip Kumar
On Thu, Oct 14, 2021 at 3:48 PM Sadhuprasad Patro  wrote:
>
> Hi All,
>
> Publisher 'DateStyle' is set as "SQL, MDY", whereas in Subscriber as
> "SQL, DMY", the logical replication is not working...
>
> From Publisher:
> postgres=# INSERT INTO calendar VALUES ('07-18-1036', '1'), ('05-15-1135', 
> '1');
> INSERT 0 2
>
> Getting below error in the subscriber log file,
> 2021-10-14 00:59:23.067 PDT [38262] ERROR:  date/time field value out
> of range: "07/18/1036"
> 2021-10-14 00:59:23.067 PDT [38262] HINT:  Perhaps you need a
> different "datestyle" setting.
>
> Is this an expected behavior?

Looks like a problem to me, I think for fixing this, on logical
replication connection always set subscriber's DateStlyle, with that
the walsender will always send the data in the same DateStyle that
worker understands and then we are good.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Make query ID more portable

2021-10-14 Thread Andrey Lepikhov

On 14/10/21 10:40, Julien Rouhaud wrote:

On Thu, Oct 14, 2021 at 12:37 PM Andrey Lepikhov
 wrote:


On 12/10/21 18:45, Bruce Momjian wrote:

On Tue, Oct 12, 2021 at 09:40:47AM -0400, Tom Lane wrote:

Andrey Lepikhov  writes:

I think that there are just too many arbitrary decisions that could be
made on what exactly should be a query identifier to have a single
in-core implementation.
Yes, and I use such custom decision too. But core jumbling code 
implements good idea and can be generalized for reuse. Patch from 
previous letter and breaking down of JumbleState can allow coders to 
implement their codes based on queryjumble.c module with smaller changes.



 If you do sharding, you already have to
properly configure each node, so configuring your custom query id
extension shouldn't be a big problem.
My project is about adaptive query optimization techniques. It is not 
obvious how to match (without a field in Query struct) a post parse and 
an execution phases because of nested queries.
Also, if we use queryId in an extension, we interfere with 
pg_stat_statements.


--
regards,
Andrey Lepikhov
Postgres Professional




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Aleksander Alekseev
Hi again,

> So all in all, -1. [...]

Here is something I would like to add:

1. As far as I know, "all the rest of DBMS have this" was never a good
argument in the PostgreSQL community. Generally, using it will turn people
against you.
2. I recall there was a proposal of making the SQL syntax itself
extendable. To my knowledge, this is still a wanted feature [1]. In theory,
that would allow you to implement the feature you want in an extension.

[1]: https://wiki.postgresql.org/wiki/Todo#Exotic_Features

-- 
Best regards,
Aleksander Alekseev


Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Vik Fearing
On 10/14/21 1:47 PM, Aleksander Alekseev wrote:
> Hi Gilles,
> 
>> Any though and interest in this feature?
> 
> Personally, I wouldn't call this feature particularly useful. `SELECT
> *` is intended for people who are working with DBMS directly e.g. via
> psql and want to see ALL columns.

I disagree strongly with this.  It is really annoying when working
interactively with psql on a table that has a PostGIS geometry column,
or any other large blobby type column.

I have not looked at the patch, but +1 for the feature.
-- 
Vik Fearing




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Aleksander Alekseev
Hi Vik,

> I have not looked at the patch, but +1 for the feature.

Maybe you could describe your use case in a little more detail? How
did you end up working with PostGIS geometry via psql on regular
basis? What exactly do you find of annoyance? How will the proposed
patch help?

I find it great that we have people with polar opinions in the
discussion. But to reach any consensus you should make the opponent
understand your situation. Also, please don't simply discard the
disadvantages stated above. If you don't believe these are significant
disadvantages, please explain why do you think so.

-- 
Best regards,
Aleksander Alekseev




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Isaac Morland
On Thu, 14 Oct 2021 at 07:17, Gilles Darold  wrote:


> The user defined columns are always visible in the PostgreSQL. If user
> wants to hide some column(s) from a SELECT * returned values then the
> hidden columns feature is useful. Hidden column can always be used and
> returned by explicitly referring it in the query.
>

It seems to me we've gone in the reverse direction recently. It used to be
that the oid columns of the system tables were hidden (hardcoded, as far as
I know), but as of Postgres 12 I believe there are no more hidden columns:
SELECT * from a table always gives all the columns.

I think a "select all columns except …" would be more useful; or another
approach would be to use a display tool that defaults to displaying only a
portion of large fields.


Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Pavel Stehule
čt 14. 10. 2021 v 14:13 odesílatel Vik Fearing 
napsal:

> On 10/14/21 1:47 PM, Aleksander Alekseev wrote:
> > Hi Gilles,
> >
> >> Any though and interest in this feature?
> >
> > Personally, I wouldn't call this feature particularly useful. `SELECT
> > *` is intended for people who are working with DBMS directly e.g. via
> > psql and want to see ALL columns.
>
> I disagree strongly with this.  It is really annoying when working
> interactively with psql on a table that has a PostGIS geometry column,
> or any other large blobby type column.
>
> I have not looked at the patch, but +1 for the feature.
>

Cannot be better to redefine some strategies for output for some types.

I can agree so sometimes in some environments proposed features can be
nice, but it can be a strong footgun too.

Maybe some strange data can be filtered in psql and it can be better
solution. I agree, so usually print long geometry in psql is useless.

Regards

Pavel



-- 
> Vik Fearing
>
>
>


Re: prevent immature WAL streaming

2021-10-14 Thread Amul Sul
On Wed, Oct 13, 2021 at 10:58 PM Alvaro Herrera  wrote:
>
> On 2021-Oct-13, Amul Sul wrote:
>
> > I have one more question, regarding the need for other global
> > variables i.e. abortedRecPtr.  (Sorry for coming back after so long.)
> >
> > Instead of abortedRecPtr point, isn't enough to write
> > overwrite-contrecord at XLogCtl->lastReplayedEndRecPtr?  I think both
> > are pointing to the same location then can't we use
> > lastReplayedEndRecPtr instead of abortedRecPtr to write
> > overwrite-contrecord and remove need of extra global variable, like
> > attached?
>
> I'm a bit fuzzy on the difference "the end+1" and "the start of the next
> record".  Are they always the same?  We do have XLogRecPtrToBytePos()
> and XLogBytePosToEndRecPtr() to convert unadorned XLogRecPtr values to
> "usable byte positions", which suggests to me that the proposed patch
> may fail if end+1 is a page or segment boundary.
>

Yes, you are correct, that could be a possible failure.

How about calculating that from the lastReplayedEndRecPtr by
converting it first to "usable byte positions" and then recalculating
the record pointer from that, like attached?

> The other difference is that abortedRecPtr is set if we fail to read a
> record, but XLogCtl->lastReplayedEndRecPtr is set even if we read the
> record successfully.  So you'd have need a bool flag that the overwrite
> contrecord record needs to be written.  Your patch is using the fact
> that missingContrecPtr is non-invalid as such a flag ... I can't see
> anything wrong with that.  So maybe your patch is okay in this aspect.
>
> > You might wonder why I am so concerned about the global variable. The
> > reason is that I am working on another thread[1] where we are trying
> > to club all the WAL write operations that happen at the end of
> > StartupXLOG into a separate function. In the future, we might want to
> > allow executing this function from other processes (e.g.
> > Checkpointer). For that, we need to remove the dependency of those WAL
> > write operations having on the global variables which are mostly valid
> > in the startup process.
>
> Seems a fine goal.

Thanks for looking at the patch.

Regards,
Amul


v2_remove-abortedRecPtr-global-variable.patch
Description: Binary data


Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gilles Darold

Le 14/10/2021 à 13:47, Aleksander Alekseev a écrit :

Hi Gilles,


Any though and interest in this feature?

Personally, I wouldn't call this feature particularly useful. `SELECT
*` is intended for people who are working with DBMS directly e.g. via
psql and want to see ALL columns. The applications should never use
`SELECT *`. So I can't see any real benefits of adding this feature to
PostgreSQL. It will only make the existing code and the existing user
interface even more complicated than they are now.



Thanks for your comments Aleksander. This was also my thougth at 
begining but unfortunately there is cases where things are not so simple 
and just relying on SELECT * is dirty or forbidden.  The hidden column 
are not only useful for SELECT * but also for INSERT without column 
list, but INSERT without column list is also a bad practice.




Also, every yet another feature is x N corner cases when this feature
works with other N features of PostgreSQL. How should it work with
partitioned or inherited tables? Or with logical replication? With
pg_dump? With COPY?



I recommand you to have look to my patch because the partitioned and 
inherited case are covered, you can have a . For logical replication I 
guess that any change in pg_attribute is also replicated so I I would 
said that it is fully supported. But obviously I may miss something. 
pg_dump and COPY are also supported.



Actually the patch only prevent an hidden column to be part of a star 
expansion for the returned column, I don't think there is corner case 
with the other part of the code outside that we need to prevent a table 
to have all columns hidden. But I could miss something, I agree.




So all in all, -1. This being said, I very much appreciate your
attempt to improve PostgreSQL. However next time before writing the
code I suggest submitting an RFC first.



Don't worry about my time spent for the PG community, this patch is a 
dust in my contribution to open source :-) If I have provided the patch 
to show the concept and how it can be easily implemented.  Also it can 
be used in some PostgreSQL forks if one is interested by this feature.



--

Gilles Darold





Re: pg14 psql broke \d datname.nspname.relname

2021-10-14 Thread Robert Haas
On Wed, Oct 13, 2021 at 4:43 PM Mark Dilger
 wrote:
> The function where the processing occurs is processSQLNamePattern, which is 
> called by pg_dump, pg_dumpall, and psql.  All three callers expect 
> processSQLNamePattern to append where-clauses to a buffer, not to execute any 
> sql of its own.  I propose that processSQLNamePattern return an error code if 
> the pattern contains more than three parts, but otherwise insert the database 
> portion into the buffer as a "pg_catalog.current_database() 
> OPERATOR(pg_catalog.=) ", where  is a properly escaped 
> representation of the database portion.  Maybe someday we can change that to 
> OPERATOR(pg_catalog.~), but for now we lack the sufficient logic for handling 
> multiple matching database names.  (The situation is different for 
> pg_dumpall, as it's using the normal logic for matching a relation name, not 
> for matching a database, and we'd still be fine matching that against a 
> pattern.)

I agree with matching using OPERATOR(pg_catalog.=) but I think it
should be an error, not a silently-return-nothing case.

> In pg_dumpall, --exclude-database=more.than.one.part would give an error 
> about too many dotted parts rather than simply trying to exclude the last 
> "part" and silently ignoring the prefix, which I think is what v13's 
> pg_dumpall would do.  --exclude-database=db?? would work to exclude four 
> character database names beginning in "db".

Those things sound good.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-14 Thread Robert Haas
On Wed, Oct 13, 2021 at 7:45 PM Jeff Davis  wrote:
> users to be relying on that undocumented function. Is there a good way
> to define a view kind of like a SECURITY DEFINER function so that the
> superuser would only need to issue a GRANT statement on the view?

According to https://www.postgresql.org/docs/current/sql-createview.html
it always works like that: "Access to tables referenced in the view is
determined by permissions of the view owner. In some cases, this can
be used to provide secure but restricted access to the underlying
tables."

Hmm, unless that rule is only being applied for *tables* and not for
*functions*? I guess that could be true, but if so, it sure seems
inconsistent.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [RFC] building postgres with meson

2021-10-14 Thread Dagfinn Ilmari Mannsåker
Josef Šimánek  writes:

> The only problem I do have currently is auto-detection of perl. I'm
> getting error related to missing "Opcode.pm". PERL is autodetected and
> enabled (https://pastebin.com/xfRRrDcU).

Your Perl (not PERL) installation seems to be incomplete. Opcode.pm is a
core module, and should be in /usr/lib64/perl5, judging by the paths in
the error message.

Which OS is this?  Some Linux distributions have separate packages for
the interpreter itself and the included modules, and the packages can be
named confusingly.  E.g. on older Redhat/Fedora versions you have to
install the 'perl-core' package to get all the modules, 'perl' is just
the interpreter and the bare minimum set of strictily necessary modules.

They've fixed this in recent versions (Fedora 34 and Redhat 8, IIRC), so
that 'perl' gives you the hole bundle, and 'perl-interpeter' is the
minimal one.


- ilmari




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gilles Darold

Le 14/10/2021 à 14:09, Aleksander Alekseev a écrit :

Hi again,

> So all in all, -1. [...]

Here is something I would like to add:

1. As far as I know, "all the rest of DBMS have this" was never a good 
argument in the PostgreSQL community. Generally, using it will turn 
people against you.



I have cited the implementation in the other RDBMS because it helps to 
understand the feature, it shows the state of the art on it and 
illustrates my needs. If making references to other implementation turns 
people against me I think that they have the wrong approach on this 
proposal and if we refuse feature because they are implemented in other 
RDBMS this is even worst. I'm not agree with this comment.



2. I recall there was a proposal of making the SQL syntax itself 
extendable. To my knowledge, this is still a wanted feature [1]. In 
theory, that would allow you to implement the feature you want in an 
extension.



For what I've read in this thread 
https://www.postgresql.org/message-id/flat/20210501072458.adqjoaqnmhg4l34l%40nol 
there is no real consensus in how implementing this feature should be 
done. But I agree that if the implementation through an extension was 
possible I would not write a patch to core but an extension, this is my 
common behavior.



Best regards,

--
Gilles Darold
http://www.darold.net/





Re: [RFC] building postgres with meson

2021-10-14 Thread Josef Šimánek
čt 14. 10. 2021 v 15:14 odesílatel Dagfinn Ilmari Mannsåker
 napsal:
>
> Josef Šimánek  writes:
>
> > The only problem I do have currently is auto-detection of perl. I'm
> > getting error related to missing "Opcode.pm". PERL is autodetected and
> > enabled (https://pastebin.com/xfRRrDcU).
>
> Your Perl (not PERL) installation seems to be incomplete. Opcode.pm is a
> core module, and should be in /usr/lib64/perl5, judging by the paths in
> the error message.
>
> Which OS is this?  Some Linux distributions have separate packages for
> the interpreter itself and the included modules, and the packages can be
> named confusingly.  E.g. on older Redhat/Fedora versions you have to
> install the 'perl-core' package to get all the modules, 'perl' is just
> the interpreter and the bare minimum set of strictily necessary modules.
>
> They've fixed this in recent versions (Fedora 34 and Redhat 8, IIRC), so
> that 'perl' gives you the hole bundle, and 'perl-interpeter' is the
> minimal one.

I'm using Fedora 34 and I still see perl-Opcode.x86_64 as a separate
package. Anyway it behaves differently with autoconf tools and the
meson build system. Is perl disabled by default in the current build
system?

>
> - ilmari




Re: [RFC] building postgres with meson

2021-10-14 Thread Alvaro Herrera
On 2021-Oct-14, Josef Šimánek wrote:

> I'm using Fedora 34 and I still see perl-Opcode.x86_64 as a separate
> package. Anyway it behaves differently with autoconf tools and the
> meson build system. Is perl disabled by default in the current build
> system?

Yes, you have to use --with-perl in order to get it.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Puedes vivir sólo una vez, pero si lo haces bien, una vez es suficiente"




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gilles Darold

Le 14/10/2021 à 14:28, Pavel Stehule a écrit :



čt 14. 10. 2021 v 14:13 odesílatel Vik Fearing 
mailto:v...@postgresfriends.org>> napsal:


On 10/14/21 1:47 PM, Aleksander Alekseev wrote:
> Hi Gilles,
>
>> Any though and interest in this feature?
>
> Personally, I wouldn't call this feature particularly useful.
`SELECT
> *` is intended for people who are working with DBMS directly
e.g. via
> psql and want to see ALL columns.

I disagree strongly with this.  It is really annoying when working
interactively with psql on a table that has a PostGIS geometry column,
or any other large blobby type column.

I have not looked at the patch, but +1 for the feature.


Cannot be better to redefine some strategies for output for some types.

I can agree so sometimes in some environments proposed features can be 
nice, but it can be a strong footgun too.


Maybe some strange data can be filtered in psql and it can be better 
solution. I agree, so usually print long geometry in psql is useless.



Pavel this doesn't concern only output but input too, think about the 
INSERT or COPY without a column list. We can add such filter in psql but 
how about other clients? They all have to implement their own filtering 
method. I think the HIDDEN attribute provide a common and basic way to 
implement that in all client application.



--
Gilles Darold
http://www.darold.net/



Re: [RFC] building postgres with meson

2021-10-14 Thread Dagfinn Ilmari Mannsåker
Josef Šimánek  writes:

> čt 14. 10. 2021 v 15:14 odesílatel Dagfinn Ilmari Mannsåker
>  napsal:
>>
>> Josef Šimánek  writes:
>>
>> > The only problem I do have currently is auto-detection of perl. I'm
>> > getting error related to missing "Opcode.pm". PERL is autodetected and
>> > enabled (https://pastebin.com/xfRRrDcU).
>>
>> Your Perl (not PERL) installation seems to be incomplete. Opcode.pm is a
>> core module, and should be in /usr/lib64/perl5, judging by the paths in
>> the error message.
>>
>> Which OS is this?  Some Linux distributions have separate packages for
>> the interpreter itself and the included modules, and the packages can be
>> named confusingly.  E.g. on older Redhat/Fedora versions you have to
>> install the 'perl-core' package to get all the modules, 'perl' is just
>> the interpreter and the bare minimum set of strictily necessary modules.
>>
>> They've fixed this in recent versions (Fedora 34 and Redhat 8, IIRC), so
>> that 'perl' gives you the hole bundle, and 'perl-interpeter' is the
>> minimal one.
>
> I'm using Fedora 34 and I still see perl-Opcode.x86_64 as a separate
> package.`

Yes, it's a separate package, but the 'perl' package depends on all the
core module packages, so installing that should fix things.  You appear
to only have 'perl-interpreter' installed.

> Anyway it behaves differently with autoconf tools and the meson build
> system. Is perl disabled by default in the current build system?

configure doesn't auto-detect any optional features, they have to be
explicitly enabled using --with-foo switches.

- ilmari




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Dave Page
On Thu, Oct 14, 2021 at 2:32 PM Gilles Darold  wrote:

> Le 14/10/2021 à 14:28, Pavel Stehule a écrit :
>
>
>
> čt 14. 10. 2021 v 14:13 odesílatel Vik Fearing 
> napsal:
>
>> On 10/14/21 1:47 PM, Aleksander Alekseev wrote:
>> > Hi Gilles,
>> >
>> >> Any though and interest in this feature?
>> >
>> > Personally, I wouldn't call this feature particularly useful. `SELECT
>> > *` is intended for people who are working with DBMS directly e.g. via
>> > psql and want to see ALL columns.
>>
>> I disagree strongly with this.  It is really annoying when working
>> interactively with psql on a table that has a PostGIS geometry column,
>> or any other large blobby type column.
>>
>> I have not looked at the patch, but +1 for the feature.
>>
>
> Cannot be better to redefine some strategies for output for some types.
>
> I can agree so sometimes in some environments proposed features can be
> nice, but it can be a strong footgun too.
>
> Maybe some strange data can be filtered in psql and it can be better
> solution. I agree, so usually print long geometry in psql is useless.
>
>
> Pavel this doesn't concern only output but input too, think about the
> INSERT or COPY without a column list. We can add such filter in psql but
> how about other clients? They all have to implement their own filtering
> method. I think the HIDDEN attribute provide a common and basic way to
> implement that in all client application.
>

I like the idea - being able to hide computed columns such as tsvectors
from CRUD queries by default seems like it would be very nice for example.

-- 
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com


Re: Next Steps with Hash Indexes

2021-10-14 Thread Peter Geoghegan
On Thu, Oct 14, 2021 at 12:48 AM Simon Riggs
 wrote:
> The hash index tuples are 20-bytes each. If that were rounded up to
> 8-byte alignment, then that would be 24 bytes.
>
> Using pageinspect, the max(live_items) on any data page (bucket or
> overflow) is 407 items, so they can't be 24 bytes long.

That's the same as an nbtree page, which confirms my suspicion. The 20
bytes consists of a 16 byte tuple, plus a 4 byte line pointer. The
tuple-level alignment overhead gets you from 12 bytes to 16 bytes with
a single int4 column. So the padding is there for the taking.

-- 
Peter Geoghegan




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Jaime Casanova
On Thu, Oct 14, 2021 at 01:16:45PM +0200, Gilles Darold wrote:
> Hi,
> 
> 
> Here is a proposal to implement HIDDEN columns feature in PostgreSQL.
> 

Great! Actually I found this very useful, especially for those people
using big fields (geometry, files, large texts).

> The user defined columns are always visible in the PostgreSQL. If user
> wants to hide some column(s) from a SELECT * returned values then the
> hidden columns feature is useful. Hidden column can always be used and
> returned by explicitly referring it in the query.
> 
> I agree that views are done for that or that using a SELECT * is a bad
> practice

An a common one, even if we want to think otherwise. I have found that
in almost every customer I have the bad luck to get to see code or
SELECTs.

Not counting that sometimes we have columns for optimization like Dave
saved about hidden a ts_vector column.

Another use case I can think of is not covered in this patch, but it
could be (I hope!) or even if not I would like opinions on this idea. 
What about a boolean GUC log_hidden_column that throws a LOG message when 
a hidden column is used directly?

The intention is to mark a to-be-deleted column as HIDDEN and then check
the logs to understand if is still being used somewhere. I know systems
where they carry the baggage of deprecated columns only because they
don't know if some system is still using them.

I know this would be extending your original proposal, and understand if
you decide is not a first patch material. 

Anyway, a +1 to your proposal. 

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gilles Darold

Le 14/10/2021 à 17:38, Jaime Casanova a écrit :

On Thu, Oct 14, 2021 at 01:16:45PM +0200, Gilles Darold wrote:

Hi,


Here is a proposal to implement HIDDEN columns feature in PostgreSQL.


Great! Actually I found this very useful, especially for those people
using big fields (geometry, files, large texts).


The user defined columns are always visible in the PostgreSQL. If user
wants to hide some column(s) from a SELECT * returned values then the
hidden columns feature is useful. Hidden column can always be used and
returned by explicitly referring it in the query.

I agree that views are done for that or that using a SELECT * is a bad
practice

An a common one, even if we want to think otherwise. I have found that
in almost every customer I have the bad luck to get to see code or
SELECTs.

Not counting that sometimes we have columns for optimization like Dave
saved about hidden a ts_vector column.

Another use case I can think of is not covered in this patch, but it
could be (I hope!) or even if not I would like opinions on this idea.
What about a boolean GUC log_hidden_column that throws a LOG message when
a hidden column is used directly?

The intention is to mark a to-be-deleted column as HIDDEN and then check
the logs to understand if is still being used somewhere. I know systems
where they carry the baggage of deprecated columns only because they
don't know if some system is still using them.

I know this would be extending your original proposal, and understand if
you decide is not a first patch material.



Why not, I will add it if there is a consencus about logging hidden 
column use, this is not a big work.



--
Gilles Darold





Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-14 Thread Isaac Morland
On Thu, 14 Oct 2021 at 09:11, Robert Haas  wrote:

>
> According to https://www.postgresql.org/docs/current/sql-createview.html
> it always works like that: "Access to tables referenced in the view is
> determined by permissions of the view owner. In some cases, this can
> be used to provide secure but restricted access to the underlying
> tables."
>
> Hmm, unless that rule is only being applied for *tables* and not for
> *functions*? I guess that could be true, but if so, it sure seems
> inconsistent.
>

Yes, I think this has come up before. It seems obvious to me that a view
should execute entirely in the context of its owner. I should be able to
use functions to define view columns without requiring that access to those
functions be handed out to users of the view.

I feel this might relate to the discussion of triggers, which I claim
should execute in the context of the table owner (or maybe the trigger
owner, if that were a separate concept). There are lots of triggers one
might want to write that cannot be written because they execute in the
context of the user of the table; my recollection is that it is harder to
find examples of non-malware triggers that depend on executing in the
context of the user of the table.


Re: refactoring basebackup.c

2021-10-14 Thread Jeevan Ladhe
Thanks, Robert for reviewing the patch.


On Tue, Oct 12, 2021 at 11:09 PM Robert Haas  wrote:

This is the only place where CHUNK_SIZE gets used, and I don't think I
> see any point to it. I think the 5th argument to LZ4F_compressUpdate
> could just be avail_in. And as soon as you do that then I think
> bbsink_lz4_archive_contents() no longer needs to be a loop.
>

Agree. Removed the CHUNK_SIZE and the loop.


>
> + /* First of all write the frame header to destination buffer. */
> + headerSize = LZ4F_compressBegin(mysink->ctx,
> + mysink->base.bbs_next->bbs_buffer,
> + mysink->base.bbs_next->bbs_buffer_length,
> + &mysink->prefs);
>
> + compressedSize = LZ4F_compressEnd(mysink->ctx,
> + mysink->base.bbs_next->bbs_buffer + mysink->bytes_written,
> + mysink->base.bbs_next->bbs_buffer_length - mysink->bytes_written,
> + NULL);
>
> I think there's some issue with these two chunks of code. What happens
> if one of these functions wants to write more data than will fit in
> the output buffer? It seems like either there needs to be some code
> someplace that ensures adequate space in the output buffer at the time
> of these calls, or else there needs to be a retry loop that writes as
> much of the data as possible, flushes the output buffer, and then
> loops to generate more output data. But there's clearly no retry loop
> here, and I don't see any code that guarantees that the output buffer
> has to be large enough (and in the case of LZ4F_compressEnd, have
> enough remaining space) either. In other words, all the same concerns
> that apply to LZ4F_compressUpdate() also apply here ... but in
> LZ4F_compressUpdate() you seem to BOTH have a retry loop and ALSO code
> to make sure that the buffer is certain to be large enough (which is
> more than you need, you only need one of those) and here you seem to
> have NEITHER of those things (which is not enough, you need one or the
> other).
>

Fair enough. I have made the change in the bbsink_lz4_begin_backup() to
make sure we reserve enough extra bytes for the header and the footer those
are written by LZ4F_compressBegin() and LZ4F_compressEnd() respectively.
The LZ4F_compressBound() when passed the input size as "0", would give
the upper bound for output buffer needed by the LZ4F_compressEnd().

How about instead using memset() to zero the whole thing and then
> omitting the zero initializations? That seems like it would be less
> fragile, if the upstream structure definition ever changes.
>

Made this change.

Please review the patch, and let me know your comments.

Regards,
Jeevan Ladhe


lz4_compress_v5.patch
Description: Binary data


Re: [RFC] building postgres with meson

2021-10-14 Thread Andres Freund
Hi,

On 2021-10-14 10:29:42 -0300, Alvaro Herrera wrote:
> On 2021-Oct-14, Josef Šimánek wrote:
>
> > I'm using Fedora 34 and I still see perl-Opcode.x86_64 as a separate
> > package. Anyway it behaves differently with autoconf tools and the
> > meson build system. Is perl disabled by default in the current build
> > system?

Hm, so it seems we should make the test separately verify that perl -M{Opcode,
ExtUtils::Embed, ExtUtils::ParseXS} doesn't fail, so that we can fail perl
detection with a useful message?


> Yes, you have to use --with-perl in order to get it.

With the meson prototype I set most optional features to "auto", except for
LLVM, as that increases compile times noticeably.

For configure we didn't/don't want to do much auto-detection, because that
makes life harder for distributors. But meson has one switch controlling all
features set to 'auto' and not explicitly enabled/disabled:
  --auto-features {enabled,disabled,auto}   
Override value of all 'auto' features (default: auto).
so the argument doesn't apply to the same degree there. We could default
auto-features to something else too.

There were two other reasons:

1) I got tired of needing to disable zlib, readline to be able to build on
   windows.
2) Exercising all the dependency detection / checking seems important at this
   stage

Greetings,

Andres Freund




Re: [RFC] building postgres with meson

2021-10-14 Thread Andres Freund
Hi,

On 2021-10-13 23:58:12 +0200, Josef Šimánek wrote:
> st 13. 10. 2021 v 1:54 odesílatel Andres Freund  napsal:
> > This *very* likely is related to building in a source tree that also 
> > contains
> > a "non-meson" build "in place". The problem is that the meson build picks up
> > the pg_config.h generated by ./configure in the "normal" build, rather than
> > the one meson generated itself.
> >
> > You'd need to execute make distclean or such, or use a separate git 
> > checkout.
> >
> > I forgot about this issue because I only ever build postgres from outside 
> > the
> > source-tree (by invoking configure from a separate directory), so there's
> > never build products in it. I think at least I need to make the build emit a
> > warning / error if there's a pg_config.h in the source tree...
> 
> Hello, thanks for the hint. I can finally build using meson and run
> regress tests.

I yesterday pushed code that should detect this case (with an error). Should
now detect the situation both when you first run configure in tree, and then
meson, and the other way round (by the dirty hack of ./configure touch'ing
meson.build at the end for in-tree builds).


> The only problem I do have currently is auto-detection of perl. I'm
> getting error related to missing "Opcode.pm". PERL is autodetected and
> enabled (https://pastebin.com/xfRRrDcU).
> 
> I do get the same error when I enforce perl for current master build
> (./configure --with-perl). Using ./configure with perl autodetection
> skips plperl extension on my system.
> 
> Disabling perl manually for meson build (meson setup build
> --reconfigure --buildtype debug -Dperl=disabled) works for me.

Yay, thanks for testing!

Greetings,

Andres Freund




Re: [Patch] ALTER SYSTEM READ ONLY

2021-10-14 Thread Robert Haas
On Tue, Oct 12, 2021 at 8:18 AM Amul Sul  wrote:
> In the attached version I have fixed this issue by restoring 
> missingContrecPtr.
>
> To handle abortedRecPtr and missingContrecPtr newly added global
> variables thought the commit # ff9f111bce24, we don't need to store
> them in the shared memory separately, instead, we need a flag that
> indicates a broken record found previously, at the end of recovery, so
> that we can overwrite contrecord.
>
> The missingContrecPtr is assigned to the EndOfLog, and we have handled
> EndOfLog previously in the 0004 patch, and the abortedRecPtr is the
> same as the lastReplayedEndRecPtr, AFAICS.  I have added an assert to
> ensure that the lastReplayedEndRecPtr value is the same as the
> abortedRecPtr, but I think that is not needed, we can go ahead and
> write an overwrite-contrecord starting at lastReplayedEndRecPtr.

I thought that it made sense to commit 0001 and 0002 at this point, so
I have done that. I think that the treatment of missingContrecPtr and
abortedRecPtr may need more thought yet, so at least for that reason I
don't think it's a good idea to proceed with 0004 yet. 0003 is just
code movement so I guess that can be committed whenever we're
confident that we know exactly which things we want to end up inside
XLogAcceptWrites().

I do have a few ideas after studying this a bit more:

- I wonder whether, in addition to moving a few things later as 0002
did, we also ought to think about moving one thing earlier,
specifically XLogReportParameters(). Right now, we have, I believe,
four things that write WAL at the end of recovery:
CreateOverwriteContrecordRecord(), UpdateFullPageWrites(),
PerformRecoveryXLogAction(), and XLogReportParameters(). As the code
is structured now, we do the first three of those things, and then do
a bunch of other stuff inside CleanupAfterArchiveRecovery() like
running recovery_end_command, and removing non-parent xlog files, and
archiving the partial segment, and then come back and do the fourth
one. Is there any good reason for that? If not, I think doing them all
together would be cleaner, and would propose to reverse the order of
CleanupAfterArchiveRecovery() and XLogReportParameters().

- If we did that, then I would further propose to adjust things so
that we remove the call to LocalSetXLogInsertAllowed() and the
assignment LocalXLogInsertAllowed = -1 from inside
CreateEndOfRecoveryRecord(), the LocalXLogInsertAllowed = -1 from just
after UpdateFullPageWrites(), and the call to
LocalSetXLogInsertAllowed() just before XLogReportParameters().
Instead, just let the call to LocalSetXLogInsertAllowed() right before
CreateOverwriteContrecordRecord() remain in effect. There doesn't seem
to be much point in flipping that switch off and on again, and the
fact that we have been doing so is in my view just evidence that
StartupXLOG() doesn't do a very good job of getting related code all
into one place.

- It seems really tempting to invent a fourth RecoveryState value that
indicates that we are done with REDO but not yet in production, and
maybe also to rename RecoveryState to something like WALState. I'm
thinking of something like WAL_STATE_CRASH_RECOVERY,
WAL_STATE_ARCHIVE_RECOVERY, WAL_STATE_REDO_COMPLETE, and
WAL_STATE_PRODUCTION. Then, instead of having
LocalSetXLogInsertAllowed(), we could teach XLogInsertAllowed() that
the startup process and the checkpointer are allowed to insert WAL
when the state is WAL_STATE_REDO_COMPLETE, but other processes only
once we reach WAL_STATE_PRODUCTION. We would set
WAL_STATE_REDO_COMPLETE where we now call LocalSetXLogInsertAllowed().
It's necessary to include the checkpointer, or at least I think it is,
because PerformRecoveryXLogAction() might call RequestCheckpoint(),
and that's got to work. If we did this, then I think it would also
solve another problem which the overall patch set has to address
somehow. Say that we eventually move responsibility for the
to-be-created XLogAcceptWrites() function from the startup process to
the checkpointer, as proposed. The checkpointer needs to know when to
call it ... and the answer with this change is simple: when we reach
WAL_STATE_REDO_COMPLETE, it's time!

But this idea is not completely problem-free. I spent some time poking
at it and I think it's a little hard to come up with a satisfying way
to code XLogInsertAllowed(). Right now that function calls
RecoveryInProgress(), and if RecoveryInProgress() decides that
recovery is no longer in progress, it calls InitXLOGAccess(). However,
that presumes that the only reason you'd call RecoveryInProgress() is
to figure out whether you should write WAL, which I don't think is
really true, and it also means that, when the wal state is
WAL_STATE_REDO_COMPLETE, RecoveryInProgress() would need to return
true in the checkpointer and startup process and false everywhere
else, which does not sound like a great idea. It seems fine to say
that xlog insertion is allowed in some processes but not others,
because not al

Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-14 Thread Stephen Frost
Greetings,

* Isaac Morland (isaac.morl...@gmail.com) wrote:
> On Thu, 14 Oct 2021 at 09:11, Robert Haas  wrote:
> > According to https://www.postgresql.org/docs/current/sql-createview.html
> > it always works like that: "Access to tables referenced in the view is
> > determined by permissions of the view owner. In some cases, this can
> > be used to provide secure but restricted access to the underlying
> > tables."
> >
> > Hmm, unless that rule is only being applied for *tables* and not for
> > *functions*? I guess that could be true, but if so, it sure seems
> > inconsistent.

I'm not sure that it's really inconsistent- if you want the function to
run as someone else, define it as SECURITY DEFINER and it will.  If the
function is defined as SECURITY INVOKER then it'll run with the
privileges of the user invoking the function- which can be pretty handy
if, say, the function references CURRENT_USER.  Note that RLS policies
work in the same way.

> Yes, I think this has come up before. It seems obvious to me that a view
> should execute entirely in the context of its owner. I should be able to
> use functions to define view columns without requiring that access to those
> functions be handed out to users of the view.

I don't know that it's all that obvious, particularly when you consider
that the function owner has the option of having the function run as the
invoker of the function or as the owner of the function.

> I feel this might relate to the discussion of triggers, which I claim
> should execute in the context of the table owner (or maybe the trigger
> owner, if that were a separate concept). There are lots of triggers one
> might want to write that cannot be written because they execute in the
> context of the user of the table; my recollection is that it is harder to
> find examples of non-malware triggers that depend on executing in the
> context of the user of the table.

Triggers can call security definer functions, so I'm not quite sure I
understand what the issue here is.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Tom Lane
Gilles Darold  writes:
> Le 14/10/2021 à 17:38, Jaime Casanova a écrit :
>> On Thu, Oct 14, 2021 at 01:16:45PM +0200, Gilles Darold wrote:
>>> Here is a proposal to implement HIDDEN columns feature in PostgreSQL.

>> Another use case I can think of is not covered in this patch, but it
>> could be (I hope!) or even if not I would like opinions on this idea.
>> What about a boolean GUC log_hidden_column that throws a LOG message when
>> a hidden column is used directly?

> Why not, I will add it if there is a consencus about logging hidden 
> column use, this is not a big work.

This seems like a completely orthogonal idea.  If you are trying
to figure out whether you have any applications that depend on
column X (without breaking anything), you should absolutely not
start by marking the column "hidden", because that'll break the
case where the apps are expecting "SELECT *" to return the column.
But if you're okay with breaking things, you might as well just
drop the column, or else revoke SELECT privilege on it, and see
what happens.

I'm not sure about the utility of logging explicit references to a
specific column --- seems like grepping the results of "log_statement"
would serve.  But in any case I think it is not a good idea to tie
it to this proposal.

As for the proposal itself, I'm kind of allergic to the terminology
you've suggested, because the column is in no way hidden.  It's
still visible in the catalogs, you can still select it explicitly,
etc.  Anybody who thinks this is useful from a security standpoint
is mistaken, but these words suggest that it is.  Perhaps some
terminology like "not expanded" or "unexpanded" would serve better
to indicate that "SELECT *" doesn't expand to include the column.
Or STAR versus NO STAR, maybe.

I also do not care for the syntax you propose: AFAICS the only reason
you've gotten away with making HIDDEN not fully reserved is that you
require it to be the last attribute of a column, which is something
that will trip users up all the time.  Plus, it does not scale to the
next thing we might want to add.  So if you can't make it a regular,
position-independent element of the ColQualList you shouldn't do it
at all.

What I think is actually important is the ALTER COLUMN syntax.
We could easily get away with having that be the only syntax for
this --- compare the precedent of ALTER COLUMN SET STATISTICS.

BTW, you do NOT get to add an information_schema column for
this.  The information_schema is defined in the SQL standard.
Yes, I'm aware that mysql feels free to "extend" the standard
in that area; but our policy is that the only point of having the
information_schema views at all is if they're standard-compliant.

regards, tom lane




Re: Inconsistency in startup process's MyBackendId and procsignal array registration with ProcSignalInit()

2021-10-14 Thread Andres Freund
Hi,

On 2021-10-14 17:28:34 +0900, Kyotaro Horiguchi wrote:
> At Wed, 13 Oct 2021 19:52:52 +0900 (JST), Kyotaro Horiguchi 
>  wrote in
> > Although needing a bit of care for the difference of invalid values
> > for both though, BackendId can be easily replaced with pgprocno almost
> > mechanically except sinvaladt. Therefore, we can confine the current
> > backend ID within sinvaladt isolating from other part.  The ids
> > dedicated for sinvaladt can be packed to small range and perfomance
> > won't be damaged.

FWIW, I don't actually think there's necessarily that strong a need for
density in sinvaladt. With a few relatively changes we can get rid of the O(n)
work in the most crucial paths.

In 
https://www.postgresql.org/message-id/20210802171255.k4yv5cfqaqbuuy6f%40alap3.anarazel.de
I wrote:
> Another approach to deal with this could be to simply not do the O(n) work in
> SIInsertDataEntries(). It's not obvious that ->hasMessages is actually
> necessary - we could atomically read maxMsgNum without acquiring a lock
> instead of needing the per-backend ->hasMessages.  I don't the density would
> be a relevant factor in SICleanupQueue().

This'd get rid of the need of density *and* make SIInsertDataEntries()
cheaper.

Greetings,

Andres Freund




Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-14 Thread Stephen Frost
Greetings,

* Jeff Davis (pg...@j-davis.com) wrote:
> On Wed, 2021-10-13 at 10:03 -0400, Robert Haas wrote:
> > Yeah. I think we should really only use predefined roles where it's
> > not practical to have people use GRANT/REVOKE.
> 
> That sounds like a good rule.
> 
> A minor complaint though: to grant on pg_backend_memory_contexts, you
> need two grant statements:
> 
>grant select on pg_backend_memory_contexts to foo;
>grant execute on function pg_get_backend_memory_contexts() to foo;
> 
> The second is more of an internal detail, and we don't really want
> users to be relying on that undocumented function. Is there a good way
> to define a view kind of like a SECURITY DEFINER function so that the
> superuser would only need to issue a GRANT statement on the view?

Erm, surely the function should be documented...

Other than that, grouping of privileges is generally done using roles.
We could possibly create a predefined role to assist with this but I
don't think it's a huge issue for users to do that themselves.,
particularly since they're likely to grant other accesses to that role
too.  In some instances, it might make sense to grant such access to
other predefined roles too (pg_monitor or the other ones), of course.

I don't think we really want to be doing privilege checks with one role
(view owner) for who is allowed to run the function, and then actually
running the function with some other role when it's a security invoker
function..

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Josef Šimánek
čt 14. 10. 2021 v 13:17 odesílatel Gilles Darold  napsal:
>
> Hi,
>
>
> Here is a proposal to implement HIDDEN columns feature in PostgreSQL.
>
> The user defined columns are always visible in the PostgreSQL. If user
> wants to hide some column(s) from a SELECT * returned values then the
> hidden columns feature is useful. Hidden column can always be used and
> returned by explicitly referring it in the query.
>
> I agree that views are done for that or that using a SELECT * is a bad
> practice
> but sometime we could need to "technically" prevent some columns to be part
> of a star expansion and nbot be forced to use view+rules.

Just to remind here, there was recently a proposal to handle this
problem another way - provide a list of columns to skip for "star
selection" aka "SELECT * EXCEPT col1...".

https://postgrespro.com/list/id/d51371a2-f221-1cf3-4a7d-b2242d4da...@gmail.com

> For example when
> upgrading a database schema where a column have been added to a table,
> this will break any old version of the application that is using a
> SELECT * on
> this table. Being able to "hide" this column to such query will make
> migration
> easier.
>
> An other common use case for this feature is to implements temporal tables
> or row versionning. On my side I see a direct interest in Oracle to
> PostgreSQL
> migration to emulate the ROWID system column without the hassle of creating
> views, it will save lot of time.
>
> The other advantage over views is that the hidden column can still be used
> in JOIN, WHERE, ORDER BY or GROUP BY clause which is not possible otherwise.
> I don't talk about writing to complex view which would require a RULE.
>
> Hidden column is not part of the SQL standard but is implemented in all
> other
> RDBMS which is also called invisible columns [1] [2] [3] [4]. In all
> these RDBMS
> the feature is quite the same.
>
>[1] https://www.ibm.com/docs/en/db2/10.5?topic=concepts-hidden-columns
>[2] https://oracle-base.com/articles/12c/invisible-columns-12cr1
>[3]
> https://docs.microsoft.com/en-us/sql/t-sql/statements/create-table-transact-sql?view=sql-server-ver15
>[4] https://dev.mysql.com/doc/refman/8.0/en/invisible-columns.html
>
>
> Here is the full description of the proposal with a patch attached that
> implements
> the feature:
>
>1) Creating hidden columns:
>
>   A column visibility attribute is added to the column definition
>   of CREATE TABLE and ALTER TABLE statements. For example:
>
>   CREATE TABLE htest1 (a bigserial HIDDEN, b text);
>
>   ALTER TABLE htest1 ADD COLUMN c integer HIDDEN;
>
>   Columns are visible by default.
>
>2) Altering column visibility attribute:
>
>   The ALTER TABLE statement can be used to change hidden columns to not
>   hidden and the opposite. Example:
>
>   ALTER TABLE htest1 ALTER COLUMN c DROP HIDDEN;
>
>3) Insert and hidden columns:
>
>   If the column list of INSERT or COPY statements is empty
>   then while expanding column list hidden columns are NOT
>   included. DEFAULT or NULL values are inserted for hidden
>   columns in this case. Hidden column should be explicitly
>   referenced in the column list of INSERT and COPY statement
>   to insert a value.
>
>   Example:
>
> -- Value 'one' is stored in column b and 1 in hidden column.
> INSERT INTO t1 VALUES ('one');
>
> -- Value 2 is stored in hidden column and 'two' in b.
> INSERT INTO htest1 (a, b) VALUES (2, 'two');
>
>4) Star expansion for SELECT * statements:
>
>   Hidden columns are not included in a column list while
>   expanding wild card '*' in the SELECT statement.
>
>   Example:
>
>   SELECT * FROM htest1;
> b
>   --
>one
>two
>
>Hidden columns are accessible when explicitly referenced
>in the query.
>
>Example:
>   SELECT f1, f2 FROM t1;
>  a  |  b
>   --+--
> 1   | one
> 2   | two
>
>5) psql extended describe lists hidden columns.
>
>postgres=# \d+ htest1
>Table "public.htest1"
> Column |  Type  | Collation | Nullable |  Default   | Visible | ...
> ++---+--++-+ ...
> a  | bigint |   | not null | nextval... | hidden  | ...
> b  | text   |   |  | | | ...
>
>6) When a column is flagged as hidden the attishidden column value of
>   table pg_attribute is set to true.
>
>7) For hidden attributes, column is_hidden of table
> information_schema.columns
>   is set to YES. By default the column is visible and the value is 'NO'.
>
> For a complete description of the feature, see chapter "Hidden columns" in
> file doc/src/sgml/ddl.sgml after applying the patch.
>
>
> The patch is a full implementation of this feture except tha

Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-14 Thread Isaac Morland
On Thu, 14 Oct 2021 at 13:43, Stephen Frost  wrote:

> I feel this might relate to the discussion of triggers, which I claim
> > should execute in the context of the table owner (or maybe the trigger
> > owner, if that were a separate concept). There are lots of triggers one
> > might want to write that cannot be written because they execute in the
> > context of the user of the table; my recollection is that it is harder to
> > find examples of non-malware triggers that depend on executing in the
> > context of the user of the table.
>
> Triggers can call security definer functions, so I'm not quite sure I
> understand what the issue here is.
>

Even something as simple as a "log all table updates" cannot be implemented
as far as I can tell.

So you have table T and T_log. Trigger on T causes all INSERT/UPDATE/DELETE
actions to be logged to T_log. The only changes to T_log should be inserts
resulting from the trigger. But now in order to make changes to T the user
also needs INSERT on T_log. OK, so use a security definer function. That
doesn't help; now instead of needing INSERT on T_log they need EXECUTE on
the function. Either way, two privilege grants are required, and one of
them allows the user to make spurious entries in T_log.

But the desired behaviour is that the user has access *only* to T, and no
access whatsoever to T_log other than indirect changes by causing the
trigger to execute.


Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread David G. Johnston
On Thursday, October 14, 2021, Tom Lane  wrote:

> Gilles Darold  writes:
> > Le 14/10/2021 à 17:38, Jaime Casanova a écrit :
> >> On Thu, Oct 14, 2021 at 01:16:45PM +0200, Gilles Darold wrote:
>
> > Why not, I will add it if there is a consencus about logging hidden
> > column use, this is not a big work.
>
> This seems like a completely orthogonal idea.


>
+1


> As for the proposal itself, I'm kind of allergic to the terminology
> you've suggested, because the column is in no way hidden.  It's
> still visible in the catalogs, you can still select it explicitly,
> etc.  Anybody who thinks this is useful from a security standpoint
> is mistaken, but these words suggest that it is.  Perhaps some
> terminology like "not expanded" or "unexpanded" would serve better
> to indicate that "SELECT *" doesn't expand to include the column.
> Or STAR versus NO STAR, maybe.


Taking this a bit further, I dislike tying the suppression of the column
from the select-list star to the behavior of insert without a column list
provided.  I’m not fully on board with having an attribute that is not
fundamental to the data model but rather an instruction about how that
column interacts with SQL; separating the two aspects, though, would help.
I accept the desire to avoid star expansion much more than default columns
for insert.  Especially since the most compelling example of the later, not
having to specify generated columns on insert, would directly conflict with
the fact that it is those generated columns that are most likely to be
useful to display when specifying a star in the select query.



> What I think is actually important is the ALTER COLUMN syntax.
> We could easily get away with having that be the only syntax for
> this --- compare the precedent of ALTER COLUMN SET STATISTICS.


+1


>
> BTW, you do NOT get to add an information_schema column for
> this.


FWIW, +1, though the project policy reminder does stand on its own.

David J.


Re: refactoring basebackup.c

2021-10-14 Thread Robert Haas
On Thu, Oct 14, 2021 at 1:21 PM Jeevan Ladhe
 wrote:
> Agree. Removed the CHUNK_SIZE and the loop.

Try harder. :-)

The loop is gone, but CHUNK_SIZE itself seems to have evaded the executioner.

> Fair enough. I have made the change in the bbsink_lz4_begin_backup() to
> make sure we reserve enough extra bytes for the header and the footer those
> are written by LZ4F_compressBegin() and LZ4F_compressEnd() respectively.
> The LZ4F_compressBound() when passed the input size as "0", would give
> the upper bound for output buffer needed by the LZ4F_compressEnd().

I think this is not the best way to accomplish the goal. Adding
LZ4F_compressBound(0) to next_buf_len makes the buffer substantially
bigger for something that's only going to happen once. We are assuming
in any case, I think, that LZ4F_compressBound(0) <=
LZ4F_compressBound(mysink->base.bbs_buffer_length), so all you need to
do is have bbsink_end_archive() empty the buffer, if necessary, before
calling LZ4F_compressEnd(). With just that change, you can set
next_buf_len = LZ4F_HEADER_SIZE_MAX + mysink->output_buffer_bound --
but that's also more than you need. You can instead do next_buf_len =
Min(LZ4F_HEADER_SIZE_MAX, mysink->output_buffer_bound). Now, you're
probably thinking that won't work, because bbsink_lz4_begin_archive()
could fill up the buffer partway, and then the first call to
bbsink_lz4_archive_contents() could overrun it. But that problem can
be solved by reversing the order of operations in
bbsink_lz4_archive_contents(): before you call LZ4F_compressUpdate(),
test whether you need to empty the buffer first, and if so, do it.

That's actually less confusing than the way you've got it, because as
you have it written, we don't really know why we're emptying the
buffer -- is it to prepare for the next call to LZ4F_compressUpdate(),
or is it to prepare for the call to LZ4F_compressEnd()? How do we know
now how much space the next person writing into the buffer is going to
need? It seems better if bbsink_lz4_archive_contents() empties the
buffer before calling LZ4F_compressUpdate() if that call might not
have enough space, and likewise bbsink_lz4_end_archive() empties the
buffer before calling LZ4F_compressEnd() if that's needed. That way,
each callback makes the space *it* needs, not the space the *next*
caller needs. (bbsink_lz4_end_archive() still needs to ALSO empty the
buffer after LZ4F_compressEnd(), so we don't orphan any data.)

On another note, if the call to LZ4F_freeCompressionContext() is
required in bbsink_lz4_end_archive(), then I think this code is going
to just leak the memory used by the compression context if an error
occurs before this code is reached. That kind of sucks. The way to fix
it, I suppose, is a TRY/CATCH block, but I don't think that can be
something internal to basebackup_lz4.c: I think the bbsink stuff would
need to provide some kind of infrastructure for basebackup_lz4.c to
use. It would be a lot better if we could instead get LZ4 to allocate
memory using palloc(), but a quick Google search suggests that you
can't accomplish that without recompiling liblz4, and that's not
workable since we don't want to require a liblz4 built specifically
for PostgreSQL. Do you see any other solution?

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-14 Thread Jeff Davis
On Thu, 2021-10-14 at 13:43 -0400, Stephen Frost wrote:
> I'm not sure that it's really inconsistent- if you want the function
> to
> run as someone else, define it as SECURITY DEFINER and it will.

There are two issues:

1. Does having permissions to read a view give the reader the ability
to execute the function as a part of reading the view?

Here it seems like we should allow the user to execute the function
that's a part of the view. If it's doing something that performs
another permission check, then it could fail, but at least they'd be
able to execute it. That seems consistent with the ability to read
tables as a part of reading the view.

2. If the function is executed, is it SECURITY INVOKER or SECURITY
DEFINER?

I think here the answer is SECURITY INVOKER. SECURITY DEFINER doesn't
even really make sense, because the definer might not be the owner of
the view. Maybe we need a concept where the function is executed as
neither the invoker or the definer, but as the owner of the view (or
something else), which sounds appealing, but sounds more like a new
feature.

Regards,
Jeff Davis






Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-14 Thread Robert Haas
On Thu, Oct 14, 2021 at 1:43 PM Stephen Frost  wrote:
> I'm not sure that it's really inconsistent- if you want the function to
> run as someone else, define it as SECURITY DEFINER and it will.  If the
> function is defined as SECURITY INVOKER then it'll run with the
> privileges of the user invoking the function- which can be pretty handy
> if, say, the function references CURRENT_USER.

That presumes that (1) the user who owns the view also owns the
function and (2) the user who created the view and the function wants
to permit people who query the view to call the function with any
arguments, rather than only those arguments that would be passed by
querying the view. Neither of those things is necessarily true.

I am not really sure that we can get away with changing this, since it
is long-established behavior. At least, if we do, we are going to have
to warn people to watch out for backward-compatibility issues, some of
which may not be things breaking functionally but rather having a
different security profile. But, in a green field, I don't know why
it's sane to suppose that if you query a view, the things in the view
behave partly as if the user querying the view were running them, and
partly as if the user owning the view were one of them. It seems much
more logical for it to be one or the other.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Tom Lane
"David G. Johnston"  writes:
> Taking this a bit further, I dislike tying the suppression of the column
> from the select-list star to the behavior of insert without a column list
> provided.  I’m not fully on board with having an attribute that is not
> fundamental to the data model but rather an instruction about how that
> column interacts with SQL; separating the two aspects, though, would help.
> I accept the desire to avoid star expansion much more than default columns
> for insert.

Yeah, me too.  I think it would add a lot of clarity if we defined this
as "this affects the behavior of SELECT * and nothing else" ... although
even then, there are squishy questions about how much it affects the
behavior of composite datums that are using the column's rowtype.
But as soon as you want it to bleed into INSERT, you start having a
lot of questions about what else it should bleed into, as Aleksander
already mentioned.

regards, tom lane




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gilles Darold

Le 14/10/2021 à 19:44, Tom Lane a écrit :

As for the proposal itself, I'm kind of allergic to the terminology
you've suggested, because the column is in no way hidden.  It's
still visible in the catalogs, you can still select it explicitly,
etc.  Anybody who thinks this is useful from a security standpoint
is mistaken, but these words suggest that it is.  Perhaps some
terminology like "not expanded" or "unexpanded" would serve better
to indicate that "SELECT *" doesn't expand to include the column.
Or STAR versus NO STAR, maybe.



Agree, I also had this feeling. I decide to use HIDDEN like in DB2 just 
because UNEXPANDED looks to me difficult to understand by users and that 
hidden or Invisible column are well known. This is a kind of "vendor 
standard" now. But I agree that it can confuse uninformed people and 
doesn't reflect the real feature. I will rename the keyword as 
"UNEXPANDED", will do.




I also do not care for the syntax you propose: AFAICS the only reason
you've gotten away with making HIDDEN not fully reserved is that you
require it to be the last attribute of a column, which is something
that will trip users up all the time.  Plus, it does not scale to the
next thing we might want to add.  So if you can't make it a regular,
position-independent element of the ColQualList you shouldn't do it
at all.



Yes I have also noted that and wanted to improve this later if the 
proposal was accepted.




What I think is actually important is the ALTER COLUMN syntax.
We could easily get away with having that be the only syntax for
this --- compare the precedent of ALTER COLUMN SET STATISTICS.



Ok great, I'm fine with that, especially for the previous point :-) I 
will remove it from the CREATE TABLE syntax except in the INCLUDING like 
option.




BTW, you do NOT get to add an information_schema column for
this.  The information_schema is defined in the SQL standard.
Yes, I'm aware that mysql feels free to "extend" the standard
in that area; but our policy is that the only point of having the
information_schema views at all is if they're standard-compliant.


Ok, I will remove it.


--
Gilles Darold





Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Tom Lane
I wrote:
> Yeah, me too.  I think it would add a lot of clarity if we defined this
> as "this affects the behavior of SELECT * and nothing else" ... although
> even then, there are squishy questions about how much it affects the
> behavior of composite datums that are using the column's rowtype.

Re-reading that, I realize I probably left way too much unstated,
so let me spell it out.

Should this feature affect
SELECT * FROM my_table t;
?  Yes, absolutely.

How about
SELECT t.* FROM my_table t;
?  Yup, one would think so.

Now how about
SELECT row_to_json(t.*) FROM my_table t;
?  All of a sudden, I'm a lot less sure --- not least because we *can't*
simply omit some columns, without the composite datum suddenly not being
of the table's rowtype anymore, which could have unexpected effects on
query semantics.  In particular, if we have a user-defined function
that's defined to accept composite type my_table, I don't think we can
suppress columns in
SELECT myfunction(t.*) FROM my_table t;

And don't forget that these can also be spelled like
SELECT row_to_json(t) FROM my_table t;
without any star visible anywhere.

So the more I think about this, the squishier it gets.  I'm now sharing
the fears expressed upthread about whether it's even possible to define
this in a way that won't have a lot of gotchas.

regards, tom lane




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gilles Darold

Le 14/10/2021 à 20:43, Tom Lane a écrit :

Re-reading that, I realize I probably left way too much unstated,
so let me spell it out.

Should this feature affect
SELECT * FROM my_table t;
?  Yes, absolutely.

How about
SELECT t.* FROM my_table t;
?  Yup, one would think so.

Now how about
SELECT row_to_json(t.*) FROM my_table t;
?  All of a sudden, I'm a lot less sure --- not least because we *can't*
simply omit some columns, without the composite datum suddenly not being
of the table's rowtype anymore, which could have unexpected effects on
query semantics.  In particular, if we have a user-defined function
that's defined to accept composite type my_table, I don't think we can
suppress columns in
SELECT myfunction(t.*) FROM my_table t;

And don't forget that these can also be spelled like
SELECT row_to_json(t) FROM my_table t;
without any star visible anywhere.

So the more I think about this, the squishier it gets.  I'm now sharing
the fears expressed upthread about whether it's even possible to define
this in a way that won't have a lot of gotchas.

regards, tom lane



You mean this ? :-)


gilles=# CREATE TABLE htest0 (a int PRIMARY KEY, b text NOT NULL HIDDEN);
CREATE TABLE
gilles=# INSERT INTO htest0 (a, b) VALUES (1, 'htest0 one');
INSERT 0 1
gilles=# INSERT INTO htest0 (a, b) VALUES (2, 'htest0 two');
INSERT 0 1

gilles=# SELECT * FROM htest0 t;
 a
---
 1
 2
(2 rows)

gilles=# SELECT t.* FROM htest0 t;
 a
---
 1
 2
(2 rows)

gilles=# SELECT row_to_json(t.*) FROM htest0 t;
   row_to_json
--
 {"a":1,"b":"htest0 one"}
 {"a":2,"b":"htest0 two"}
(2 rows)

gilles=# SELECT row_to_json(t) FROM htest0 t;
   row_to_json
--
 {"a":1,"b":"htest0 one"}
 {"a":2,"b":"htest0 two"}
(2 rows)


You should have a look at the patch, I don't think that the way it is 
done there could have gotchas.


--
Gilles Darold





Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gilles Darold

Le 14/10/2021 à 20:26, Tom Lane a écrit :

"David G. Johnston"  writes:

Taking this a bit further, I dislike tying the suppression of the column
from the select-list star to the behavior of insert without a column list
provided.  I’m not fully on board with having an attribute that is not
fundamental to the data model but rather an instruction about how that
column interacts with SQL; separating the two aspects, though, would help.
I accept the desire to avoid star expansion much more than default columns
for insert.

Yeah, me too.  I think it would add a lot of clarity if we defined this
as "this affects the behavior of SELECT * and nothing else" ... although
even then, there are squishy questions about how much it affects the
behavior of composite datums that are using the column's rowtype.
But as soon as you want it to bleed into INSERT, you start having a
lot of questions about what else it should bleed into, as Aleksander
already mentioned.



I not agree, expansion in executed when there is no column list provided 
and this affect SELECT and INSERT. It cover the same needs: being able 
to remove a column for the target list when it is not explicitly set. 
This feature is known like this and I'm not in favor to tear off a leg.



--
Gilles Darold





Re: should we allow users with a predefined role to access pg_backend_memory_contexts view and pg_log_backend_memory_contexts function?

2021-10-14 Thread Jeff Davis
On Thu, 2021-10-14 at 14:22 -0400, Robert Haas wrote:
> I am not really sure that we can get away with changing this, since
> it
> is long-established behavior. At least, if we do, we are going to
> have
> to warn people to watch out for backward-compatibility issues, some
> of
> which may not be things breaking functionally but rather having a
> different security profile. But, in a green field, I don't know why
> it's sane to suppose that if you query a view, the things in the view
> behave partly as if the user querying the view were running them, and
> partly as if the user owning the view were one of them. It seems much
> more logical for it to be one or the other.

How do you feel about at least allowing the functions to execute (and
if it's SECURITY INVOKER, possibly encountering a permissions failure
during execution)?

There are of course security implications with any change like that,
but it seems like a fairly minor one unless I'm missing something. Why
would an admin give someone the privileges to read a view if it will
always fail due to lack of execute privilege?

Regards,
Jeff Davis






Re: [RFC] building postgres with meson

2021-10-14 Thread John Naylor
I wrote:

> Ok great, it builds now! :-) Now something's off with dynamic loading.
There are libraries in ./tmp_install/usr/local/lib/ but apparently initdb
doesn't know to look for them there:
>
> $ cat /Users/john/pgdev/meson/build/testrun/main/pg_regress/log/initdb.log
> dyld: Library not loaded: /usr/local/lib/libpq.5.dylib
>   Referenced from:
/Users/john/pgdev/meson/build/tmp_install/usr/local/bin/initdb
>   Reason: image not found

After poking a bit more, this only happens when trying to run the tests. If
I specify a prefix, I can install, init, and start the server just fine, so
that much works.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: [RFC] building postgres with meson

2021-10-14 Thread Josef Šimánek
čt 14. 10. 2021 v 15:32 odesílatel Dagfinn Ilmari Mannsåker
 napsal:
>
> Josef Šimánek  writes:
>
> > čt 14. 10. 2021 v 15:14 odesílatel Dagfinn Ilmari Mannsåker
> >  napsal:
> >>
> >> Josef Šimánek  writes:
> >>
> >> > The only problem I do have currently is auto-detection of perl. I'm
> >> > getting error related to missing "Opcode.pm". PERL is autodetected and
> >> > enabled (https://pastebin.com/xfRRrDcU).
> >>
> >> Your Perl (not PERL) installation seems to be incomplete. Opcode.pm is a
> >> core module, and should be in /usr/lib64/perl5, judging by the paths in
> >> the error message.
> >>
> >> Which OS is this?  Some Linux distributions have separate packages for
> >> the interpreter itself and the included modules, and the packages can be
> >> named confusingly.  E.g. on older Redhat/Fedora versions you have to
> >> install the 'perl-core' package to get all the modules, 'perl' is just
> >> the interpreter and the bare minimum set of strictily necessary modules.
> >>
> >> They've fixed this in recent versions (Fedora 34 and Redhat 8, IIRC), so
> >> that 'perl' gives you the hole bundle, and 'perl-interpeter' is the
> >> minimal one.
> >
> > I'm using Fedora 34 and I still see perl-Opcode.x86_64 as a separate
> > package.`
>
> Yes, it's a separate package, but the 'perl' package depends on all the
> core module packages, so installing that should fix things.  You appear
> to only have 'perl-interpreter' installed.

You're right. Installing "perl" or "perl-Opcode" manually fixes this
problem. Currently I only have "perl-interpreter" installed.

> > Anyway it behaves differently with autoconf tools and the meson build
> > system. Is perl disabled by default in the current build system?
>
> configure doesn't auto-detect any optional features, they have to be
> explicitly enabled using --with-foo switches.
>
> - ilmari




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gilles Darold

Le 14/10/2021 à 20:55, Gilles Darold a écrit :


gilles=# SELECT row_to_json(t.*) FROM htest0 t;
   row_to_json
--
 {"a":1,"b":"htest0 one"}
 {"a":2,"b":"htest0 two"}
(2 rows)

gilles=# SELECT row_to_json(t) FROM htest0 t;
   row_to_json
--
 {"a":1,"b":"htest0 one"}
 {"a":2,"b":"htest0 two"}
(2 rows)



Tom, I have probably not well understood what you said about do the 
cases above. Do you mean that the column should not be visible too? I 
have though not but maybe I'm wrong, I will fix that.



--
Gilles Darold





Re: [RFC] building postgres with meson

2021-10-14 Thread Josef Šimánek
čt 14. 10. 2021 v 19:24 odesílatel Andres Freund  napsal:
>
> Hi,
>
> On 2021-10-14 10:29:42 -0300, Alvaro Herrera wrote:
> > On 2021-Oct-14, Josef Šimánek wrote:
> >
> > > I'm using Fedora 34 and I still see perl-Opcode.x86_64 as a separate
> > > package. Anyway it behaves differently with autoconf tools and the
> > > meson build system. Is perl disabled by default in the current build
> > > system?
>
> Hm, so it seems we should make the test separately verify that perl -M{Opcode,
> ExtUtils::Embed, ExtUtils::ParseXS} doesn't fail, so that we can fail perl
> detection with a useful message?

I can confirm "perl -MOpcode" fails. ExtUtils::Embed and
ExtUtils::ParseXS are present. Looking at the local system history of
perl-interpreter package, it seems to be installed by default on
Fedora 34. Friendly error message would be welcomed.

>
> > Yes, you have to use --with-perl in order to get it.
>
> With the meson prototype I set most optional features to "auto", except for
> LLVM, as that increases compile times noticeably.
>
> For configure we didn't/don't want to do much auto-detection, because that
> makes life harder for distributors. But meson has one switch controlling all
> features set to 'auto' and not explicitly enabled/disabled:
>   --auto-features {enabled,disabled,auto}   
> Override value of all 'auto' features (default: auto).
> so the argument doesn't apply to the same degree there. We could default
> auto-features to something else too.
>
> There were two other reasons:
>
> 1) I got tired of needing to disable zlib, readline to be able to build on
>windows.
> 2) Exercising all the dependency detection / checking seems important at this
>stage

Clear, thanks for the info.

> Greetings,
>
> Andres Freund




Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gavin Flower

On 15/10/21 07:01, Josef Šimánek wrote:

čt 14. 10. 2021 v 13:17 odesílatel Gilles Darold  napsal:

Hi,


Here is a proposal to implement HIDDEN columns feature in PostgreSQL.

The user defined columns are always visible in the PostgreSQL. If user
wants to hide some column(s) from a SELECT * returned values then the
hidden columns feature is useful. Hidden column can always be used and
returned by explicitly referring it in the query.

I agree that views are done for that or that using a SELECT * is a bad
practice
but sometime we could need to "technically" prevent some columns to be part
of a star expansion and nbot be forced to use view+rules.

Just to remind here, there was recently a proposal to handle this
problem another way - provide a list of columns to skip for "star
selection" aka "SELECT * EXCEPT col1...".

https://postgrespro.com/list/id/d51371a2-f221-1cf3-4a7d-b2242d4da...@gmail.com


[...]

I feel using EXCEPT would be a lot clearer, no one is likely to be 
mislead into thinking that its is a security feature unlike 'HIDDEN'.  
Also you know that SELECT * will select all columns.


If this kind of feature were to be added, then I'd give a +1 to use the 
EXCEPT syntax.



Cheers,
Gavin






Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Rod Taylor
On Thu, 14 Oct 2021 at 07:16, Gilles Darold  wrote:

> Hi,
>
> Here is a proposal to implement HIDDEN columns feature in PostgreSQL.
>
> The user defined columns are always visible in the PostgreSQL. If user
> wants to hide some column(s) from a SELECT * returned values then the
> hidden columns feature is useful. Hidden column can always be used and
> returned by explicitly referring it in the query.
>

The behaviour of SELECT * is well defined and consistent across many
databases, so I don't like changing the behaviour of it.

I would be in favour of a different symbol which expands to a more
selective column set. Perhaps by default it picks up short textish columns;
skip bytea or long text fields for example but can be adjusted with HIDDEN.
Perhaps "SELECT +"?


-- 
Rod Taylor


Re: Gather performance analysis

2021-10-14 Thread Robert Haas
On Tue, Oct 12, 2021 at 10:14 AM Dilip Kumar  wrote:
> Thanks, yeah now it looks in line with other results.

Since it seems there are no remaining concerns here, and we have
benchmarking results showing that the patch helps, I have committed
the patch.

I wonder whether the new code in shm_mq_send_bytes() should guard
against calling shm_mq_inc_bytes_written() with a second argument of
0, or alternatively whether shm_mq_inc_bytes_written() should have an
internal defense against that. It might save some writes to shared
memory, but it would also add a branch, which isn't free, either.

I also think that, as a followup action item, we need to reassess
parallel_tuple_cost.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: [RFC] building postgres with meson

2021-10-14 Thread Andres Freund
Hi,

On October 14, 2021 12:14:16 PM PDT, John Naylor  
wrote:
>I wrote:
>
>> Ok great, it builds now! :-) Now something's off with dynamic loading.
>There are libraries in ./tmp_install/usr/local/lib/ but apparently initdb
>doesn't know to look for them there:
>>
>> $ cat /Users/john/pgdev/meson/build/testrun/main/pg_regress/log/initdb.log
>> dyld: Library not loaded: /usr/local/lib/libpq.5.dylib
>>   Referenced from:
>/Users/john/pgdev/meson/build/tmp_install/usr/local/bin/initdb
>>   Reason: image not found
>
>After poking a bit more, this only happens when trying to run the tests. If
>I specify a prefix, I can install, init, and start the server just fine, so
>that much works.

Is this a Mac with SIP enabled? The Mac CI presumably has that disabled, which 
is why I didn't see this issue there. Probably need to implement whatever Tom 
figured out to do about that for the current way of running tests.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Andrew Dunstan


On 10/14/21 12:15 AM, Tom Lane wrote:
> Peter Geoghegan  writes:
>> Any idea what the problems on drongo are?
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2021-10-14%2001%3A27%3A19
> It says
>
> # pg_ctl start failed; logfile:
> 2021-10-14 02:10:33.996 UTC [491848:1] LOG:  starting PostgreSQL 14.0, 
> compiled by Visual C++ build 1923, 64-bit
> 2021-10-14 02:10:33.999 UTC [491848:2] LOG:  could not bind IPv4 address 
> "127.0.0.1": Only one usage of each socket address (protocol/network 
> address/port) is normally permitted.
> 2021-10-14 02:10:33.999 UTC [491848:3] HINT:  Is another postmaster already 
> running on port 54407? If not, wait a few seconds and retry.
> 2021-10-14 02:10:33.999 UTC [491848:4] WARNING:  could not create listen 
> socket for "127.0.0.1"
> 2021-10-14 02:10:33.999 UTC [491848:5] FATAL:  could not create any TCP/IP 
> sockets
> 2021-10-14 02:10:34.000 UTC [491848:6] LOG:  database system is shut down
> Bail out!  pg_ctl start failed
>
> Looks like a transient/phase of the moon issue to me.
>
>   



Bowerbird is having similar issues, so I don't think this is just a
transient.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [RFC] building postgres with meson

2021-10-14 Thread John Naylor
On Thu, Oct 14, 2021 at 4:34 PM Andres Freund  wrote:

> Is this a Mac with SIP enabled? The Mac CI presumably has that disabled,
which is why I didn't see this issue there. Probably need to implement
whatever Tom figured out to do about that for the current way of running
tests.

System Information says it's disabled. Running "csrutil status" complains
of an unsupported configuration, which doesn't sound good, so I should
probably go fix that independent of anything else. :-/

--
John Naylor
EDB: http://www.enterprisedb.com


Re: [PATCH] Proposal for HIDDEN/INVISIBLE column

2021-10-14 Thread Gilles Darold
Le 14/10/2021 à 22:01, Gavin Flower a écrit :
> On 15/10/21 07:01, Josef Šimánek wrote:
>> čt 14. 10. 2021 v 13:17 odesílatel Gilles Darold 
>> napsal:
>>> Hi,
>>>
>>>
>>> Here is a proposal to implement HIDDEN columns feature in PostgreSQL.
>>>
>>> The user defined columns are always visible in the PostgreSQL. If user
>>> wants to hide some column(s) from a SELECT * returned values then the
>>> hidden columns feature is useful. Hidden column can always be used and
>>> returned by explicitly referring it in the query.
>>>
>>> I agree that views are done for that or that using a SELECT * is a bad
>>> practice
>>> but sometime we could need to "technically" prevent some columns to
>>> be part
>>> of a star expansion and nbot be forced to use view+rules.
>> Just to remind here, there was recently a proposal to handle this
>> problem another way - provide a list of columns to skip for "star
>> selection" aka "SELECT * EXCEPT col1...".
>>
>> https://postgrespro.com/list/id/d51371a2-f221-1cf3-4a7d-b2242d4da...@gmail.com
>>
>
> [...]
>
> I feel using EXCEPT would be a lot clearer, no one is likely to be
> mislead into thinking that its is a security feature unlike 'HIDDEN'. 
> Also you know that SELECT * will select all columns.
>
> If this kind of feature were to be added, then I'd give a +1 to use
> the EXCEPT syntax.


I don't think that the EXCEPT syntax will be adopted as it change the
SQL syntax for SELECT in a non standard way. This is not the case of the
hidden column feature which doesn't touch of the SELECT or INSERT syntax.





Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Mark Dilger



> On Oct 14, 2021, at 1:50 PM, Andrew Dunstan  wrote:
> 
> Bowerbird is having similar issues, so I don't think this is just a
> transient.

The pg_amcheck patch Peter committed for me adds a new test, 
src/bin/pg_amcheck/t/006_bad_targets.pl, which creates two PostgresNode objects 
(a primary and a standby) and uses PostgresNode::background_psql().  It doesn't 
bother to "finish" the returned harness, which may be the cause of an 
installation hanging around long enough to be in the way when another test 
tries to start.

Assuming this is right, the fix is just one line.  Thoughts? 

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Tom Lane
Andrew Dunstan  writes:
> On 10/14/21 12:15 AM, Tom Lane wrote:
>> Looks like a transient/phase of the moon issue to me.

> Bowerbird is having similar issues, so I don't think this is just a
> transient.

Yeah, I noticed that too today, and poked around a bit.  But I don't
see what this test is doing differently from other tests that
bowerbird is running successfully.  It's failing while trying to crank
up a replica using init_from_backup, which has lots of precedent.

I do see that bowerbird is skipping some comparable tests due
to using "--skip-steps misc-check".  But it's not skipping,
eg, pg_rewind's 008_min_recovery_point.pl; and the setup steps
in that sure look just the same.  What's different?

(BTW, I wondered if PostgresNode->new's own_host hack could fix this.
But AFAICS that's dead code, with no existing test using it.  Seems
like we should nuke it.)

regards, tom lane




Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Tom Lane
Mark Dilger  writes:
> The pg_amcheck patch Peter committed for me adds a new test, 
> src/bin/pg_amcheck/t/006_bad_targets.pl, which creates two PostgresNode 
> objects (a primary and a standby) and uses PostgresNode::background_psql().  
> It doesn't bother to "finish" the returned harness, which may be the cause of 
> an installation hanging around long enough to be in the way when another test 
> tries to start.

(a) Isn't that just holding open one connection, not the whole instance?

(b) Wouldn't finish()ing that connection cause the temp tables to be
dropped, negating the entire point of the test?

TBH, I seriously doubt this test case is worth expending buildfarm
cycles on forevermore.  I'm more than a bit tempted to just drop
it, rather than also expending developer time figuring out why it's
not as portable as it looks.

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-14 Thread John Naylor
I wrote:

> > Is this a Mac with SIP enabled? The Mac CI presumably has that
disabled, which is why I didn't see this issue there. Probably need to
implement whatever Tom figured out to do about that for the current way of
running tests.
>
> System Information says it's disabled. Running "csrutil status" complains
of an unsupported configuration, which doesn't sound good, so I should
probably go fix that independent of anything else. :-/

Looking online, I wonder if the "unsupported" message might be overly
cautious. In any case, I do remember turning something off to allow a
debugger to run. Here are all the settings, in case it matters:

Apple Internal: disabled
Kext Signing: enabled
Filesystem Protections: enabled
Debugging Restrictions: disabled
DTrace Restrictions: enabled
NVRAM Protections: enabled
BaseSystem Verification: enabled

--
John Naylor
EDB: http://www.enterprisedb.com


Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Mark Dilger



> On Oct 14, 2021, at 2:13 PM, Tom Lane  wrote:
> 
> (a) Isn't that just holding open one connection, not the whole instance?

Yes.

> (b) Wouldn't finish()ing that connection cause the temp tables to be
> dropped, negating the entire point of the test?

The finish() would have to be the last line of the test.

> TBH, I seriously doubt this test case is worth expending buildfarm
> cycles on forevermore.  I'm more than a bit tempted to just drop
> it, rather than also expending developer time figuring out why it's
> not as portable as it looks.

I'm curious if the test is indicating something about the underlying test 
system.  Only one other test in the tree uses background_psql().  I was hoping 
Andrew would have something to say about whether this is a bug with that 
function or just user error on my part.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Peter Geoghegan
On Thu, Oct 14, 2021 at 2:13 PM Tom Lane  wrote:
> TBH, I seriously doubt this test case is worth expending buildfarm
> cycles on forevermore.  I'm more than a bit tempted to just drop
> it, rather than also expending developer time figuring out why it's
> not as portable as it looks.

I agree. I can go remove the whole file now, and will.

Mark: Any objections?

-- 
Peter Geoghegan




Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Mark Dilger



> On Oct 14, 2021, at 2:21 PM, Peter Geoghegan  wrote:
> 
> I agree. I can go remove the whole file now, and will.
> 
> Mark: Any objections?

None of the "pride of ownership" type, but I would like to see something more 
about the limitations of background_psql().  It's the closest thing we have to 
being able to run things in parallel from TAP tests.  There's no 
isolationtester equivalent, and PostgresNode doesn't allow you to fork() in 
tests without hacking PostgresNodes END{} block.  So if we don't debug this, we 
never get any further towards parallel testing from perl.  Or do you have a 
different way forward for that?

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Peter Geoghegan
On Thu, Oct 14, 2021 at 2:24 PM Mark Dilger
 wrote:
> None of the "pride of ownership" type, but I would like to see something more 
> about the limitations of background_psql().

I'm not sure what that means for the buildfarm. Are you suggesting
that we leave things as-is pending an investigation on affected BF
animals, or something else?

> Or do you have a different way forward for that?

I don't know enough about this stuff to be able to comment.

-- 
Peter Geoghegan




Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Mark Dilger



> On Oct 14, 2021, at 2:28 PM, Peter Geoghegan  wrote:
> 
> I'm not sure what that means for the buildfarm. Are you suggesting
> that we leave things as-is pending an investigation on affected BF
> animals, or something else?

I was just waiting a couple minutes to see if Andrew wanted to jump in.  Having 
heard nothing, I guess you can revert it.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Tom Lane
Mark Dilger  writes:
>> On Oct 14, 2021, at 2:13 PM, Tom Lane  wrote:
>> (b) Wouldn't finish()ing that connection cause the temp tables to be
>> dropped, negating the entire point of the test?

> The finish() would have to be the last line of the test.
> ...
> I'm curious if the test is indicating something about the underlying test 
> system.  Only one other test in the tree uses background_psql().  I was 
> hoping Andrew would have something to say about whether this is a bug with 
> that function or just user error on my part.

Neither of these things could explain the problem at hand, AFAICS,
because it's failing to start up the standby.

regards, tom lane




Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Andrew Dunstan


On 10/14/21 5:09 PM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 10/14/21 12:15 AM, Tom Lane wrote:
>>> Looks like a transient/phase of the moon issue to me.
>> Bowerbird is having similar issues, so I don't think this is just a
>> transient.
> Yeah, I noticed that too today, and poked around a bit.  But I don't
> see what this test is doing differently from other tests that
> bowerbird is running successfully.  It's failing while trying to crank
> up a replica using init_from_backup, which has lots of precedent.


Yes, that's been puzzling me too. I've just been staring at it again and
nothing jumps out. But maybe we can investigate that offline if this
test is deemed not worth keeping.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [RFC] building postgres with meson

2021-10-14 Thread Sergey Shinderuk

Hi,

On 14.10.2021 23:54, John Naylor wrote:
On Thu, Oct 14, 2021 at 4:34 PM Andres Freund > wrote:


 > Is this a Mac with SIP enabled? The Mac CI presumably has that 
disabled, which is why I didn't see this issue there. Probably need to 
implement whatever Tom figured out to do about that for the current way 
of running tests.


System Information says it's disabled. Running "csrutil status" 
complains of an unsupported configuration, which doesn't sound good, so 
I should probably go fix that independent of anything else. :-/



Maybe you could check that DYLD_LIBRARY_PATH is working for you?

% DYLD_FALLBACK_LIBRARY_PATH= 
DYLD_LIBRARY_PATH=./tmp_install/usr/local/lib 
./tmp_install/usr/local/bin/psql --version

psql (PostgreSQL) 15devel


Without DYLD_LIBRARY_PATH I get the error, as expected:

% DYLD_FALLBACK_LIBRARY_PATH= ./tmp_install/usr/local/bin/psql --version
dyld: Library not loaded: /usr/local/lib/libpq.5.dylib
  Referenced from: 
/Users/shinderuk/src/postgres-meson/build/./tmp_install/usr/local/bin/psql

  Reason: image not found


I add "DYLD_FALLBACK_LIBRARY_PATH=" because otherwise dyld falls back to 
/usr/lib/libpq.5.dylib provided by Apple (I am testing on Catalina).


% DYLD_PRINT_LIBRARIES=1 ./tmp_install/usr/local/bin/psql --version 2>&1 
| grep libpq

dyld: loaded: <4EDF735E-2104-32AD-BE7B-B400ABFCF57C> /usr/lib/libpq.5.dylib


Regards,

--
Sergey Shinderukhttps://postgrespro.com/




Re: [RFC] building postgres with meson

2021-10-14 Thread Tom Lane
John Naylor  writes:
>> System Information says it's disabled. Running "csrutil status" complains
>> of an unsupported configuration, which doesn't sound good, so I should
>> probably go fix that independent of anything else. :-/

> Looking online, I wonder if the "unsupported" message might be overly
> cautious. In any case, I do remember turning something off to allow a
> debugger to run. Here are all the settings, in case it matters:

> Apple Internal: disabled
> Kext Signing: enabled
> Filesystem Protections: enabled
> Debugging Restrictions: disabled
> DTrace Restrictions: enabled
> NVRAM Protections: enabled
> BaseSystem Verification: enabled

I remember having seen that report too, after some previous software
upgrade that had started from a "SIP disabled" status.  I'm mostly
guessing here, but my guess is that

(a) csrutil only considers the all-enabled and all-disabled states
of these individual flags to be "supported" cases.

(b) some one or more of these flags came along in a macOS update,
and if you did the update starting from a "disabled" state, you
nonetheless ended up with the new flags enabled, leading to the
mixed state that csrutil complains about.

I've lost count of the number of times I've seen macOS updates
be sloppy about preserving non-default settings, so I don't find
theory (b) to be even slightly surprising.

Whether the mixed state is actually problematic in any way,
I dunno.  I don't recall having had any problems before noticing
that that was what I had.

regards, tom lane




Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Peter Geoghegan
On Thu, Oct 14, 2021 at 2:31 PM Mark Dilger
 wrote:
> I was just waiting a couple minutes to see if Andrew wanted to jump in.  
> Having heard nothing, I guess you can revert it.

Okay. Pushed a commit removing the test case just now.

Thanks
-- 
Peter Geoghegan




Re: BUG #17212: pg_amcheck fails on checking temporary relations

2021-10-14 Thread Tom Lane
Andrew Dunstan  writes:
> Yes, that's been puzzling me too. I've just been staring at it again and
> nothing jumps out. But maybe we can investigate that offline if this
> test is deemed not worth keeping.

As Mark says, it'd be interesting to know whether the use of
background_psql is related, because if it is, we'd want to debug that.
(I don't really see how it could be related, but maybe I just lack
sufficient imagination today.)

Beyond that, ISTM this is blocking all TAP testing on the Windows
machines, which is pretty bad to leave in place for long.

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-14 Thread Tom Lane
Andres Freund  writes:
> Is this a Mac with SIP enabled? The Mac CI presumably has that disabled, 
> which is why I didn't see this issue there. Probably need to implement 
> whatever Tom figured out to do about that for the current way of running 
> tests.

AFAIR the only cases we've made work are

(1) disable SIP

(2) avoid the need for (1) by always doing "make install" before
"make check".

Peter E. did some hacking towards another solution awhile ago,
but IIRC it involved changing the built binaries, and I think
we concluded that the benefits didn't justify that.

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-14 Thread Tom Lane
Andres Freund  writes:
> Hm, so it seems we should make the test separately verify that perl -M{Opcode,
> ExtUtils::Embed, ExtUtils::ParseXS} doesn't fail, so that we can fail perl
> detection with a useful message?

Our existing policy is that we should check this at configure time,
not later.  Since plperl won't work at all without Opcode, it seems
appropriate to add a check there if you say --with-perl.  I wasn't
aware that Red Hat had unbundled that from the minimal perl
installation :-(.

OTOH, if they've not unbundled ExtUtils::Embed or ExtUtils::ParseXS,
I doubt it's worth the configure cycles to check for those separately.

regards, tom lane




Re: relation OID in ReorderBufferToastReplace error message

2021-10-14 Thread Bossart, Nathan
On 9/23/21, 11:26 AM, "Alvaro Herrera"  wrote:
> On 2021-Sep-23, Jeremy Schneider wrote:
>
>> On 9/22/21 20:11, Amit Kapila wrote:
>> >
>> > On Thu, Sep 23, 2021 at 3:06 AM Jeremy Schneider  
>> > wrote:
>> >>
>> >> Any chance of back-patching this?
>> >
>> > Normally, we don't back-patch code improvements unless they fix some
>> > bug or avoid future back-patch efforts. So, I am not inclined to
>> > back-patch this but if others also feel strongly about this then we
>> > can consider it.
>>
>> The original thread about the logical replication bugs spawned a few
>> different threads and code changes. The other code changes coming out of
>> those threads were all back-patched, but I guess I can see arguments
>> both ways on this one.
>
> I think that for patches that are simple debugging aids we do
> backpatch, with the intent to get them deployed in users' systems as
> soon and as widely possible.  I did that in this one, for example

+1 for back-patching

Nathan



Re: [RFC] building postgres with meson

2021-10-14 Thread Andrew Dunstan


On 10/13/21 7:11 PM, Andrew Dunstan wrote:
> On 10/13/21 5:46 PM, Andres Freund wrote:
>> Hi,
>>
>> On 2021-10-13 16:06:32 -0400, Andrew Dunstan wrote:
>>> If you prep a patch I'll test it.
>> Well, right now I'm wondering if the better fix is to just remove the whole
>> win32 block. I don't know how far back, but afaict it's not needed. Seems to
>> have been needed for narwhal at some point, according to 02b61dd08f99. But
>> narwhal is long dead.
>>
> Ok, I'll test it out.
>

confirmed that jacana doesn't need this code to build or test plperl
(all I did was change the test from win32 to win32x). There would still
be work to do to fix the contrib bool_plperl, jsonb_plperl and
hstore_plperl modules.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: [RFC] building postgres with meson

2021-10-14 Thread Thomas Munro
On Fri, Oct 15, 2021 at 11:00 AM Tom Lane  wrote:
> Peter E. did some hacking towards another solution awhile ago,
> but IIRC it involved changing the built binaries, and I think
> we concluded that the benefits didn't justify that.

Yeah, by now there are lots of useful blogs from various projects
figuring out that you can use the install_name_tool to adjust the
paths it uses to be absolute or relative to certain magic words, like
@executable_path/../lib/blah.dylib, which is tempting, but...
realistically, for serious hacking on a Mac, SIP is so annoying that
it isn't the only reason you'll want to turn it off: it stops
dtrace/dtruss/... from working, and somehow prevents debuggers from
working when you've ssh'd in from a remote machine with a proper
keyboard, and probably more things that I'm forgetting.

I wish I could find the Xnu source that shows exactly how and when the
environment is suppressed in this way to understand better, but it
doesn't jump out of Apple's github; maybe it's hiding in closed source
machinery...




Re: [RFC] building postgres with meson

2021-10-14 Thread Tom Lane
Thomas Munro  writes:
> I wish I could find the Xnu source that shows exactly how and when the
> environment is suppressed in this way to understand better, but it
> doesn't jump out of Apple's github; maybe it's hiding in closed source
> machinery...

I recall that we figured out awhile ago that the environment gets trimmed
when make (or whatever) executes some command via the shell; seemingly,
Apple has decided that /bin/sh is a security-critical program that mustn't
be run with a non-default DYLD_LIBRARY_PATH.  Dunno if that helps you
find where the damage is done exactly.

(The silliness of this policy, when you pair it with the fact that they
don't reset PATH at the same time, seems blindingly obvious to me.  But
apparently not to Apple.)

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-14 Thread Andres Freund
Hi,

On 2021-10-14 18:00:49 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Is this a Mac with SIP enabled? The Mac CI presumably has that disabled, 
> > which is why I didn't see this issue there. Probably need to implement 
> > whatever Tom figured out to do about that for the current way of running 
> > tests.
> 
> AFAIR the only cases we've made work are
> 
> (1) disable SIP
> 
> (2) avoid the need for (1) by always doing "make install" before
> "make check".

Ah, I thought it was more than that. In that case, John, does meson's test
succeed after you did the "proper" install? Assuming it's in a path that's
allowed to provide shared libraries?

Greetings,

Andres Freund




[PATCH] Prefer getenv("HOME") to find the UNIX home directory

2021-10-14 Thread Anders Kaseorg
According to getpwnam(3):

  An application that wants to determine its user's home directory
  should inspect the value of HOME (rather than the value
  getpwuid(getuid())->pw_dir) since this allows the user to modify
  their notion of "the home directory" during a login session.

This is important for systems where many users share the same UID, and for test 
systems that change HOME to avoid interference with the user’s real home 
directory.  It matches what most applications do, as well as what glibc does 
for glob("~", GLOB_TILDE, …) and wordexp("~", …).

There was some previous discussion of this in 2016, where although there were 
some questions about the use case, there seemed to be general support for the 
concept:

https://www.postgresql.org/message-id/flat/CAEH6cQqbdbXoUHJBbX9ixwfjFFsUC-a8hFntKcci%3DdiWgBb3fQ%40mail.gmail.com

Regardless of whether one thinks modifying HOME is a good idea, if we happen to 
find ourselves in that case, we should respect the modified HOME, so that when 
the user creates (say) a ~/.pgpass file, we’ll look for it at the same place 
the user’s editor created it.  getenv() also skips the overhead of reading 
/etc/passwd as an added bonus.

The way I ran into this issue myself was in a test suite that runs on GitHub 
Actions, which automatically sets HOME=/github/home.

Anders
From df9c435c759fffe77c9c92f70e7c095ffb6556ae Mon Sep 17 00:00:00 2001
From: Anders Kaseorg 
Date: Thu, 14 Oct 2021 15:20:13 -0700
Subject: [PATCH v1] Prefer getenv("HOME") to find the UNIX home directory

According to getpwnam(3):

  An application that wants to determine its user's home directory
  should inspect the value of HOME (rather than the value
  getpwuid(getuid())->pw_dir) since this allows the user to modify
  their notion of "the home directory" during a login session.

Signed-off-by: Anders Kaseorg 
---
 src/bin/psql/command.c| 20 
 src/interfaces/libpq/fe-connect.c | 14 ++
 src/port/path.c   | 14 ++
 3 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 49d4c0e3ce..cc2fe6ba0e 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -15,6 +15,7 @@
 #include 			/* for stat() */
 #include 			/* for setitimer() */
 #include /* open() flags */
+#include /* for getenv() */
 #include /* for geteuid(), getpid(), stat() */
 #else
 #include 
@@ -558,15 +559,18 @@ exec_command_cd(PsqlScanState scan_state, bool active_branch, const char *cmd)
 			uid_t		user_id = geteuid();
 
 			errno = 0;			/* clear errno before call */
-			pw = getpwuid(user_id);
-			if (!pw)
-			{
-pg_log_error("could not get home directory for user ID %ld: %s",
-			 (long) user_id,
-			 errno ? strerror(errno) : _("user does not exist"));
-exit(EXIT_FAILURE);
+			dir = getenv("HOME");
+			if (dir == NULL || dir[0] == '\0') {
+pw = getpwuid(user_id);
+if (!pw)
+{
+	pg_log_error("could not get home directory for user ID %ld: %s",
+ (long) user_id,
+ errno ? strerror(errno) : _("user does not exist"));
+	exit(EXIT_FAILURE);
+}
+dir = pw->pw_dir;
 			}
-			dir = pw->pw_dir;
 #else			/* WIN32 */
 
 			/*
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index b288d346f9..ebdd815c73 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "common/ip.h"
@@ -7242,14 +7243,19 @@ bool
 pqGetHomeDirectory(char *buf, int bufsize)
 {
 #ifndef WIN32
+	const char *home;
 	char		pwdbuf[BUFSIZ];
 	struct passwd pwdstr;
 	struct passwd *pwd = NULL;
 
-	(void) pqGetpwuid(geteuid(), &pwdstr, pwdbuf, sizeof(pwdbuf), &pwd);
-	if (pwd == NULL)
-		return false;
-	strlcpy(buf, pwd->pw_dir, bufsize);
+	home = getenv("HOME");
+	if (home == NULL || home[0] == '\0') {
+		(void) pqGetpwuid(geteuid(), &pwdstr, pwdbuf, sizeof(pwdbuf), &pwd);
+		if (pwd == NULL)
+			return false;
+		home = pwd->pw_dir;
+	}
+	strlcpy(buf, home, bufsize);
 	return true;
 #else
 	char		tmppath[MAX_PATH];
diff --git a/src/port/path.c b/src/port/path.c
index c39d4688cd..607bd16c23 100644
--- a/src/port/path.c
+++ b/src/port/path.c
@@ -32,6 +32,7 @@
 #define near
 #include 
 #else
+#include 
 #include 
 #endif
 
@@ -807,14 +808,19 @@ bool
 get_home_path(char *ret_path)
 {
 #ifndef WIN32
+	const char *home;
 	char		pwdbuf[BUFSIZ];
 	struct passwd pwdstr;
 	struct passwd *pwd = NULL;
 
-	(void) pqGetpwuid(geteuid(), &pwdstr, pwdbuf, sizeof(pwdbuf), &pwd);
-	if (pwd == NULL)
-		return false;
-	strlcpy(ret_path, pwd->pw_dir, MAXPGPATH);
+	home = getenv("HOME");
+	if (home == NULL || home[0] == '\0') {
+		(void) pqGetpwuid(geteuid(), &pwdstr, pwdbuf, sizeof(pwdbuf), &pwd);
+		if (pwd == NULL)
+			return false;
+		home = pwd->pw_dir;
+	}
+	strlcpy(ret_path, home, MAXPGPATH);
 	return true;
 #else
 	char	   *tmppath;

Re: [RFC] building postgres with meson

2021-10-14 Thread Tom Lane
I wrote:
> I recall that we figured out awhile ago that the environment gets trimmed
> when make (or whatever) executes some command via the shell; seemingly,
> Apple has decided that /bin/sh is a security-critical program that mustn't
> be run with a non-default DYLD_LIBRARY_PATH.  Dunno if that helps you
> find where the damage is done exactly.

BTW, here's the evidence for this theory:

[tgl@pro ~]$ cat checkenv.c
#include 
#include 

int
main(int argc, char **argv)
{
  char *pth = getenv("DYLD_LIBRARY_PATH");

  if (pth)
printf("DYLD_LIBRARY_PATH = %s\n", pth);
  else
printf("DYLD_LIBRARY_PATH is unset\n");

  return 0;
}
[tgl@pro ~]$ gcc checkenv.c
[tgl@pro ~]$ ./a.out
DYLD_LIBRARY_PATH is unset
[tgl@pro ~]$ export DYLD_LIBRARY_PATH=/Users/tgl/pginstall/lib
[tgl@pro ~]$ ./a.out
DYLD_LIBRARY_PATH = /Users/tgl/pginstall/lib
[tgl@pro ~]$ sh -c ./a.out
DYLD_LIBRARY_PATH is unset
[tgl@pro ~]$ ./a.out
DYLD_LIBRARY_PATH = /Users/tgl/pginstall/lib
[tgl@pro ~]$ bash -c ./a.out
DYLD_LIBRARY_PATH is unset

You have to check the environment using an "unprivileged" program.
If you try to examine the environment using, say, "env", you will get
very misleading results.  AFAICT, /usr/bin/env is *also* considered
security-critical, because I cannot get it to ever report that
DYLD_LIBRARY_PATH is set.

Hmm ... /usr/bin/perl seems to act the same way.  It can see
ENV{'PATH'} but not ENV{'DYLD_LIBRARY_PATH'}.

This may indicate that they've applied this policy on a blanket
basis to everything in /bin and /usr/bin (and other system
directories, maybe), rather than singling out the shell.

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-14 Thread Andres Freund
Hi,

On 2021-10-15 11:23:00 +1300, Thomas Munro wrote:
> On Fri, Oct 15, 2021 at 11:00 AM Tom Lane  wrote:
> > Peter E. did some hacking towards another solution awhile ago,
> > but IIRC it involved changing the built binaries, and I think
> > we concluded that the benefits didn't justify that.
> 
> Yeah, by now there are lots of useful blogs from various projects
> figuring out that you can use the install_name_tool to adjust the
> paths it uses to be absolute or relative to certain magic words, like
> @executable_path/../lib/blah.dylib, which is tempting, but...
> realistically, for serious hacking on a Mac, SIP is so annoying that
> it isn't the only reason you'll want to turn it off: it stops
> dtrace/dtruss/... from working, and somehow prevents debuggers from
> working when you've ssh'd in from a remote machine with a proper
> keyboard, and probably more things that I'm forgetting.

Meson has support for using install_name_tool to remove "build time" rpaths
and set "install time" rpaths during the installation process - which uses
install_name_tool on mac.

If, and perhaps that's too big an if, relative rpaths actually work despite
SIP, it might be worth setting a relative install_rpath, because afaict that
should then work both for a "real" installation and our temporary test one.

If absolute rpaths are required, it'd make the process a bit more expensive,
because we'd probably need to change a configure time option during the 
temporary
install. No actual rebuilds would be required, but still.

Greetings,

Andres Freund




Re: [RFC] building postgres with meson

2021-10-14 Thread Tom Lane
Andres Freund  writes:
> If, and perhaps that's too big an if, relative rpaths actually work despite
> SIP, it might be worth setting a relative install_rpath, because afaict that
> should then work both for a "real" installation and our temporary test one.

>From what we know so far, it seems like SIP wouldn't interfere with
that (if it works at all).  I think what SIP desires to prevent is
messing with a program's execution by setting DYLD_LIBRARY_PATH.
As long as the program executable itself is saying where to find
the library, I don't see why they should interfere with that.

(Again, it seems blindingly stupid to forbid this while not blocking
PATH or any of the other environment variables that have always affected
execution.  But what do I know.)

> If absolute rpaths are required, it'd make the process a bit more expensive,

It'd also put the kibosh on relocatable install trees, though I dunno how
much people really care about that.

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-14 Thread John Naylor
On Thu, Oct 14, 2021 at 6:55 PM Andres Freund  wrote:
> Ah, I thought it was more than that. In that case, John, does meson's test
> succeed after you did the "proper" install? Assuming it's in a path that's
> allowed to provide shared libraries?

Oh, it can act like installcheck? [checks] Yep, "meson test" ran fine (*).
It still ran the temp install first, but in any case it worked. The full
"configure step"  was

meson setup build --buildtype debug -Dldap=disabled -Dcassert=true
-Dprefix=$(pwd)/inst

* (all passed but skipped subscription/t/012_collation.pl)

--
John Naylor
EDB: http://www.enterprisedb.com


Re: [RFC] building postgres with meson

2021-10-14 Thread Andres Freund
Hi,

On 2021-10-14 18:08:58 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Hm, so it seems we should make the test separately verify that perl 
> > -M{Opcode,
> > ExtUtils::Embed, ExtUtils::ParseXS} doesn't fail, so that we can fail perl
> > detection with a useful message?
>
> Our existing policy is that we should check this at configure time,
> not later.

Yea, I was thinking of configure (and meson's equivalent) as well.


> Since plperl won't work at all without Opcode, it seems
> appropriate to add a check there if you say --with-perl.  I wasn't
> aware that Red Hat had unbundled that from the minimal perl
> installation :-(.
>
> OTOH, if they've not unbundled ExtUtils::Embed or ExtUtils::ParseXS,
> I doubt it's worth the configure cycles to check for those separately.

On debian the perl binary, with a sparse set of modules is in
perl-base. ExtUtils::Embed and ExtUtils::ParseXS are in
perl-modules-x.yy. Whereas Opcode is in libperlx.yy. But libperlx.yy depends
on perl-modules-x.yy so I guess an Opcode.pm check would suffice.

Seems we can just check all of them at once with with something like

perl -MOpcode -MExtUtils::Embed -MExtUtils::ParseXSNotAvailable -e ''
Can't locate ExtUtils/ParseXSNotAvailable.pm in @INC (you may need to install 
the ExtUtils::ParseXS3 module) (@INC contains: 
/home/andres/bin/perl5/lib/perl5/x86_64-linux-gnu-thread-multi 
/home/andres/bin/perl5/lib/perl5 /etc/perl 
/usr/local/lib/x86_64-linux-gnu/perl/5.32.1 /usr/local/share/perl/5.32.1 
/usr/lib/x86_64-linux-gnu/perl5/5.32 /usr/share/perl5 
/usr/lib/x86_64-linux-gnu/perl-base /usr/lib/x86_64-linux-gnu/perl/5.32 
/usr/share/perl/5.32 /usr/local/lib/site_perl).

Greetings,

Andres Freund




Re: [RFC] building postgres with meson

2021-10-14 Thread Tom Lane
Andres Freund  writes:
> On 2021-10-14 18:08:58 -0400, Tom Lane wrote:
>> Andres Freund  writes:
>>> Hm, so it seems we should make the test separately verify that perl 
>>> -M{Opcode,
>>> ExtUtils::Embed, ExtUtils::ParseXS} doesn't fail, so that we can fail perl
>>> detection with a useful message?

>> Our existing policy is that we should check this at configure time,
>> not later.

> Yea, I was thinking of configure (and meson's equivalent) as well.

Ah, sorry, I misunderstood what you meant by "test".

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-14 Thread Andres Freund
Hi,

On 2021-10-14 19:27:17 -0400, John Naylor wrote:
> On Thu, Oct 14, 2021 at 6:55 PM Andres Freund  wrote:
> > Ah, I thought it was more than that. In that case, John, does meson's test
> > succeed after you did the "proper" install? Assuming it's in a path that's
> > allowed to provide shared libraries?
> 
> Oh, it can act like installcheck? [checks] Yep, "meson test" ran fine (*).
> It still ran the temp install first, but in any case it worked.

As far as I understand Tom, our normal make check only works on OSX if
previously you ran make install. Which will have installed libpq into the
"proper" install location. Because all our binaries will, by default, have an
rpath to the library directory embedded, that then allows binaries in the
temporary install to work. But using the wrong libpq - which most of the time
turns out to be harmless, because libpq doesn't change that rapidly.


> * (all passed but skipped subscription/t/012_collation.pl)

That test requires ICU, so that's fine. I guess we could prevent the test from
being executed in the first place, but I don't think we've done that for cases
where it's one specific test in a t/ directory, where others in the same
directory do not have such dependencies.

Greetings,

Andres Freund




Re: [RFC] building postgres with meson

2021-10-14 Thread Thomas Munro
On Fri, Oct 15, 2021 at 12:04 PM Tom Lane  wrote:
> [tgl@pro ~]$ cat checkenv.c
> #include 
> #include 
>
> int
> main(int argc, char **argv)
> {
>   char *pth = getenv("DYLD_LIBRARY_PATH");
>
>   if (pth)
> printf("DYLD_LIBRARY_PATH = %s\n", pth);
>   else
> printf("DYLD_LIBRARY_PATH is unset\n");
>
>   return 0;
> }
> [tgl@pro ~]$ gcc checkenv.c
> [tgl@pro ~]$ ./a.out
> DYLD_LIBRARY_PATH is unset
> [tgl@pro ~]$ export DYLD_LIBRARY_PATH=/Users/tgl/pginstall/lib
> [tgl@pro ~]$ ./a.out
> DYLD_LIBRARY_PATH = /Users/tgl/pginstall/lib
> [tgl@pro ~]$ sh -c ./a.out
> DYLD_LIBRARY_PATH is unset
> [tgl@pro ~]$ ./a.out
> DYLD_LIBRARY_PATH = /Users/tgl/pginstall/lib
> [tgl@pro ~]$ bash -c ./a.out
> DYLD_LIBRARY_PATH is unset
>
> You have to check the environment using an "unprivileged" program.
> If you try to examine the environment using, say, "env", you will get
> very misleading results.  AFAICT, /usr/bin/env is *also* considered
> security-critical, because I cannot get it to ever report that
> DYLD_LIBRARY_PATH is set.
>
> Hmm ... /usr/bin/perl seems to act the same way.  It can see
> ENV{'PATH'} but not ENV{'DYLD_LIBRARY_PATH'}.
>
> This may indicate that they've applied this policy on a blanket
> basis to everything in /bin and /usr/bin (and other system
> directories, maybe), rather than singling out the shell.

Looks like it.  If I've found the right code here, it looks like where
any common-or-garden Unix runtime linker would ignore LD_LIBRARY_PATH
for a setuid binary, they've trained theirs to whack DYLD_*, and also
for code-signed and __RESTRICT-marked executables.

https://github.com/opensource-apple/dyld/blob/master/src/dyld.cpp#L1681

I suppose you could point SHELL at an unsigned copy of sh (codesign
--remove-signature, or something from brew/ports/x) so that GNU make
should respect, but I don't know how many other exec("/bin/sh") calls
might be hiding around the place (I guess perl calls system()?) and
might require some kind of LD_PRELOAD hackery... not much fun.




Re: Added schema level support for publication.

2021-10-14 Thread Greg Nancarrow
On Thu, Oct 14, 2021 at 9:59 PM Amit Kapila  wrote:
>
> > If partitions belong to a different schema than the parent partitioned
> > table, then the current patch implementation allows the partitions to
> > (optionally) be explicitly added to a publication that includes the
> > parent partitioned table (and for the most part, it doesn't seem to
> > make any difference to the publication behavior). Should this be
> > allowed?
> >
> > e.g.
> >
> > CREATE SCHEMA sch;
> > CREATE SCHEMA sch1;
> > CREATE TABLE sch.sale (sale_date date not null, country_code text,
> > product_sku text, units integer) PARTITION BY RANGE (sale_date);
> > CREATE TABLE sch1.sale_201901 PARTITION OF sch.sale FOR VALUES FROM
> > ('2019-01-01') TO ('2019-02-01');
> > CREATE TABLE sch1.sale_201902 PARTITION OF sch.sale FOR VALUES FROM
> > ('2019-02-01') TO ('2019-03-01');
> >
> > postgres=# CREATE PUBLICATION pub FOR ALL TABLES IN SCHEMA sch, TABLE
> > sch1.sale_201901, TABLE sch1.sale_201902;
> > CREATE PUBLICATION
> > postgres=# \dRp+
> >  Publication pub
> >  Owner | All tables | Inserts | Updates | Deletes | Truncates | Via root
> > ---++-+-+-+---+--
> >  gregn | f  | t   | t   | t   | t | f
> > Tables:
> > "sch1.sale_201901"
> > "sch1.sale_201902"
> > Tables from schemas:
> > "sch"
> >
>
> I don't see any problem with this. Do you have a specific problem in
> mind due to this?
>

I'm not sure if it's a problem as such, really just a query from me as
to whether it should be allowed to also (redundantly) add partitions
to the publication, in addition to the partitioned table, since the
current documentation says: "When a partitioned table is added to a
publication, all of its existing and future partitions are implicitly
considered to be part of the publication".
I guess it should be allowed, as I find I can do it in the current
implementation just with TABLE.

Regards,
Greg Nancarrow
Fujitsu Australia




Re: [RFC] building postgres with meson

2021-10-14 Thread Tom Lane
Thomas Munro  writes:
> On Fri, Oct 15, 2021 at 12:04 PM Tom Lane  wrote:
>> This may indicate that they've applied this policy on a blanket
>> basis to everything in /bin and /usr/bin (and other system
>> directories, maybe), rather than singling out the shell.

> Looks like it.  If I've found the right code here, it looks like where
> any common-or-garden Unix runtime linker would ignore LD_LIBRARY_PATH
> for a setuid binary, they've trained theirs to whack DYLD_*, and also
> for code-signed and __RESTRICT-marked executables.
> https://github.com/opensource-apple/dyld/blob/master/src/dyld.cpp#L1681

Ugh.  That explains it, all right.

> I suppose you could point SHELL at an unsigned copy of sh (codesign
> --remove-signature, or something from brew/ports/x) so that GNU make
> should respect, but I don't know how many other exec("/bin/sh") calls
> might be hiding around the place (I guess perl calls system()?) and
> might require some kind of LD_PRELOAD hackery... not much fun.

Yeah.  I thought about invoking everything via a small wrapper
that restores the correct setting of DYLD_LIBRARY_PATH.  We could
perhaps make that work for the invocations of test postmasters
and psqls from "make" and TAP scripts, but hacking up our code's
sundry uses of system(3) like that seems like it'd be very messy,
if feasible at all.

BTW, the POSIX spec explicitly discourages letting SHELL affect the
behavior of system(3), so I bet that wouldn't help.

regards, tom lane




Re: [RFC] building postgres with meson

2021-10-14 Thread Andres Freund
Hi,

On 2021-10-14 22:46:07 -0400, Tom Lane wrote:
> Thomas Munro  writes:
> > I suppose you could point SHELL at an unsigned copy of sh (codesign
> > --remove-signature, or something from brew/ports/x) so that GNU make
> > should respect, but I don't know how many other exec("/bin/sh") calls
> > might be hiding around the place (I guess perl calls system()?) and
> > might require some kind of LD_PRELOAD hackery... not much fun.
> 
> Yeah.  I thought about invoking everything via a small wrapper
> that restores the correct setting of DYLD_LIBRARY_PATH.  We could
> perhaps make that work for the invocations of test postmasters
> and psqls from "make" and TAP scripts, but hacking up our code's
> sundry uses of system(3) like that seems like it'd be very messy,
> if feasible at all.

It does sound like using relative rpaths might be the thing we want - and like
they've been available since 10.5 or something.

Is there a reason we're using absolute rpaths on a bunch of platforms, rather
than relative ones, which'd allow relocation?

Greetings,

Andres Freund




Re: [PATCH] Added TRANSFORM FOR for COMMENT tab completion

2021-10-14 Thread Shinya Kato

On 2021-10-14 14:30, katouknl wrote:

It is very good, but it seems to me that there are some tab-completion
missing in COMMENT command.
For example,
- CONSTRAINT ... ON DOMAIN
- OPERATOR CLASS
- OPERATOR FAMILY
- POLICY ... ON
- [PROCEDURAL]
- RULE ... ON
- TRIGGER ... ON

I think these tab-comletion also can be improved and it's a good
timing for that.


Thank you for the comments!

I fixed where you pointed out.

Thank you for the update!
I tried "COMMENT ON OPERATOR ...", and an operator seemed to be 
complemented with double quotation marks.

However, it caused the COMMENT command to fail.
---
postgres=# COMMENT ON OPERATOR "+" (integer, integer) IS 'test_fail';
ERROR:  syntax error at or near "("
LINE 1: COMMENT ON OPERATOR "+" (integer, integer) IS 'test_fail';
postgres=# COMMENT ON OPERATOR + (integer, integer) IS 'test_success';
COMMENT
---

So, I think as with \do command, you do not need to complete the 
operators.

Do you think?


--
Regards,

--
Shinya Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




  1   2   >