Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Andreas Kretschmer
It is in epas15, but for the whole cluster. Different keys for each database is 
not possible, how should it works for instance the wal - stream?

On 18 May 2023 00:35:39 CEST, Tony Xu  wrote:
>Hi There,
>
>The FAQ (copied below) mentioned that native transparent data encryption
>might be included in 16. Is it fair to assume that it will support database
>level encryption, that is, we can use two encryption keys for two databases
>in the same server, respectively? How can one verify that?
>
>Thanks
>Tony
>
>
>
>
>*https://www.postgresql.org/about/press/faq/
>*
>
>*Q: What features will PostgreSQL 16 have?A: As always, we can't be certain
>what will go in and what won't; the project has strict quality standards
>that not all patches can make before deadline. All we can tell you is
>what's currently being worked on, which includes native transparent data
>encryption support, continued improvements to logical replication,
>parallelism, partitioning, and vacuuming, and many more features. By the
>time 16 is released, though, this list may have changed considerably.*


Is there a good way to handle sum types (or tagged unions) in PostgreSQL?

2023-05-18 Thread Victor Nordam Suadicani
Hi,

Is there any nice way to handle sum types (aka tagged unions) in a
PostgreSQL database? I've searched far and wide and have not reached any
satisfying answer.

As a (somewhat contrived) example, say I have the following enum in Rust:

enum TaggedUnion {
Variant1(String),
Variant2(i32),
Variant3(f64),
}

How might I best save this data in a PostgreSQL database? There is to my
knowledge unfortunately no way to "natively" handle sum types like this.

One method would be to have 3 different tables, one for each variant. This
is not a great solution as you can't (as far as I know) easily query for
all variants at once (for instance, to serialize into a Vec on
the Rust side).

Another method would be to use PostgreSQL table inheritance. This has the
same problem as the above workaround but also has the issue that you could
put something into the supertable without having a corresponding entry in
any of the subtables (basically a value not inhabiting any of the variants,
which is nonsense).

A third method would be to save all fields of all variants into a single
table, with all fields being nullable. So you'd have a nullable text field,
nullable integer and nullable double precision field. You'd then need an
additional tag field to indicate which variant of the union is used and
you'd have to write check constraints for each variant to ensure that all
the fields in that variant are not null and all the fields not in that
variant *are* null. This *almost* works, but has two major problems:

1. It wastes space. In Rust, an enum is only as big as its largest variant.
Using this method, a table row would be as big as the sum of all the
variants.

2. Querying the data is very cumbersome, as there is no way to indicate
(beyond check constraints) that, given a specific tag, certain other fields
must be filled while certain other fields must not be. For instance, the
nullability of fields can be used to serialize into the Option type in
Rust. There is no "nice" way to tell the host language that the nullability
of the variant fields is hinged on the value of the extra tag field.

Both of these problems get bigger and bigger as you add more variants - it
doesn't scale well.

Does anyone know of better methods? I realize the example enum is contrived
but this kind of thing comes up *a lot* in all kinds of places. For
instance different kinds of messages that all have various different
associated data. Or different events that record different additional data.
Sum types are ubiquitous in languages like Rust and Haskell.

If there are no good methods of handling this, is there any way PostgreSQL
could be extended with capabilities for this? I have no idea how this would
be done in practice though. Perhaps SQL itself is just unsuited for data of
this kind? I don't really see why it should be though.

Thanks,
Victor Nordam Suadicani


JSONB operator unanticipated behaviour

2023-05-18 Thread Brian Mendoza
Hello,

I have encountered unanticipated behaviour with a JSONB operator, and
wanted to make sure I am not misunderstanding its intended use.

When using the @> operator, I get this result.

select '{"a": [1]}'::jsonb @> '{"a": []}'::jsonb;
 ?column?
--
 t
(1 row)

However, given the phrasing in the documentation, I would have expected
False.

"Does the left JSON value contain the right JSON path/value entries at the
top level?"

Particularly given the following:

select '[1]'::jsonb = '[]'::jsonb;
 ?column?
--
 f
(1 row)

So the keys are the same, the values (when compared directly) are not, but
@> returns True. Have I misunderstood the usage of the operator?

The above queries have been run on postgres 14, if that helps.

-- 
Brian Mendoza
br...@rotamap.net

Rotamap
www.rotamap.net
020 7631 1555
3 Tottenham Street London W1T 2AF
Registered in England No. 04551928


Re: Is there a good way to handle sum types (or tagged unions) in PostgreSQL?

2023-05-18 Thread Adrian Klaver

On 5/18/23 05:27, Victor Nordam Suadicani wrote:

Hi,

Is there any nice way to handle sum types (aka tagged unions) in a 
PostgreSQL database? I've searched far and wide and have not reached any 
satisfying answer.


As a (somewhat contrived) example, say I have the following enum in Rust:

enum TaggedUnion {
     Variant1(String),
     Variant2(i32),
     Variant3(f64),
}



If there are no good methods of handling this, is there any way 
PostgreSQL could be extended with capabilities for this? I have no idea 
how this would be done in practice though. Perhaps SQL itself is just 
unsuited for data of this kind? I don't really see why it should be though.


Composite type?:

https://www.postgresql.org/docs/current/rowtypes.html


Thanks,
Victor Nordam Suadicani


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: JSONB operator unanticipated behaviour

2023-05-18 Thread Adrian Klaver

On 5/18/23 06:17, Brian Mendoza wrote:

Hello,

I have encountered unanticipated behaviour with a JSONB operator, and 
wanted to make sure I am not misunderstanding its intended use.


When using the @> operator, I get this result.

select '{"a": [1]}'::jsonb @> '{"a": []}'::jsonb;
  ?column?
--
  t
(1 row)

However, given the phrasing in the documentation, I would have expected 
False.


"Does the left JSON value contain the right JSON path/value entries at 
the top level?"


Particularly given the following:

select '[1]'::jsonb = '[]'::jsonb;
  ?column?
--
  f
(1 row)

So the keys are the same, the values (when compared directly) are not, 
but @> returns True. Have I misunderstood the usage of the operator?


The above queries have been run on postgres 14, if that helps.


Have you looked at the containment examples?:

https://www.postgresql.org/docs/current/datatype-json.html#JSON-CONTAINMENT

I'm thinking this:

-- A top-level key and an empty object is contained:
SELECT '{"foo": {"bar": "baz"}}'::jsonb @> '{"foo": {}}'::jsonb;

and/or this

"The general principle is that the contained object must match the 
containing object as to structure and data contents, possibly after 
discarding some non-matching array elements or object key/value pairs 
from the containing object. "


applies.



--
Brian Mendoza
br...@rotamap.net 

Rotamap
www.rotamap.net 
020 7631 1555
3 Tottenham Street London W1T 2AF
Registered in England No. 04551928


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: JSONB operator unanticipated behaviour

2023-05-18 Thread Brian Mendoza
Ah, yes, that seem to be the explanation!

So it would seem that indeed it was my misunderstanding of the operator.

select '{"a": [1,2]}'::jsonb @> '{"a": [1]}'::jsonb;
 ?column?
--
 t
(1 row)

select '{"a": [1,2]}'::jsonb @> '{"a": [2,1,2]}'::jsonb;
 ?column?
--
 t
(1 row)

I was not aware of "possibly after discarding some non-matching array
elements or object key/value pairs from the containing object. But remember
that the order of array elements is not significant when doing a
containment match, and duplicate array elements are effectively considered
only once." and was expecting array equality to be the comparison. Good to
know!

Many thanks

On Thu, 18 May 2023 at 15:41, Adrian Klaver 
wrote:

> On 5/18/23 06:17, Brian Mendoza wrote:
> > Hello,
> >
> > I have encountered unanticipated behaviour with a JSONB operator, and
> > wanted to make sure I am not misunderstanding its intended use.
> >
> > When using the @> operator, I get this result.
> >
> > select '{"a": [1]}'::jsonb @> '{"a": []}'::jsonb;
> >   ?column?
> > --
> >   t
> > (1 row)
> >
> > However, given the phrasing in the documentation, I would have expected
> > False.
> >
> > "Does the left JSON value contain the right JSON path/value entries at
> > the top level?"
> >
> > Particularly given the following:
> >
> > select '[1]'::jsonb = '[]'::jsonb;
> >   ?column?
> > --
> >   f
> > (1 row)
> >
> > So the keys are the same, the values (when compared directly) are not,
> > but @> returns True. Have I misunderstood the usage of the operator?
> >
> > The above queries have been run on postgres 14, if that helps.
>
> Have you looked at the containment examples?:
>
> https://www.postgresql.org/docs/current/datatype-json.html#JSON-CONTAINMENT
>
> I'm thinking this:
>
> -- A top-level key and an empty object is contained:
> SELECT '{"foo": {"bar": "baz"}}'::jsonb @> '{"foo": {}}'::jsonb;
>
> and/or this
>
> "The general principle is that the contained object must match the
> containing object as to structure and data contents, possibly after
> discarding some non-matching array elements or object key/value pairs
> from the containing object. "
>
> applies.
>
> >
> > --
> > Brian Mendoza
> > br...@rotamap.net 
> >
> > Rotamap
> > www.rotamap.net 
> > 020 7631 1555
> > 3 Tottenham Street London W1T 2AF
> > Registered in England No. 04551928
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>

-- 
Brian Mendoza
br...@rotamap.net

Rotamap
www.rotamap.net
020 7631 1555
3 Tottenham Street London W1T 2AF
Registered in England No. 04551928


Re: JSONB operator unanticipated behaviour

2023-05-18 Thread Tom Lane
Brian Mendoza  writes:
> Ah, yes, that seem to be the explanation!
> So it would seem that indeed it was my misunderstanding of the operator.

You seem to be reading some fairly old version of the documentation.
The extended definition that Adrian mentions has been there for
awhile, but the JSON operator table didn't link to it before v13.
(I agree that the "top level" bit was just wrong, but it's gone.)

regards, tom lane




Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Stephen Frost
Greetings,

* Tony Xu (tony...@rubrik.com) wrote:
> The FAQ (copied below) mentioned that native transparent data encryption
> might be included in 16. Is it fair to assume that it will support database
> level encryption, that is, we can use two encryption keys for two databases
> in the same server, respectively? How can one verify that?

The current work to include TDE in PG isn't contemplating a per-database
key option.  What's the use-case for that?  Why do you feel that you'd
need two independent keys?  Also, the general idea currently is that
we'll have one key provided by the user which will be a KEK and then
multiple DEKs (different ones for relation data vs. temporary data vs.
the WAL).

If you're interested in TDE in PG, we could certainly use more folks
being involved and working to push it forward.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: JSONB operator unanticipated behaviour

2023-05-18 Thread Adrian Klaver

On 5/18/23 08:46, Tom Lane wrote:

Brian Mendoza  writes:

Ah, yes, that seem to be the explanation!
So it would seem that indeed it was my misunderstanding of the operator.


You seem to be reading some fairly old version of the documentation.
The extended definition that Adrian mentions has been there for
awhile, but the JSON operator table didn't link to it before v13.
(I agree that the "top level" bit was just wrong, but it's gone.)


Can you elaborate on gone and/or wrong as I see it in the 15 and devel 
documentation.




regards, tom lane


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: JSONB operator unanticipated behaviour

2023-05-18 Thread Tom Lane
Adrian Klaver  writes:
> On 5/18/23 08:46, Tom Lane wrote:
>> You seem to be reading some fairly old version of the documentation.
>> The extended definition that Adrian mentions has been there for
>> awhile, but the JSON operator table didn't link to it before v13.
>> (I agree that the "top level" bit was just wrong, but it's gone.)

> Can you elaborate on gone and/or wrong as I see it in the 15 and devel 
> documentation.

In v12 (and probably earlier, didn't look) Table 9.45 defines @> as
"Does the left JSON value contain the right JSON path/value entries
at the top level?" [1].

regards, tom lane

[1] https://www.postgresql.org/docs/12/functions-json.html




Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Tony Xu
Thanks for the information,  Andreas, Stephen.

Our use-case is for a multi-tenancy scenario - we are considering using
different databases to store different customer's data, however, for
cost-efficiency, we want to host them in the same server (to reduce the
CPU/mem idle time and to reduce the server management efforts). Now there
is a compliance related feature that we need to let our customer control
the KEK for their databases so they can rotate their KEKs independently, so
we cannot use one KEK for the whole PG server. Conceptually, different
databases are independent of each other, it also makes sense to allow them
to have completely independent encryption facilities?

Thanks,
Tony



On Thu, May 18, 2023 at 8:54 AM Stephen Frost  wrote:

> Greetings,
>
> * Tony Xu (tony...@rubrik.com) wrote:
> > The FAQ (copied below) mentioned that native transparent data encryption
> > might be included in 16. Is it fair to assume that it will support
> database
> > level encryption, that is, we can use two encryption keys for two
> databases
> > in the same server, respectively? How can one verify that?
>
> The current work to include TDE in PG isn't contemplating a per-database
> key option.  What's the use-case for that?  Why do you feel that you'd
> need two independent keys?  Also, the general idea currently is that
> we'll have one key provided by the user which will be a KEK and then
> multiple DEKs (different ones for relation data vs. temporary data vs.
> the WAL).
>
> If you're interested in TDE in PG, we could certainly use more folks
> being involved and working to push it forward.
>
> Thanks,
>
> Stephen
>


Unrecognized Node Type Warning

2023-05-18 Thread Arora, Nick
We are using PostgreSQL 12.13. We are noticing that queries that attempt to 
retrieve an element of an array by specifying its position cause a warning to 
be emitted: "WARNING:  unrecognized node type: 110".

Would appreciate your help diagnosing the issue and identifying steps to 
resolve.

Queries that reproduce the issue:

SELECT ('{0}'::int2[])[0];
WARNING:  unrecognized node type: 110
int2
--

(1 row)


SELECT ('0'::int2vector)[0];
WARNING:  unrecognized node type: 110
int2vector

  0
(1 row)

SELECT (indkey::int2[])[0] FROM pg_index limit 1;
WARNING:  unrecognized node type: 110
indkey

  1
(1 row)

SELECT scores[1], scores[2], scores[3], scores[4] FROM 
(select('{10,12,14,16}'::int[]) AS scores) AS round;
WARNING:  unrecognized node type: 110
WARNING:  unrecognized node type: 110
WARNING:  unrecognized node type: 110
WARNING:  unrecognized node type: 110
scores | scores | scores | scores
+++
 10 | 12 | 14 | 16
(1 row)

This email and any attachments thereto may contain private, confidential, and 
privileged material for the sole use of the intended recipient. 
Any review, copying, or distribution of this email (or any attachments thereto) 
by others is strictly prohibited. If you are not the intended recipient, 
please contact the sender immediately and permanently delete the original and 
any copies of this email and any attachments thereto.


Re: JSONB operator unanticipated behaviour

2023-05-18 Thread Adrian Klaver

On 5/18/23 09:36, Tom Lane wrote:

Adrian Klaver  writes:

On 5/18/23 08:46, Tom Lane wrote:

You seem to be reading some fairly old version of the documentation.
The extended definition that Adrian mentions has been there for
awhile, but the JSON operator table didn't link to it before v13.
(I agree that the "top level" bit was just wrong, but it's gone.)



Can you elaborate on gone and/or wrong as I see it in the 15 and devel
documentation.


In v12 (and probably earlier, didn't look) Table 9.45 defines @> as
"Does the left JSON value contain the right JSON path/value entries
at the top level?" [1].


Alright I get it now.



regards, tom lane

[1] https://www.postgresql.org/docs/12/functions-json.html


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Ron

On 5/18/23 10:54, Stephen Frost wrote:

Greetings,

* Tony Xu (tony...@rubrik.com) wrote:

The FAQ (copied below) mentioned that native transparent data encryption
might be included in 16. Is it fair to assume that it will support database
level encryption, that is, we can use two encryption keys for two databases
in the same server, respectively? How can one verify that?

The current work to include TDE in PG isn't contemplating a per-database
key option.  What's the use-case for that?  Why do you feel that you'd
need two independent keys?


I don't /feel/ that key-per-database us useful; I /know/ that 
key-per-database is useful, since the databases can be different projects 
for different companies.  Each wants it's own encryption key so that no one 
else can get to their at-rest data.


(pg_dump files will automatically be encrypted, right?)

--
Born in Arizona, moved to Babylonia.

Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Rob Sargent

On 5/18/23 11:49, Ron wrote:

On 5/18/23 10:54, Stephen Frost wrote:

Greetings,

* Tony Xu (tony...@rubrik.com) wrote:

The FAQ (copied below) mentioned that native transparent data encryption
might be included in 16. Is it fair to assume that it will support database
level encryption, that is, we can use two encryption keys for two databases
in the same server, respectively? How can one verify that?

The current work to include TDE in PG isn't contemplating a per-database
key option.  What's the use-case for that?  Why do you feel that you'd
need two independent keys?


I don't /feel/ that key-per-database us useful; I /know/ that 
key-per-database is useful, since the databases can be different 
projects for different companies.  Each wants it's own encryption key 
so that no one else can get to their at-rest data.


(pg_dump files will automatically be encrypted, right?)

--
Born in Arizona, moved to Babylonia.
Ron, this sounds like a revenue opportunity:  "Oh you want your own key, 
well then we'll have to spin up another server just for you so you're 
all separate and special-like.  Way more secure that way."


Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Thorsten Glaser
On Thu, 18 May 2023, Tony Xu wrote:

>Our use-case is for a multi-tenancy scenario - we are considering using
>different databases to store different customer's data, however, for

Why not using multiple clusters then?

Better isolation of the customers, but still on one server.

bye,
//mirabilos
-- 
15:41⎜ Somebody write a testsuite for helloworld :-)




Re: Unrecognized Node Type Warning

2023-05-18 Thread Tom Lane
"Arora, Nick"  writes:
> We are using PostgreSQL 12.13. We are noticing that queries that attempt to 
> retrieve an element of an array by specifying its position cause a warning to 
> be emitted: "WARNING:  unrecognized node type: 110".

I don't see that here, so I'm guessing it's coming from some extension.
What extensions do you have loaded?

Node type 110 would be T_SubscriptingRef in v12, which is a type name
that didn't exist in earlier versions (it used to be called ArrayRef),
so it's not very hard to believe that some extension missed out
support for that type.  But the only core-PG suspect is
pg_stat_statements, and I can see that it does know that node type.

regards, tom lane




Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Tom Lane
Thorsten Glaser  writes:
> On Thu, 18 May 2023, Tony Xu wrote:
>> Our use-case is for a multi-tenancy scenario - we are considering using
>> different databases to store different customer's data, however, for

> Why not using multiple clusters then?

Yeah.  The problem with key-per-database is what are you going to do
with the shared catalogs?  It's a lot simpler, and probably better
performing, to use one key per cluster.

regards, tom lane




Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Ron

On 5/18/23 13:02, Thorsten Glaser wrote:

On Thu, 18 May 2023, Tony Xu wrote:


Our use-case is for a multi-tenancy scenario - we are considering using
different databases to store different customer's data, however, for

Why not using multiple clusters then?


Yet More Firewall Rules to get approved by the Security Team.  And then they 
balk at port 5433 because they've never heard of it.


And from a technical point of view, one Postgresql system can better manage 
the memory on a VM than two which don't know about each other.


--
Born in Arizona, moved to Babylonia.




Re: Unrecognized Node Type Warning

2023-05-18 Thread Arora, Nick
Hello Tom,

Thanks for the information.  Here are the extensions we are using:

uuid-ossp
pgcrypto
citext
btree_gin

The warnings did start emitting shortly after the installation of btree_gin, so 
it seems somewhat suspect


From: Tom Lane 
Date: Thursday, May 18, 2023 at 11:30 AM
To: Arora, Nick 
Cc: pgsql-general@lists.postgresql.org 
Subject: Re: Unrecognized Node Type Warning
EXT - t...@sss.pgh.pa.us

"Arora, Nick"  writes:
> We are using PostgreSQL 12.13. We are noticing that queries that attempt to 
> retrieve an element of an array by specifying its position cause a warning to 
> be emitted: "WARNING:  unrecognized node type: 110".

I don't see that here, so I'm guessing it's coming from some extension.
What extensions do you have loaded?

Node type 110 would be T_SubscriptingRef in v12, which is a type name
that didn't exist in earlier versions (it used to be called ArrayRef),
so it's not very hard to believe that some extension missed out
support for that type.  But the only core-PG suspect is
pg_stat_statements, and I can see that it does know that node type.

regards, tom lane

This email and any attachments thereto may contain private, confidential, and 
privileged material for the sole use of the intended recipient. 
Any review, copying, or distribution of this email (or any attachments thereto) 
by others is strictly prohibited. If you are not the intended recipient, 
please contact the sender immediately and permanently delete the original and 
any copies of this email and any attachments thereto.


Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Ron

On 5/18/23 12:54, Rob Sargent wrote:

On 5/18/23 11:49, Ron wrote:

On 5/18/23 10:54, Stephen Frost wrote:

Greetings,

* Tony Xu (tony...@rubrik.com) wrote:

The FAQ (copied below) mentioned that native transparent data encryption
might be included in 16. Is it fair to assume that it will support database
level encryption, that is, we can use two encryption keys for two databases
in the same server, respectively? How can one verify that?

The current work to include TDE in PG isn't contemplating a per-database
key option.  What's the use-case for that?  Why do you feel that you'd
need two independent keys?


I don't /feel/ that key-per-database us useful; I /know/ that 
key-per-database is useful, since the databases can be different projects 
for different companies.  Each wants it's own encryption key so that no 
one else can get to their at-rest data.


(pg_dump files will automatically be encrypted, right?)

--
Born in Arizona, moved to Babylonia.
Ron, this sounds like a revenue opportunity:  "Oh you want your own key, 
well then we'll have to spin up another server just for you so you're all 
separate and special-like.  Way more secure that way."


We need to keep costs down, too.

Oracle (I think) does it at the DB level, and so does SQL Server. Upper 
Management hears us say "sorry, no can do" and wonders what bunch of 
amateurs are developing PostgreSQL.


--
Born in Arizona, moved to Babylonia.

Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Thorsten Glaser
On Thu, 18 May 2023, Ron wrote:

>> Why not using multiple clusters then?
>
> Yet More Firewall Rules to get approved by the Security Team.  And then they
> balk at port 5433 because they've never heard of it.

But mixing multiple customers on one cluster is much more of a risk.

> And from a technical point of view, one Postgresql system can better manage 
> the
> memory on a VM than two which don't know about each other.

Probably true. Is there something with which multiple clusters running
on the same server can communicate to do that better?

bye,
//mirabilos
-- 
15:41⎜ Somebody write a testsuite for helloworld :-)




Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Adrian Klaver

On 5/18/23 11:56, Ron wrote:

On 5/18/23 12:54, Rob Sargent wrote:

On 5/18/23 11:49, Ron wrote:



We need to keep costs down, too.

Oracle (I think) does it at the DB level, and so does SQL Server. Upper 
Management hears us say "sorry, no can do" and wonders what bunch of 
amateurs are developing PostgreSQL.


Looks like you will be migrating to Oracle or SQL Server.

Good luck on keeping costs down.



--
Born in Arizona, moved to Babylonia.


--
Adrian Klaver
adrian.kla...@aklaver.com





Re: Unrecognized Node Type Warning

2023-05-18 Thread Arora, Nick
To provide more complete information:

Here is the name and version of each extension we have installed:
azure   (1.0)
btree_gin (1.3)
citext   (1.6)
pgcrypto  (1.3)
plpgsql  (1.0)
uuid-ossp (1.1)

This email and any attachments thereto may contain private, confidential, and 
privileged material for the sole use of the intended recipient. 
Any review, copying, or distribution of this email (or any attachments thereto) 
by others is strictly prohibited. If you are not the intended recipient, 
please contact the sender immediately and permanently delete the original and 
any copies of this email and any attachments thereto.


Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Ron

On 5/18/23 14:07, Thorsten Glaser wrote:

On Thu, 18 May 2023, Ron wrote:


Why not using multiple clusters then?

Yet More Firewall Rules to get approved by the Security Team.  And then they
balk at port 5433 because they've never heard of it.

But mixing multiple customers on one cluster is much more of a risk.


Better have your roles and pg_hba.conf entries defined correctly!

Fortunately, pg_hba.conf is pretty easy.


And from a technical point of view, one Postgresql system can better manage the
memory on a VM than two which don't know about each other.

Probably true. Is there something with which multiple clusters running
on the same server can communicate to do that better?


A /meta/ postmaster!!!

--
Born in Arizona, moved to Babylonia.

PostgreSQL 13 - Logical Replication - ERROR: could not receive data from WAL stream: SSL SYSCALL error: EOF detected

2023-05-18 Thread FOUTE K . Jaurès
Hello everyone,

I have a replication between PostgreSQL 12 to 13 in the production system.
Using Ubuntu 18.04 LTS
We have this error today.



*2023-05-18 18:34:04.374 WAT [117322] ERROR:  could not receive data from
WAL stream: SSL SYSCALL error: EOF detected2023-05-18 18:34:04.381 WAT
[118393] LOG:  logical replication apply worker for subscription
"inov_transactionnal_table_bertoua_sub_00" has started2023-05-18
18:34:04.423 WAT [1039] LOG:  background worker "logical replication
worker" (PID 117322) exited with exit code 12023-05-18 18:47:51.485 WAT
[66836] postgres@inov_henrietfreres_v71_00 LOG:  could not receive data
from client: Connection timed out.*

*Any idea how to solve this?*
-- 
Jaurès FOUTE


Re: Unrecognized Node Type Warning

2023-05-18 Thread Tom Lane
"Arora, Nick"  writes:
> Here is the name and version of each extension we have installed:
> azure   (1.0)
> btree_gin (1.3)
> citext   (1.6)
> pgcrypto  (1.3)
> plpgsql  (1.0)
> uuid-ossp (1.1)

I'm quite certain that none of the last five are causing this,
so you need to take it up with whoever provides "azure".

regards, tom lane




Re: Is there a good way to handle sum types (or tagged unions) in PostgreSQL?

2023-05-18 Thread Victor Nordam Suadicani
A composite type is a *product type*
, not a sum type
. PostgreSQL currently has
great support for product types, but basically no support for sum types.
>From the perspective of algebraic data types, this feels like a "missing
link" in the type system. I'm not sure why SQL or the underlying relational
model has never addressed this deficiency. Would greatly appreciate any
insight anyone may have.

On Thu, 18 May 2023 at 16:35, Adrian Klaver 
wrote:

> On 5/18/23 05:27, Victor Nordam Suadicani wrote:
> > Hi,
> >
> > Is there any nice way to handle sum types (aka tagged unions) in a
> > PostgreSQL database? I've searched far and wide and have not reached any
> > satisfying answer.
> >
> > As a (somewhat contrived) example, say I have the following enum in Rust:
> >
> > enum TaggedUnion {
> >  Variant1(String),
> >  Variant2(i32),
> >  Variant3(f64),
> > }
> >
>
> > If there are no good methods of handling this, is there any way
> > PostgreSQL could be extended with capabilities for this? I have no idea
> > how this would be done in practice though. Perhaps SQL itself is just
> > unsuited for data of this kind? I don't really see why it should be
> though.
>
> Composite type?:
>
> https://www.postgresql.org/docs/current/rowtypes.html
> >
> > Thanks,
> > Victor Nordam Suadicani
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>


Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Bruce Momjian
On Thu, May 18, 2023 at 01:56:48PM -0500, Ron wrote:
> We need to keep costs down, too.
> 
> Oracle (I think) does it at the DB level, and so does SQL Server.  Upper
> Management hears us say "sorry, no can do" and wonders what bunch of amateurs
> are developing PostgreSQL.

I have found it is hard to protect against the judgment of the ignorant,
so I usually don't bother.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.




Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Ron

On 5/18/23 15:56, Bruce Momjian wrote:

On Thu, May 18, 2023 at 01:56:48PM -0500, Ron wrote:

We need to keep costs down, too.

Oracle (I think) does it at the DB level, and so does SQL Server.  Upper
Management hears us say "sorry, no can do" and wonders what bunch of amateurs
are developing PostgreSQL.

I have found it is hard to protect against the judgment of the ignorant,
so I usually don't bother.


It's perfectly valid to ask "why can't they do X when our previous vendor 
could?"


I, of course, know that it's because WAL records are global.  "They" then 
quite reasonably ask why the developers haven't fixed that, yet.  The answer 
(I think) is that "global WAL"is deeply embedded in the product, and would 
require a major rewrite; adding new features is a higher priority.


--
Born in Arizona, moved to Babylonia.




Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Stephen Frost
Greetings,

Please don't top-post on these lists.

* Tony Xu (tony...@rubrik.com) wrote:
> Our use-case is for a multi-tenancy scenario - we are considering using
> different databases to store different customer's data, however, for
> cost-efficiency, we want to host them in the same server (to reduce the
> CPU/mem idle time and to reduce the server management efforts). Now there
> is a compliance related feature that we need to let our customer control
> the KEK for their databases so they can rotate their KEKs independently, so
> we cannot use one KEK for the whole PG server. Conceptually, different
> databases are independent of each other, it also makes sense to allow them
> to have completely independent encryption facilities?

This really isn't currently in the plans and while it might be something
added later, as pointed out farther down on this thread, it wouldn't be
possible for the shared catalogs or the WAL to have separate keys for
those things which are relevant to a database, so it's not like each
tenant would actually have control over the key for "all" of their data
(consider that roles are stored in a shared PG catalog and then shared
among databases...).

To meet this compliance requirement, you'd certainly be much more able
to blanket claim that everything is independent by having a separate PG
instance for each client.  This would also allow rather useful things
like being able to do a file-based restore on a per-client basis in the
event something happens, rather than having to roll back an entire
cluster to some point in time just because one client did something
bad..  You'd also be able to scale the number of systems supporting a
given client independently.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Tony Xu
Thanks all for the discussions. New to PostgreSQL so don't have much
context here.

Regarding the multiple clusters idea, how does that work? Assume we can
store one customer's data in one cluster, is it possible to have separate
KEK for different clusters?

Why not using multiple clusters then?

Better isolation of the customers, but still on one server.


On Thu, May 18, 2023 at 3:53 PM Stephen Frost  wrote:

> Greetings,
>
> Please don't top-post on these lists.
>
> * Tony Xu (tony...@rubrik.com) wrote:
> > Our use-case is for a multi-tenancy scenario - we are considering using
> > different databases to store different customer's data, however, for
> > cost-efficiency, we want to host them in the same server (to reduce the
> > CPU/mem idle time and to reduce the server management efforts). Now there
> > is a compliance related feature that we need to let our customer control
> > the KEK for their databases so they can rotate their KEKs independently,
> so
> > we cannot use one KEK for the whole PG server. Conceptually, different
> > databases are independent of each other, it also makes sense to allow
> them
> > to have completely independent encryption facilities?
>
> This really isn't currently in the plans and while it might be something
> added later, as pointed out farther down on this thread, it wouldn't be
> possible for the shared catalogs or the WAL to have separate keys for
> those things which are relevant to a database, so it's not like each
> tenant would actually have control over the key for "all" of their data
> (consider that roles are stored in a shared PG catalog and then shared
> among databases...).
>
> To meet this compliance requirement, you'd certainly be much more able
> to blanket claim that everything is independent by having a separate PG
> instance for each client.  This would also allow rather useful things
> like being able to do a file-based restore on a per-client basis in the
> event something happens, rather than having to roll back an entire
> cluster to some point in time just because one client did something
> bad..  You'd also be able to scale the number of systems supporting a
> given client independently.
>
> Thanks,
>
> Stephen
>


Re: Would PostgreSQL 16 native transparent data encryption support database level encryption?

2023-05-18 Thread Stephen Frost
Greetings,

Really, please don't top-post on these lists.

* Tony Xu (tony...@rubrik.com) wrote:
> Regarding the multiple clusters idea, how does that work? Assume we can
> store one customer's data in one cluster, is it possible to have separate
> KEK for different clusters?

In the proposed TDE work, yes, each cluster (which is an entier
PostgreSQL system) would be able to have its own KEK.

> Why not using multiple clusters then?

There's a bit of overhead from each cluster and each would have their
own shared buffers pool of memory and such.

> Better isolation of the customers, but still on one server.

Depending on the OS, multi-cluster management on a given system is
easier or harder.  In my view, at least, Debian systems make having
multiple clusters on a given server a lot easier as they have
pg_createcluster, pg_lsclusters, etc, commands and management tools.

Another alternative would be to use container technology and Kubernetes
or OpenShift and a PG Operator to manage all the clusters across
whatever systems you're running on top of.

Of course, there are trade-offs to consider between all of these
different approaches.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: PostgreSQL 13 - Logical Replication - ERROR: could not receive data from WAL stream: SSL SYSCALL error: EOF detected

2023-05-18 Thread Kyotaro Horiguchi
At Thu, 18 May 2023 21:00:08 +0100, FOUTE K. Jaurès  
wrote in 
> Hello everyone,
> 
> I have a replication between PostgreSQL 12 to 13 in the production system.
> Using Ubuntu 18.04 LTS
> We have this error today.
> 
> 
> 
> *2023-05-18 18:34:04.374 WAT [117322] ERROR:  could not receive data from
> WAL stream: SSL SYSCALL error: EOF detected
> 2023-05-18 18:34:04.381 WAT [118393] LOG:  logical replication apply worker 
> for subscription "inov_transactionnal_table_bertoua_sub_00" has started
> 2023-05-18 18:34:04.423 WAT [1039] LOG:  background worker "logical 
> replication worker" (PID 117322) exited with exit code 1
> 2023-05-18 18:47:51.485 WAT [66836] postgres@inov_henrietfreres_v71_00 LOG:  
> could not receive data from client: Connection timed out.*
> 
> *Any idea how to solve this?*

According to the message, the SSL-encrypted replication connection got
disconnected unexpectedly. I suppose it is due to the death of the
upstream server or some reasons outside of PostgreSQL. It seems like
the issue had been persisting for a few minutes after that. Other than
the server's death, I doubt some network hardware problems or changes
of firewall or networking setup of the OS. I think it would be good
idea to check for them first.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




a simple-minded question about updating

2023-05-18 Thread Martin Mueller
I work with Postgres and wonder whether for my purposes there is a good-enough 
reason to update one of these days.

I’m an editor working with some 60,000 Early Modern texts, many of them in need 
of some editorial attention. The texts are XM encoded documents. Each word is 
wrapped in a  element with attributes for various linguistic metadata. 
Typically a type of error occurs several or many times, and at the margins they 
need individual attention. I use Python scripts to extract stuff from the main 
corpus—sometimes dozens, sometimes thousands or millions—turn them into keyword 
in contexts and import them into Postgres. I basically use Postgres as a giant 
spreadsheet.  Its excellent string-handling routines make it relatively easy to 
to perform search and sort operations that identify tokens in need of 
correction. Once they corrections are made in Postgres—typically as batch 
updates-- I move them as a data frame into Python, and from Python I move them 
back into the texts.

I do this on a recent Mac with 64 GB of memory and a 6 cor i& processor.  I use 
Data Studio as an editing interface.

Unless a more recent version of Postgress has additional string handling 
routines, or indexing routines that speed up working with tables with rows in 
the low millions, or other features that are likely to speed up operations, I 
don’t see any reasons to update.

I could imagine a table that has up to 40 million rows.  That would be pretty 
sluggish on my current equipment, which handles up to 10 million rows quite 
comfortably.

A I right in thinking that given my tasks and equipment it would be a waste of 
time to update? Or is there something I’m missing?

Martin Mueller
Professor emeritus of English and Classiccs
Northwestern University


Re: a simple-minded question about updating

2023-05-18 Thread Adrian Klaver

On 5/18/23 21:08, Martin Mueller wrote:
I work with Postgres and wonder whether for my purposes there is a 
good-enough reason to update one of these days.


Since you have not mentioned the Postgres version you are on now, there 
is really no definitive way to answer this.


Though as a rule keeping up with minor updates to whatever major version 
you running is a good idea. Should be noted that at some point ~5 years 
from a version's initial release the minor updates will stop. That is 
when community support ends. Then the answer to any problem you have 
will start with; you should upgrade.




Martin Mueller

Professor emeritus of English and Classiccs

Northwestern University



--
Adrian Klaver
adrian.kla...@aklaver.com





Records, Types, and Arrays

2023-05-18 Thread Raymond Brinzer
Greetings, all.

It's been a down-the-rabbit-hole day for me.  It all started out with a
simple problem.  I have defined a composite type.  There are functions
which return arrays whose values would be suitable to the type I defined.
How do I turn arrays into composite typed values?

Conceptually, this is straightforward.  Any given array can be mapped to a
corresponding record with the same elements, so this expression would make
sense:

ARRAY[1,2,3]::RECORD

If the result happens to be a valid instance of my_type, you might say:

ARRAY[1,2,3]::RECORD::my_type

Or, ideally, just:

ARRAY[1,2,3]::my_type

It seems to be a rather long way from the idea to the implementation,
however.  A helpful soul from the IRC channel did manage to make this
happen in a single expression:

(format('(%s)', array_to_string(the_array, ','))::my_type).*

While I'm happy to have it, that's ugly even by SQL's syntactic yardstick.
So, I figured I'd see about hiding it behind a function and a custom cast.
These efforts have not been successful, for reasons I'll probably share in
a subsequent email, as the details would distract from the point of this
one.

Getting to that point... we have these three kinds of things:

* Arrays
* Composite Values / Records
* Typed Composite Values (instances of composite types)

(Note on the second:  while section 8.16.2 of the documentation talks about
constructing "composite values", pg_typeof() reports these to be of the
"record" pseudo-type.  To (hopefully) avoid confusion, I'm going to
exclusively say "record" here.)

Here's the thing about these:  in the abstract, they're mostly the same.  A
record is simply an ordered multiset.  If you ignore implementation,
syntax, and whatnot, you could say that arrays are the subset of records
where all the members are of the same type.  Objects of composite type can
be considered records with an additional feature:  each member has a name.

It seems to me, then, that:

1) Switching between these things should be dead easy; and
2) One should be able to treat them as similarly as their actual
differences allow.

On the first point (speaking of arrays and composite types generically),
there are six possible casts.  One of these already works, when members are
compatible:

record::composite_type

(Mostly, anyway; I did run into a kink with it, which I'll explain when I
discuss what I've tried.)

These casts would always be valid:

array::record
composite_type::record

These would be valid where the member sets are compatible:

array::composite_type
record::array
composite_type::array

It seems like having all six casts available would be very handy.  But
(here's point 2) to the extent that you don't have to bother switching
between them at all, so much the better.  For instance:

(ARRAY[5,6,7])[1]
(ROW(5,6,7))[1]
(ROW(5,6,7)::my_type)[1]

all make perfect sense.  It would be lovely to be able to treat these types
interchangeably where appropriate.  It seems to me (having failed to
imagine a counterexample) that any operation you could apply to an array
should be applicable to a record, and any operation you could apply to a
record should be applicable to an instance of a composite type.

While the second point is rather far-reaching and idealistic, the first
seems well-defined and reasonably easy.

If you've taken the time to read all this, thank you.  If you take the idea
seriously, or have practical suggestions, thank you even more.  If you
correct me on something important... well, I owe much of what I know to
people like you, so please accept my deepest gratitude.

--
Yours,

Ray Brinzer


Re: Records, Types, and Arrays

2023-05-18 Thread David G. Johnston
On Thu, May 18, 2023 at 10:06 PM Raymond Brinzer 
wrote:

> How do I turn arrays into composite typed values?
>

Using just SQL syntax and no string munging:

(array_val[1]::col1_type, array_val[2]::col2_type)::composite_type


> While the second point is rather far-reaching and idealistic, the first
> seems well-defined and reasonably easy.
>

> If you've taken the time to read all this, thank you.  If you take the
> idea seriously, or have practical suggestions, thank you even more.  If you
> correct me on something important... well, I owe much of what I know to
> people like you, so please accept my deepest gratitude.
>
> None of what you are saying is likely to ever see the light of day.  If
you want to learn the SQL-way might be easier to just forget about your
idealized equivalency between composite types and array containers.

ARRAY[...] is a constructor, its output is an array container.  You can
either type the elements within the constructor or leave them untyped and
put a syntactical-sugar cast on the result.

ARRAY['2023-01-01'::date,'2023-02-01'::date]
ARRAY['2023-01-01','2023-02-01']::date[]

While you've said that having all this stuff would "be quite handy" that
isn't obvious to me.  It is even less obvious that any benefit would likely
be small compared to the effort to make all of this actually work.

Even if I could write: composite_type[1] instead of composite_type.column1
I don't know why I'd want to give up the expressiveness of writing the
column name.

David J.


Re: Records, Types, and Arrays

2023-05-18 Thread Raymond Brinzer
On Fri, May 19, 2023 at 1:42 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Thu, May 18, 2023 at 10:06 PM Raymond Brinzer 
> wrote:
>
>> How do I turn arrays into composite typed values?
>>
>
> Using just SQL syntax and no string munging:
>
> (array_val[1]::col1_type, array_val[2]::col2_type)::composite_type
>

Assuming one wants to write a specific solution, rather than a general one,
sure.  And when you want to deal with an unnamed array returned from a
function?  Well, you can throw *that* in a CTE to give it a name, or
perform some other such contortion.  The aggregate load of having to phrase
such simple ideas in complicated ways really isn't good.
>
>
> None of what you are saying is likely to ever see the light of day.  If
> you want to learn the SQL-way might be easier to just forget about your
> idealized equivalency between composite types and array containers.
>

The problem with "easier" is that addressing directly in front of you is
always easier in the immediate sense than actually attacking the problem
itself.  It also dooms you to the (after)life of Sisyphus, always rolling
the same rock up the same hill.


> ARRAY[...] is a constructor, its output is an array container.  You can
> either type the elements within the constructor or leave them untyped and
> put a syntactical-sugar cast on the result.
>
> ARRAY['2023-01-01'::date,'2023-02-01'::date]
> ARRAY['2023-01-01','2023-02-01']::date[]
>
> While you've said that having all this stuff would "be quite handy" that
> isn't obvious to me.  It is even less obvious that any benefit would likely
> be small compared to the effort to make all of this actually work.
>

Well, making one small part of it work would be a boon to me.  Is a simple,
generic cast from an array to a record really rocket science?  I can't
imagine why that would be.


> Even if I could write: composite_type[1] instead of composite_type.column1
> I don't know why I'd want to give up the expressiveness of writing the
> column name.
>

Naturally, you wouldn't give up the ability to do that.  You'd merely gain
the ability to do it another way.

-- 
Ray Brinzer


Re: pg_stats.avg_width

2023-05-18 Thread Maciek Sakrejda
Thanks, that makes sense. It was going to be my third guess, but it
seemed pretty wide for a TOAST pointer. Reviewing what goes in there,
though, it's reasonable.

I assume that this means for unTOASTed but compressed data, this
counts the compressed size.

Would a doc patch clarifying this (and possibly linking to the
relevant TOAST docs [1]) be welcome? The current wording is pretty
vague. Something like

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 7c09ab3000..2814ac8007 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7466,7 +7466,9 @@ SCRAM-SHA-256$:&l
stawidth int4
   
   
-   The average stored width, in bytes, of nonnull entries
+   The average stored width, in bytes, of nonnull entries. For compressed
+   entries, counts the compressed size; for TOASTed data, the size of the
+   TOAST pointer (see TOAST).
   
  

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index bb1a418450..62184fe32b 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -3680,7 +3680,9 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
avg_width int4
   
   
-   Average width in bytes of column's entries
+   Average width in bytes of column's entries. For compressed entries,
+   counts the compressed size; for TOASTed data, the size of the TOAST
+   pointer (see TOAST).
   
  

(not sure if this should be  or ).

Thanks,
Maciek

[1]: https://www.postgresql.org/docs/current/storage-toast.html




Re: Records, Types, and Arrays

2023-05-18 Thread Raymond Brinzer
On a problem which came up while trying to implement a solution, perhaps
someone could explain this:

scratch=# create type test_type as (a int, b int);
CREATE TYPE
scratch=# create function get_row() returns record as $$ select row(2,3);
$$ language sql;
CREATE FUNCTION
scratch=# select get_row();
 get_row
-
 (2,3)
(1 row)

scratch=# select pg_typeof( get_row() );
 pg_typeof
---
 record
(1 row)

scratch=# select pg_typeof( row(2,3) );
 pg_typeof
---
 record
(1 row)

scratch=# select row(2,3)::test_type;
  row
---
 (2,3)
(1 row)

scratch=# select get_row()::test_type;
ERROR:  cannot cast type record to test_type
LINE 1: select get_row()::test_type;

If row(2,3) and get_row() are both of type record, and the records have the
same values, why can one be cast to test_type, and the other not?

On Fri, May 19, 2023 at 1:07 AM Raymond Brinzer 
wrote:

> Greetings, all.
>
> It's been a down-the-rabbit-hole day for me.  It all started out with a
> simple problem.  I have defined a composite type.  There are functions
> which return arrays whose values would be suitable to the type I defined.
> How do I turn arrays into composite typed values?
>
> Conceptually, this is straightforward.  Any given array can be mapped to a
> corresponding record with the same elements, so this expression would make
> sense:
>
> ARRAY[1,2,3]::RECORD
>
> If the result happens to be a valid instance of my_type, you might say:
>
> ARRAY[1,2,3]::RECORD::my_type
>
> Or, ideally, just:
>
> ARRAY[1,2,3]::my_type
>
> It seems to be a rather long way from the idea to the implementation,
> however.  A helpful soul from the IRC channel did manage to make this
> happen in a single expression:
>
> (format('(%s)', array_to_string(the_array, ','))::my_type).*
>
> While I'm happy to have it, that's ugly even by SQL's syntactic
> yardstick.  So, I figured I'd see about hiding it behind a function and a
> custom cast.  These efforts have not been successful, for reasons I'll
> probably share in a subsequent email, as the details would distract from
> the point of this one.
>
> Getting to that point... we have these three kinds of things:
>
> * Arrays
> * Composite Values / Records
> * Typed Composite Values (instances of composite types)
>
> (Note on the second:  while section 8.16.2 of the documentation talks
> about constructing "composite values", pg_typeof() reports these to be of
> the "record" pseudo-type.  To (hopefully) avoid confusion, I'm going to
> exclusively say "record" here.)
>
> Here's the thing about these:  in the abstract, they're mostly the same.
> A record is simply an ordered multiset.  If you ignore implementation,
> syntax, and whatnot, you could say that arrays are the subset of records
> where all the members are of the same type.  Objects of composite type can
> be considered records with an additional feature:  each member has a name.
>
> It seems to me, then, that:
>
> 1) Switching between these things should be dead easy; and
> 2) One should be able to treat them as similarly as their actual
> differences allow.
>
> On the first point (speaking of arrays and composite types generically),
> there are six possible casts.  One of these already works, when members are
> compatible:
>
> record::composite_type
>
> (Mostly, anyway; I did run into a kink with it, which I'll explain when I
> discuss what I've tried.)
>
> These casts would always be valid:
>
> array::record
> composite_type::record
>
> These would be valid where the member sets are compatible:
>
> array::composite_type
> record::array
> composite_type::array
>
> It seems like having all six casts available would be very handy.  But
> (here's point 2) to the extent that you don't have to bother switching
> between them at all, so much the better.  For instance:
>
> (ARRAY[5,6,7])[1]
> (ROW(5,6,7))[1]
> (ROW(5,6,7)::my_type)[1]
>
> all make perfect sense.  It would be lovely to be able to treat these
> types interchangeably where appropriate.  It seems to me (having failed to
> imagine a counterexample) that any operation you could apply to an array
> should be applicable to a record, and any operation you could apply to a
> record should be applicable to an instance of a composite type.
>
> While the second point is rather far-reaching and idealistic, the first
> seems well-defined and reasonably easy.
>
> If you've taken the time to read all this, thank you.  If you take the
> idea seriously, or have practical suggestions, thank you even more.  If you
> correct me on something important... well, I owe much of what I know to
> people like you, so please accept my deepest gratitude.
>
> --
> Yours,
>
> Ray Brinzer
>


-- 
Ray Brinzer


Re: Records, Types, and Arrays

2023-05-18 Thread David G. Johnston
On Thursday, May 18, 2023, Raymond Brinzer  wrote:

>
> scratch=# select row(2,3)::test_type;
>

Unknown typed value, immediately converted to a known concrete instance of
test_type. It is never actually resolved as record.

All of the others must concretely be resolved to record to escape their
query level, and if you then try to cast the concrete record to some other
concrete type a cast needs to exist.

David J.


Re: Records, Types, and Arrays

2023-05-18 Thread Raymond Brinzer
Sorry, I should have noted this as well:

"One should also realize that when a PL/pgSQL function is declared to
return type record, this is not quite the same concept as a record
variable, even though such a function might use a record variable to hold
its result. In both cases the actual row structure is unknown when the
function is written, but for a function returning record the actual
structure is determined when the calling query is parsed, whereas a record
variable can change its row structure on-the-fly."

I'm guessing that row() isn't really a function, then?  And even so,
assuming this is the important difference, how is the ability to change row
structure on the fly making the cast possible?  In what way would the query
calling get_row() be critical?

On Fri, May 19, 2023 at 2:48 AM Raymond Brinzer 
wrote:

> On a problem which came up while trying to implement a solution, perhaps
> someone could explain this:
>
> scratch=# create type test_type as (a int, b int);
> CREATE TYPE
> scratch=# create function get_row() returns record as $$ select row(2,3);
> $$ language sql;
> CREATE FUNCTION
> scratch=# select get_row();
>  get_row
> -
>  (2,3)
> (1 row)
>
> scratch=# select pg_typeof( get_row() );
>  pg_typeof
> ---
>  record
> (1 row)
>
> scratch=# select pg_typeof( row(2,3) );
>  pg_typeof
> ---
>  record
> (1 row)
>
> scratch=# select row(2,3)::test_type;
>   row
> ---
>  (2,3)
> (1 row)
>
> scratch=# select get_row()::test_type;
> ERROR:  cannot cast type record to test_type
> LINE 1: select get_row()::test_type;
>
> If row(2,3) and get_row() are both of type record, and the records have
> the same values, why can one be cast to test_type, and the other not?
>
> On Fri, May 19, 2023 at 1:07 AM Raymond Brinzer 
> wrote:
>
>> Greetings, all.
>>
>> It's been a down-the-rabbit-hole day for me.  It all started out with a
>> simple problem.  I have defined a composite type.  There are functions
>> which return arrays whose values would be suitable to the type I defined.
>> How do I turn arrays into composite typed values?
>>
>> Conceptually, this is straightforward.  Any given array can be mapped to
>> a corresponding record with the same elements, so this expression would
>> make sense:
>>
>> ARRAY[1,2,3]::RECORD
>>
>> If the result happens to be a valid instance of my_type, you might say:
>>
>> ARRAY[1,2,3]::RECORD::my_type
>>
>> Or, ideally, just:
>>
>> ARRAY[1,2,3]::my_type
>>
>> It seems to be a rather long way from the idea to the implementation,
>> however.  A helpful soul from the IRC channel did manage to make this
>> happen in a single expression:
>>
>> (format('(%s)', array_to_string(the_array, ','))::my_type).*
>>
>> While I'm happy to have it, that's ugly even by SQL's syntactic
>> yardstick.  So, I figured I'd see about hiding it behind a function and a
>> custom cast.  These efforts have not been successful, for reasons I'll
>> probably share in a subsequent email, as the details would distract from
>> the point of this one.
>>
>> Getting to that point... we have these three kinds of things:
>>
>> * Arrays
>> * Composite Values / Records
>> * Typed Composite Values (instances of composite types)
>>
>> (Note on the second:  while section 8.16.2 of the documentation talks
>> about constructing "composite values", pg_typeof() reports these to be of
>> the "record" pseudo-type.  To (hopefully) avoid confusion, I'm going to
>> exclusively say "record" here.)
>>
>> Here's the thing about these:  in the abstract, they're mostly the same.
>> A record is simply an ordered multiset.  If you ignore implementation,
>> syntax, and whatnot, you could say that arrays are the subset of records
>> where all the members are of the same type.  Objects of composite type can
>> be considered records with an additional feature:  each member has a name.
>>
>> It seems to me, then, that:
>>
>> 1) Switching between these things should be dead easy; and
>> 2) One should be able to treat them as similarly as their actual
>> differences allow.
>>
>> On the first point (speaking of arrays and composite types generically),
>> there are six possible casts.  One of these already works, when members are
>> compatible:
>>
>> record::composite_type
>>
>> (Mostly, anyway; I did run into a kink with it, which I'll explain when I
>> discuss what I've tried.)
>>
>> These casts would always be valid:
>>
>> array::record
>> composite_type::record
>>
>> These would be valid where the member sets are compatible:
>>
>> array::composite_type
>> record::array
>> composite_type::array
>>
>> It seems like having all six casts available would be very handy.  But
>> (here's point 2) to the extent that you don't have to bother switching
>> between them at all, so much the better.  For instance:
>>
>> (ARRAY[5,6,7])[1]
>> (ROW(5,6,7))[1]
>> (ROW(5,6,7)::my_type)[1]
>>
>> all make perfect sense.  It would be lovely to be able to treat these
>> types interchangeably where appropriate.  It seems to me (having fa