TDE implementation in postgres which is in docker container

2020-07-25 Thread Vasu Madhineni
Hi All,

How to implement TDE in postgres which is running docker containers.

Thanks in advance.

Regards,
Vasu Madhineni


Re: is JIT available

2020-07-25 Thread Scott Ribe
> On Jul 24, 2020, at 9:55 PM, Pavel Stehule  wrote:
> 
> SELECT * FROM pg_config;

That doesn't tell me whether or not it can actually be used.





Re: is JIT available

2020-07-25 Thread Pavel Stehule
so 25. 7. 2020 v 14:04 odesílatel Scott Ribe 
napsal:

> > On Jul 24, 2020, at 9:55 PM, Pavel Stehule 
> wrote:
> >
> > SELECT * FROM pg_config;
>
> That doesn't tell me whether or not it can actually be used.
>

It shows if Postgres was compiled with JIT support.

When you run EXPLAIN ANALYZE SELECT ... then you can see info about JIT
overhead. If you don't see notices about JIT in EXPLAIN, then JIT was not
used.

Pavel


Re: is JIT available

2020-07-25 Thread Scott Ribe
> On Jul 25, 2020, at 6:21 AM, Pavel Stehule  wrote:
> 
> It shows if Postgres was compiled with JIT support.
> 
> When you run EXPLAIN ANALYZE SELECT ... then you can see info about JIT 
> overhead. If you don't see notices about JIT in EXPLAIN, then JIT was not 
> used.

The presence of "jit = on" in the config file does not indicate whether the 
running PG was actually compiled with JIT. And I'm not sure whether beyond 
that, compiling with JIT requires presence of anything outside the PG install.



Re: is JIT available

2020-07-25 Thread Pavel Stehule
so 25. 7. 2020 v 14:33 odesílatel Scott Ribe 
napsal:

> > On Jul 25, 2020, at 6:21 AM, Pavel Stehule 
> wrote:
> >
> > It shows if Postgres was compiled with JIT support.
> >
> > When you run EXPLAIN ANALYZE SELECT ... then you can see info about JIT
> overhead. If you don't see notices about JIT in EXPLAIN, then JIT was not
> used.
>
> The presence of "jit = on" in the config file does not indicate whether
> the running PG was actually compiled with JIT. And I'm not sure whether
> beyond that, compiling with JIT requires presence of anything outside the
> PG install.


 select * from pg_config where name = 'CONFIGURE' and setting like
'%with-llvm';

if you see one row, then your postgres should be configured and compiled
with JIT support


Re: is JIT available

2020-07-25 Thread Christoph Moench-Tegeder
## Scott Ribe (scott_r...@elevated-dev.com):

> So JIT is enabled in your conf, how can you tell from within a client
> session whether it's actually available (PG compiled with it and
> compiler available)?

pg_jit_available()  boolean  is JIT compilation available in this session

https://www.postgresql.org/docs/12/functions-info.html

Regards,
Christoph

-- 
Spare Space




Re: bad JIT decision

2020-07-25 Thread Tom Lane
David Rowley  writes:
> On Sat, 25 Jul 2020 at 10:42, David Rowley  wrote:
>> I think plan cost overestimation is a common cause of unwanted jit too.
>> It would be good to see the EXPLAIN ANALYZE so we knew if that was the
>> case here.

> So Scott did send me the full EXPLAIN ANALYZE for this privately. He
> wishes to keep the full output private.

So ... what was the *top* line, ie total cost estimate?

regards, tom lane




Re: bad JIT decision

2020-07-25 Thread Tom Lane
Andres Freund  writes:
> On 2020-07-24 18:37:02 -0400, Tom Lane wrote:
>> Yeah.  I'm fairly convinced that the v12 defaults are far too low,
>> because we are constantly seeing complaints of this sort.

> I think the issue is more that we need to take into accoutn that the
> overhead of JITing scales ~linearly with the number of JITed
> expressions. And that's not done right now.  I've had a patch somewhere
> that had a prototype implementation of changing the costing to be
> #expressions * some_cost, and I think that's a lot more accurate.

Another thing we could try with much less effort is scaling it by the
number of relations in the query.  There's already some code in the
plancache that tries to estimate planning effort that way, IIRC.
Such a scaling would be very legitimate for the cost of compiling
tuple-deconstruction code, and for other expressions it'd kind of
amount to an assumption that the expressions-per-table ratio is
roughly constant.  If you don't like that, maybe some simple
nonlinear growth rule would work.

regards, tom lane




Format generation_expression

2020-07-25 Thread Nils Bergmann
Hi,

Currently, I have a problem. I am working on a project with generated columns. 
I try to implement the generated column feature in the typeorm project. The 
thing is, I don't want to drop and add a generated column every time the 
application starts. Therefore I need a way to check whether a generated column 
expression has changed.

With select generation_expression from information_schema."columns" c where 
c.generation_expression is not null; I can retrieve the saved generation 
expression. However, It seems like Postgres optimizes or changes it. Therefore, 
it is different.

Example:

I add a column like this:

alter table company add tsv tsvector generated always as 
(setweight(to_tsvector('german', coalesce(name, '')), 'A') || 
setweight(jsonb_to_tsvector('german', keywords::jsonb, '["string"]'),'A')) 
stored;

Now if I try to retrieve the expression with select c.generation_expression 
from information_schema."columns" c where c.column_name = 'tsv' and 
c.table_name = 'company'; it is different.

Input expression:
setweight(to_tsvector('german', coalesce(title, '')), 'A') || 
setweight(to_tsvector('german', coalesce(description, '')), 'B');
What I got from postgres:
(setweight(to_tsvector('german'::regconfig, (COALESCE(name, ''::character 
varying))::text), 'A'::"char") || 
setweight(jsonb_to_tsvector('german'::regconfig, keywords, 
'["string"]'::jsonb), 'A'::"char"))

Postgres formats the query in some way. Is there any way to do this manually?
Like:
magic_function_to_format_sql(`setweight(to_tsvector('german', coalesce(title, 
'')), 'A') || setweight(to_tsvector('german', coalesce(description, '')), 
'B')`) Returns: (setweight(to_tsvector('german'::regconfig, (COALESCE(name, 
''::character varying))::text), 'A'::"char") || 
setweight(jsonb_to_tsvector('german'::regconfig, keywords, 
'["string"]'::jsonb), 'A'::"char"))

Then I could check if the saved expression is equal to the formatted one. I 
tried to find information in the documentation. But I had no luck.

Best regards,

Nils Bergmann


Re: bad JIT decision

2020-07-25 Thread Tom Lane
David Rowley  writes:
> ... nested at the bottom level join, about 6 joins deep.  The lack of
> any row being found results in upper level joins not having to do
> anything, and the majority of the plan is (never executed).

On re-reading this, that last point struck me forcibly.  If most of
the plan never gets executed, could we avoid compiling it?  That is,
maybe JIT isn't JIT enough, and we should make compilation happen
at first use of an expression not during executor startup.

regards, tom lane




Re: TDE implementation in postgres which is in docker container

2020-07-25 Thread Ron

On 7/24/20 11:38 AM, Vasu Madhineni wrote:

Hi All,

How to implement TDE in postgres which is running docker containers.


Postgres does not have TDE.  You need to encrypt at the filesystem level.

--
Angular momentum makes the world go 'round.




plphyton function - return each list value as a row ?

2020-07-25 Thread karsten
Hi All,
 
I am trying to create my first plphyton function ( retrieving earthquake
data from an API) .
Overall I was able to get one single row to be returned, but am struggling
how to return the entire list I have as multiple rows - see below. Currently
I get the following error when running the GetEartquakeAll function:

select GetEartquakeAll('2020-01-01' ,'2020-03-01', -120, 40,200, 1.7) gives
me
ERROR: length of returned sequence did not match number of columns in row

How can I 'simply' return each list value as a row ? 
Thanks
Karsten Vennemann
 
CREATE OR REPLACE FUNCTION GetEartquakeAll(start date ,stop date, lon float,
lat float,radius int, minmagnitude float) RETURNS equake_values AS $$
  import urllib2
  import json as json
  data =
urllib2.urlopen('https://earthquake.usgs.gov/fdsnws/event/1/query?format=geo
json&starttime=%s&endtime=%s&latitude=%s&longitude=%s&maxradiuskm=%s&minmagn
itude=%s&orderby=magnitude' % (start,stop,lat,lon,radius,minmagnitude))
  js_data = json.load(data)
  equake = js_data
  equakearray = []
  a = 0
  for i in equake['features']:
equakeplace = i['properties']['place'] # tile for earthquake location
magnitude =   i['properties']['mag']
qlong =   i['geometry']['coordinates'][0]
qlat =i['geometry']['coordinates'][1]  
equakevalue = {"place": equakeplace, "magnitude": magnitude , "qlong":
qlong, "qlat": qlat}
equakearray.append(equakevalue)
a = a+1
  return equakearray
$$ LANGUAGE plpythonu; 
   
# create custom data type that is returned from equake  data API query
CREATE TYPE equake_values AS (
  place text,
  magnitude float,
  qlong  float,
  qlat  float
);  









RE: plphyton function - return each list value as a row ?

2020-07-25 Thread karsten
Answering my own question I got it to work by a tiny change add SETOF for
the return definition:
Cheers
Karsten

...
RETURNS SETOF equake_values AS $$
...

-Original Message-
From: karsten [mailto:kars...@terragis.net] 
Sent: Saturday, July 25, 2020 14:42
To: pgsql-general@lists.postgresql.org
Subject: plphyton function - return each list value as a row ?

Hi All,
 
I am trying to create my first plphyton function ( retrieving earthquake
data from an API) .
Overall I was able to get one single row to be returned, but am struggling
how to return the entire list I have as multiple rows - see below. Currently
I get the following error when running the GetEartquakeAll function:

select GetEartquakeAll('2020-01-01' ,'2020-03-01', -120, 40,200, 1.7) gives
me
ERROR: length of returned sequence did not match number of columns in row

How can I 'simply' return each list value as a row ? 
Thanks
Karsten Vennemann
 
CREATE OR REPLACE FUNCTION GetEartquakeAll(start date ,stop date, lon float,
lat float,radius int, minmagnitude float) RETURNS equake_values AS $$
  import urllib2
  import json as json
  data =
urllib2.urlopen('https://earthquake.usgs.gov/fdsnws/event/1/query?format=geo
json&starttime=%s&endtime=%s&latitude=%s&longitude=%s&maxradiuskm=%s&minmagn
itude=%s&orderby=magnitude' % (start,stop,lat,lon,radius,minmagnitude))
  js_data = json.load(data)
  equake = js_data
  equakearray = []
  a = 0
  for i in equake['features']:
equakeplace = i['properties']['place'] # tile for earthquake location
magnitude =   i['properties']['mag']
qlong =   i['geometry']['coordinates'][0]
qlat =i['geometry']['coordinates'][1]  
equakevalue = {"place": equakeplace, "magnitude": magnitude , "qlong":
qlong, "qlat": qlat}
equakearray.append(equakevalue)
a = a+1
  return equakearray
$$ LANGUAGE plpythonu; 
   
# create custom data type that is returned from equake  data API query
CREATE TYPE equake_values AS (
  place text,
  magnitude float,
  qlong  float,
  qlat  float
);  












Re: bad JIT decision

2020-07-25 Thread David Rowley
On Sun, 26 Jul 2020 at 02:17, Tom Lane  wrote:
>
> David Rowley  writes:
> > On Sat, 25 Jul 2020 at 10:42, David Rowley  wrote:
> >> I think plan cost overestimation is a common cause of unwanted jit too.
> >> It would be good to see the EXPLAIN ANALYZE so we knew if that was the
> >> case here.
>
> > So Scott did send me the full EXPLAIN ANALYZE for this privately. He
> > wishes to keep the full output private.
>
> So ... what was the *top* line, ie total cost estimate?

Hash Right Join  (cost=1200566.17..1461446.31 rows=1651 width=141)
(actual time=5881.944..5881.944 rows=0 loops=1)

So well above the standard jit inline and optimize cost

David




Re: bad JIT decision

2020-07-25 Thread David Rowley
On Sun, 26 Jul 2020 at 02:23, Tom Lane  wrote:
>
> Andres Freund  writes:
> > On 2020-07-24 18:37:02 -0400, Tom Lane wrote:
> >> Yeah.  I'm fairly convinced that the v12 defaults are far too low,
> >> because we are constantly seeing complaints of this sort.
>
> > I think the issue is more that we need to take into accoutn that the
> > overhead of JITing scales ~linearly with the number of JITed
> > expressions. And that's not done right now.  I've had a patch somewhere
> > that had a prototype implementation of changing the costing to be
> > #expressions * some_cost, and I think that's a lot more accurate.
>
> Another thing we could try with much less effort is scaling it by the
> number of relations in the query.  There's already some code in the
> plancache that tries to estimate planning effort that way, IIRC.
> Such a scaling would be very legitimate for the cost of compiling
> tuple-deconstruction code, and for other expressions it'd kind of
> amount to an assumption that the expressions-per-table ratio is
> roughly constant.  If you don't like that, maybe some simple
> nonlinear growth rule would work.

I had imagined something a bit less all or nothing.  I had thought
that the planner could pretty cheaply choose if jit should occur or
not on a per-Expr level.  For WHERE clause items we know "norm_selec"
and we know what baserestrictinfos come before this RestrictInfo, so
we could estimate the number of executions per item in the WHERE
clause. For Exprs in the targetlist we have the estimated rows from
the RelOptInfo. HAVING clause Exprs will be evaluated a similar number
of times.   The planner could do something along the lines of
assuming, say 1000 * cpu_operator_cost to compile an Expr then assume
that a compiled Expr will be some percentage faster than an evaluated
one and only jit when the Expr is likely to be evaluated enough times
for it to be an overall win.  Optimize and inline would just have
higher thresholds.

David




Re: bad JIT decision

2020-07-25 Thread David Rowley
On Sun, 26 Jul 2020 at 02:54, Tom Lane  wrote:
>
> David Rowley  writes:
> > ... nested at the bottom level join, about 6 joins deep.  The lack of
> > any row being found results in upper level joins not having to do
> > anything, and the majority of the plan is (never executed).
>
> On re-reading this, that last point struck me forcibly.  If most of
> the plan never gets executed, could we avoid compiling it?  That is,
> maybe JIT isn't JIT enough, and we should make compilation happen
> at first use of an expression not during executor startup.

That's interesting.  But it would introduce an additional per
evaluation cost of checking if we're doing the first execution.

David




Re: [SOLUTION] Slow or Cannot Connect to PostgreSQL Instance Service on Windows 10

2020-07-25 Thread Surya Widyanto

On 24-07-2020 06:16, Adrian Klaver wrote:

On 7/22/20 4:49 PM, Surya Widyanto wrote:

On 23-07-2020 06:12, Rob Sargent wrote:



On Jul 22, 2020, at 4:58 PM, Surya Widyanto > wrote:


On 23-07-2020 05:26, Adrian Klaver wrote:

On 7/22/20 10:24 AM, Surya Widyanto wrote:

Hi,

I have PostgreSQL v12.3 (x64) installed on my Laptop running 
Windows 10 Home Single Language (64-bit).
In two or three times since I fresh install the PostgreSQL and 
windows on 17-05-2020, 06:04:56 (GMT).


Meant to ask earlier, where did you install Postgres from?

I've download the installer executable from 
https://www.enterprisedb.com/downloads/postgres-postgresql-downloads, 
the Windows x86-64 version.


Then I've install it on my Windows with binaries path to 
*"C:\Program Files\PostgreSQL\12\" *and for data path to 
*"H:\PostgreSQL\Data\12.x\"*.
Both *C:* and *H:* drive are partition on my NVMe PCIe Gen. 3 x 4 
SSD Drive mentioned earlier.


For additional information, I've also install the *PostGIS v3.0 
r15475 (x64)* for *PostgreSQL v12.*


By two or three times I mean, if this slow connection problem 
happen, than it wil occur in a view days in a row. Then it will 
return to normal with fast login/connection process.





Regards,
Surya


Is there by any chance a correlation between your physical location 
and the slowness episodes?

Any AV software in play?



Since this COVID-19 outbreak, I'm working from home. I've try to use 
different internet connection to test, one with wired connection 
internet provider on my house, and the other are 4G cellular 
connection internet from my smartphone and both internet connection I 
use give the same effect of slow and easy to drop/disconnect 
connection to PostgreSQL.


I thought you said earlier you where only connecting to the database 
on your laptop at localhost?:
Yes it is a local connection using TCP on port 5432 using local loopback 
adapter localhost or IP address 127.0.0.1.


"The Database is on the same machine/laptop, I try to connect using 
localhost and 127.0.0.1, both have the same slow connection problem."


So Internet connection should not be of concern here or is there more 
to this story?
Yes, I've also try to disconnect from internet connection, and there is 
no change on the connection behaviour. It is still slow to connect to 
the database and easily dropped the connection.


For additional information, I've also had this problem when 
connectiong from PHP Application, so that total 4 apps I try to 
connect with PostgreSQL (psql, pgAdmin, Navicat, PHP Application 
Connection)
I'm not installing any AntiVirus software, the only installed 
AntiVirus software are WIndows Defender that came with Microsoft 
Windows. (Currently I'm Disabling the real time scan on windows 
defender)


After I activate the DEBUG5 for *log_min_messages* and 
*log_min_error_statement*, another line of log came up with the other 
log line I mentioned earlier:
2020-07-22 23:31:29.065 +00 [postgres@postgres] - 127.0.0.1(10191) - 
[1724] (psql) LOG:  could not receive data from client: An existing 
connection was forcibly closed by the remote host.
2020-07-22 23:31:29.065 +00 [postgres@postgres] - 127.0.0.1(10191) - 
[1724] (psql) DEBUG:  unexpected EOF on client connection





After so much trial and error I've finally able to solve the problem 
with my connection to my local PostgreSQL server on my laptop. After I'm 
managed to fully disable any Windows Defender and Windows Firewall 
services, I'm now able to connect to my PostgreSQL server quickly and 
without any dropped connection like before.
The error of *DEBUG:  unexpected EOF on client connectio**n *based on my 
research on the internet are caused by suddenly dropped connection 
between PostgreSQL client apps with the PostgreSQL serve. and most of 
the case of dropped connection are caused by some kind of firewall in 
the system.


Thanks to Adrian Klaver and Rob Sargent for giving me insight to solve 
the problem.


Regards,
Surya