SV: SV: Implementing pgaudit extension on Microsoft Windows

2019-02-13 Thread Niels Jespersen
Hi Tom (and others)

Thank you for input. I think my next steps will be approcching the problem from 
differenct angles: 

Implement on Linux as Proof of Concept. 
Dive into the source for pgaudit and try to understand how it works, perhaps 
getting a small example extension working along the way. 
Perhaps contacting the authors of pgaudit to have them confirm that there  
exists working implementations of pgaudit on Windows (and if so if they are on 
PostgreSQL 11). 

I will report how this turns out. 

Regards Niels

-Oprindelig meddelelse-
Fra: Tom Lane  
Sendt: 12. februar 2019 15:14
Til: Niels Jespersen 
Cc: 'Pavel Stehule' ; 
pgsql-general@lists.postgresql.org
Emne: Re: SV: Implementing pgaudit extension on Microsoft Windows

Niels Jespersen  writes:
> Same result from
> pgaudit.log = 'all'
> in postgresql.conf and after both select pg_reload_conf(); and after service 
> restart.
> No entries in the log from audit.

Hm.  I don't know much about pgaudit, but just from scanning its documentation, 
it doesn't seem like there's anything else that has to be set.

I wonder whether pgaudit actually works on Windows?  It might have some weird 
dependency on children being spawned with fork not exec, for instance.  You 
probably ought to contact the authors and ask.

regards, tom lane



Is the ring buffer not used for index blocks

2019-02-13 Thread Daniel Westermann
Hi,

we just wondered: When a huge table is loaded into buffer cache it goes into 
the ring buffer to not pollute the cache. The same is apparently not true for 
indexes as much more blocks are cached.


-- Restarted the instance
pgbench=# explain (analyze,buffers) select count(*) from pgbench_accounts;

QUERY PLAN 
---
 Finalize Aggregate  (cost=212771.98..212771.99 rows=1 width=8) (actual 
time=848.179..848.179 rows=1 loops=1)
   Buffers: shared hit=6 read=9045
   ->  Gather  (cost=212771.77..212771.98 rows=2 width=8) (actual 
time=847.981..848.172 rows=3 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 Buffers: shared hit=6 read=9045
 ->  Partial Aggregate  (cost=211771.77..211771.78 rows=1 width=8) 
(actual time=835.475..835.475 rows=1 loops=3)
   Buffers: shared hit=18 read=27325
   ->  Parallel Index Only Scan using pgbench_accounts_pkey on 
pgbench_accounts  (cost=0.43..201355.10 rows=417 width=0
 Heap Fetches: 0
 Buffers: shared hit=18 read=27325
 Planning time: 0.166 ms
 Execution time: 853.673 ms
(13 rows)


pgbench=# SELECT c.relname, count(*) AS buffers
FROM pg_class c 
   INNER JOIN pg_buffercache b
  ON b.relfilenode=c.relfilenode 
   INNER JOIN pg_database d 
  ON (b.reldatabase=d.oid AND d.datname=current_database())
   GROUP BY c.relname
   ORDER BY 2 DESC LIMIT 50;
 relname  | buffers 
--+-
 pgbench_accounts_pkey|   16160
 pgbench_accounts |   6
 pg_index |   4

pgbench=# show shared_buffers ;
 shared_buffers 

 128MB
(1 row)

pgbench=# select version();
  version   


 PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 
(Red Hat 4.8.5-16), 64-bit
(1 row)

Can someone please explain this? Why do we see so many blocks of the primary 
key? Shouldn't this be limited somehow in the same way it is currently done for 
tables?

Thanks in advance
Daniel


Re: Subquery to select max(date) value

2019-02-13 Thread Jan Kohnert
Am Dienstag, 12. Februar 2019, 23:23:49 CET schrieb Rich Shepard:
> The query is to return the latest next_contact date for each person. Using
> the max() aggregate function and modeling the example of lo_temp on page 13
> of the rel. 10 manual I wrote this statement:

You don't even need a subselect to do so:

select 
P.person_id, P.lname, P.fname, P.direct_phone, 
O.org_name, max(A.next_contact) as next_contact
from People as P
join Organizations as O
on P.org_id = O.org_id
join Activities as A
on P.person_id = A.person_id
group by 
P.person_id, P.lname, P.fname, 
P.direct_phone, O.org_name;

-- 
Kind regards Jan






Re: Server goes to Recovery Mode when run a SQL

2019-02-13 Thread PegoraroF10
*Here is a SQL which will get that recovery mode. You can run it on any
database because we created it with FROM VALUES, so ...*
But this one is that one which grows and grows memory use until all memory
and swap space are gone, so problem occurs. That other SQL which gives us
the same problem but immediatelly we couldn´t replay it without our entire
database. Even if we extract just that schema to a new DB it doesn´t go to
recovery mode. We will think a little bit more to create something you can
test.
with feriados as (  SELECT dia, repete  FROM (VALUES
('2014-10-11'::DATE, FALSE), ('2014-10-13', FALSE), ('2014-10-14', FALSE),
('2014-10-15', FALSE), ('2014-10-16', FALSE),   ('2014-10-17',
FALSE), ('2014-12-19', FALSE), ('2014-12-20', FALSE), ('2014-12-22', FALSE),
('2014-12-23', FALSE),   ('2014-12-24', FALSE), ('2014-12-26',
FALSE), ('2014-12-27', FALSE), ('2014-12-29', FALSE), ('2014-12-30', FALSE),
  
('2014-12-31', FALSE), ('2015-01-02', FALSE), ('2015-01-03', FALSE),
('2015-01-04', FALSE), ('2015-02-16', FALSE),   ('2015-02-17',
FALSE), ('2015-04-03', FALSE), ('2015-04-04', FALSE), ('2015-06-04', FALSE),
('2015-12-18', FALSE),   ('2015-12-19', FALSE), ('2015-12-21',
FALSE), ('2015-12-22', FALSE), ('2015-12-23', FALSE), ('2015-12-24', FALSE),
  
('2015-12-26', FALSE), ('2015-12-28', FALSE), ('2015-12-29', FALSE),
('2015-12-30', FALSE), ('2015-12-31', FALSE),   ('2016-01-02',
FALSE), ('2016-01-04', FALSE), ('2016-01-05', FALSE), ('2016-01-06', FALSE),
('2016-01-07', FALSE),   ('2016-01-08', FALSE), ('2016-01-09',
FALSE), ('2016-02-08', FALSE), ('2016-03-09', FALSE), ('2016-03-25', FALSE),
  
('2016-03-26', FALSE), ('2016-05-26', FALSE), ('2016-12-24', FALSE),
('2016-12-28', FALSE), ('2016-12-29', FALSE),   ('2016-12-30',
FALSE), ('2016-12-31', FALSE), ('2017-01-02', FALSE), ('2017-01-03', FALSE),
('2017-01-04', FALSE),   ('2017-01-05', FALSE), ('2017-01-06',
FALSE), ('2017-01-07', FALSE), ('2017-02-25', FALSE), ('2017-02-27', FALSE),
  
('2017-03-09', FALSE), ('2017-04-14', FALSE), ('2017-04-15', FALSE),
('2017-06-15', FALSE), ('2017-09-30', FALSE),   ('2017-12-18',
FALSE), ('2017-12-19', FALSE), ('2017-12-20', FALSE), ('2017-12-21', FALSE),
('2017-12-22', FALSE),   ('2017-12-23', FALSE), ('2017-12-26',
FALSE), ('2017-12-27', FALSE), ('2017-12-28', FALSE), ('2017-12-29', FALSE),
  
('2017-12-30', FALSE), ('2018-01-02', FALSE), ('2018-01-03', FALSE),
('2018-01-04', FALSE), ('2018-01-05', FALSE),   ('2018-01-06',
FALSE), ('2018-01-07', FALSE), ('2018-02-12', FALSE), ('2018-02-13', FALSE),
('2018-03-30', FALSE),   ('2018-03-31', FALSE), ('2018-04-30',
FALSE), ('2018-05-31', FALSE), ('2018-10-15', FALSE), ('2018-12-18', FALSE),
  
('2018-12-19', FALSE), ('2018-12-20', FALSE), ('2018-12-21', FALSE),
('2018-12-22', FALSE), ('2018-12-24', FALSE),   ('2018-12-26',
FALSE), ('2018-12-27', FALSE), ('2018-12-28', FALSE), ('2018-12-29', FALSE),
('2018-12-31', FALSE),   ('2019-01-01', TRUE), ('2019-01-02',
FALSE), ('2019-01-03', FALSE), ('2019-01-04', FALSE), ('2019-01-05', FALSE),
  
('2019-01-07', FALSE), ('2019-02-09', TRUE), ('2019-03-09', TRUE),
('2019-04-21', TRUE), ('2019-05-01', TRUE),   ('2019-09-07', TRUE),
('2019-10-12', TRUE), ('2019-11-02', TRUE), ('2019-11-15', TRUE),
('2019-12-19', TRUE),   ('2019-12-20', TRUE), ('2019-12-21', TRUE),
('2019-12-22', TRUE), ('2019-12-23', TRUE), ('2019-12-25', TRUE),  
('2019-12-26', TRUE), ('2019-12-27', TRUE)   ) x (dia, repete)),
materias as (  SELECT * from (VALUES(593, 11091, 'AAC - Ética e
Segurança Digital', 9, 120, '2019/01/30'::DATE, '2019/01/30'::DATE, 60, 120,
0),(593, 11085, 'AAC - Computação nas Nuvens', 12, 60, NULL, NULL, 60,
120, 60)  ) x (turma_id, materia_id, materia, sequencia, previsto, dataini,
datafim, tempoatividade, minutosaula, minutosrestantes)), aulasporsemana as
(   select * from (values (593,1)) x (turma_id, quantidade)), assistidas
(turma_id, sequencia, conteudo_id, conteudo, prevista, aularealizada,
tempoatividade, aulasNoDia, dia, minutoaula, minutosassistidos,
cargaconteudo, cargarestante) as (  SELECTmaterias.turma_id,   
materias.sequencia,materias.materia_id,materias.materia,   
coalesce(realizada.prevista, 1),realizada.aularealizada,   
materias.tempoatividade,(realizada.minutosassistidos /
materias.tempoatividade),realizada.dia,materias.minutosaula,   
realizada.minutosassistidos,materias.previsto,   
coalesce(materias.previsto - (row_number() OVER AulaDaMateria *
realizada.minutosassistidos), materias.previsto)  FROM materias  LEFT JOIN
LATERAL (SELECT TRUE, tsrange(col_aula.data, (col_aula.data + (
col_aula.tempo || ' minute')::INTERVAL)) dia, 0 prevista, (extract(EPOCH
FROM col_aula.tempo) / 60) minutosassistidosFROM (VALUES (593,
11091, TIMESTAMP '2019-01-30 19:00:00', TIME '0

Re: Server goes to Recovery Mode when run a SQL

2019-02-13 Thread rob stone
Hello Marcos,

On Wed, 2019-02-13 at 04:41 -0700, PegoraroF10 wrote:
> Here is a SQL which will get that recovery mode. You can run it on
> any database because we created it with FROM VALUES, so ...
> But this one is that one which grows and grows memory use until all
> memory and swap space are gone, so problem occurs. That other SQL
> which gives us the same problem but immediatelly we couldn´t replay
> it without our entire database. Even if we extract just that schema
> to a new DB it doesn´t go to recovery mode. We will think a little
> bit more to create something you can test.
> 

Thanks for the revised query.

However, the create table/view scripts are needed as well.

Cheers,
Robert




Re: Server goes to Recovery Mode when run a SQL

2019-02-13 Thread PegoraroF10
is a sql FROM VALUES, why do you need tables ? 



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html



Re: Out of memory: Kill process nnnn (postmaster) score nn or sacrifice child

2019-02-13 Thread Vikas Sharma
Thank you Adrian for the reply, I did check the postgres processes running
around the time when OOM was invoked, there were lots of high CPU consuming
postgres processes running long running selects.
I am not sure of how to interpret the memory terms appearing in   linux
dmeg or /var/log/messages but I can see out of memory happened and
Postmaster invoked OOM.

Regards
Vikas Sharma

On Tue, 12 Feb 2019 at 16:39, Adrian Klaver 
wrote:

> On 2/12/19 8:20 AM, Vikas Sharma wrote:
> > Hello All,
> >
> > I have a 4 node PostgreSQL 9.6 cluster with streaming replication.  we
> > encounter today the Out of Memory  Error on the Master which resulted in
> > All postres  processes restarted and cluster recovered itself. Please
> > let me know the best way to diagnose this issue.
>
> For a start look back further in the Postgres log then the below. What
> is shown below is the effects of the OOM killer. What you need to look
> for is the statement that caused Postgres memory to increase to the
> point that the OOM killer was invoked.
>
> >
> >
> >
> > The error seen in the postgresql log:
> >
> > 2019-02-12 10:55:17 GMT LOG:  terminating any other active server
> processes
> > 2019-02-12 10:55:17 GMT WARNING:  terminating connection because of
> > crash of another server process
> > 2019-02-12 10:55:17 GMT DETAIL:  The postmaster has commanded this
> > server process to roll back the current transaction and exit, because
> > another server process exited abnormally and possibly corrupted shared
> > memory.
> > 2019-02-12 10:55:17 GMT HINT:  In a moment you should be able to
> > reconnect to the database and repeat your command.
> > 2019-02-12 10:55:17 GMT WARNING:  terminating connection because of
> > crash of another server process
> > 2019-02-12 10:55:17 GMT DETAIL:  The postmaster has commanded this
> > server process to roll back the current transaction and exit, because
> > another server process exited abnormally and possibly corrupted shared
> > memory.
> > 2019-02-12 10:55:17 GMT HINT:  In a moment you should be able to
> > reconnect to the database and repeat your command.
> > 2019-02-12 10:55:17 GMT WARNING:  terminating connection because of
> > crash of another server process
> > -
> >
> > Error from dmesg on linux:
> > ---
> > [4331093.885622] Out of memory: Kill process n (postmaster) score nn
> > or sacrifice child
> > [4331093.890225] Killed process n (postmaster) total-vm:18905944kB,
> > anon-rss:1747460kB, file-rss:4kB, shmem-rss:838220kB
> >
> > Thanks & Best Regards
> > Vikas Sharma
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Subquery to select max(date) value

2019-02-13 Thread Rich Shepard

On Wed, 13 Feb 2019, Andrew Gierth wrote:


Rich> Will probably try DISTINCT ON, too, if that makes it simpler or
Rich> faster.

You want LATERAL.


Andrew,

That's new to me so I'll read about it.

Thanks,

Rich



Re: Server goes to Recovery Mode when run a SQL

2019-02-13 Thread rob stone



On Wed, 2019-02-13 at 05:32 -0700, PegoraroF10 wrote:
> is a sql FROM VALUES, why do you need tables ? 
> 
> 


I can see that, however:-

ERROR: relation "col_diasaula" does not exist
  Position: 7477
[SQL State: 42P01]

Is that a table, view or a missing CTE?






Re: Server goes to Recovery Mode when run a SQL

2019-02-13 Thread PegoraroF10
sorry, it´s a missing part of the CTE
that constant should be on beginning part of it.

with feriados as (
  SELECT dia, repete
  FROM (
VALUES ('2014-10-11'::DATE, FALSE), ('2014-10-13', FALSE),
('2014-10-14', FALSE), ('2014-10-15', FALSE), ('2014-10-16', FALSE),
   ('2014-10-17', FALSE), ('2014-12-19', FALSE), ('2014-12-20',
FALSE), ('2014-12-22', FALSE), ('2014-12-23', FALSE),
   ('2014-12-24', FALSE), ('2014-12-26', FALSE), ('2014-12-27',
FALSE), ('2014-12-29', FALSE), ('2014-12-30', FALSE),
   ('2014-12-31', FALSE), ('2015-01-02', FALSE), ('2015-01-03',
FALSE), ('2015-01-04', FALSE), ('2015-02-16', FALSE),
   ('2015-02-17', FALSE), ('2015-04-03', FALSE), ('2015-04-04',
FALSE), ('2015-06-04', FALSE), ('2015-12-18', FALSE),
   ('2015-12-19', FALSE), ('2015-12-21', FALSE), ('2015-12-22',
FALSE), ('2015-12-23', FALSE), ('2015-12-24', FALSE),
   ('2015-12-26', FALSE), ('2015-12-28', FALSE), ('2015-12-29',
FALSE), ('2015-12-30', FALSE), ('2015-12-31', FALSE),
   ('2016-01-02', FALSE), ('2016-01-04', FALSE), ('2016-01-05',
FALSE), ('2016-01-06', FALSE), ('2016-01-07', FALSE),
   ('2016-01-08', FALSE), ('2016-01-09', FALSE), ('2016-02-08',
FALSE), ('2016-03-09', FALSE), ('2016-03-25', FALSE),
   ('2016-03-26', FALSE), ('2016-05-26', FALSE), ('2016-12-24',
FALSE), ('2016-12-28', FALSE), ('2016-12-29', FALSE),
   ('2016-12-30', FALSE), ('2016-12-31', FALSE), ('2017-01-02',
FALSE), ('2017-01-03', FALSE), ('2017-01-04', FALSE),
   ('2017-01-05', FALSE), ('2017-01-06', FALSE), ('2017-01-07',
FALSE), ('2017-02-25', FALSE), ('2017-02-27', FALSE),
   ('2017-03-09', FALSE), ('2017-04-14', FALSE), ('2017-04-15',
FALSE), ('2017-06-15', FALSE), ('2017-09-30', FALSE),
   ('2017-12-18', FALSE), ('2017-12-19', FALSE), ('2017-12-20',
FALSE), ('2017-12-21', FALSE), ('2017-12-22', FALSE),
   ('2017-12-23', FALSE), ('2017-12-26', FALSE), ('2017-12-27',
FALSE), ('2017-12-28', FALSE), ('2017-12-29', FALSE),
   ('2017-12-30', FALSE), ('2018-01-02', FALSE), ('2018-01-03',
FALSE), ('2018-01-04', FALSE), ('2018-01-05', FALSE),
   ('2018-01-06', FALSE), ('2018-01-07', FALSE), ('2018-02-12',
FALSE), ('2018-02-13', FALSE), ('2018-03-30', FALSE),
   ('2018-03-31', FALSE), ('2018-04-30', FALSE), ('2018-05-31',
FALSE), ('2018-10-15', FALSE), ('2018-12-18', FALSE),
   ('2018-12-19', FALSE), ('2018-12-20', FALSE), ('2018-12-21',
FALSE), ('2018-12-22', FALSE), ('2018-12-24', FALSE),
   ('2018-12-26', FALSE), ('2018-12-27', FALSE), ('2018-12-28',
FALSE), ('2018-12-29', FALSE), ('2018-12-31', FALSE),
   ('2019-01-01', TRUE), ('2019-01-02', FALSE), ('2019-01-03',
FALSE), ('2019-01-04', FALSE), ('2019-01-05', FALSE),
   ('2019-01-07', FALSE), ('2019-02-09', TRUE), ('2019-03-09',
TRUE), ('2019-04-21', TRUE), ('2019-05-01', TRUE),
   ('2019-09-07', TRUE), ('2019-10-12', TRUE), ('2019-11-02', TRUE),
('2019-11-15', TRUE), ('2019-12-19', TRUE),
   ('2019-12-20', TRUE), ('2019-12-21', TRUE), ('2019-12-22', TRUE),
('2019-12-23', TRUE), ('2019-12-25', TRUE),
   ('2019-12-26', TRUE), ('2019-12-27', TRUE)
   ) x (dia, repete)
), materias as (
  SELECT * from (VALUES
(593, 11091, 'AAC - Ética e Segurança Digital', 9, 120,
'2019/01/30'::DATE, '2019/01/30'::DATE, 60, 120, 0),
(593, 11085, 'AAC - Computação nas Nuvens', 12, 60, NULL, NULL, 60, 120,
60)
  ) x (turma_id, materia_id, materia, sequencia, previsto, dataini, datafim,
tempoatividade, minutosaula, minutosrestantes)
)


, col_diasaula(turma_id, dia, tempoaula) as(
  select * from (values(593, 3, time '02:00')
  ) as x(turma_id, dia, tempoaula)), 



), aulasporsemana as (
   select * from (values (593,1)) x (turma_id, quantidade)
), assistidas (turma_id, sequencia, conteudo_id, conteudo, prevista,
aularealizada, tempoatividade, aulasNoDia, dia, minutoaula,
minutosassistidos, cargaconteudo, cargarestante) as (
  SELECT
materias.turma_id,
materias.sequencia,
materias.materia_id,
materias.materia,
coalesce(realizada.prevista, 1),
realizada.aularealizada,
materias.tempoatividade,
(realizada.minutosassistidos / materias.tempoatividade),
realizada.dia,
materias.minutosaula,
realizada.minutosassistidos,
materias.previsto,
coalesce(materias.previsto - (row_number() OVER AulaDaMateria *
realizada.minutosassistidos), materias.previsto)
  FROM materias
  LEFT JOIN LATERAL (
SELECT TRUE, tsrange(col_aula.data, (col_aula.data + ( col_aula.tempo ||
' minute')::INTERVAL)) dia, 0 prevista, (extract(EPOCH FROM col_aula.tempo)
/ 60) minutosassistidos
FROM (VALUES
 (593, 11091, TIMESTAMP '2019-01-30 19:00:00', TIME '02:00'),
 (593, 11091, '2019-02-06 19:00:00', '01:00')
) col_aula (turma_id, materia_id, data, tempo)
WHERE col_aula.materia_id = materias.materia_id AND col_aula.turma_id =
materias.turma_id
ORDER BY col_aula.data, se

Re: SV: SV: Implementing pgaudit extension on Microsoft Windows

2019-02-13 Thread Joe Conway
On 2/13/19 3:32 AM, Niels Jespersen wrote:
> Dive into the source for pgaudit and try to understand how it works,
> perhaps getting a small example extension working along the way. 
> Perhaps contacting the authors of pgaudit to have them confirm that
> there  exists working implementations of pgaudit on Windows (and if
> so if they are on PostgreSQL 11).

pgaudit definitely *should* work on Windows. If you can post your other
log related conf settings I will fire up a Windows VM and try to
duplicate the issue.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



Re: Subquery to select max(date) value

2019-02-13 Thread Rich Shepard

On Tue, 12 Feb 2019, Ken Tanzer wrote:


If that's getting you what you want, then great and more power to you.  It
looks like you'll only get people who have a next_contact in your target
window there.  You might also consider something like this...

select
   p.person_id,
   p.lname,
   p.fname,
   p.direct_phone,
   o.org_name,
   a.next_contact
from
   people as p
   LEFT JOIN organizations o USING (person_id)
   LEFT JOIN (
   SELECT
   DISTINCT ON (person_id)
   person_id,
   next_contact
   FROM activities a
   -- WHERE ???
   ORDER BY person_id,next_contact DESC
   ) a USING (person_id)
;


I modified this to restrict the time and ignore nulls by replacing the
question marks:

WHERE a.next_contact <= 'today' and a.next_contact > '2018-12-31' and
  a.next_contact is not null

For a reason I've not yet found, the last condition is not observed; i.e.,
those rows with null next_contact dates appear in the results. Position in
the sequence makes no difference. What might cause this?

Regards,

Rich





Re: Subquery to select max(date) value

2019-02-13 Thread Adrian Klaver

On 2/13/19 6:28 AM, Rich Shepard wrote:

On Tue, 12 Feb 2019, Ken Tanzer wrote:

If that's getting you what you want, then great and more power to 
you.  It

looks like you'll only get people who have a next_contact in your target
window there.  You might also consider something like this...

select
   p.person_id,
   p.lname,
   p.fname,
   p.direct_phone,
   o.org_name,
   a.next_contact
from
   people as p
   LEFT JOIN organizations o USING (person_id)
   LEFT JOIN (
   SELECT
   DISTINCT ON (person_id)
   person_id,
   next_contact
   FROM activities a
   -- WHERE ???
   ORDER BY person_id,next_contact DESC
   ) a USING (person_id)
;


I modified this to restrict the time and ignore nulls by replacing the
question marks:

WHERE a.next_contact <= 'today' and a.next_contact > '2018-12-31' and
   a.next_contact is not null

For a reason I've not yet found, the last condition is not observed; i.e.,
those rows with null next_contact dates appear in the results. Position in
the sequence makes no difference. What might cause this?


The LEFT JOIN. There are rows in people for which there no records 
coming from the sub-select on activities, so the row is 'padded' with 
NULL values for the missing data.




Regards,

Rich







--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Subquery to select max(date) value

2019-02-13 Thread Rich Shepard

On Wed, 13 Feb 2019, Adrian Klaver wrote:


The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL values
for the missing data.


Adrian,

I assume it's the inner left join. I'll trace what's happening at each step
and learn where to specify no nulls.

Thanks,

Rich



Re: Subquery to select max(date) value

2019-02-13 Thread Adrian Klaver

On 2/13/19 7:37 AM, Rich Shepard wrote:

On Wed, 13 Feb 2019, Adrian Klaver wrote:


The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL 
values

for the missing data.


Adrian,

I assume it's the inner left join. I'll trace what's happening at each step


AFAIK there is no inner left join:
https://www.postgresql.org/docs/10/sql-select.html#SQL-FROM
"
join_type

One of

[ INNER ] JOIN

LEFT [ OUTER ] JOIN

RIGHT [ OUTER ] JOIN

FULL [ OUTER ] JOIN

CROSS JOIN
"


and learn where to specify no nulls.


You can't it is the nature of the join:

"LEFT OUTER JOIN returns all rows in the qualified Cartesian product 
(i.e., all combined rows that pass its join condition), plus one copy of 
each row in the left-hand table for which there was no right-hand row 
that passed the join condition. This left-hand row is extended to the 
full width of the joined table by inserting null values for the 
right-hand columns. Note that only the JOIN clause's own condition is 
considered while deciding which rows have matches. Outer conditions are 
applied afterwards."



What you are seeing are records for which there is a person in the 
people table that either:


1) Do not have a record under that person_id in the activities table

or

2) Do not have a record that meets the date filtering you are doing on 
next_contact.




Thanks,

Rich





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Postgresql RDS DB Latency Chossing Hash join Plan

2019-02-13 Thread github kran
On Tue, Feb 12, 2019 at 12:55 PM github kran  wrote:

>
>
> On Tue, Feb 12, 2019 at 12:33 PM Michael Lewis  wrote:
>
>> Did you update the stats by running ANALYZE on the tables involved, or
>> perhaps the entire database on the 'Non prod Aurora RDS instance'? Can you
>> share the two execution plans?
>>
>> *Michael Lewis  |  Software Engineer*
>> *Entrata*
>> *c: **619.370.8697 <619-370-8697>*
>>
>
> Here is the plan for both of the DB instances.
>

Michael - Did you get any chance to look at this issue. Also we see there
is a sequential scan being done instead of index scan.

>
>>
>> On Tue, Feb 12, 2019 at 11:27 AM github kran 
>> wrote:
>>
>>>
>>>
>>> On Tue, Feb 12, 2019 at 7:07 AM github kran 
>>> wrote:
>>>


 On Mon, Feb 11, 2019 at 6:00 PM github kran 
 wrote:

>
>
> On Mon, Feb 11, 2019 at 3:29 PM Michael Lewis 
> wrote:
>
>> Are default statistics target the same on both prod and AWS? Have you
>> analyzed all tables being used in this query to ensure stats are up 
>> proper?
>> If the optimizer is choosing a different plan, then the stats must be
>> different IMO.
>>
>>
>> *Michael Lewis |  Software Engineer*
>> *Entrata*
>>
>
>
> Thanks for your reply  I have verified few of the tables and their
> default statistics target and they seem to be same but is there anything 
> in
> particular you want me to look at it to differentiate Prod and Non prod
> databases ?. ( Also the DB instance size is same but there is little
> more data in the Non prod Aurora RDS instance compared to Prod
> instance).
>
> Query used.
>>select * from pg_stats where tablename = 'tableName'
>>
>>
>> On Mon, Feb 11, 2019 at 2:15 PM github kran 
>> wrote:
>>
>>> Hi Postgres Team,
>>> We are using AWS Aurora RDS Postgres DB 9.6.8 Engine. We recently
>>> deleted few million rows from the database and ran into a issue in one 
>>> of
>>> our dev account where the
>>> DB was not normal after this deletion. We did re index, vacuuming
>>> entire database but we couldnt bring it to the same state as earlier. So
>>> next steps we deleted the database and
>>> recreated the database by copying the snapshot from a production
>>> instance. Further did vacumming, re-index on the database.
>>>
>>> After this now the dev database seems to be in a better state than
>>> earlier but we are seeing few of our DB calls are taking more than 1 
>>> minute
>>> when we are fetching data and we observed
>>> this is because the query plan was executing a hash join as part of
>>> the query whereas a similar query on prod instance is not doing any hash
>>> join and is returning faster.
>>>
>>> Also we did not want to experiment by modifing the DB settings by
>>> doing enable_hash_join to off or random_page_count to 1 as we dont have
>>> these settings in Prod instance.
>>>
>>> Note:
>>> The partition table sizes we have here is between 40 GB to 75 GB and
>>> this is our normal size range, we have a new partition table for every 7
>>> days.
>>>
>>> Appreciate your ideas on what we could be missing and what we can
>>> correct here to reduce the query latency.
>>>
>>> Thanks
>>> githubKran
>>>
>>


Re: Subquery to select max(date) value

2019-02-13 Thread Adrian Klaver

On 2/13/19 7:37 AM, Rich Shepard wrote:

On Wed, 13 Feb 2019, Adrian Klaver wrote:


The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL 
values

for the missing data.


Adrian,

I assume it's the inner left join. I'll trace what's happening at each step
and learn where to specify no nulls.


Should have been clearer in my previous post, you can get rid of the 
nulls by filtering out the entire row.




Thanks,

Rich





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Subquery to select max(date) value

2019-02-13 Thread Rich Shepard

On Wed, 13 Feb 2019, Adrian Klaver wrote:


AFAIK there is no inner left join:
https://www.postgresql.org/docs/10/sql-select.html#SQL-FROM


Sigh. There are two LEFT JOINS in the statement. I referred to the SECOND
one as INNER. I correct my message to refer to the second of the two left
joins.

Rich



Re: Subquery to select max(date) value

2019-02-13 Thread Rich Shepard

On Wed, 13 Feb 2019, Adrian Klaver wrote:


Should have been clearer in my previous post, you can get rid of the nulls
by filtering out the entire row.


Adrian,

Thank you. I'm rebuilding the statement from the inside out (which helps me
learn more SQL in the process). For example,

select * from activities where next_contact is not null and
 next_contact <= 'today' and next_contact > '2018-12-31';

works as written. This suggests that it's the placement within the two joins
that needs correcting. Now I'll add the two joined tables one at a time and
learn how to structure the whole statement.

Best regards,

Rich




Re: Subquery to select max(date) value

2019-02-13 Thread Adrian Klaver
On Wed, Feb 13, 2019, at 9:06 AM, Rich Shepard wrote:
> On Wed, 13 Feb 2019, Adrian Klaver wrote:
> 
> > Should have been clearer in my previous post, you can get rid of the nulls
> > by filtering out the entire row.
> 
> Adrian,
> 
> Thank you. I'm rebuilding the statement from the inside out (which helps me
> learn more SQL in the process). For example,
> 
> select * from activities where next_contact is not null and
>   next_contact <= 'today' and next_contact > '2018-12-31';
> 
> works as written. This suggests that it's the placement within the two joins
> that needs correcting. Now I'll add the two joined tables one at a time and
> learn how to structure the whole statement.


It will work if you use it to filter after the joins are done.

> 
> Best regards,
> 
> Rich
> 
> 
>



Re: Postgresql RDS DB Latency Chossing Hash join Plan

2019-02-13 Thread Michael Lewis
I didn't see your email yesterday, sorry about that. Index scans instead of
sequential scans and nested loop instead of hash join means that you have
bad row count estimates on "Non prod Aurora RDS instance" as far as I can
figure. Have you run commands like-

analyze asset_info_2019_2_part4;
analyze asset_info_2019_2_part2;

etc? If data are very similar, indexes all exist, and
default_statistics_target are the same, then you should be getting the same
plans.


*Michael Lewis  |  Software Engineer*
*Entrata*
*c: **619.370.8697 <619-370-8697>*


On Wed, Feb 13, 2019 at 8:49 AM github kran  wrote:

>
>
> On Tue, Feb 12, 2019 at 12:55 PM github kran  wrote:
>
>>
>>
>> On Tue, Feb 12, 2019 at 12:33 PM Michael Lewis 
>> wrote:
>>
>>> Did you update the stats by running ANALYZE on the tables involved, or
>>> perhaps the entire database on the 'Non prod Aurora RDS instance'? Can you
>>> share the two execution plans?
>>>
>>> *Michael Lewis  |  Software Engineer*
>>> *Entrata*
>>> *c: **619.370.8697 <619-370-8697>*
>>>
>>
>> Here is the plan for both of the DB instances.
>>
>
> Michael - Did you get any chance to look at this issue. Also we see there
> is a sequential scan being done instead of index scan.
>
>>
>>>
>>> On Tue, Feb 12, 2019 at 11:27 AM github kran 
>>> wrote:
>>>


 On Tue, Feb 12, 2019 at 7:07 AM github kran 
 wrote:

>
>
> On Mon, Feb 11, 2019 at 6:00 PM github kran 
> wrote:
>
>>
>>
>> On Mon, Feb 11, 2019 at 3:29 PM Michael Lewis 
>> wrote:
>>
>>> Are default statistics target the same on both prod and AWS? Have
>>> you analyzed all tables being used in this query to ensure stats are up
>>> proper? If the optimizer is choosing a different plan, then the stats 
>>> must
>>> be different IMO.
>>>
>>>
>>> *Michael Lewis |  Software Engineer*
>>> *Entrata*
>>>
>>
>>
>> Thanks for your reply  I have verified few of the tables and their
>> default statistics target and they seem to be same but is there anything 
>> in
>> particular you want me to look at it to differentiate Prod and Non prod
>> databases ?. ( Also the DB instance size is same but there is little
>> more data in the Non prod Aurora RDS instance compared to Prod
>> instance).
>>
>> Query used.
>>>select * from pg_stats where tablename = 'tableName'
>>>
>>>
>>> On Mon, Feb 11, 2019 at 2:15 PM github kran 
>>> wrote:
>>>
 Hi Postgres Team,
 We are using AWS Aurora RDS Postgres DB 9.6.8 Engine. We recently
 deleted few million rows from the database and ran into a issue in one 
 of
 our dev account where the
 DB was not normal after this deletion. We did re index, vacuuming
 entire database but we couldnt bring it to the same state as earlier. 
 So
 next steps we deleted the database and
 recreated the database by copying the snapshot from a production
 instance. Further did vacumming, re-index on the database.

 After this now the dev database seems to be in a better state than
 earlier but we are seeing few of our DB calls are taking more than 1 
 minute
 when we are fetching data and we observed
 this is because the query plan was executing a hash join as part of
 the query whereas a similar query on prod instance is not doing any 
 hash
 join and is returning faster.

 Also we did not want to experiment by modifing the DB settings by
 doing enable_hash_join to off or random_page_count to 1 as we dont have
 these settings in Prod instance.

 Note:
 The partition table sizes we have here is between 40 GB to 75 GB
 and this is our normal size range, we have a new partition table for 
 every
 7 days.

 Appreciate your ideas on what we could be missing and what we can
 correct here to reduce the query latency.

 Thanks
 githubKran

>>>


Re: Out of memory: Kill process nnnn (postmaster) score nn or sacrifice child

2019-02-13 Thread Andreas Kretschmer
On 12 February 2019 17:20:09 CET, Vikas Sharma  wrote:
>Hello All,
>
>I have a 4 node PostgreSQL 9.6 cluster with streaming replication.  we
>encounter today the Out of Memory  Error on the Master which resulted
>in
>All postres  processes restarted and cluster recovered itself. Please
>let
>me know the best way to diagnose this issue.
>

typical reason for oom-kill are too high values for work_mem.


Regards, Andreas

-- 
2ndQuadrant - The PostgreSQL Support Company



Re: Subquery to select max(date) value

2019-02-13 Thread Rich Shepard

On Wed, 13 Feb 2019, Adrian Klaver wrote:


It will work if you use it to filter after the joins are done.


I'll work on finding the proper syntax to do this. Need to do more reading
and trial-and-error testing.

Regards,

Rich



Re: Subquery to select max(date) value

2019-02-13 Thread Adrian Klaver

On 2/13/19 7:37 AM, Rich Shepard wrote:

On Wed, 13 Feb 2019, Adrian Klaver wrote:


The LEFT JOIN. There are rows in people for which there no records coming
from the sub-select on activities, so the row is 'padded' with NULL 
values

for the missing data.


Adrian,

I assume it's the inner left join. I'll trace what's happening at each step
and learn where to specify no nulls.


create table people(person_id integer, desc_fld varchar);
create table activities(person_id integer, next_contact date);

insert into people values (1, 'contacted'), (2, 'never contacted'), (3, 
'out of range'), (4, 'contacted');


insert into activities values (1, '01/31/19'), (3, '11/01/18'), (4, 
'02/02/19');


No activities record for person_id =2, activities data set to NULL:

SELECT  * FROM people AS p LEFT JOIN activities AS a ON p.person_id = 
a.person_id;

 person_id |desc_fld | person_id | next_contact
---+-+---+--
 1 | contacted   | 1 | 2019-01-31
 2 | never contacted |  NULL | NULL
 3 | out of range| 3 | 2018-11-01
 4 | contacted   | 4 | 2019-02-02

Close to your last posted query. person_id 2 and 3 have NULL values for
activities data as there is no record for 2 and 3 is out of the date range.:

 select
   p.person_id,
   p.desc_fld,
   a.next_contact
from
   people as p
   LEFT JOIN (
   SELECT
   DISTINCT ON (person_id)
   person_id,
   next_contact
   FROM
activities a
   WHERE
   a.next_contact <= 'today' and a.next_contact > '2018-12-31' and
   a.next_contact is not null
   ) a USING (person_id)
;

 person_id |desc_fld | next_contact
---+-+--
 1 | contacted   | 2019-01-31
 2 | never contacted | NULL
 3 | out of range| NULL
 4 | contacted   | 2019-02-02
(4 rows)

What I think you want:

select
   p.person_id,
   p.desc_fld,
   a.next_contact
from
   people as p
   LEFT JOIN (
   SELECT
   DISTINCT ON (person_id)
   person_id,
   next_contact
   FROM
activities a
   WHERE
   a.next_contact <= 'today' and a.next_contact > '2018-12-31' and
   a.next_contact is not null
   ) a USING (person_id)
WHERE next_contact is not null;

 person_id | desc_fld  | next_contact
---+---+--
 1 | contacted | 2019-01-31
 4 | contacted | 2019-02-02







Thanks,

Rich





--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Subquery to select max(date) value

2019-02-13 Thread Andrew Gierth
> "Adrian" == Adrian Klaver  writes:

 Adrian> Close to your last posted query. person_id 2 and 3 have NULL
 Adrian> values for activities data as there is no record for 2 and 3 is
 Adrian> out of the date range.:

 Adrian>  select
 Adrian>p.person_id,
 Adrian>p.desc_fld,
 Adrian>a.next_contact
 Adrian> from
 Adrian>people as p
 Adrian>LEFT JOIN (
 Adrian>SELECT
 Adrian>DISTINCT ON (person_id)
 [...]
 Adrian>) a USING (person_id)
 Adrian> ;

DISTINCT ON with no matching ORDER BY at the _same_ query level is
non-deterministic.

Also DISTINCT ON isn't efficient. Consider instead something along the
lines of:

select p.*,
   a.* -- for illustration
  from people p
   join lateral (select *
   from activities a1
  where a1.person_id = p.person_id
and a1.next_contact > '2018-12-31'
and a1.next_contact <= 'today'
  order by a1.next_contact desc
  limit 1) a
 on true;

(make sure to have an index on activities(person_id,next_contact))

-- 
Andrew (irc:RhodiumToad)



Postgrest over foreign data wrapper

2019-02-13 Thread adrien ruffie
we have a tricky problem with my colleague.
We have to database db1 and db2 linked by a foreign data wrapper connection.
1 table "contractline" in db1 and "contract" in db2.

We use postgrest in order to request db2 via CURL.
But we want to add a link between previous tables.

In db2 we have a foreign table ft_contractline
example:

Column|   Type   | Collation | Nullable | Default | FDW 
options | Storage  | Stats target | Description
--+--+---+--+-+-+--+--+-
 id   | character varying(1024)  |   | not null | | 
| extended |  |
 ccode  | text |   |  | |   
  | extended |  |
 status   | text |   |  | | 
| extended |  |

We want to add for example, the following constraint:
ALTER TABLE contract ADD CONSTRAINT contractline_ft_contract_fkey FOREIGN KEY 
(contractid) REFERENCES ft_contractline(ccode);

in order to use the following query (via CURL on db2):
SELECT c.name, c.id FROM contract c JOIN 
ft_contractline ft_c ON c.id = ft_c.ccode WHERE c.type = 'business'

but we saw, isn't possible to add a foreign key on 'contract' table of db2 to 
'ft_contractline' foreign table ...

Do you know way to do it ?

Thank all and best regards.

Adrien


Re: Postgrest over foreign data wrapper

2019-02-13 Thread Adrian Klaver

On 2/13/19 1:04 PM, adrien ruffie wrote:

we have a tricky problem with my colleague.
We have to database db1 and db2 linked by a foreign data wrapper
connection.
1 table "contractline" in db1 and "contract" in db2.

We use postgrest in order to request db2 via CURL.
But we want to add a link between previous tables.

In db2 we have a foreign table ft_contractline
example:

     Column    |           Type           | Collation | Nullable |
Default | FDW options | Storage  | Stats target | Description

--+--+---+--+-+-+--+--+-
  id           | character varying(1024)  |           | not null |  
       |             | extended |              |
  ccode      | text                     |           |          |
     |             | extended |              |
  status       | text                     |           |          |  
       |             | extended |              |


We want to add for example, the following constraint:
ALTER TABLE contract ADD CONSTRAINT contractline_ft_contract_fkey
FOREIGN KEY (contractid) REFERENCES ft_contractline(ccode);

in order to use the following query (via CURL on db2):
SELECT c.name , c.id  FROM contract c
JOIN ft_contractline ft_c ON c.id  = ft_c.ccode WHERE
c.type = 'business'


You don't need a FK to use the above query, just to enforce referential 
integrity between the tables. Do you want RI between the tables?





but we saw, isn't possible to add a foreign key on 'contract' table
of db2 to 'ft_contractline' foreign table ...

Do you know way to do it ?

Thank all and best regards.

Adrien




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Postgrest over foreign data wrapper

2019-02-13 Thread Michael Lewis
You don't need an fkey to write a select statement with a join. I think I
must be missing something. Are you wanting it to enforce integrity across
the dblink? Or are you adding an fkey with the assumption that you will get
an index?


*Michael Lewis  |  Software Engineer*
*Entrata*
*c: **619.370.8697 <619-370-8697>*


On Wed, Feb 13, 2019 at 2:04 PM adrien ruffie 
wrote:

> we have a tricky problem with my colleague.
>> We have to database db1 and db2 linked by a foreign data wrapper
>> connection.
>> 1 table "contractline" in db1 and "contract" in db2.
>>
>> We use postgrest in order to request db2 via CURL.
>> But we want to add a link between previous tables.
>>
>> In db2 we have a foreign table ft_contractline
>> example:
>>
>> Column|   Type   | Collation | Nullable | Default
>> | FDW options | Storage  | Stats target | Description
>>
>> --+--+---+--+-+-+--+--+-
>>  id   | character varying(1024)  |   | not null |
>> | | extended |  |
>>  ccode  | text |   |  | |
>> | extended |  |
>>  status   | text |   |  |
>> | | extended |  |
>>
>> We want to add for example, the following constraint:
>> ALTER TABLE contract ADD CONSTRAINT contractline_ft_contract_fkey FOREIGN
>> KEY (contractid) REFERENCES ft_contractline(ccode);
>>
>> in order to use the following query (via CURL on db2):
>> SELECT c.name, c.id FROM contract c JOIN ft_contractline ft_c ON c.id =
>> ft_c.ccode WHERE c.type = 'business'
>>
>> but we saw, isn't possible to add a foreign key on 'contract' table of
>> db2 to 'ft_contractline' foreign table ...
>>
>> Do you know way to do it ?
>>
>> Thank all and best regards.
>>
>> Adrien
>>
>


Re: Postgrest over foreign data wrapper

2019-02-13 Thread Adrian Klaver

On 2/13/19 1:36 PM, Michael Lewis wrote:
You don't need an fkey to write a select statement with a join. I think 
I must be missing something. Are you wanting it to enforce integrity 


Me to until I looked at this:

http://postgrest.org/en/v5.2/api.html#resource-embedding

"PostgREST can also detect relations going through join tables. Thus you 
can request the Actors for Films (which in this case finds the 
information through Roles). You can also reverse the direction of 
inclusion, asking for all Directors with each including the list of 
their Films:


GET /directors?select=films(title,year) HTTP/1.1

Important

Whenever foreign key relations change in the database schema you must 
refresh PostgREST’s schema cache to allow resource embedding to work 
properly. See the section Schema Reloading.

"

across the dblink? Or are you adding an fkey with the assumption that 
you will get an index?

*
*
*
*
*Michael Lewis  | Software Engineer*
*Entrata**
*
*c:**619.370.8697 *


On Wed, Feb 13, 2019 at 2:04 PM adrien ruffie > wrote:


we have a tricky problem with my colleague.
We have to database db1 and db2 linked by a foreign data wrapper
connection.
1 table "contractline" in db1 and "contract" in db2.

We use postgrest in order to request db2 via CURL.
But we want to add a link between previous tables.

In db2 we have a foreign table ft_contractline
example:

     Column    |           Type           | Collation | Nullable
| Default | FDW options | Storage  | Stats target | Description

--+--+---+--+-+-+--+--+-
  id           | character varying(1024)  |           | not null
|         |             | extended |              |
  ccode      | text                     |           |          |
         |             | extended |              |
  status       | text                     |           |
  |         |             | extended |              |


We want to add for example, the following constraint:
ALTER TABLE contract ADD CONSTRAINT
contractline_ft_contract_fkey FOREIGN KEY (contractid)
REFERENCES ft_contractline(ccode);

in order to use the following query (via CURL on db2):
SELECT c.name , c.id  FROM contract
c JOIN ft_contractline ft_c ON c.id  = ft_c.ccode
WHERE c.type = 'business'

but we saw, isn't possible to add a foreign key on 'contract'
table of db2 to 'ft_contractline' foreign table ...

Do you know way to do it ?

Thank all and best regards.

Adrien




--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Postgrest over foreign data wrapper

2019-02-13 Thread Michael Lewis
Ah. I didn't realize Postgrest was something, rather than just a typo. An
fkey to a foreign table is not supported.

Related-
https://dba.stackexchange.com/questions/138591/foreign-key-references-constraint-on-postgresql-foreign-data-wrapper


*Michael Lewis  |  Software Engineer*
*Entrata*
*c: **619.370.8697 <619-370-8697>*


On Wed, Feb 13, 2019 at 2:43 PM Adrian Klaver 
wrote:

> On 2/13/19 1:36 PM, Michael Lewis wrote:
> > You don't need an fkey to write a select statement with a join. I think
> > I must be missing something. Are you wanting it to enforce integrity
>
> Me to until I looked at this:
>
> http://postgrest.org/en/v5.2/api.html#resource-embedding
>
> "PostgREST can also detect relations going through join tables. Thus you
> can request the Actors for Films (which in this case finds the
> information through Roles). You can also reverse the direction of
> inclusion, asking for all Directors with each including the list of
> their Films:
>
> GET /directors?select=films(title,year) HTTP/1.1
>
> Important
>
> Whenever foreign key relations change in the database schema you must
> refresh PostgREST’s schema cache to allow resource embedding to work
> properly. See the section Schema Reloading.
> "
>
> > across the dblink? Or are you adding an fkey with the assumption that
> > you will get an index?
> > *
> > *
> > *
> > *
> > *Michael Lewis  | Software Engineer*
> > *Entrata**
> > *
> > *c:**619.370.8697 *
> >
> >
> > On Wed, Feb 13, 2019 at 2:04 PM adrien ruffie  > > wrote:
> >
> > we have a tricky problem with my colleague.
> > We have to database db1 and db2 linked by a foreign data wrapper
> > connection.
> > 1 table "contractline" in db1 and "contract" in db2.
> >
> > We use postgrest in order to request db2 via CURL.
> > But we want to add a link between previous tables.
> >
> > In db2 we have a foreign table ft_contractline
> > example:
> >
> >  Column|   Type   | Collation | Nullable
> > | Default | FDW options | Storage  | Stats target | Description
> >
>  
> --+--+---+--+-+-+--+--+-
> >   id   | character varying(1024)  |   | not null
> > | | | extended |  |
> >   ccode  | text |   |  |
> >  | | extended |  |
> >   status   | text |   |
> >   | | | extended |  |
> >
> > We want to add for example, the following constraint:
> > ALTER TABLE contract ADD CONSTRAINT
> > contractline_ft_contract_fkey FOREIGN KEY (contractid)
> > REFERENCES ft_contractline(ccode);
> >
> > in order to use the following query (via CURL on db2):
> > SELECT c.name , c.id  FROM contract
> > c JOIN ft_contractline ft_c ON c.id  = ft_c.ccode
> > WHERE c.type = 'business'
> >
> > but we saw, isn't possible to add a foreign key on 'contract'
> > table of db2 to 'ft_contractline' foreign table ...
> >
> > Do you know way to do it ?
> >
> > Thank all and best regards.
> >
> > Adrien
> >
>
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>


Re: Subquery to select max(date) value

2019-02-13 Thread Rich Shepard

On Wed, 13 Feb 2019, Andrew Gierth wrote:


Adrian> Close to your last posted query. person_id 2 and 3 have NULL
Adrian> values for activities data as there is no record for 2 and 3 is
Adrian> out of the date range.:



DISTINCT ON with no matching ORDER BY at the _same_ query level is
non-deterministic.

Also DISTINCT ON isn't efficient. Consider instead something along the
lines of:


Andrew/Adrian,

I again read about DISTINCT and DISTINCT ON and fully understand them. I've
also again read about JOINs; I understand them in terms of sets and _think_
that in this query the people table is the LEFT (many) while the
organizations and activities tables are the RIGHT (one) in the many-to-one
relationships. That is, for each person_id there is only one org_id and only
one next_contact that meets the three constraints.

I'm now working on understanding how the syntax in the examples you two,
Ken, and others have provided expresses the many-to-one relationships of
organization and activities to people. I have the syntax that returns the
next_date meeting the WHERE constraints to each person_id and am now
focusing on adding the additional people and organization columns to the
results. Might not be until tomorrow or Friday but I'll let you and the list
subscribes know when I have understood all your suggestions and get the
results I want from the query.

Thanks again,

Rich




Streaming replication - invalid resource manager ID

2019-02-13 Thread Maeldron T.
I’m a bit stuck with a few LOG messages. The one in the subject disturbs me
the most.

(FreeBSD 11.2, PostgreSQL 10.6)

When I restart the hot standby, sometimes I see

2019-02-13 21:17:31 CET LOG:  redo starts at 0/488A2748
2019-02-13 21:17:31 CET LOG:  invalid record length at 0/488A2780: wanted
24, got 0
2019-02-13 21:17:31 CET LOG:  database system is ready to accept read only
connections
2019-02-13 21:17:31 CET LOG:  started streaming WAL from primary at
0/4800 on timeline 1

I try to believe it is harmless. I found two emails in the archive telling
that this is harmless, but they were from many years ago. Is it really
harmless? (Without PANIC, etc). The main reason I believe it’s harmless is
that I often see it.

The one I see less often is: "invalid resource manager ID"

The last time I encountered it when I made the standby the new master, and
the original master the new standby. Steps:

1. Stop the master (service postgresql stop)

Log from the original master:
2019-02-13 20:19:12 CET  LOG:  received fast shutdown request
2019-02-13 20:19:12 CET  LOG:  aborting any active transactions
2019-02-13 20:19:12 CET  LOG:  worker process: logical replication launcher
(PID 17828) exited with exit code 1
2019-02-13 20:19:12 CET  LOG:  shutting down
2019-02-13 20:19:16 CET  LOG:  database system is shut down

Log from the new master (it was the standby):
2019-02-13 20:19:16 CET  LOG:  replication terminated by primary server
2019-02-13 20:19:16 CET  DETAIL:  End of WAL reached on timeline 3 at
8F42/645FBD00.
2019-02-13 20:19:16 CET  FATAL:  could not send end-of-streaming message to
primary: no COPY in progress
2019-02-13 20:19:16 CET  LOG:  invalid record length at 8F42/645FBD00:
wanted 24, got 0
2019-02-13 20:19:16 CET  FATAL:  could not connect to the primary server:
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.

2. pg_ctl -D the_whole_galaxy promote

Log from the new master (was slave):
2019-02-13 20:19:31 CET  LOG:  received promote request
2019-02-13 20:19:31 CET  FATAL:  terminating walreceiver process due to
administrator command
2019-02-13 20:19:31 CET  LOG:  redo done at 8F42/645FBC90
2019-02-13 20:19:31 CET  LOG:  last completed transaction was at log time
2019-02-13 20:19:12.354569+01
2019-02-13 20:19:31 CET  LOG:  selected new timeline ID: 4
2019-02-13 20:19:31 CET  LOG:  archive recovery complete
2019-02-13 20:19:31 CET  LOG:  database system is ready to accept
connections

3. Start the old master after putting there a recovery.conf to make it the
new standby

Log from the original master:
2019-02-13 20:23:22 CET  LOG:  database system was shut down at 2019-02-13
20:19:16 CET
2019-02-13 20:23:22 CET  LOG:  entering standby mode
2019-02-13 20:23:22 CET  LOG:  consistent recovery state reached at
8F42/645FBD00
2019-02-13 20:23:22 CET  LOG:  invalid record length at 8F42/645FBD00:
wanted 24, got 0
2019-02-13 20:23:22 CET  LOG:  database system is ready to accept read only
connections
2019-02-13 20:23:22 CET  LOG:  fetching timeline history file for timeline
4 from primary server
2019-02-13 20:23:22 CET  LOG:  started streaming WAL from primary at
8F42/6400 on timeline 3
2019-02-13 20:23:22 CET  LOG:  replication terminated by primary server
2019-02-13 20:23:22 CET  DETAIL:  End of WAL reached on timeline 3 at
8F42/645FBD00.
2019-02-13 20:23:22 CET  LOG:  new target timeline is 4
2019-02-13 20:23:22 CET  LOG:  restarted WAL streaming at 8F42/6400 on
timeline 4
2019-02-13 20:23:22 CET  LOG:  redo starts at 8F42/645FBD00

So far so good (except for the invalid record length, but it’s only LOG).

4. A bit later I shutdown the original master server itself (which runs the
now-standby PG cluster) due to VM configuration. It takes a few minutes. I
start the server.

Log from the original master:
2019-02-13 20:33:39 CET  LOG:  received fast shutdown request
2019-02-13 20:33:39 CET  LOG:  aborting any active transactions
2019-02-13 20:33:39 CET  FATAL:  terminating walreceiver process due to
administrator command
2019-02-13 20:33:39 CET  LOG:  shutting down
2019-02-13 20:33:40 CET  LOG:  database system is shut down
2019-02-13 20:38:45 CET  LOG:  database system was shut down in recovery at
2019-02-13 20:33:39 CET
2019-02-13 20:38:45 CET  LOG:  entering standby mode
2019-02-13 20:38:45 CET  LOG:  redo starts at 8F42/E7DC51B8
2019-02-13 20:38:47 CET  WARNING:  could not open directory
"base/26083437": No such file or directory
2019-02-13 20:38:47 CET  CONTEXT:  WAL redo at 8F42/EDE3E720 for
Database/DROP: dir 26083437/1663
2019-02-13 20:38:47 CET  WARNING:  some useless files may be left behind in
old database directory "base/26083437"
2019-02-13 20:38:47 CET  CONTEXT:  WAL redo at 8F42/EDE3E720 for
Database/DROP: dir 26083437/1663

I don’t like these because they are WARNINGS. I believe they might be there
because sometimes I check the replication with a createdb something (on
master); psql -l (on standby);

Re: Subquery to select max(date) value

2019-02-13 Thread Adrian Klaver

On 2/13/19 2:24 PM, Rich Shepard wrote:

On Wed, 13 Feb 2019, Andrew Gierth wrote:


Adrian> Close to your last posted query. person_id 2 and 3 have NULL
Adrian> values for activities data as there is no record for 2 and 3 is
Adrian> out of the date range.:



DISTINCT ON with no matching ORDER BY at the _same_ query level is
non-deterministic.

Also DISTINCT ON isn't efficient. Consider instead something along the
lines of:


Andrew/Adrian,

I again read about DISTINCT and DISTINCT ON and fully understand them. I've
also again read about JOINs; I understand them in terms of sets and _think_
that in this query the people table is the LEFT (many) while the
organizations and activities tables are the RIGHT (one) in the many-to-one
relationships. That is, for each person_id there is only one org_id and 
only

one next_contact that meets the three constraints.


Given a sufficiently large date range that may not be true as you may 
have contacted a given person multiple times during that range and 
generated multiple activities records.




I'm now working on understanding how the syntax in the examples you two,
Ken, and others have provided expresses the many-to-one relationships of
organization and activities to people. I have the syntax that returns the
next_date meeting the WHERE constraints to each person_id and am now
focusing on adding the additional people and organization columns to the
results. Might not be until tomorrow or Friday but I'll let you and the 
list

subscribes know when I have understood all your suggestions and get the
results I want from the query.

Thanks again,

Rich






--
Adrian Klaver
adrian.kla...@aklaver.com



Re: Subquery to select max(date) value

2019-02-13 Thread Rich Shepard

On Wed, 13 Feb 2019, Adrian Klaver wrote:


Given a sufficiently large date range that may not be true as you may have
contacted a given person multiple times during that range and generated
multiple activities records.


Adrian,

This is true as it has occurred. I want only the most recent activity row
associated with that person_id. (NB: while I'm really comfortable with DDL
statements my DML experience is seriously lacking and that's what I need to
improve now.)

I've just read a couple of blog posts on the LATERAL join added in 9.3 and
understand it in theory. Properly applying it to my application is now my
focus (and I need to re-read Andrew's example very closely.)

Best regards,

Rich



Re: SV: SV: Implementing pgaudit extension on Microsoft Windows

2019-02-13 Thread Joe Conway
On 2/13/19 9:17 AM, Joe Conway wrote:
> On 2/13/19 3:32 AM, Niels Jespersen wrote:
>> Dive into the source for pgaudit and try to understand how it works,
>> perhaps getting a small example extension working along the way. 
>> Perhaps contacting the authors of pgaudit to have them confirm that
>> there  exists working implementations of pgaudit on Windows (and if
>> so if they are on PostgreSQL 11).
> 
> pgaudit definitely *should* work on Windows. If you can post your other
> log related conf settings I will fire up a Windows VM and try to
> duplicate the issue.

I just compiled pgaudit on a Win10 VM and loaded it into a Postgres 11
Win64 build downloaded from the PGDG community download. Works perfectly
for me. Will definitely need to understand more details about your
installation. Probably not appropriate to do that on this list, so
please submit an issue on github.com/pgaudit/pgaudit

Thanks,

Joe


-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature


Re: Subquery to select max(date) value

2019-02-13 Thread Rich Shepard

On Wed, 13 Feb 2019, Brent Wood wrote:


I have not really followed this thread but would not a query along the lines of
select * from activity where person_id = n and timestamp = (select
max(timestamp) from activity where person_id = n);
give the required answer ie, always return the latest result for the specified 
person_id??


Brent,

I don't know. What does work is this statement:

SELECT
DISTINCT ON (person_id) person_id,
next_contact
FROM activities AS a
WHERE a.next_contact is not null and a.next_contact <= 'today' and
  a.next_contact > '2018-12-31'
ORDER BY person_id,next_contact;

which returns these results:

 person_id | next_contact 
---+--

 1 | 2019-01-14
 4 | 2019-01-14
22 | 2019-01-14
36 | 2019-01-03
37 | 2019-01-14
38 | 2019-01-21
40 | 2019-02-11
41 | 2019-02-11
42 | 2019-02-11
43 | 2019-02-11
44 | 2019-02-11
45 | 2019-02-11
46 | 2019-02-11
(13 rows)

Now I'm learning how to join the people and organization table using LATERAL
join(s) so the results include names and phone numbers.

Thanks for the suggestion,

Rich