Re: [GENERAL] a JOIN to a VIEW seems slow

2017-10-05 Thread Frank Millman

On 5 Oct 2017, at 8:20 AM, Frank Millman wrote:

> If anyone wants to take this further, maybe this is a good place to start.

I should have re-stated the reason for my original post.

Exactly the same query, on exactly the same data, takes 1.8 seconds on Sql 
Server, 1.0 seconds on SQLite3, and 1607 seconds, or 26 minutes, on PostgreSQL 
9.4.4.

Frank
 

[GENERAL] delete a file everytime pg server starts/crashes

2017-10-05 Thread athinivas
Hi, 

I'm having a requirement to delete a file in system whenever pg server is
started/crashed. Any idea?

Thanks,
Athi 



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] delete a file everytime pg server starts/crashes

2017-10-05 Thread Melvin Davidson
On Thu, Oct 5, 2017 at 10:04 AM, athinivas  wrote:

> Hi,
>
> I'm having a requirement to delete a file in system whenever pg server is
> started/crashed. Any idea?
>
> Thanks,
> Athi
>
>
>
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-general-
> f1843780.html
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

It would be nice to know your Operating System and PostgreSQL version().

That being said, have you looked at the documentation for
"shared_preload_libraries" ?

*https://www.postgresql.org/docs/9.6/static/runtime-config-client.html
*

-- 
*Melvin Davidson*
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.


Re: [GENERAL] delete a file everytime pg server starts/crashes

2017-10-05 Thread Ray Cote
>
> On Thu, Oct 5, 2017 at 10:04 AM, athinivas  wrote:
>
>> Hi,
>>
>> I'm having a requirement to delete a file in system whenever pg server is
>> started/crashed. Any idea?
>>
>> Thanks,
>> Athi
>>
>>
If you’re running on Linux you can modify the init.d (or service) file and
add a line to delete the file.
To remove on crash is likely to require some sort of ongoing monitoring
service to see that the process is no longer running.
If you’re already using a service (such as monit) which brings an
application back up after crash then adding a line to the init.d file may
be sufficient.
—Ray

--
Raymond Cote, President
voice: +1.603.924.6079 email: rgac...@appropriatesolutions.com skype:
ray.cote
Schedule a meeting: https://calendly.com/ray_cote/60min/


[GENERAL] Postgresql CDC tool recommendations ?

2017-10-05 Thread avi Singh
Guys
 Any recommendation on a good CDC tool that can be used to push
postgresql changes to Kafka in json format ?

Thanks
Avi


Re: [GENERAL] Postgresql CDC tool recommendations ?

2017-10-05 Thread Steve Atkins

> On Oct 5, 2017, at 10:28 AM, avi Singh  wrote:
> 
> Guys
>  Any recommendation on a good CDC tool that can be used to push 
> postgresql changes to Kafka in json format ?

Not sure whether json is a constraint, but I'd look at http://debezium.io  and 
(maybe) the no longer supported https://github.com/confluentinc/bottledwater-pg

Cheers,
  Steve

-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Strange checkpoint behavior - checkpoints take a long time

2017-10-05 Thread Vladimir Nicolici
Some further updates about the issue.

I did a bit of benchmarking on the disk system with iozone, and the during the 
test the SSDs seemed to be able to easily sustain 200 MB/second of writes each, 
they fluctuated between 200 MB/s and 400 MB/s when doing 96 GB of random writes 
in a file. That would mean between 400 and 800 MB/s for the entire RAID volume, 
since it’s 1+0 and has 4 SSDs, 2 in each mirror.

I wasn’t able to benchmark reads properly because the machine has so much RAM 
that they work mostly from cache. But reads  shouldn’t be an issue anyway. I’ll 
try to do more comprehensive tests tomorrow but, from what I’ve seen so far, 
SSD I/O limits shouldn’t be a concern.

I changed some configuration parameters during the night to the values I was 
considering yesterday:

- shared_buffers = 144GB #previously 96 GB
- bgwriter_lru_maxpages = 100  #previously 400
- checkpoint_timeout = 30min  #previously 5min
- checkpoint_completion_target = 0.83 #previously 0.85; 0.83 means 25 minutes 
writes out of 30 minutes.
- max_wal_size = 96GB #previously 16GB
- wal_buffers = 16MB  #previously 32 MB

With the new settings the checkpoints now finish on time, more or less. One 
recent checkpoint looked like this:

2017-10-05 14:16:22.891 EDT [7828] LOG:  checkpoint starting: time
2017-10-05 14:42:35.429 EDT [7828] LOG:  checkpoint complete: wrote 4770679 
buffers (25.3%); 0 transaction log file(s) added, 0 removed, 2088 recycled; 
write=1501.567 s, sync=1.844 s, total=1572.538 s; sync files=750, longest=0.029 
s, average=0.002 s; distance=33579763 kB, estimate=35950738 kB

So it took 1502 seconds for writes, instead of the configured 1494, that’s 
close enough, just 6 seconds more. Sync was another 2 seconds. So 1504 write + 
sync. However, the total is 1573 seconds, 69 seconds more. Not sure what the 
checkpoint process does in those 69 seconds.

Looking further back at the logs with the checkpoints from today, the first 
checkpoint of the day that did some significant work didn’t have such a large 
difference between write time and checkpoint time, it had write + sync  = 1494, 
and total 1495, just 1 second difference:

2017-10-05 09:16:22.851 EDT [7828] LOG:  checkpoint starting: time
2017-10-05 09:41:18.139 EDT [7828] LOG:  checkpoint complete: wrote 4445573 
buffers (23.6%); 0 transaction log file(s) added, 0 removed, 98 recycled; 
write=1493.322 s, sync=0.575 s, total=1495.287 s; sync files=734, longest=0.022 
s, average=0.000 s; distance=29166105 kB, estimate=29166105 kB

The difference is it didn’t need to recycle so many buffers as the later one. 
Not sure what exactly the recycling does, if anything, but it seems to add some 
overhead to the process. The checkpoint with the 69 seconds delay recycled 2088 
wal segments, the one with the 1 second delay recycled just 98.

If the overhead is indeed caused by recycling, I’m not sure if it’s 
proportional to the size of the recycled segments, or with the number of 
recycled segments. If it’s the number of segments, then maybe a wal segment 
size larger than 16 MB would help with this issue, but unfortunately for the 
time being that can only be configured at compile time and forces you to 
recreate the cluster as well.

For tomorrow I plan to reduce bgwriter_lru_maxpages from 100 to 0 and rely just 
on checkpoints for datafile writes, and reduce checkpoint_completion_target 
from 0.83 to 0.66 to try make the checkpoints finish the writes in 20 minutes 
instead of 25 minutes, out of 30 minutes.

Thanks,
Vlad


Re: [GENERAL] Strange checkpoint behavior - checkpoints take a long time

2017-10-05 Thread Igor Polishchuk
Vladimir,
Just curious, if your goal is to reduce checkpoint overhead, shouldn’t you 
decrease shared_buffers instead of increasing it? 
With bigger shared_buffers, you can accumulate more dirty buffers for 
checkpoint to take care. 
I remember in early versions ( around 8.4), when checkpoint_completion_target 
was not available, one suggested way of fighting heavy checkpoints was 
setting very low shared_buffers.
Also, why do yo need to reduce your checkpoint write time to 20 minutes from 
25? What will you gain? If you will have the same number of dirty buffers to 
flush, your IO intensity and overhead will  increase in these 20 minutes.

Igor

  
> On Oct 5, 2017, at 12:58, Vladimir Nicolici  wrote:
> 
> Some further updates about the issue.
>  
> I did a bit of benchmarking on the disk system with iozone, and the during 
> the test the SSDs seemed to be able to easily sustain 200 MB/second of writes 
> each, they fluctuated between 200 MB/s and 400 MB/s when doing 96 GB of 
> random writes in a file. That would mean between 400 and 800 MB/s for the 
> entire RAID volume, since it’s 1+0 and has 4 SSDs, 2 in each mirror.
>  
> I wasn’t able to benchmark reads properly because the machine has so much RAM 
> that they work mostly from cache. But reads  shouldn’t be an issue anyway. 
> I’ll try to do more comprehensive tests tomorrow but, from what I’ve seen so 
> far, SSD I/O limits shouldn’t be a concern.
>  
> I changed some configuration parameters during the night to the values I was 
> considering yesterday:
>  
> shared_buffers = 144GB #previously 96 GB
> bgwriter_lru_maxpages = 100  #previously 400
> checkpoint_timeout = 30min  #previously 5min
> checkpoint_completion_target = 0.83 #previously 0.85; 0.83 means 25 minutes 
> writes out of 30 minutes.
> max_wal_size = 96GB #previously 16GB
> wal_buffers = 16MB  #previously 32 MB
>  
> With the new settings the checkpoints now finish on time, more or less. One 
> recent checkpoint looked like this:
>  
> 2017-10-05 14:16:22.891 EDT [7828] LOG:  checkpoint starting: time
> 2017-10-05 14:42:35.429 EDT [7828] LOG:  checkpoint complete: wrote 4770679 
> buffers (25.3%); 0 transaction log file(s) added, 0 removed, 2088 recycled; 
> write=1501.567 s, sync=1.844 s, total=1572.538 s; sync files=750, 
> longest=0.029 s, average=0.002 s; distance=33579763 kB, estimate=35950738 kB
>  
> So it took 1502 seconds for writes, instead of the configured 1494, that’s 
> close enough, just 6 seconds more. Sync was another 2 seconds. So 1504 write 
> + sync. However, the total is 1573 seconds, 69 seconds more. Not sure what 
> the checkpoint process does in those 69 seconds.
>  
> Looking further back at the logs with the checkpoints from today, the first 
> checkpoint of the day that did some significant work didn’t have such a large 
> difference between write time and checkpoint time, it had write + sync  = 
> 1494, and total 1495, just 1 second difference:
>  
> 2017-10-05 09:16:22.851 EDT [7828] LOG:  checkpoint starting: time
> 2017-10-05 09:41:18.139 EDT [7828] LOG:  checkpoint complete: wrote 4445573 
> buffers (23.6%); 0 transaction log file(s) added, 0 removed, 98 recycled; 
> write=1493.322 s, sync=0.575 s, total=1495.287 s; sync files=734, 
> longest=0.022 s, average=0.000 s; distance=29166105 kB, estimate=29166105 kB
>  
> The difference is it didn’t need to recycle so many buffers as the later one. 
> Not sure what exactly the recycling does, if anything, but it seems to add 
> some overhead to the process. The checkpoint with the 69 seconds delay 
> recycled 2088 wal segments, the one with the 1 second delay recycled just 98.
>  
> If the overhead is indeed caused by recycling, I’m not sure if it’s 
> proportional to the size of the recycled segments, or with the number of 
> recycled segments. If it’s the number of segments, then maybe a wal segment 
> size larger than 16 MB would help with this issue, but unfortunately for the 
> time being that can only be configured at compile time and forces you to 
> recreate the cluster as well.
>  
> For tomorrow I plan to reduce bgwriter_lru_maxpages from 100 to 0 and rely 
> just on checkpoints for datafile writes, and reduce 
> checkpoint_completion_target from 0.83 to 0.66 to try make the checkpoints 
> finish the writes in 20 minutes instead of 25 minutes, out of 30 minutes.
>  
> Thanks,
> Vlad



Re: [GENERAL] Strange checkpoint behavior - checkpoints take a longtime

2017-10-05 Thread Vladimir Nicolici
My primary goal is to determine the current level of utilization of the current 
hardware + software combination, and how much more load it can handle.

It’s quite likely that volume of work performed by that database will need to 
increase by 50% or more over the next months, and I don’t want to risk hitting 
a performance wall unexpectedly when that happens, I want to plan in advance 
for more hardware or other solutions in case the current hardware and software 
combination is inadequate.

My secondary goal is to determine optimal settings for best performance, and 
learn more about how postgres works.

>From the point of view of the applications using the database, the system 
>doesn’t currently experience any performance issues. Even when the checkpoints 
>didn’t finish in the default configured 5 minutes, the database performed 
>acceptably.
 
About the shared_buffers, decreasing them would indeed make the database keep 
less dirty buffers, but that would not reduce the overall number of writes to 
the disk. It would just change the responsibility of doing writes more towards 
the bgwriter process instead of the checkpoint process, and I don’t see any 
advantage in doing that.

Actually, reducing the shared buffers may even increase the number of writes to 
the datafiles, because according to the documentation the bgwriter process can 
write the same buffer multiple times during a checkpoint interval, and the 
checkpoint would only write it once. That’s why I want to completely disable 
the bgwriter.

About reducing the checkpoint write time from 25 minutes to 20 minutes, you are 
correct that the same number of buffers will need to be flushed and that it 
will increase the I/O intensity and put more stress on the disk system in that 
interval. That’s exactly what I want.

I want to see if the hardware is powerful enough to complete the checkpoint in 
just 20 minutes. This will prove there is some headroom and the database can 
handle more load. If that works fine, I may even try 15 minutes, to see how it 
behaves.

Since it’s a production system I have to be a bit careful, downtime during the 
day can be quite expensive, but at the same time suddenly hitting a performance 
wall unexpectedly in the future will be even more expensive.

After I finish the tests and reach a conclusion about the capabilities of this 
hardware + software combination, I will probably set it to something like 0.90 
target, so that it distributes the writes over 27 minutes.

Thanks,
Vlad

From: Igor Polishchuk
Sent: Friday, October 6, 2017 02:56
To: Vladimir Nicolici
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Strange checkpoint behavior - checkpoints take a longtime

Vladimir,
Just curious, if your goal is to reduce checkpoint overhead, shouldn’t you 
decrease shared_buffers instead of increasing it? 
With bigger shared_buffers, you can accumulate more dirty buffers for 
checkpoint to take care. 
I remember in early versions ( around 8.4), when checkpoint_completion_target 
was not available, one suggested way of fighting heavy checkpoints was 
setting very low shared_buffers.
Also, why do yo need to reduce your checkpoint write time to 20 minutes from 
25? What will you gain? If you will have the same number of dirty buffers to 
flush, your IO intensity and overhead will  increase in these 20 minutes.

Igor

  
On Oct 5, 2017, at 12:58, Vladimir Nicolici  wrote:

Some further updates about the issue.
 
I did a bit of benchmarking on the disk system with iozone, and the during the 
test the SSDs seemed to be able to easily sustain 200 MB/second of writes each, 
they fluctuated between 200 MB/s and 400 MB/s when doing 96 GB of random writes 
in a file. That would mean between 400 and 800 MB/s for the entire RAID volume, 
since it’s 1+0 and has 4 SSDs, 2 in each mirror.
 
I wasn’t able to benchmark reads properly because the machine has so much RAM 
that they work mostly from cache. But reads  shouldn’t be an issue anyway. I’ll 
try to do more comprehensive tests tomorrow but, from what I’ve seen so far, 
SSD I/O limits shouldn’t be a concern.
 
I changed some configuration parameters during the night to the values I was 
considering yesterday:
 
• shared_buffers = 144GB #previously 96 GB
• bgwriter_lru_maxpages = 100  #previously 400
• checkpoint_timeout = 30min  #previously 5min
• checkpoint_completion_target = 0.83 #previously 0.85; 0.83 means 25 minutes 
writes out of 30 minutes.
• max_wal_size = 96GB #previously 16GB
• wal_buffers = 16MB  #previously 32 MB
 
With the new settings the checkpoints now finish on time, more or less. One 
recent checkpoint looked like this:
 
2017-10-05 14:16:22.891 EDT [7828] LOG:  checkpoint starting: time
2017-10-05 14:42:35.429 EDT [7828] LOG:  checkpoint complete: wrote 4770679 
buffers (25.3%); 0 transaction log file(s) added, 0 removed, 2088 recycled; 
write=1501.567 s, sync=1.844 s, total=1572.538 s; sync files=750, longest=0.029 
s, average=0.002 s; distance=33579763 kB, 

Re: [GENERAL] Strange checkpoint behavior - checkpoints take a long time

2017-10-05 Thread Andres Freund
Hi,

On 2017-10-05 22:58:31 +0300, Vladimir Nicolici wrote:
> I changed some configuration parameters during the night to the values I was 
> considering yesterday:
> 
> - shared_buffers = 144GB #previously 96 GB
> - bgwriter_lru_maxpages = 100  #previously 400
> - checkpoint_timeout = 30min  #previously 5min
> - checkpoint_completion_target = 0.83 #previously 0.85; 0.83 means 25 minutes 
> writes out of 30 minutes.
> - max_wal_size = 96GB #previously 16GB

Which version of postgres is this?

> - wal_buffers = 16MB  #previously 32 MB

That seems quite unlikely to help.

You might want to try also enabling wal_compression, sometimes the WAL
volume is a considerable problem.

I'd suggest reporting some "pidstat -dl 1" output, so we can see which
processes are doing how much IO.

Regards,

Andres



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Strange checkpoint behavior - checkpoints take a longtime

2017-10-05 Thread Vladimir Nicolici
Hello, it’s postgres 9.6 . I know reducing wal_buffers from 32 MB to 16MB is 
unlikely to help, but according to the documentation values larger than 16MB 
are unlikely to help either, at least at the default wal segment size, so I 
decided to go with 16 MB.

I will probably try the compression on Monday or Tuesday, I can only experiment 
with a single set of changes in a day, and I plan to test something else 
tomorrow.

Thanks for the suggestions, and sorry for the reply style, but my mail client 
is not best suited for replying inline to individual points.

From: Andres Freund
Sent: Friday, October 6, 2017 04:51
To: Vladimir Nicolici
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Strange checkpoint behavior - checkpoints take a longtime

Hi,

On 2017-10-05 22:58:31 +0300, Vladimir Nicolici wrote:
> I changed some configuration parameters during the night to the values I was 
> considering yesterday:
> 
> - shared_buffers = 144GB #previously 96 GB
> - bgwriter_lru_maxpages = 100  #previously 400
> - checkpoint_timeout = 30min  #previously 5min
> - checkpoint_completion_target = 0.83 #previously 0.85; 0.83 means 25 minutes 
> writes out of 30 minutes.
> - max_wal_size = 96GB #previously 16GB

Which version of postgres is this?

> - wal_buffers = 16MB  #previously 32 MB

That seems quite unlikely to help.

You might want to try also enabling wal_compression, sometimes the WAL
volume is a considerable problem.

I'd suggest reporting some "pidstat -dl 1" output, so we can see which
processes are doing how much IO.

Regards,

Andres



Re: [GENERAL] Strange checkpoint behavior - checkpoints take a longtime

2017-10-05 Thread Andres Freund
Hi,

On 2017-10-06 05:53:39 +0300, Vladimir Nicolici wrote:
> Hello, it’s postgres 9.6.

Consider setting checkpoint_flush_after to 16MB or something large like that.


> I will probably try the compression on Monday or Tuesday, I can only
> experiment with a single set of changes in a day, and I plan to test
> something else tomorrow.
> 
> Thanks for the suggestions, and sorry for the reply style, but my mail
> client is not best suited for replying inline to individual points.

You should consider getting a new mail client then...

- Andres


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general