Replication using PGLogical

2018-06-18 Thread Abhinav Singh
Hello All,

I am currently using PostgreSQL Community version 10.3 and then using this
instance, I am doing logical replication(using *PGLOGICAL 2.2 *plugin) and
it is working perfectly fine as per my use-case based on PostgreSQL to
PostgreSQL replication.

The use-case here is I want to utilize the plugin to send data to my
different application utilizing this data via this replication methodology.
The reason for this is the replication_sets concepts that is available
which has tables attached to the slots. If I understand correctly, the
replication slot will be decoding everything from the WALs. but the slot
will not be storing all the transactions that it decodes, but instead only
the tables in the replication sets for this slot, which is what I am
looking for.

I did look at the following code base:

https://gist.github.com/ringerc/f74a12e430866ccd9227

And the presentation:

https://www.slideshare.net/UmairShahid16/logical-replication-with-pglogical

But when I try to do *peek_changes *on my own slot, I do not see the actual
transactions as mentioned here
,
but I just see BEGIN COMMT and LSN position changing and not the actual
transactions.

Additionally, by changing the log_min_duration_statement to 0, I see that,
when I create the subscriber, I see that following SQLs are executed on the
Publisher end:

--
2018-06-14 07:03:06.499 UTC [16696] DEBUG:  received replication command:
CREATE_REPLICATION_SLOT "pgl_postgres_publisher_subscriptionter" LOGICAL
pglogical_output
2018-06-14 07:03:06.509 UTC [16696] DEBUG:  snapshot of 0+0 running
transaction ids (lsn 3D1/FF49B928 oldest xid 121331 latest complete 121330
next xid 121331)
*
2018-06-14 07:11:23.902 UTC [18490] DEBUG:  received replication command:
IDENTIFY_SYSTEM
2018-06-14 07:11:23.903 UTC [18490] DEBUG:  received replication command:
START_REPLICATION SLOT "pgl_postgres_publisher_subscriptionter" LOGICAL
3D1/FF4A2F28 (expected_encoding 'UTF8', min_proto_version '1',
max_proto_version '1', startup_params_format '1',
"binary.want_internal_basetypes" '1', "binary.want_binary_basetypes" '1',
"binary.basetypes_major_version" '1000', "binary.sizeof_datum" '8',
"binary.sizeof_int" '4', "binary.sizeof_long" '8', "binary.bigendian" '0',
"binary.float4_byval" '1', "binary.float8_byval" '1',
"binary.integer_datetimes" '0', "hooks.setup_function"
'pglogical.pglogical_hooks_setup', "pglogical.forward_origins"
'"all"', *"pglogical.replication_set_names"
'sets'*, "relmeta_cache_size" '-1', pg_version '13', pglogical_version
'2.2.0', pglogical_version_num '20200', pglogical_apply_pid '16759')
--

The reason I am mentioning this is because in the above command, I see the
replication sets also attached to the slots when replication is initiated.
But when I try to execute the same on my end, I see this error:

-
postgres=# START_REPLICATION SLOT "demo_slot" LOGICAL 3D1/FF5F51C0
(expected_encoding 'UTF8', min_proto_version _format '1',
"binary.want_internal_basetypes" '1', "binary.want_binary_basetypes" '1',
"binary.basetypes_major_v "binary.sizeof_int" '4', "binary.sizeof_long"
'8', "binary.bigendian" '0', "binary.float4_byval" '1', "binary.fs" '0',
"hooks.setup_function" 'pglogical.pglogical_hooks_setup',
"pglogical.forward_origins" '"all"', "pglogicacache_size" '-1', pg_version
'13', pglogical_version '2.2.0', pglogical_version_num '20200',
pglogical_apply
*unexpected PQresultStatus: 8*
-

Hence, if someone can please guide me here as to how can I achieve my goal
of:
=> Using pglogical and replication sets for sending data to other
applications
=> Why peek changes is not showing the transactions? Same thing can bee
seen for *pgoutput *as well.

Please review and share your comments.

Thanks and Regards,
Abhinav Singh


pgbench on pglogical: bandwidth usage

2018-06-18 Thread Fabio Pardi
Hi,

I'm researchingon pglogical,performing tests to see the impact on the network 
traffic, in comparisonwith streaming replication.

I configured one provider and one subscriber both running on a VM, both on 
Postgres 9.6.3 and latest pglogical 2.2.0. 

Forcomparison, Ialso have one master and one hot standby, running on the same 
VM and on the same Postgres.

I performed 2 different kind of tests. One using pgbench and another manually 
generating data. But there is something I do not understand when it comes to 
draw conclusions.Therefore I would appreciate yourhelp.


=Test 1=

I create a schema/table called:

my_replicated_schema.my_first_replicated_table

I ask to provider to add it to the replication set using:

SELECT pglogical.replication_set_add_all_tables('default', 
ARRAY['my_replicated_schema']);

Then I generate 50 million +1 records on the table:

insert into my_replicated_schema.my_first_replicated_table values 
(generate_series(0,5000));'


At this stage I can check how much traffic was generated to pass the records to 
the other node.

Logical replication costs:

 RX bytes:10271325 (9.7 MiB)  TX bytes:2465344586 (2.2 GiB)



While streaming replication:

 RX bytes:23058328 (21.9 MiB)  TX bytes:7502525286 (6.9 GiB)




Conclusion:

Pglogical is more convenient.

Nice.



= Test 2=

Same configuration used to test pglogical during test 1.Accidentally, Idid not 
change the replication set. So the replication set was still using: 
'my_replicated_schema.my_first_replicated_table'

Pgbench instead writes to 'public'schema.


I theninitialize pgbench tables on the provider, using:

pgbench -i 

which results in:

 \dt public.
 List of relations
 Schema |   Name   | Type  | Owner
+--+---+---
 public | pgbench_accounts | table | postgres
 public | pgbench_branches | table | postgres
 public | pgbench_history  | table | postgres
 public | pgbench_tellers  | table | postgres
(4 rows)


I then run pgbench on the provider:


pgbench -c 3 -P 10  -r -T 200 -R 1000


And with big surprise, Ican thensee that an amount of traffic was generated:

  RX bytes:35500329 (33.8 MiB)  TX bytes:66376459 (63.3 MiB)

This is very strange to me. Running similar tests, where on the provider I 
manually push data on a table which is not in the replication set, no traffic 
was generated.

There must be an explanation for what is happening, and I think is more related 
to pgbench than pglogical, but i cannot find it. Do you have pointers?



= Test 3 =

Provider is replicating the schema public, and subscriber receiving it.

I then run pgbenchsame way as above:

pgbench -c 3 -P 10  -r -T 200 -R 1000


But I get results which are contradicting test 1.

Pglogical replication uses:


 RX bytes:69783187 (66.5 MiB)  TX bytes:371664060 (354.4 MiB)


While streaming replication:


  RX bytes:86286353 (82.2 MiB)  TX bytes:281693815 (268.6 MiB)


Here, one would say that streaming replication is cheaper..


Also I cannot explain why is that. Do you?



Side notes:

- All tests ran over multiple times, producing identical results

- I also ran a test similar to 'Test1' but updating results (instead of running 
'INSERT'), which as expected gave same outcome as 'Test 1'

- At every run Idestroy 'pgbench' database and recreate it, in order to start 
as clean as possible.

- As cross check, Im also checking that traffic wise,the numbers that appear on 
the provider are specular on the subscriber

- Here Ireport the exact commands I ran in order to reproduce the test beds for 
pglogical:


Test 1 and 2:

Provider called 'dbfabio':

PGDATA=pgbench
createdb pgbench
pglog_db=pgbench
psql $pglog_db  -c "CREATE extension pglogical;"
psql $pglog_db  -c "CREATE schema my_replicated_schema"
psql $pglog_db  -c "CREATE table my_replicated_schema.my_first_replicated_table 
(i int primary key)"
psql $pglog_db  -c "SELECT pglogical.create_node(node_name := 
'provider.$pglog_db', dsn := 'host=dbfabio port=5432 dbname=$pglog_db');"
psql $pglog_db  -c "SELECT pglogical.replication_set_add_all_tables('default', 
ARRAY['my_replicated_schema']);"


Subscriber called 'dbfabio2':

PGDATA=pgbench
createdb pgbench
pglog_db=pgbench
psql $pglog_db -c "CREATE extension pglogical;"
psql $pglog_db -c "SELECT pglogical.create_node(node_name := 
'subscriber.$pglog_db', dsn := 'host=dbfabio2 port=5432 dbname=$pglog_db');"
psql $pglog_db -c "SELECT pglogical.create_subscription(subscription_name := 
'subscription_to_dbfabio_$pglog_db',  synchronize_structure := true, 
provider_dsn := 'host=dbfabio port=5432 dbname=$pglog_db');"



Test 3:

Provider:

PGDATA=pgbench
createdb pgbench
pglog_db=pgbench
pgbench -i 

# Now a small hack, since pglogical only accepts tables who have a primary key. 
pgbench_historical does not have it, out of the box: (maybe here there is some 
room for an improvement to propose for pgbench code? what do you think?)

psql $pglog_db -c "ALTER TABLE 

Re: Failed rpm package signature checks with reposync

2018-06-18 Thread Bruno Lavoie
Hi Devrim,

Sorry, me too just saw your email...

Issue still persists:

# reposync --repoid=pgdg10 --gpgcheck
--download_path=/var/www/html/centos7/repos/postgresql/
Repository 'base' is missing name in configuration, using id
Removing mysql_fdw_10-2.3.0-3.rhel7.x86_64.rpm due to failed signature
check.
Removing mysql_fdw_10-debuginfo-2.3.0-3.rhel7.x86_64.rpm due to failed
signature check.
Removing osm2pgrouting_10-2.3.3-1.rhel7.x86_64.rpm due to failed signature
check.
Removing osm2pgrouting_10-debuginfo-2.3.3-1.rhel7.x86_64.rpm due to failed
signature check.


For sake of completeness, here is my repo file:

[pgdg10]
name=PostgreSQL 10 - RHEL 7 - x86_64
baseurl=
https://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64
enabled=0
gpgcheck=1
gpgkey=https://download.postgresql.org/pub/repos/yum/RPM-GPG-KEY-PGDG-10


Thanks



On Sun, Jun 10, 2018 at 4:55 PM, Devrim Gündüz  wrote:

>
> Hi,
>
> Just saw this email. Does the problem still persist?
>
> Regards, Devrim
>
> On Wed, 2018-04-25 at 11:22 -0400, Bruno Lavoie wrote:
> > Hello,
> >
> > Don't know if I should post it to hackers list, by I'll try here first.
> >
> > For many reasons, we're currently deploying a mirror for postgresql rpm
> > packages. And when we run reposync (yum mirroring tool) with --gpgcheck
> > switch some packages are removed due to failed signature check.
> >
> > Command and output:
> > # reposync --repoid=pgdg10 --gpgcheck
> > --download_path=/var/www/html/centos7/repos/postgresql/
> >
> > Package mysql_fdw_10-2.3.0-3.rhel7.x86_64.rpm is not signed 0% [
> >  ]  0.0 B/s |0 B  --:--:-- ETA
> > (1/4): mysql_fdw_10-2.3.0-3.rhel7.x86_64.rpm
> > |  30 kB  00:00:00
> > (2/4): mysql_fdw_10-debuginfo-2.3.0-3.rhel7.x86_64.rpm
> > | 117 kB  00:00:00
> > (3/4): osm2pgrouting_10-2.3.3-1.rhel7.x86_64.rpm
> > | 134 kB  00:00:00
> > (4/4): osm2pgrouting_10-debuginfo-2.3.3-1.rhel7.x86_64.rpm
> > |  17 kB  00:00:00
> > Removing mysql_fdw_10-2.3.0-3.rhel7.x86_64.rpm due to failed signature
> > check.
> > Removing mysql_fdw_10-debuginfo-2.3.0-3.rhel7.x86_64.rpm due to failed
> > signature check.
> > Removing osm2pgrouting_10-2.3.3-1.rhel7.x86_64.rpm due to failed
> signature
> > check.
> > Removing osm2pgrouting_10-debuginfo-2.3.3-1.rhel7.x86_64.rpm due to
> failed
> > signature check.
> >
> > Here is the used .repo file:
> > [pgdg10]
> > name=PostgreSQL 10 - RHEL 7 - x86_64
> > baseurl=
> > https://download.postgresql.org/pub/repos/yum/10/redhat/rhel-7-x86_64
> > enabled=0
> > gpgcheck=1
> > gpgkey=https://download.postgresql.org/pub/repos/yum/RPM-GPG-KEY-PGDG-10
> >
> >
> > Any reasons for this?
> >
> > Thanks
> > Bruno Lavoie
>
> --
> Devrim Gündüz
> EnterpriseDB: https://www.enterprisedb.com
> PostgreSQL Consultant, Red Hat Certified Engineer
> Twitter: @DevrimGunduz , @DevrimGunduzTR


Re: Failed rpm package signature checks with reposync

2018-06-18 Thread Devrim Gündüz

Hi Bruno,

On Mon, 2018-06-18 at 08:31 -0400, Bruno Lavoie wrote:
> 
> Sorry, me too just saw your email...
> 
> Issue still persists:
> 
> # reposync --repoid=pgdg10 --gpgcheck
> --download_path=/var/www/html/centos7/repos/postgresql/
> Repository 'base' is missing name in configuration, using id
> Removing mysql_fdw_10-2.3.0-3.rhel7.x86_64.rpm due to failed signature
> check.
> Removing mysql_fdw_10-debuginfo-2.3.0-3.rhel7.x86_64.rpm due to failed
> signature check.
> Removing osm2pgrouting_10-2.3.3-1.rhel7.x86_64.rpm due to failed signature
> check.
> Removing osm2pgrouting_10-debuginfo-2.3.3-1.rhel7.x86_64.rpm due to failed
> signature check.

Ok, verified on my end as well. Fixed this, and pushed them to the repo. They
will be available 1 hour later. Can you please try again then?

Regards,
-- 
Devrim Gündüz
EnterpriseDB: https://www.enterprisedb.com
PostgreSQL Consultant, Red Hat Certified Engineer
Twitter: @DevrimGunduz , @DevrimGunduzTR

signature.asc
Description: This is a digitally signed message part


Re: Clarifying "timestamp with time zone"

2018-06-18 Thread Jeremy Finzel
On Fri, Jun 15, 2018 at 2:57 PM, Adrian Klaver 
wrote:

> On 06/15/2018 12:24 PM, Jeremy Finzel wrote:
>
>> Hello!
>>
>> We often prefer to use timestamptz or "timestamp with time zone" in our
>> environment because of its actually storing "objective time" with respect
>> to UTC.  But in my own work experience, I have scarcely encountered a case
>> where business users, and software engineers, do not actually think it
>> means the opposite.
>>
>> When I say "timestamp with time zone", people think the data is saved
>> *in* a specific time zone, whereas in reality, the opposite is true.  It is
>> really more like "timestamp UTC" or you even could say "timestamp at UTC".
>> When you query this of course, then it shows you the time offset based on
>> your client timezone setting.
>>
>> I do believe this is part of the SQL standard, but I have found that it
>> creates great confusion.  I think many devs choose timestamp instead of
>> timestamptz because they don't really understand that timestamptz gives you
>> UTC time storage built-in.
>>
>> That of course means that if you have multiple servers that run in a
>> different time zone, and you want to replicate that data to a centralized
>> location, you can easily figure out what objective time a record changed,
>> for instance, not knowing anything about what time zone the source system
>> is in.
>>
>> So it seems to me that "timestamp with time zone" is a misnomer in a big
>>
>
> It actually is. It is just one timezone though, UTC.
>
> way, and perhaps it's worth at least clarifying the docs about this, or
>>
>
> https://www.postgresql.org/docs/10/static/datatype-datetime.
> html#DATATYPE-TIMEZONES
>
> "For timestamp with time zone, the internally stored value is always in
> UTC (Universal Coordinated Time, traditionally known as Greenwich Mean
> Time, GMT). An input value that has an explicit time zone specified is
> converted to UTC using the appropriate offset for that time zone. If no
> time zone is stated in the input string, then it is assumed to be in the
> time zone indicated by the system's TimeZone parameter, and is converted to
> UTC using the offset for the timezone zone."
>
> How should the above be clarified?


Actually, that is a really good description.  But I would say the problem
is it does not have a prominent place on the page, and that many people
reading the docs will make enough assumptions about the data types before
they get down to this part of the page.  What is displayed as nitty-gritty
details down the page should be essential reading for any user of postgres
wanting to know how to decide between timestamp and timestamptz.

There are some descriptions that tend to mislead that perhaps could be
clarified.  For example, at the top of the page, timestamptz is described
as "both date and time, with time zone".  Given what we all seem to
acknowledge is a misleading description, I think we ought to either change
this to summarize the above very helpful description, perhaps "both date
and time, in UTC" or some such idea.

I like the idea of making that note that is now nested deep in the example
section very prominent at the top of the page, perhaps as one of these
special notes, given how critical timestamps are for nearly any relational
database use.

Thoughts?

Thanks,
Jeremy


Re: Clarifying "timestamp with time zone"

2018-06-18 Thread Adrian Klaver

On 06/18/2018 06:24 AM, Jeremy Finzel wrote:



On Fri, Jun 15, 2018 at 2:57 PM, Adrian Klaver 
mailto:adrian.kla...@aklaver.com>> wrote:


On 06/15/2018 12:24 PM, Jeremy Finzel wrote:

Hello!

We often prefer to use timestamptz or "timestamp with time zone"
in our environment because of its actually storing "objective
time" with respect to UTC.  But in my own work experience, I
have scarcely encountered a case where business users, and
software engineers, do not actually think it means the opposite.

When I say "timestamp with time zone", people think the data is
saved *in* a specific time zone, whereas in reality, the
opposite is true.  It is really more like "timestamp UTC" or you
even could say "timestamp at UTC".  When you query this of
course, then it shows you the time offset based on your client
timezone setting.

I do believe this is part of the SQL standard, but I have found
that it creates great confusion.  I think many devs choose
timestamp instead of timestamptz because they don't really
understand that timestamptz gives you UTC time storage built-in.

That of course means that if you have multiple servers that run
in a different time zone, and you want to replicate that data to
a centralized location, you can easily figure out what objective
time a record changed, for instance, not knowing anything about
what time zone the source system is in.

So it seems to me that "timestamp with time zone" is a misnomer
in a big


It actually is. It is just one timezone though, UTC.

way, and perhaps it's worth at least clarifying the docs about
this, or



https://www.postgresql.org/docs/10/static/datatype-datetime.html#DATATYPE-TIMEZONES



"For timestamp with time zone, the internally stored value is always
in UTC (Universal Coordinated Time, traditionally known as Greenwich
Mean Time, GMT). An input value that has an explicit time zone
specified is converted to UTC using the appropriate offset for that
time zone. If no time zone is stated in the input string, then it is
assumed to be in the time zone indicated by the system's TimeZone
parameter, and is converted to UTC using the offset for the timezone
zone."

How should the above be clarified?


Actually, that is a really good description.  But I would say the 
problem is it does not have a prominent place on the page, and that many 
people reading the docs will make enough assumptions about the data 
types before they get down to this part of the page.  What is displayed 
as nitty-gritty details down the page should be essential reading for 
any user of postgres wanting to know how to decide between timestamp and 
timestamptz.


There are some descriptions that tend to mislead that perhaps could be 
clarified.  For example, at the top of the page, timestamptz is 
described as "both date and time, with time zone".  Given what we all 
seem to acknowledge is a misleading description, I think we ought to 
either change this to summarize the above very helpful description, 
perhaps "both date and time, in UTC" or some such idea.


I like the idea of making that note that is now nested deep in the 
example section very prominent at the top of the page, perhaps as one of 
these special notes, given how critical timestamps are for nearly any 
relational database use.


Thoughts?


Time/dates/timestamps are complicated and there are no end of 'if, and 
and buts'. My suggestion would be a note at the top of the page that 
says read through this section at least twice and then come back and do 
that again. Other gotchas in the section that I have seen in posts to 
this list:


"The SQL standard differentiates timestamp without time zone and 
timestamp with time zone literals by the presence of a “+” or “-” symbol 
and time zone offset after the time. Hence, according to the standard,


TIMESTAMP '2004-10-19 10:23:54'

is a timestamp without time zone, while

TIMESTAMP '2004-10-19 10:23:54+02'

is a timestamp with time zone. PostgreSQL never examines the content of 
a literal string before determining its type, and therefore will treat 
both of the above as timestamp without time zone.
To ensure that a literal is treated as timestamp with time zone, give it 
the correct explicit type:


TIMESTAMP WITH TIME ZONE '2004-10-19 10:23:54+02'

In a literal that has been determined to be timestamp without time zone, 
PostgreSQL will silently ignore any time zone indication. That is, the 
resulting value is derived from the date/time fields in the input value, 
and is not adjusted for time zone."



"One should be wary that the POSIX-style time zone feature can lead to 
silently accepting bogus input, since there is no check on the 
reaso

What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Sherrylyn Branchaw
Greetings,

We are using Postgres 9.6.8 (planning to upgrade to 9.6.9 soon) on RHEL 6.9.

We recently experienced two similar outages on two different prod
databases. The error messages from the logs were as follows:

LOG:  server process (PID 138529) was terminated by signal 6: Aborted
LOG:  terminating any other active server processes
WARNING:  terminating connection because of crash of another server process
DETAIL:  The postmaster has commanded this server process to roll back the
current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.

We are still investigating what may have triggered these errors, since
there were no recent changes to these databases. Unfortunately, core dumps
were not configured correctly, so we may have to wait for the next outage
before we can do a good root cause analysis.

My question, meanwhile, is around remedial actions to take when this
happens.

In one case, the logs recorded

LOG:  all server processes terminated; reinitializing
LOG:  incomplete data in "postmaster.pid": found only 1 newlines while
trying to add line 7
LOG:  database system was not properly shut down; automatic recovery in
progress
LOG:  redo starts at 365/FDFA738
LOG:  invalid record length at 365/12420978: wanted 24, got 0
LOG:  redo done at 365/12420950
LOG:  last completed transaction was at log time 2018-06-05
10:59:27.049443-05
LOG:  checkpoint starting: end-of-recovery immediate
LOG:  checkpoint complete: wrote 5343 buffers (0.5%); 0 transaction log
file(s) added, 1 removed, 0 recycled; write=0.131 s, sync=0.009 s,
total=0.164 s; sync files=142, longest=0.005 s, average=0.000 s;
distance=39064 kB, estimate=39064 kB
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections

In that case, the database restarted immediately, with only 30 seconds of
downtime.

In the other case, the logs recorded

LOG:  all server processes terminated; reinitializing
LOG:  dynamic shared memory control segment is corrupt
LOG:  incomplete data in "postmaster.pid": found only 1 newlines while
trying to add line 7

In that case, the database did not restart on its own. It was 5 am on
Sunday, so the on-call SRE just manually started the database up, and it
appears to have been running fine since.

My question is whether the corrupt shared memory control segment, and the
failure of Postgres to automatically restart, mean the database should not
be automatically started up, and if there's something we should be doing
before restarting.

Do we potentially have corrupt data or indices as a result of our last
outage? If so, what should we do to investigate?

Thanks,
Sherrylyn


Run Stored procedure - function from VBA

2018-06-18 Thread Łukasz Jarych
Hi Guys,

i have example function :

CREATE OR REPLACE FUNCTION totalRecords ()
RETURNS integer AS $total$
declare
total integer;
BEGIN
   SELECT count(*) into total FROM COMPANY;
   RETURN total;
END;
$total$ LANGUAGE plpgsql;

and i want to run it from VBA using odbc connection.

What is the best way to use it ?

something like this:

Dim dbCon as new ADODB.Connection
Dim rst as new ADODB.Recordset

Dbcon.connectionstring=”Your connection string goes here!”
Dbcon.open

Rst.open strsql

where strsql is "Select * from totalRecords" or this is not a good solution?

Best,
Jacek


Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Tom Lane
Sherrylyn Branchaw  writes:
> We are using Postgres 9.6.8 (planning to upgrade to 9.6.9 soon) on RHEL 6.9.
> We recently experienced two similar outages on two different prod
> databases. The error messages from the logs were as follows:
> LOG:  server process (PID 138529) was terminated by signal 6: Aborted

Hm ... were these installations built with --enable-cassert?  If not,
an abort trap seems pretty odd.

> In one case, the logs recorded
> LOG:  all server processes terminated; reinitializing
> LOG:  incomplete data in "postmaster.pid": found only 1 newlines while
> trying to add line 7
> ...

> In the other case, the logs recorded
> LOG:  all server processes terminated; reinitializing
> LOG:  dynamic shared memory control segment is corrupt
> LOG:  incomplete data in "postmaster.pid": found only 1 newlines while
> trying to add line 7
> ...

Those "incomplete data" messages are quite unexpected and disturbing.
I don't know of any mechanism within Postgres proper that would result
in corruption of the postmaster.pid file that way.  (I wondered briefly
if trying to start a conflicting postmaster would result in such a
situation, but experimentation here says not.)  I'm suspicious that
this may indicate a bug or unwarranted assumption in whatever scripts
you use to start/stop the postmaster.  Whether that is at all related
to your crash issue is hard to say, but it bears looking into.

> My question is whether the corrupt shared memory control segment, and the
> failure of Postgres to automatically restart, mean the database should not
> be automatically started up, and if there's something we should be doing
> before restarting.

No, that looks like fairly typical crash recovery to me: corrupt shared
memory contents are expected and recovered from after a crash.  However,
we don't expect postmaster.pid to get mucked with.

regards, tom lane



Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Andres Freund
On 2018-06-18 12:30:13 -0400, Tom Lane wrote:
> Sherrylyn Branchaw  writes:
> > We are using Postgres 9.6.8 (planning to upgrade to 9.6.9 soon) on RHEL 6.9.
> > We recently experienced two similar outages on two different prod
> > databases. The error messages from the logs were as follows:
> > LOG:  server process (PID 138529) was terminated by signal 6: Aborted
> 
> Hm ... were these installations built with --enable-cassert?  If not,
> an abort trap seems pretty odd.

PANIC does an abort, so that's not too surprising...

if (elevel >= PANIC)
{
/*
 * Serious crash time. Postmaster will observe SIGABRT process 
exit
 * status and kill the other backends too.
 *
 * XXX: what if we are *in* the postmaster?  abort() won't kill 
our
 * children...
 */
fflush(stdout);
fflush(stderr);
abort();
}

Greetings,

Andres Freund



Re: Query hitting empty tables taking 48 minutes

2018-06-18 Thread Robert Creager


> On Jun 7, 2018, at 4:18 PM, Robert  wrote:
> 
>> You can capture the execution plan of the bad statement by using 
>> auto_explain,
>> that would certainly shed more light on the problem.
> 

A different query started showing up as the problem, the auto_explain with 
analyze shows an oddity, the total query duration is 11k seconds, while the 
query itself is 3 seconds.  I captured a ZFS snapshot as soon as the problem 
was noticed.

Jun 16 23:15:30 blackpearl postgres[9860]: [79-1] 
db=tapesystem,user=Administrator,app=[unknown],client=127.0.0.1 LOG:  duration: 
10856644.336 ms  plan:   
Jun 16 23:15:30 blackpearl postgres[9860]: [79-2]   Query Text: DELETE FROM 
ds3.blob WHERE EXISTS (SELECT * FROM ds3.s3_object WHERE id = 
ds3.blob.object_id AND (bucket_id = $1)) 
Jun 16 23:15:30 blackpearl postgres[9860]: [79-3]   Delete on blob  
(cost=1308.79..1312.82 rows=1 width=12) (actual time=3465.919..3465.919 rows=0 
loops=1)
Jun 16 23:15:30 blackpearl postgres[9860]: [79-4] ->  Nested Loop  
(cost=1308.79..1312.82 rows=1 width=12) (actual time=50.293..2435.271 
rows=30 loops=1)  
Jun 16 23:15:30 blackpearl postgres[9860]: [79-5]   ->  Bitmap Heap 
Scan on s3_object  (cost=634.39..636.41 rows=1 width=22) (actual 
time=50.269..153.885 rows=30 loops=1) 
Jun 16 23:15:30 blackpearl postgres[9860]: [79-6] Recheck 
Cond: (bucket_id = 'bc6e6b10-80ad-4329-9fb9-1a66d8c1505e'::uuid)
 
Jun 16 23:15:30 blackpearl postgres[9860]: [79-7] Heap 
Blocks: exact=3704
Jun 16 23:15:30 blackpearl postgres[9860]: [79-8] ->  
Bitmap Index Scan on ds3_s3_object__bucket_id  (cost=0.00..634.39 rows=1 
width=0) (actual time=49.552..49.552 rows=30 loops=1)
Jun 16 23:15:30 blackpearl postgres[9860]: [79-9]   
Index Cond: (bucket_id = 'bc6e6b10-80ad-4329-9fb9-1a66d8c1505e'::uuid)  
   
Jun 16 23:15:30 blackpearl postgres[9860]: [79-10]  ->  Bitmap Heap 
Scan on blob  (cost=674.39..676.41 rows=1 width=22) (actual time=0.005..0.006 
rows=1 loops=30)
Jun 16 23:15:30 blackpearl postgres[9860]: [79-11]Recheck 
Cond: (object_id = s3_object.id)
Jun 16 23:15:30 blackpearl postgres[9860]: [79-12]Heap 
Blocks: exact=30
Jun 16 23:15:30 blackpearl postgres[9860]: [79-13]->  
Bitmap Index Scan on ds3_blob__object_id  (cost=0.00..674.39 rows=1 width=0) 
(actual time=0.004..0.004 rows=1 loops=30)
Jun 16 23:15:30 blackpearl postgres[9860]: [79-14]  
Index Cond: (object_id = s3_object.id)


Doing a ZFS rollback and executing the query shows what is happening, although 
not to the extent above.  If I read this correctly, it’s the constraint checks 
that are causing the query to take so long.  I don’t see any server 
configuration that might allow those checks to be parallelized.  Is that 
possible?

tapesystem=# PREPARE foo(uuid) AS DELETE FROM ds3.blob WHERE EXISTS (SELECT * 
FROM ds3.s3_object WHERE id = ds3.blob.object_id AND (bucket_id = $1)); EXPLAIN 
ANALYZE EXECUTE foo('bc6e6b10-80ad-4329-9fb9-1a66d8c1505e');
PREPARE
   QUERY PLAN   

   

   
 Delete on blob  (cost=9555.07..21134.01 rows=23 width=12) (actual 
time=1516.140..1516.140 rows=0 loops=1)
   ->  Hash Join  (cost=9555.07..21134.01 rows=23 width=12) (actual 
time=237.816..621.306 rows=30 loops=1)
 Hash Cond: (s3_object.id = blob.object_id)
 ->  Seq Scan on s3_object  (cost=0.00..7454.04 rows=23 width=22) 
(actual time=0.027..148.503 rows=30 loops=1)
 
   Filter: (bucket_id = 
'bc6e6b10-80ad-4329-9fb9-1a66d8c1505e'::uuid)
   Rows Removed by Filter: 3
 ->  Hash  (cost=5805.03..5805.03 rows=33 width=22) (actual 
time=235.219..235.219 rows=33 loops=1)
   Buckets: 524288  Batches: 1  Memory Usage: 19917kB
   ->  Seq Scan on blob  (cost=0.00..5805.03 rows=33 width=22) 
(actual time=0.038..114.107 rows=33 loops=1)

 Trigger for constraint blob_tape_blob_id_fkey: time=5389.627 calls=30
 Trigger for constraint multi_part_upload_placeholder_blob_id_fkey: 
time=4566.305 calls=30
 Trigger for constraint mul

Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Tom Lane
Andres Freund  writes:
> On 2018-06-18 12:30:13 -0400, Tom Lane wrote:
>> Sherrylyn Branchaw  writes:
>>> LOG:  server process (PID 138529) was terminated by signal 6: Aborted

>> Hm ... were these installations built with --enable-cassert?  If not,
>> an abort trap seems pretty odd.

> PANIC does an abort, so that's not too surprising...

Hm, I supposed that Sherrylyn would've noticed any PANIC entries in
the log.  The TRAP message from an assertion failure could've escaped
notice though, even assuming that her logging setup captured it.

regards, tom lane



Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Peter Geoghegan
On Mon, Jun 18, 2018 at 1:03 PM, Tom Lane  wrote:
> Hm, I supposed that Sherrylyn would've noticed any PANIC entries in
> the log.  The TRAP message from an assertion failure could've escaped
> notice though, even assuming that her logging setup captured it.

Unhandled C++ exceptions end up calling std::abort(). I've seen bugs
in modules like PL/V8 that were caused by this. The symptom was a
mysterious message in the logs about SIGABRT. Perhaps that's what
happened here?

What extensions are installed, if any?

-- 
Peter Geoghegan



High WriteLatency RDS Postgres 9.3.20

2018-06-18 Thread Juan Manuel Cuello
Hi,

I'm experiencing high WriteLatency levels in a Postgres server 9.3.20
hosted in Amazon RDS. So far it's been almost two months of investigation
and people at AWS technical support don't seem to find the cause. I think
it could be related to Postgres and the number of schema/tables in the
database, that's why I post this issue here.

I have around 4600 schemas, each contains 62 tables. The DB size is only
around 130 GB. the server has plenty of available RAM, CPU usage is less
than 10% and there are only around 16 connections, but WriteLatency is
unusually high.

As I don't have access to the server, I cannot see which are the process
that are wiring to disk, but my guess is that each Postgres process is
writing to disk for some reason.

This issue doesn't seem related to workload. If I restart the server,
WriteLatency drops to normal levels and remains like that until, after some
time (a few hours or a day), without any obvious reason, it spikes again
and continues at high levels since then.

Is it possible that, for some reason, Postgres processes start writing to
disk at some point due to reaching any internal limit? Maybe related to
relcache/catcache/syscache? Any other thoughts?

Thanks

Juan


Re: Query hitting empty tables taking 48 minutes

2018-06-18 Thread Laurenz Albe
Robert Creager wrote:
> A different query started showing up as the problem, the auto_explain with 
> analyze shows an oddity,
> the total query duration is 11k seconds, while the query itself is 3 seconds. 
>  I captured a ZFS
> snapshot as soon as the problem was noticed.
> 
>db=tapesystem,user=Administrator,app=[unknown],client=127.0.0.1 LOG:  
> duration: 10856644.336 ms  plan:  
>  
>  Query Text: DELETE FROM ds3.blob WHERE EXISTS (SELECT * FROM 
> ds3.s3_object WHERE id = ds3.blob.object_id AND (bucket_id = $1)) 
> 
>  Delete on blob  (cost=1308.79..1312.82 rows=1 width=12) (actual 
> time=3465.919..3465.919 rows=0 loops=1)   
>  
> [...]
> 
> Doing a ZFS rollback and executing the query shows what is happening, 
> although not to the extent above.
> If I read this correctly, it’s the constraint checks that are causing the 
> query to take so long.
> I don’t see any server configuration that might allow those checks to be 
> parallelized.  Is that possible?
> 
> tapesystem=# PREPARE foo(uuid) AS DELETE FROM ds3.blob WHERE EXISTS (SELECT * 
> FROM ds3.s3_object WHERE id = ds3.blob.object_id AND (bucket_id = $1)); 
> EXPLAIN ANALYZE EXECUTE foo('bc6e6b10-80ad-4329-9fb9-1a66d8c1505e');
> PREPARE

Are we missing an "EXECUTE foo('bc6e6b10-80ad-4329-9fb9-1a66d8c1505e')" here?

>QUERY PLAN 
>   
>
> 
>
>  Delete on blob  (cost=9555.07..21134.01 rows=23 width=12) (actual 
> time=1516.140..1516.140 rows=0 loops=1)
> [...]
>  Trigger for constraint blob_tape_blob_id_fkey: time=5389.627 calls=30
>  Trigger for constraint multi_part_upload_placeholder_blob_id_fkey: 
> time=4566.305 calls=30
>  Trigger for constraint multi_part_upload_part_blob_id_fkey: time=3597.896 
> calls=30
>  Trigger for constraint blob_pool_blob_id_fkey: time=4631.851 calls=30
>  Trigger for constraint blob_target_blob_id_fkey: time=4688.731 calls=30
>  Trigger for constraint suspect_blob_tape_blob_id_fkey: time=4473.787 
> calls=30
>  Trigger for constraint suspect_blob_pool_blob_id_fkey: time=4475.286 
> calls=30
>  Trigger for constraint suspect_blob_target_blob_id_fkey: time=4353.237 
> calls=30
>  Trigger for constraint blob_s3_target_blob_id_fkey: time=4451.687 
> calls=30
>  Trigger for constraint blob_azure_target_blob_id_fkey: time=4448.613 
> calls=30
>  Trigger for constraint suspect_blob_azure_target_blob_id_fkey: time=4353.467 
> calls=30
>  Execution time: 85175.581 ms
> 
> I executed a vacuum analyze, then a vacuum full analyze, neither changed the 
> timing.
> Other than removing constraints, is there any way to address this?

I cannot explain the discrepancy between the runtimes of 85 seconds versus 
10857 seconds.

But other than that, it sure looks like the foreign keys are missing an index on
the source columns, leading to a sequential scan for each deletion and table.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



Re: High WriteLatency RDS Postgres 9.3.20

2018-06-18 Thread Andres Freund
On 2018-06-18 18:43:06 -0300, Juan Manuel Cuello wrote:
> I'm experiencing high WriteLatency levels in a Postgres server 9.3.20
> hosted in Amazon RDS.

A lot of performance improvements have been made since 9.3, and it'll
soon-ish be out of support.

If you can reproduce the issue on postgres proper, rather than a
modified version in an environment that precludes getting detailed data,
we might be able to sensibly help you further.


> So far it's been almost two months of investigation
> and people at AWS technical support don't seem to find the cause. I think
> it could be related to Postgres and the number of schema/tables in the
> database, that's why I post this issue here.

There've been improvements made since 9.3. Upgrade.

Greetings,

Andres Freund



Re: Query hitting empty tables taking 48 minutes

2018-06-18 Thread Robert Creager


> On Jun 18, 2018, at 4:04 PM, Laurenz Albe  wrote:
> 
> Robert Creager wrote:
>> A different query started showing up as the problem, the auto_explain with 
>> analyze shows an oddity,
>> the total query duration is 11k seconds, while the query itself is 3 
>> seconds.  I captured a ZFS
>> snapshot as soon as the problem was noticed.
>> 
>>   db=tapesystem,user=Administrator,app=[unknown],client=127.0.0.1 LOG:  
>> duration: 10856644.336 ms  plan: 
>>   
>> Query Text: DELETE FROM ds3.blob WHERE EXISTS (SELECT * FROM 
>> ds3.s3_object WHERE id = ds3.blob.object_id AND (bucket_id = $1))
>>  
>> Delete on blob  (cost=1308.79..1312.82 rows=1 width=12) (actual 
>> time=3465.919..3465.919 rows=0 loops=1)  
>>   
>> [...]
>> 
>> Doing a ZFS rollback and executing the query shows what is happening, 
>> although not to the extent above.
>> If I read this correctly, it’s the constraint checks that are causing the 
>> query to take so long.
>> I don’t see any server configuration that might allow those checks to be 
>> parallelized.  Is that possible?
>> 
>> tapesystem=# PREPARE foo(uuid) AS DELETE FROM ds3.blob WHERE EXISTS (SELECT 
>> * FROM ds3.s3_object WHERE id = ds3.blob.object_id AND (bucket_id = $1)); 
>> EXPLAIN ANALYZE EXECUTE foo('bc6e6b10-80ad-4329-9fb9-1a66d8c1505e');
>> PREPARE
> 
> Are we missing an "EXECUTE foo('bc6e6b10-80ad-4329-9fb9-1a66d8c1505e')" here?

Nope, just hidden on the same line, this is the plan for that EXECUTE

> 
>>   QUERY PLAN 
>>  
>> 
>> 
>>
>> Delete on blob  (cost=9555.07..21134.01 rows=23 width=12) (actual 
>> time=1516.140..1516.140 rows=0 loops=1)
>> [...]
>> Trigger for constraint blob_tape_blob_id_fkey: time=5389.627 calls=30
>> [...]
>> Execution time: 85175.581 ms
>> 
>> I executed a vacuum analyze, then a vacuum full analyze, neither changed the 
>> timing.
>> Other than removing constraints, is there any way to address this?
> 
> I cannot explain the discrepancy between the runtimes of 85 seconds versus 
> 10857 seconds.

It would be nice if the auto_explain analyze did include the other information 
like the psql analyze does.

> But other than that, it sure looks like the foreign keys are missing an index 
> on
> the source columns, leading to a sequential scan for each deletion and table.

You’d think, but they are present.  As an example, the first constraint 
blob_tape_blob_id_fkey is indexed.

create table blob
(
[…]
  iduuid   not null
constraint blob_pkey
primary key,
[…]
);

create table blob_tape
(
  blob_id uuidnot null
constraint blob_tape_blob_id_fkey
references ds3.blob
on update cascade on delete cascade,
  id  uuidnot null
constraint blob_tape_pkey
primary key,
[…]
);

create index tape_blob_tape__blob_id
  on blob_tape (blob_id);

> 
> Yours,
> Laurenz Albe
> -- 
> Cybertec | https://www.cybertec-postgresql.com



Re: Run Stored procedure - function from VBA

2018-06-18 Thread Rob Sargent


> On Jun 18, 2018, at 9:47 AM, Łukasz Jarych  wrote:
> 
> CREATE OR REPLACE FUNCTION totalRecords ()
> RETURNS integer AS $total$
> declare
>   total integer;
> BEGIN
>SELECT count(*) into total FROM COMPANY;
>RETURN total;
> END;
> $total$ LANGUAGE plpgsql;



Re: Run Stored procedure - function from VBA

2018-06-18 Thread Rob Sargent


> On Jun 18, 2018, at 9:47 AM, Łukasz Jarych  wrote:
> 
> Hi Guys,
> 
> i have example function :
> 
> CREATE OR REPLACE FUNCTION totalRecords ()
> RETURNS integer AS $total$
> declare
>   total integer;
> BEGIN
>SELECT count(*) into total FROM COMPANY;
>RETURN total;
> END;
> $total$ LANGUAGE plpgsql;
> 
> and i want to run it from VBA using odbc connection. 
> 
> What is the best way to use it ?
> 
> something like this: 
> 
> Dim dbCon as new ADODB.Connection
> Dim rst as new ADODB.Recordset
> 
> Dbcon.connectionstring=”Your connection string goes here!”
> Dbcon.open
> 
> Rst.open strsql
> where strsql is "Select * from totalRecords" or this is not a good solution?
> 
> Best,
> Jacek

You need the parentheses after the function name: “select * from 
totalrecords();"



Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Sherrylyn Branchaw
> Hm ... were these installations built with --enable-cassert?  If not,
> an abort trap seems pretty odd.

The packages are installed directly from the yum repos for RHEL. I'm not
aware that --enable-cassert is being used, and we're certainly not
installing from source.

> Those "incomplete data" messages are quite unexpected and disturbing.
> I don't know of any mechanism within Postgres proper that would result
> in corruption of the postmaster.pid file that way.  (I wondered briefly
> if trying to start a conflicting postmaster would result in such a
> situation, but experimentation here says not.)  I'm suspicious that
> this may indicate a bug or unwarranted assumption in whatever scripts
> you use to start/stop the postmaster.  Whether that is at all related
> to your crash issue is hard to say, but it bears looking into.

We're using the stock initd script from the yum repo, but I dug into this
issue, and it looks like we're passing the path to the postmaster.pid as
the $pidfile variable in our sysconfig file, meaning the initd script is
managing the postmaster.pid file, and specifically is overwriting it with a
single line containing just the pid. I'm not sure why it's set up like
this, and I'm thinking we should change it, but it seems harmless and
unrelated to the crash. In particular, manual initd actions such as stop,
start, restart, and status all work fine.

> No, that looks like fairly typical crash recovery to me: corrupt shared
> memory contents are expected and recovered from after a crash.

That's reassuring. But if it's safe for us to immediately start the server
back up, why did Postgres not automatically start the server up like it did
the first time? I was assuming it was due to the presence of the corrupt
memory segment, as that was the only difference in the logs, although I
could be wrong. Automatic restart would have saved us a great deal of
downtime; since in the first case we had total recovery within 30 seconds,
and in the second case, many minutes of downtime while someone got paged,
troubleshot the issue, and eventually decided to try starting the database
back up.

At any rate, if it's safe, we can write a script to detect this failure
mode and automatically restart, although it would be less error-prone if
Postgres restarted automatically.

> Hm, I supposed that Sherrylyn would've noticed any PANIC entries in
> the log.

No PANICs. The log lines I pasted were the only ones that looked relevant
in the Postgres logs. I can try to dig through the application logs, but I
was planning to wait until the next time this happens, since we should have
core dumps fixed and that might make things easier.

> What extensions are installed, if any?

In the first database, the one without the corrupt memory segment and that
restarted automatically: plpgsql and postgres_fdw.

In the second database, the one where the memory segment got corrupted and
that didn't restart automatically: dblink, hstore, pg_trgm, pgstattuple,
plpgsql, and tablefunc.

I forgot to mention that the queries that got killed were innocuous-looking
SELECTs that completed successfully for me in less than a second when I ran
them manually. In other words, the problem was not reproducible.

Sherrylyn


Re: Query hitting empty tables taking 48 minutes

2018-06-18 Thread Robert Creager


> On Jun 18, 2018, at 4:33 PM, Robert Creager  wrote:
> 
>> I cannot explain the discrepancy between the runtimes of 85 seconds versus 
>> 10857 seconds.
> 
> It would be nice if the auto_explain analyze did include the other 
> information like the psql analyze does.

Like this.  I’ll try again.

auto_explain.log_triggers (boolean)
auto_explain.log_triggers causes trigger execution statistics to be included 
when an execution plan is logged. This parameter has no effect unless 
auto_explain.log_analyze is enabled. This parameter is off by default. Only 
superusers can change this setting.





Re: High WriteLatency RDS Postgres 9.3.20

2018-06-18 Thread Benjamin Scherrey
I would also add that AWS' I/O capabilities are quite poor and expensive. I
assume that you have tried purchasing additional IOOPs on that setup to see
whether you got an expected speed up? If not you should try that as a
diagnostic tool even if you wouldn't want to pay that on an ongoing basis.

We have a client that is I/O write bound and it has taken us significant
efforts to get it to perform well on AWS. We definitely run our own
instances rather than depend on RDS and have always been able to outperform
RDS instances which seem to really be focused to provide a PAAS capability
for developers who really don't want to have to understand how a db works.
Running our identical environment on bare metal is like night & day under
any circumstances when compared to AWS.

Client's requirement is AWS so we keep working on it and we like AWS for
many things but understand it will always underperform on I/O.

Post actual measurements with and without IOOPs or create your own PG
server instance and then people might be able to give you additional
insights.

  - - Ben Scherrey

On Tue, Jun 19, 2018, 5:24 AM Andres Freund  wrote:

> On 2018-06-18 18:43:06 -0300, Juan Manuel Cuello wrote:
> > I'm experiencing high WriteLatency levels in a Postgres server 9.3.20
> > hosted in Amazon RDS.
>
> A lot of performance improvements have been made since 9.3, and it'll
> soon-ish be out of support.
>
> If you can reproduce the issue on postgres proper, rather than a
> modified version in an environment that precludes getting detailed data,
> we might be able to sensibly help you further.
>
>
> > So far it's been almost two months of investigation
> > and people at AWS technical support don't seem to find the cause. I think
> > it could be related to Postgres and the number of schema/tables in the
> > database, that's why I post this issue here.
>
> There've been improvements made since 9.3. Upgrade.
>
> Greetings,
>
> Andres Freund
>
>


Re: What to do when dynamic shared memory control segment is corrupt

2018-06-18 Thread Tom Lane
Sherrylyn Branchaw  writes:
>> Hm ... were these installations built with --enable-cassert?  If not,
>> an abort trap seems pretty odd.

> The packages are installed directly from the yum repos for RHEL. I'm not
> aware that --enable-cassert is being used, and we're certainly not
> installing from source.

OK, I'm pretty sure nobody builds production RPMs with --enable-cassert.
But your extensions (as listed below) don't include any C++ code, so
that still leaves us wondering where the abort trap came from.  A stack
trace would almost certainly help clear that up.

>> Those "incomplete data" messages are quite unexpected and disturbing.

> We're using the stock initd script from the yum repo, but I dug into this
> issue, and it looks like we're passing the path to the postmaster.pid as
> the $pidfile variable in our sysconfig file, meaning the initd script is
> managing the postmaster.pid file, and specifically is overwriting it with a
> single line containing just the pid. I'm not sure why it's set up like
> this, and I'm thinking we should change it, but it seems harmless and
> unrelated to the crash. In particular, manual initd actions such as stop,
> start, restart, and status all work fine.

This is bad; a normal postmaster.pid file contains half a dozen lines
besides the PID proper.  You might get away with this for now, but it'll
break pg_ctl as of v10 or so, and might confuse other external tools
sooner than that.  Still, it doesn't seem related to your crash problem.

>> No, that looks like fairly typical crash recovery to me: corrupt shared
>> memory contents are expected and recovered from after a crash.

> That's reassuring. But if it's safe for us to immediately start the server
> back up, why did Postgres not automatically start the server up like it did
> the first time?

Yeah, I'd like to know that too.  The complaint about corrupt shared
memory may be just an unrelated red herring, or it might be a separate
effect of whatever the primary failure was ... but I think it was likely
not the direct cause of the failure-to-restart.  But we've got no real
evidence as to what that direct cause was.

> At any rate, if it's safe, we can write a script to detect this failure
> mode and automatically restart, although it would be less error-prone if
> Postgres restarted automatically.

I realize that you're most focused on less-downtime, but from my
perspective it'd be good to worry about collecting evidence as to
what happened exactly.  Capturing core files is a good start --- and
don't forget the possibility that there's more than one.  A plausible
guess as to why the system didn't restart is that the postmaster crashed
too, so we'd need to see its core to figure out why.

Anyway, I would not be afraid to try restarting the postmaster manually
if it died.  Maybe don't do that repeatedly without human intervention;
but PG is pretty robust against crashes.  We developers crash it all the
time, and we don't lose data.

regards, tom lane



Re: Run Stored procedure - function from VBA

2018-06-18 Thread Łukasz Jarych
Thank you Rob,

question is it is the optimal way to run SP from VBA?
Or not?

Best,
Jacek

2018-06-19 1:34 GMT+02:00 Rob Sargent :

>
>
> On Jun 18, 2018, at 9:47 AM, Łukasz Jarych  wrote:
>
> Hi Guys,
>
> i have example function :
>
> CREATE OR REPLACE FUNCTION totalRecords ()
> RETURNS integer AS $total$
> declare
> total integer;
> BEGIN
>SELECT count(*) into total FROM COMPANY;
>RETURN total;
> END;
> $total$ LANGUAGE plpgsql;
>
> and i want to run it from VBA using odbc connection.
>
> What is the best way to use it ?
>
> something like this:
>
> Dim dbCon as new ADODB.Connection
> Dim rst as new ADODB.Recordset
>
> Dbcon.connectionstring=”Your connection string goes here!”
> Dbcon.open
>
> Rst.open strsql
>
> where strsql is "Select * from totalRecords" or this is not a good
> solution?
>
> Best,
> Jacek
>
>
> You need the parentheses after the function name: “select * from
> totalrecords();"
>
>