logical replication snapshots

2018-07-23 Thread Dimitri Maziuk
Hi everyone,

I'm playing with logical replication in postgres-10 and I got it to hang
on startup with

> 2018-07-23 13:15:06.577 CDT [18624] FATAL:  the database system is starting up
> 2018-07-23 13:15:06.577 CDT [18624] DEBUG:  shmem_exit(1): 0 
> before_shmem_exit callbacks to make
> 2018-07-23 13:15:06.577 CDT [18624] DEBUG:  shmem_exit(1): 0 on_shmem_exit 
> callbacks to make
> 2018-07-23 13:15:06.577 CDT [18624] DEBUG:  proc_exit(1): 1 callbacks to make
> 2018-07-23 13:15:06.577 CDT [18624] DEBUG:  exit(1)
> 2018-07-23 13:15:06.577 CDT [18624] DEBUG:  shmem_exit(-1): 0 
> before_shmem_exit callbacks to make
> 2018-07-23 13:15:06.577 CDT [18624] DEBUG:  shmem_exit(-1): 0 on_shmem_exit 
> callbacks to make
> 2018-07-23 13:15:06.577 CDT [18624] DEBUG:  proc_exit(-1): 0 callbacks to make
... forever ...

After poking around:

# find /var/lib/pgsql/10/data/pg_logical/snapshots -printf . | wc -c
12727975

It looks like ZFS simply can't handle this many files: `ls` etc. hang
forever.

So my question is, does it have to be that many? And if not, is there a
knob to keep the number down?

TIA
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-24 Thread Dimitri Maziuk

On 7/23/2018 6:51 PM, Andres Freund wrote:


What precedes that "loop"?


systemctl start postgresql-10


Could you describe what exactly you did into that situation?


Created a database, few publications, as per TFM, and a few 
subscriptions on another host. Went on vacation for 3 weeks. The problem 
host is centos 7.current w/ postgres rpms from PGDG repo. Postgres is on 
port 5434 because there is a stock one on 5432.



ls probably works in general, it's just the sorting that makes it
hang. Should work with -f or such.


I don't think this means what you think this means.

There are *13 million* files in 
/var/lib/pgsql/10/data/pg_logical/snapshots and postgres hangs forever 
on start with

FATAL:  the database system is starting up
FATAL:  the database system is starting up
FATAL:  the database system is starting up
...
I think the two are related. ICBW and all that. What do I do no not get 
it into that state again?


Thx



Re: logical replication snapshots

2018-07-24 Thread Dimitri Maziuk
On 07/24/2018 12:21 PM, Alvaro Herrera wrote:
> 
> Are you serious about us trying to diagnose a bug with this description?

What do you want to know, exactly?

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-24 Thread Dimitri Maziuk
On 07/24/2018 01:43 PM, Andres Freund wrote:
> On 2018-07-24 12:22:24 -0500, Dimitri Maziuk wrote:
>> On 07/24/2018 12:21 PM, Alvaro Herrera wrote:
>>>
>>> Are you serious about us trying to diagnose a bug with this description?
>>
>> What do you want to know, exactly?
> 
> A recipe that we can follow and reproduce the issue.

The nutshell version is as I said: I pretty much followed the manual to
create a test publication/subscription setup and let it run for a few
weeks. I had to increase a few limits but otherwise everything's at the
default settings as shipped in the rpms.

I can send you postgres config files from both nodes and the entire
database dump -- it's all public domain. I can `zfs send` you the
snapshot of the entire /var/lib/pgsql/10 as is, too.

If you are asking for something other than those two extremes, please
let me know what it is and I'll be happy to try and provide it.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-25 Thread Dimitri Maziuk

On 7/25/2018 10:28 AM, Andres Freund wrote:


Are you really expecting us to be able to reproduce the problem based on
the above description? Our test suites do setup plain replications
setups, and the problem doesn't occur there.


I don't, by definition, have a reproducible case: it only happened once 
so far.


If nobody knows what limits the number of files created in 
$PGDATA/pg_logical/snapshots then we'll all have to wait until this 
happens again.


(To somebody else as I'm obviously not turning logical replication back 
on until I know it won't kill my server again.)


Dima



Re: logical replication snapshots

2018-07-25 Thread Dimitri Maziuk
On 07/25/2018 11:16 AM, Adrian Klaver wrote:

> Where you using pg_export_snapshot() by any chance?:

If it's not in chapter 31 of TFM then I have not touched it. The only
notable thing I changed from 31.9 quickstart is that replica identity is
full on all tables (the data's a mess).

> Where there any relevant error messages in the log before the database
> hung?

Do you seriously expect me to know what "relevant" is? ;) -- It's 7
days' worth of

> 2018-07-23 09:43:12.851 CDT [7975] STATEMENT:  COPY 
> macromolecules."Entity_common_name" TO STDOUT
> 2018-07-23 09:43:12.925 CDT [7982] LOG:  logical decoding found consistent 
> point at 19/E6942408
> 2018-07-23 09:43:12.925 CDT [7982] DETAIL:  There are no running transactions.
> 2018-07-23 09:43:12.935 CDT [7982] ERROR:  permission denied for schema 
> macromolecules
> 2018-07-23 09:43:12.935 CDT [7982] STATEMENT:  COPY macromolecules."Assembly" 
> TO STDOUT
> 2018-07-23 09:43:13.034 CDT [7987] LOG:  logical decoding found consistent 
> point at 19/E6942440
> 2018-07-23 09:43:13.034 CDT [7987] DETAIL:  There are no running transactions.
> 2018-07-23 09:43:13.044 CDT [7987] ERROR:  permission denied for schema 
> macromolecules
> 2018-07-23 09:43:13.044 CDT [7987] STATEMENT:  COPY 
> macromolecules."Spectral_transition" TO STDOUT
> 2018-07-23 09:43:13.135 CDT [7990] LOG:  logical decoding found consistent 
> point at 19/E6942478
> 2018-07-23 09:43:13.135 CDT [7990] DETAIL:  There are no running transactions.
> 2018-07-23 09:43:13.145 CDT [7990] ERROR:  permission denied for schema 
> macromolecules
> 2018-07-23 09:43:13.145 CDT [7990] STATEMENT:  COPY 
> macromolecules."Assembly_db_link" TO STDOUT

that ends with

> 2018-07-23 09:45:40.280 CDT [6032] LOG:  database system was interrupted; 
> last known up at 2018-07-06 17:04:28 CDT
> 2018-07-23 09:45:40.281 CDT [6033] FATAL:  the database system is starting up
> 2018-07-23 09:45:40.282 CDT [6034] FATAL:  the database system is starting up
... ad infinitum ...

Would "permission denied" be relevant?

(Not sure where that is coming from: I did initial sync as postgres then
altered it to replication user. Can't track it down now since the
database is dead.)

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-25 Thread Dimitri Maziuk
On 07/25/2018 02:31 PM, Adrian Klaver wrote:

> In your previous post you said:
> 
> "(To somebody else as I'm obviously not turning logical replication back
> on until I know it won't kill my server again.) "
> 
> So what are you working off now?

I have that database rebuilt with no publications defined, and I have
the original $PGDATA on a ZFS volume.

I have not looked at the subscriber host yet, I've more urgent fires to
douse first.
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-26 Thread Dimitri Maziuk
On 07/25/2018 07:57 PM, Andres Freund wrote:
> On 2018-07-25 12:31:01 -0700, Adrian Klaver wrote:

>>> Would "permission denied" be relevant?
>>
>> Logical decoding is something I am still learning. The "permission denied"
>> would to me be relevant only to the extent that it seems to be provoking:
> 
> Yes, it looks related. Looks like logical rep was trying to perform the
> intiial sync of those tables, and couldn't due to permission errors.

I'm not sure what happened, I remember the initial sync of that
particular schema failing on one table only, but looking at it now, all
tables are empty on the subscriber.

>> "LOG:  logical decoding found consistent point at 19/E6942440"
>> DETAIL:  There are no running transactions."
>>
>> Others with more experience in this area would need to fill whether that
>> might account for the 13 million files in the ~snapshot/ directory.
> 
> That indicates there's some error handling issues to be resolved. Petr?

It logs "COPY TO STDOUT", does that actually cache to disk (in ~snapshot/)?

Would it be safe to delete every file in ~snapshot/ older than some
X-number of minutes? (And what a good X might be?)

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-26 Thread Dimitri Maziuk
On 07/26/2018 02:54 PM, Adrian Klaver wrote:
> On 07/26/2018 10:54 AM, Dimitri Maziuk wrote:

>> I'm not sure what happened, I remember the initial sync of that
>> particular schema failing on one table only, but looking at it now, all
>> tables are empty on the subscriber.
> 
> To me that indicates all the syncs failed.

Yeah, well... the error message said one table failed and I went off to
find out why (a co-worker added a column behind everyone's back) and
never checked 'count(*)' on the other tables.

... deleting files in ~snapshots
> Again I don't know the answer to this. Are you trying this on a test
> setup or production one?

I could fire up another instance on a different port off the now-broken
$PGDATA easily enough and test. However if whatever uses those files
needs to start from the very first file and "replay" them in sequence,
that won't work.

The files are named like 19_E6942440.snap which presumably corresponds
to "LOG: logical decoding found consistent point at 19/E6942440" and
they seem to get progressively larger. That suggest that maybe just one
(the newest one) could be good enough...

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-26 Thread Dimitri Maziuk
On 07/26/2018 04:39 PM, Adrian Klaver wrote:

> The thing that has me somewhat confused is:
> 
> ERROR: permission denied for schema macromolecules
> 
> I would thought the replication user could access that.

The more I look at the errors, the less sense it all makes.

Let me ask a different question: if I drop and re-create a published
table on the publisher without doing anything to the publication and
subscription, what happens?

Do I need to re-create the publication when I re-create the table?

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-26 Thread Dimitri Maziuk
On 07/26/2018 05:34 PM, Adrian Klaver wrote:
> On 07/26/2018 03:01 PM, Dimitri Maziuk wrote:

>> Let me ask a different question: if I drop and re-create a published
>> table on the publisher without doing anything to the publication and
>> subscription, what happens?
> 
> Take a look at:
> 
> https://www.postgresql.org/docs/10/static/sql-alterpublication.html

Thanks but what I ask (and what I suspect happened here while I wasn't
looking) is

- create schema foo with tables bar and baz and some rows in 'em
- create publication foopub with foo.bar and foo.baz
- create subscription to foopub on the other node, initial sync and all
- drop schema foo on the publisher
- create schema foo with tables bar and baz and some rows in 'em

The publication foopub is at this point fubar I take it? And needs to be
re-created on the publisher and reconnected on the subscriber? Complete
with with inital resync?

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-27 Thread Dimitri Maziuk
On 07/26/2018 07:11 PM, Adrian Klaver wrote:
> On 07/26/2018 04:48 PM, Dimitri Maziuk wrote:
...
>> The publication foopub is at this point fubar I take it? And needs to be
>> re-created on the publisher and reconnected on the subscriber? Complete
>> with with inital resync?
> 
> Not sure. Personally I would try:
> 
> 1) ALTER PUBLICATION DROP TABLE foo|bar;
> 
> 2) ALTER PUBLICATION ADD TABLE foo|bar;
> 
> 3) ALTER SUBSCRIPTION sub_name REFRESH PUBLICATION
> 
> If you get to 3) it will re-sync the data unless you tell it otherwise.
> 
> The above is probably dependent on the size of the publication. If you
> did a publication for ALL it would make more sense to do the above then
> if you did a publication for just foo or bar.

... but if I did the publication for ALL, I could just use streaming
replication and then drop table/add table would replicate automagically ...

It looks like we probably have to re-think a few of our workflows and
procedures, and until/unless that happens, logical replication won't do
what we want. Which means figuring out this 13-million-files problem
becomes a very low priority for me. Unfortunately: it'd be nice to track
it down and squash it...

Thanks everyone,
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: logical replication snapshots

2018-07-30 Thread Dimitri Maziuk
On 07/27/2018 05:20 PM, Adrian Klaver wrote:
> On 07/27/2018 03:04 PM, Dimitri Maziuk wrote:

>> ... but if I did the publication for ALL, I could just use streaming
>> replication ...
> 
> Well I was just showing the extremes from a single table publication to
> ALL tables. You can also do subsets of ALL. Remember that binary
> replication(streaming) involves the whole Postgres cluster, e.g. all the
> databases in the cluster no choice in the matter. Also it does not allow
> you to shape what is replicated.
Yes, and that's why I was looking into logical replication. Now that I
looked, it seems the limitations are too many for our use case now, and
that spinning up a separate "cluster" for each "replicated dataset" is
simply less work ATM.

Thanks again,
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Question on postgresql.conf

2018-07-31 Thread Dimitri Maziuk

On 7/31/2018 9:36 AM, Matt Zagrabelny wrote:


Or have puppet/chef/ansible/etc. write the correct config file based on 
your dynamic data.


Postgres ain't orrible. I wonder if one actually needs as much 
host-specific config tuning in the first place... I haven't touched 
oracle in forever, though, and mostly managed to repress the memories by 
now.


Dima



Re: Question on postgresql.conf

2018-07-31 Thread Dimitri Maziuk
On 07/31/2018 12:33 PM, George Neuner wrote:

> Is there some reason that postgresql.conf cannot be a link to your
> file?

It's six of one, half a dozen of the other. The big problem is when the
network share is unavailable at host boot and all of your databases are
belong to bitbucket. If you want to do it right, you need chef/puppet as
already suggested.

One useful thing linux (at least) has now is conf.d directories whose
contents are auto-included after the main config: this way you can keep
the vendor-provided /etc/blah.conf and have all your host-specific
settings in /etc/blah.d/myhost.conf, all living together happily ever
after. You still want the latter to be a local file that's guaranteed
available when blah service starts up. (Postgres devs: hint! hint!)

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Question on postgresql.conf

2018-07-31 Thread Dimitri Maziuk
On 07/31/2018 01:05 PM, Steve Atkins wrote:

> You'd do that with "include_dir 'conf.d'" or similar, I think, which 
> PostgreSQL
> has in all current versions.

So it does, huh. I guess it helps to read all the way to end of
postgresql.conf sometimes...

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Question on postgresql.conf

2018-07-31 Thread Dimitri Maziuk
On 07/31/2018 01:16 PM, George Neuner wrote:
> 
> I understand the problem with network shares - but the OP mentioned
> only a "shared location", which could be just another directory on the
> same device.   Because the OP said also that there were other DBMS
> being used in the same shop, I considered that there might be a
> respectable SAN involved.

(Veering further OT) yeah and you have to remember to tell Lennartware
to start the automounter before the database and make sure the next
update doesn't "improve" on your configs... IME the shared location that
works best is the chef master.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Who and How is responsible for released installations packages and 3rd party packs? (e.g. on https://yum.postgresql.org/9.6/redhat/rhel-7.3-x86_64/)

2018-08-06 Thread Dimitri Maziuk
On 08/06/2018 03:49 PM, Alexandru Lazarev wrote:

> And the same question about postgresql-server install packages themselves
> (RPMs, debs, etc)

For this one, packages you get from "PGDG"@
https://www.postgresql.org/download/ are built by postgres. RedHat, for
example, also has postgres built by them and included in the
distribution -- in RHEL7 it's v.9.2 but RedHat is known to patch code
without changing version numbers.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Who and How is responsible for released installations packages and 3rd party packs? (e.g. on https://yum.postgresql.org/9.6/redhat/rhel-7.3-x86_64/)

2018-08-10 Thread Dimitri Maziuk
On 08/10/2018 02:49 PM, Christoph Berg wrote:

> plv8 will work for anyone willing to go through that. It's just not
> feasible to support it from a packager perspective.

Hm. Gotta wonder if running this:
https://github.com/clkao/docker-postgres-plv8/blob/master/10-2/Dockerfile
and copying plv8.so out of the resulting image is the easy way out.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: pg_basebackup failed to read a file

2018-08-14 Thread Dimitri Maziuk
On 08/14/2018 12:14 PM, Ron wrote:

> Why is checking a bunch of file permissions anywhere close to being as
> expensive as transferring 1.5TB over a WAN link?

Normally it shouldn't be but I recently had postgres create ~13M .snap
files and just opendir() took longer than anyone would care to wait...
so it can be just as expensive.

One could just as easily ask why create mode 600 files in places where
they don't belong.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Can Pg somehow recognize/honor linux groups to control user access ?

2018-08-22 Thread Dimitri Maziuk
On 08/22/2018 11:38 AM, Ravi Krishna wrote:
> 
> In fact DBAs don't even need to get involved when a new user needs DB access. 
>  Sysadmin had to just add that user in a group and we are done.

How is that different from giving your grants to a database role and
just telling the new user the name and password of that role to connect as?

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Can Pg somehow recognize/honor linux groups to control user access ?

2018-08-22 Thread Dimitri Maziuk
On 08/22/2018 12:54 PM, Ravi Krishna wrote:
>>
>> How is that different from giving your grants to a database role and
>> just telling the new user the name and password of that role to connect as?
> 
> Well here I have to do some work, with the groups approach, it is outsourced 
> to devops.  Secondly when you take into account AD, the user does not have to 
> remember his password for db login.  It is same as AD.

So it seems to me that the feature may be worth adding is to fetch the
password, *as well as "ldapsearchattribute"* from LDAP:
https://www.postgresql.org/docs/10/static/auth-methods.html#AUTH-LDAP

You should be able to get the role name from AD already, but the
password they still have to remember.

Although I still don't see this really working for anything more
complicated than one database and no user in more than one group.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Can Pg somehow recognize/honor linux groups to control user access ?

2018-08-22 Thread Dimitri Maziuk
On 08/22/2018 05:17 PM, Tim Cross wrote:
> 
> I hope your not suggesting that multiple users use the same login
> credentials to access the database?

A database role is effectively a "shared account" with "shared password"
so I'm not sure what your point is. It's just not a "computer login"
shared account. That said, there is nothing scary about shared computer
accounts either.

It's the users that are the scary part.
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: unorthodox use of PG for a customer

2018-08-24 Thread Dimitri Maziuk
On 08/24/2018 02:07 PM, David Gauthier wrote:
> 
> ...  He likes the idea of a
> separate DB per workarea.  He just doesn't gt it.

Well there are advantages to that.

> But for some reason, he doesn't like
> the client/server DB model which would work so nicely here.  I'm just
> trying to make sure I didn't miss some sort of solution, PG or not, that
> would work here.

What you should tell him is that he can't have it both ways. Either it's
multiple worker nodes concurrently writing to the same workstation -- in
which case he needs "a server" on the workstation to accept incoming
connections and all that overhead, with enough oomph to handle
concurrency he expects. Or it's a beefed-up central server where
everybody writes to, and every workstation can poll it and maintain its
own state databases from there (i.e. worker nodes don't talk back to
submit nodes).

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: unorthodox use of PG for a customer

2018-08-24 Thread Dimitri Maziuk
On 08/24/2018 02:35 PM, Andrew Kerber wrote:
> Unless I am missing something, it sounds like you might be able to do this
> with an nfs export shared to each workstation.

That's no different from polling the central database though, you're
just using nfs server and files instead of a db server and queries.

A compute cluster has to have a manager node that has all the state. The
way to do what he wants is to query that manager from each workstation
and keep the results in the local state database. Whether that is
actually feasible/doable in his particular case is another question.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: unorthodox use of PG for a customer

2018-08-27 Thread Dimitri Maziuk

On 8/27/2018 8:20 AM, John McKown wrote:
On Fri, Aug 24, 2018 at 1:19 PM David Gauthier > wrote:



I can't think of anything better.  Does anyone have any ideas?

Yes, but the civil authorities would prosecute you if they found the body.


And that's why we have raised floors and locked doors in the server rooms.

Dima




Re: PostgreSQL: Copy from File missing data error

2018-09-04 Thread Dimitri Maziuk
On 09/04/2018 01:13 PM, Holly Gibons wrote:

> Could someone suggest what I'm doing wrong?

There's probably a magic character that is not escaped properly
somewhere before the error location. Exactly how to find them all in a
2GB+ file is another question: e.g. if you know python you could play
with csv.Reader and unicode/bytes encode/decode...

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: increasing HA

2018-09-05 Thread Dimitri Maziuk
On 09/05/2018 12:04 PM, Thomas Poty wrote:
> I want to get :
> Automatic failover (promoting a slave)
> Automatic Routing traffic to master
> Fencing in case of node failure.

Why would you want to fence a node that's already failed? -- You want to
fence off the master during the failover so it does not accept any more
DB updates.

Anyway, here's how we run:
- we provide services to customers; customer-facing service access
points is where we want to reduce downtime.
- We buy decent quality hardware with redundant disks and power
circuits. 99.% of the downtime is maintenance reboots.
- We carefully partition or services so that most of them require
read-only access. Those run off slaves.
- The master is not even visible to the customers. Its downtime does not
affect service availability.

Obviously, this works because we can run read-only. If you need
read-write on a public-facing database, the above won't work for you.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Code of Conduct plan

2018-09-14 Thread Dimitri Maziuk
On 09/14/2018 09:42 AM, Dave Page wrote:

> There are some fuzzy edges I guess (e.g. Slack), but in my mind it's always
> been anyone who participates in any of the projects communications channels.

Then you Sir are an evil ter'rist member of isis because your spoken
words are carried by the same air in the same atmosphere as theirs.
Please stand by while the black helicopters are being dispatched to your
current location, you will be shot in the face and dropped in the ocean
shortly.

Have a nice day
-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Code of Conduct plan

2018-09-14 Thread Dimitri Maziuk
On 09/14/2018 12:14 PM, Peter Geoghegan wrote:

> No CoC can possibly provide for every conceivable situation. Somebody
> has to interpret the rules, and it has to be possible to impose
> sanctions when the CoC is violated -- otherwise, what's the point?
> There are several checks and balances in place, and I for one have
> confidence in the process as outlined. It's imperfect, but quite a lot
> better than either the status quo, or a platitude about inclusivity.

So let me get this straight: you want to have a "sanctioned" way to deny
people access to postgresql community support channel?  "Because
somebody who may or may not be the same person, allegedly said something
somewhere that some other tweet disagreed with on faceplant"?

Great plan if you do for-pay postgresql support for the living.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Code of Conduct plan

2018-09-14 Thread Dimitri Maziuk
On 09/14/2018 12:46 PM, Peter Geoghegan wrote:
> On Fri, Sep 14, 2018 at 10:31 AM, Dimitri Maziuk  
> wrote:
>> So let me get this straight: you want to have a "sanctioned" way to deny
>> people access to postgresql community support channel?
> 
> Yes.

A question to TPTBs, then: once The Great Plan is implemented, will I be
automagically unsubscribed from all postgres lists because I did not
explicitly agree to abide by The Rules And Regulations back when I
susbscribed?

Personally I would like that. Others might prefer an invitation to
unsubscribe or forever hold their peace, I could live with that too, but
I believe explicit opt-ins are preferable to opt-outs.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Code of Conduct plan

2018-09-14 Thread Dimitri Maziuk
On 09/14/2018 01:17 PM, Chris Travers wrote:

> And frankly I am probably being paranoid here though I find paranoia is a
> good thing when it comes to care of databases and computer systems.  But I
> do worry about the interactions between the PostgreSQL community and the
> larger world with things worded this way.

"The issue isn't whether you're paranoid, it's whether you're paranoid
enough"

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature


Re: Code of Conduct plan

2018-09-17 Thread Dimitri Maziuk
On 09/17/2018 10:39 AM, Chris Travers wrote:
> On Mon, Sep 17, 2018 at 5:28 PM Joshua D. Drake 
> wrote:
...
>> My feedback is that those two sentences provide an overarching authority
>> that .Org does not have the right to enforce 
...
> Fascinating that this would, on its face, not apply to a harassment
> campaign carried out over twitter, but it would apply to a few comments
> made over drinks at a bar.

There is a flip side: if you have written standards, you can be held
liable for not enforcing them. Potentially including enforcement of
twitbook AUP on the list subscribers who also have a slackogger account.

-- 
Dimitri Maziuk
Programmer/sysadmin
BioMagResBank, UW-Madison -- http://www.bmrb.wisc.edu



signature.asc
Description: OpenPGP digital signature