I'm assuming based on the "SSL error" that you have ssl set to 'on'. What's
your ssl_renegotiation_limit? The default is 512MB, but setting it to 0 has
solved problems for a number of people on this list, including myself.

Sherrylyn

On Thu, Sep 24, 2015 at 3:57 PM, Francisco Reyes <li...@natserv.net> wrote:

> Have an existing setup of 9.3 servers. Replication has been rock solid,
> but recently the circuits between data centers were upgraded and
> pg_basebackup now seems to fail often when setting up streaming
> replication. What used to take 10+ hours now  only took 68 minutes, but had
> to do many retries. Many attempts fail within minutes while others go to
> 90% or higher and then drop. The reason we are doing a sync is because we
> have to swap data centers every so often for compliance. So I had to swap
> master and slave.
>
> Calling pg_basebackup like this:
> pg_basebackup -P -R -X s -h <HostName> -D <Folder> -U replicator
>
> The error we keep having is:
> Sep 23 13:36:32 <HostName> postgres[16804]: [11-1] 2015-09-23 13:36:32 EDT
> <IP> [unknown] replicator LOG: SSL error: bad write retry
> Sep 23 13:36:32 <HostName> postgres[16804]: [12-1] 2015-09-23 13:36:32 EDT
> <IP> [unknown] replicator LOG: SSL error: bad write retry
> Sep 23 13:36:32 <HostName> postgres[16804]: [13-1] 2015-09-23 13:36:32 EDT
> <IP> [unknown] replicator FATAL: connection to client lost
> Sep 23 13:36:32 <HostName> postgres[16972]: [9-1] 2015-09-23 13:36:32 EDT
> <IP> [unknown] replicator LOG: could not receive data from client:
> Connection reset by peer
>
> I have been working with the network team and we have even been actively
> monitoring the line, and running ping, as the replication is setup. At the
> point the connection reset by peer error happens, we don't see any issue
> with the network and ping doesn't show an issue at that point in time.
>
> The issue also happened on another set of machines and likewise, had to
> retry many times before pg_basebackup would do the initial sync. Once the
> initial sync is set, replication is fine.
>
> I  tried both "-X s" (stream) and "-X f" (fetch) and both fail often.
>
> Any ideas what may be going on?
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

Reply via email to