I think RR DNS is the only viable solution under these circumstances. If you can cope with the fact that failovers won't be seamless, I don't think there's anything wrong with that though.

On 07/21/2015 11:54 AM, Laz C. Peterson wrote:
The consensus seems to say no to RR DNS … I am going to take that into serious 
consideration.

With this proxy setup you describe, what would happen if HAProxy or Dovecot 
Proxy were to fail?

I think there is no problem with many moving parts, as long as there is a 
backup plan in case something goes awry.  My goal is slightly different, as I 
want to have HA available across datacenters without using BGP or having 
control over the IP space (so, no anycast).  Just a simple way to get the 
clients redirected to the other Dovecot server when I lose an entire datacenter 
network for whatever reason.

~ Laz Peterson
Paravis, LLC

On Jul 20, 2015, at 5:32 PM, Chad M Stewart <c...@balius.com> wrote:


Round-robin DNS last I checked can be fraught with issues.

While doing something else I came up with this idea:  Clients --> Load Balancer(HAProxy) 
--> Dovecot Proxy(DP) --> Dovecot Director(DD) --> MS1 / MS2.


When DP checks say user100 it'll find a host=DD-POD1 that returns two IPs, 
those of the two DD that sit in front of POD1. This DD pair is the only pair in 
the ring and only responsible for POD1.  Another pair will handle POD2.  When 
DD looks up the host value for a user it'll find the same name, but the IPs 
returned will be different.  Instead have both IPs of the mail stores returned.

I believe this will achieve what I'm after.  HAProxy will do the load balancing of the DP 
instances.  DP will balance the DDs, and DDs will do its job well and ensure that say 
user300 has all of their connections sent to MS1.  When I need to do maintenance on MS1 I 
can use the DD pair for POD1 to gently move the connections to MS2, etc..   I could also 
make each POD a 2+1 cluster, so a silent but up-to-date and replicated store sits there 
waiting should it be needed, or even a 2+2 cluster.  After all "two is one, and one 
is none".

Not sure when I'll get time to implement/test this out, but in theory it sounds 
reasonable. I admit its a fair amount of moving parts and areas for failure but 
I think it maybe the balance needed to achieve the service level availability 
I'm after while still allowing for maintenance on the systems w/o clients 
noticing.

-Chad


On Jul 20, 2015, at 1:04 PM, Laz C. Peterson <l...@paravis.net> wrote:

I’m trying to do this too.  But the goal would be simply for automatic failover 
to the other datacenter.  Everything is working if the server’s unique hostname 
is entered, but I want to do something like round robin DNS that mail clients 
will automatically attempt to connect to the other IP if they cannot get to the 
first address.  Unfortunately mail applications don’t really do this like web 
browsers do …

~ Laz Peterson
Paravis, LLC

On Jul 20, 2015, at 10:29 AM, Chad M Stewart <c...@balius.com> wrote:


I'm trying to determine which dovecot components to use and how to order them 
in the network path from client to mail store.


If I have say 1,000 users, all stored in MySQL (or LDAP) and have 4 mail 
stores, configured into 2, 2 node pods.


MS1 and MS2 are pod1 and are configured with replication (dsync) and host users 
0-500.  MS3 and MS4 are pod2 and are configured with replication between them 
and host users 501-1000.   Ideally the active connections in pod1 would be 
split 50/50 between MS1 and MS2.  When maintenance is performed obviously all 
active connections/users would be moved to the other node in the pod and then 
rebalanced once maintenance is completed.

I'm not sure if I need to use both the proxy and director, or just one or the 
other? If both then what is the proper path, from a network perspective?  I 
like the functionality director provides, being able to add/remove servers on 
the fly and adjust connections, etc.. But from what I've read director needs to 
know about all mail servers.  The problem is that not all servers host all 
users.  User100 could be serviced by ms1 or ms2, but not by ms3 or ms4.

I'm trying to design a system that should provide as close to 99.999% service 
availability as possible.



Thank you,
Chad

Reply via email to