Hello! I am currently working with a three-node DRBD9 setup, where A and B are 
diskful nodes, and C is a diskless node. In certain scenario, I have observed 
that both diskful nodes (A and B) can end up in the "outdated" state. Below are 
the steps to reproduce this issue:


1. The primary node is initially on B.
2. The connection between B and A is severed (by iptables in my experiment), 
leaving A in the "outdated" state. Some data is then written to B.
3. The connection between B and A is restored, making A "Inconsistent" and 
syncing the latest data from B.
4. During the sync process, B is demoted from primary, and A is promoted to 
primary.
5. While the sync process is ongoing, the connection between A and B is severed 
again, leaving B in the "outdated" state.
6. The connection is restored, and the synchronization process completes. 
However, both A and B are now in the "outdated" state and remain so even after 
a restart.


I am using DRBD 9.2.8 and have reproduced this issue multiple times with the 
same result. After analyzing the behavior, I believe the root cause is that 
DRBD allows an "Inconsistent" node to be promoted to primary, provided it has a 
stable connection to an "UpToDate" node. However, this can lead to the 
following issue:


When an "Inconsistent" node (while syncing) becomes primary and then its 
connection to another node is severed, the other "UpToDate" node becomes 
"outdated." Once the connection is restored and synchronization completes, both 
nodes end up in the "outdated" state.


I have the following questions:


1. Is it possible to configure DRBD to disallow the promotion of an 
"Inconsistent" node to primary? This would help avoid this issue.
2. If both disked nodes are in the "outdated" state, is it guaranteed that 
their data is consistent? If the data is consistent, it would it be safe to use 
the --force option to promote one of the nodes to primary to resolve the 
situation.
3. Can nodes in the "Inconsistent" or "Outdated" state participate in voting? 
Based on my understanding of distributed systems like etcd, unhealthy nodes are 
not allowed to vote or become leaders.


I would greatly appreciate your guidance on these issues. Thank you in advance 
for your time and support, and I look forward to your reply.


Best regards,
Rui

Reply via email to