has anybody seen this before, got any insight?

James

James Masson wrote:
> Hi list,
> 
> I'm using DRBD and NFS to provide HA to Virtual Machine images between pairs 
> of storage servers.
> 
> Systems are RHEL5.4 2.6.18-164.el5 + drbd8.3 from Centos Extras
> 
> We've been having issues where disk I/O problems on the DRBD Secondary stops 
> all IO to the Primary
> too. DRBD doesn't seem to recognise these disk I/O problems, the Secondary 
> isn't disconnected
> automatically. Everything just hangs.
> 
> During this state:
> If I try a "drbdadm disconnect all" on the Primary, the command hangs.
> If I try this on the Secondary, the command eventually completes, and NFS I/O 
> returns to normal
> operation on the Primary.
> 
> I've tried the following things to fix this:
> 
> 1) Putting in a custom local-io-error handler to hard reset the problem node.
> 
> This never triggers. Just like the default "detach", never triggers.
> 
> 2) Changing the net connection parameters to:
> 
>       net {
>               ko-count 2;
>               timeout 20;
>       }
> 
> Again, this never triggers.
> 
> 
> 3) Changing the protocol used from C to B
> 
> Doesn't have any effect on the issue - I'd prefer to use C anyway.
> 
> 
> Any further ideas on how to track this issue down and fix it?
> 
> thanks
> 
> James Masson
> _______________________________________________
> drbd-user mailing list
> [email protected]
> http://lists.linbit.com/mailman/listinfo/drbd-user
_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to