Re: [Pacemaker] streamed writes fail with migration for NFS v3 over TCP

2009-05-20 Thread Tim Serong
Bob Haxo wrote: > Anyone have any ideas why NFSv3 over TCP reads should be successful > across 100s of migrations and failovers, but writes bomb? You might be suffering from a variant of this: http://marc.info/?l=linux-nfs&m=123175640421702&w=2 In particular, note the behaviour described for a

[Pacemaker] A few questions

2009-05-20 Thread Ryan Steele
Hey folks, Been toying with OpenAIS and Pacemaker for a day or two, and I have a few questions that I couldn't find verbiage on in the documentation, which I hope some wiser minds might be able to answer. 1. Is there a bug in crm_attribute, specifically with --attr-name (-n)? I can't seem

Re: [Pacemaker] streamed writes fail with migration for NFS v3 over TCP

2009-05-20 Thread Bob Haxo
Hi Lars, 1) wireshark ... really nice tool. wireshark and I are already well on our way of becoming close friends as I try to debug this situation. 2) this is a pure test environment with everything that I can do to make the setup simple. Therefore no firewall configured on these systems. (All

Re: [Pacemaker] streamed writes fail with migration for NFS v3 over TCP

2009-05-20 Thread Florian Haas
lmb, can you ring Volker and see how you can steal their tickle ACK thingy? :) Florian On 2009-05-20 00:15, Bob Haxo wrote: > Greetings, > > I find that streamed writes fail with migration for NFS v3 over TCP. > Not every time, but almost every time. > > Streamed writes continue nicely across

Re: [Pacemaker] streamed writes fail with migration for NFS v3 over TCP

2009-05-20 Thread Lars Ellenberg
On Tue, May 19, 2009 at 03:15:17PM -0700, Bob Haxo wrote: > Greetings, > > I find that streamed writes fail with migration for NFS v3 over TCP. > Not every time, but almost every time. > > Streamed writes continue nicely across many migrations for NFS v3 over > UDP. > > With TCP, writes continue

Re: [Pacemaker] trigger STONITH for testing purposes

2009-05-20 Thread Bob Haxo
Hi Andrew, > I'd say you removed no-quorum-policy=ignore Actually, the pair of no_quorum_policy and no-quorum-policy are set to "ignore", and expected-quorum-votes is set to "2": ... ... Removing the no-quorum-policy=ignore and no_quorum_po

Re: [Pacemaker] streamed writes fail with migration for NFS v3 over TCP

2009-05-20 Thread Bob Haxo
Hi Karl, I have not encountered stale file handles with NFSv3 migration with streamed write failures. And I'm pretty certain that at least some of the time I wait more than 90 sec for the migration to happen before declaring failure and migrating back to the original server. I would first like to

Re: [Pacemaker] Clone config question

2009-05-20 Thread Dejan Muhamedagic
Hi, On Wed, May 20, 2009 at 11:04:49AM +0200, Andrew Beekhof wrote: > On Wed, May 20, 2009 at 8:55 AM, Mark Schenk wrote: > > Hello Andrew, > > > > ? thanks for the offer, however I'm pretty sure that it's my lack of > > knowledge that't the problem here and not pacemaker :-) I'll experiment on >

Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-20 Thread Nikola Ciprich
On Wed, May 20, 2009 at 02:02:52PM +0200, Andrew Beekhof wrote: > Ah, well that was pretty obvious. > /me humbly apologizes for such a stupid error. Hi and thanks! no problem > (It wasn't caught by my own valgrind testing because this function is > specific to heartbeat based clusters) don't worr

Re: [Pacemaker] cib still leaks in pacemaker-1.0.3

2009-05-20 Thread Andrew Beekhof
Ah, well that was pretty obvious. /me humbly apologizes for such a stupid error. (It wasn't caught by my own valgrind testing because this function is specific to heartbeat based clusters) Try this: diff -r ea5d0b58c0be cib/callbacks.c --- a/cib/callbacks.c Wed May 20 11:56:39 2009 +0200 +++

Re: [Pacemaker] trigger STONITH for testing purposes

2009-05-20 Thread Andrew Beekhof
On Wed, May 20, 2009 at 1:31 AM, Bob Haxo wrote: > Greetings, > > I liked the idea of not starting the cluster at boot, and found that the > fenced node would reboot and then openais start brought the node onboard > without triggering a reboot of the already running node. > > Then magic happened. 

Re: [Pacemaker] Clone config question

2009-05-20 Thread Andrew Beekhof
On Wed, May 20, 2009 at 8:55 AM, Mark Schenk wrote: > Hello Andrew, > >   thanks for the offer, however I'm pretty sure that it's my lack of > knowledge that't the problem here and not pacemaker :-) I'll experiment on > and repost here when I'm really stuck... honestly, its ok... even if its not