On Wed, 25 Jan 2012, Rick Macklem wrote:

Bruce Evans wrote:
On Tue, 24 Jan 2012, Rick Macklem wrote:

Bruce Evans wrote:
On Wed, 25 Jan 2012, Rick Macklem wrote:

Log:
 If a mount -u is done to either NFS client that switches it
 from TCP to UDP and the rsize/wsize/readdirsize is greater
 than NFS_MAXDGRAMDATA, it is possible for a thread doing an
 I/O RPC to get stuck repeatedly doing retries. This happens
 ...

Could it wait for the old i/o to complete (and not start any new
i/o?). This is little different from having to wait when changing
from rw to ro. The latter is not easy, and at least the old nfs
client seems to not even dream of it. ffs has always called a
...

As you said above "not easy ... uses complicated suspension of i/o".
I have not tried to code this, but I think it would be non-trivial.
The code would need to block new I/O before RPCs are issued and wait
for all in-progress I/Os to complete. At this time, the kernel RPC
handles the in-progress RPCs and NFS doesn't "know" what is
outstanding. Of course, code could be added to keep track of
in-progress
I/O RPCs, but that would have to be written, as well.

Hmm, this means that even when the i/o sizes are small, the mode
switch
from tcp to udp may be unsafe since there may still be i/o's with
higher
sizes outstanding. So to switch from tcp to udp, the user should first
reduce the sizes, when wait a while before switching to udp. And what
happens with retries after changing sizes up or down? Does it retry
with the old sizes?

Bruce
Good point. I think (assuming a TCP mount with large rsize):
# mount -u -o rsize=16384 /mnt
# mount -u -o udp /mnt
- could still result in a wedged thread trying to do a read that
 is too large for UDP.

I'll revert r230516, since it doesn't really fix the problem, it just
reduced its lieklyhood.

That seems a regression.

I'll ask on freebsd-fs@ if anyone finds switching from TCP->UDP via a
"mount -u" is useful to them. If no one thinks it's necessary, the patch
could just disallow the switch, no matter what the old rsize/wsize/readdirsize
is.

I use it a lot for performance testing.  Of course it is unnecessary,
since a least for performance testing it is possible to do a full
unmount and re-mount, but mount -u is more convenient.

Otherwise, the fix is somewhat involved and difficult for a scenario
like this, where the NFS server is network partitioned or crashed:
- sysadmin notices NFS mount is "hung" and does
 # mount -u -o udp /path
 to try and fix it, but it doesn't help
- sysadmin tries "umount -f /path" to get rid of the "hung" mount.

Now I wonder what makes a full unmount (without without -f) and re-mount work.

If "mount -u -o udp /path" is waiting for I/O ops to complete,
(which is what the somewhat involved patch would need to do) the
"umount -f /path" will get stuck waiting for the "mount -u"
which will be waiting for I/O RPCs to complete. This could

I often misremember -f for umount is meaning don't wait.  It actually
means to forcibly close files before proceeding.

be partially fixed by making sure that the "mount -u -o udp /path" is
interruptible (via <ctrl>C), but I still don't like the idea that
"umount -f /path" won't work if "mount -u -o udp /path" is sitting in
the kernel waiting for RPCs to complete, which would need to be done
to make a TCP->UDP switch work.

Doesn't umount -f have to wait for i/o anyway?  When it closes files,
it must wait for all in-progress i/o for the files, and for all new
i/o's that result from closing.

Bruce
_______________________________________________
svn-src-all@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"

Reply via email to