Yep, you're right, I misread that (shouldn't send email pre-coffee). Are the timeouts repeatable enough that you can get a packet capture in there before and while they're happening?
On Wed, Sep 05, 2018 at 07:09:09PM -0400, Zoltan Forray wrote: > Skylar, > > I sent your comment about UDP vs TDP to my OS tech (beyond my ken) - got > this feedback: > > I assume what they are talking about is this: > > hhisilonnfs23.rams.adp.vcu.edu:/ifs/NFS/TSM on /tsmnfs type nfs > (rw,relatime,sync,vers=3,rsize=131072,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.19.12,mountvers=3,mountport=300, > *mountproto=udp*,local_lock=none,addr=192.168.19.12) > > Looks like this is the default setting (also on all the other servers to > initiate a conversation with the NFS server). However, if you read the > documentation on this option it goes into detail about how this option > differs from proto (which is also defined): > > https://access.redhat.com/solutions/183583 > > "mountproto differs from proto as it defines what protocol (TCP or UDP) the > client will use to initiate the connection and conduct the mount and > umountoperations. > This differs from the proto option which sets the protocol that the initial > connection *and* the actual transportation will use." > > The proto option (set to TCP in the mount) appears to be determining how > the actual connection and transport of data is conducted. > > When running a tcpdump on Earth I see NFS TCP traffic running over the 23 > VLAN (and the 22 VLAN on other TSM servers) and no UDP packets to speak of. > > On Wed, Sep 5, 2018 at 10:25 AM Skylar Thompson <skyl...@uw.edu> wrote: > > > It looks like you're using UDP as a transport - have you tried switching to > > TCP? Especially with large NFS payload sizes, you're going to get lots of > > fragmentation with UDP's 512-byte packet limit. > > > > On Wed, Sep 05, 2018 at 09:03:25AM -0400, Zoltan Forray wrote: > > > A pair of 10G links bonded - CISCO switches. > > > > > > On Tue, Sep 4, 2018 at 7:54 PM Skylar Thompson <skyl...@uw.edu> wrote: > > > > > > > Quick question - what's the data link protocol (Ethernet, IB, etc.) and > > > > link rate > > > > that you're using? > > > > > > > > On Tue, Sep 04, 2018 at 02:05:33PM -0400, Zoltan Forray wrote: > > > > > We are still fighting issues with ISILON storage. Our current issue > > is > > > > with > > > > > NFS timeouts for the storage a server is using. We see message like > > > > these > > > > > in the server /var/log > > > > > > > > > > Sep 4 13:21:49 earth kernel: nfs: server > > hhisilonnfs23.rams.adp.vcu.edu > > > > > not responding, still trying > > > > > Sep 4 13:21:49 earth kernel: nfs: server > > hhisilonnfs23.rams.adp.vcu.edu > > > > > not responding, still trying > > > > > Sep 4 13:21:49 earth kernel: nfs: server > > hhisilonnfs23.rams.adp.vcu.edu > > > > > not responding, still trying > > > > > Sep 4 13:21:49 earth kernel: nfs: server > > hhisilonnfs23.rams.adp.vcu.edu > > > > > not responding, still trying > > > > > Sep 4 13:22:14 earth kernel: nfs: server > > hhisilonnfs23.rams.adp.vcu.edu > > > > > not responding, still trying > > > > > Sep 4 13:22:15 earth kernel: nfs: server > > hhisilonnfs23.rams.adp.vcu.edu > > > > > not responding, still trying > > > > > Sep 4 13:22:16 earth kernel: nfs: server > > hhisilonnfs23.rams.adp.vcu.edu > > > > OK > > > > > Sep 4 13:22:16 earth kernel: nfs: server > > hhisilonnfs23.rams.adp.vcu.edu > > > > OK > > > > > Sep 4 13:22:16 earth kernel: nfs: server > > hhisilonnfs23.rams.adp.vcu.edu > > > > OK > > > > > > > > > > OS folks say the NFS mount is setup as IBM recommends in various > > > > documents. > > > > > So they asked us to implement the nfstimeout option from this > > document ( > > > > > > > > > > > https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.0/com.ibm.itsm.client.doc/r_opt_nfstimeout.html > > > > ). > > > > > Yes I realize it is primarily for a client backup of an NFS mount, > > but > > > > the > > > > > statement: > > > > > > > > > > Supported Clients This option is for all UNIX and Linux clients. *The > > > > > server can also define this option*. > > > > > > > > > > throws us - kind-of implying I can use this from the server > > perspective? > > > > > But I can't find any documentation to support using it from the > > server. > > > > > > > > > > For you Linux guru's - this is what the mount says: > > > > > > > > > > hhisilonnfs23.rams.adp.vcu.edu:/ifs/NFS/TSM on /tsmnfs type nfs > > > > > > > > > > > (rw,relatime,sync,vers=3,rsize=131072,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.19.12,mountvers=3,mountport=300,mountproto=udp,local_lock=none,addr=192.168.19.12) > > > > > > > > > > Any thoughts? Suggestion? Are we simply expecting too much from > > NFS? > > > > > > > > > > My OS person also asks why ISP is so slow to write to NFS? When they > > > > did a > > > > > test copy of a large file to the NFS mount, they were getting > > upwards of > > > > 8G/s > > > > > vs 1.5-3G/s when TSM/ISP writes to it (via EMC monitoring tools). > > > > > > > > > > -- > > > > > *Zoltan Forray* > > > > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator > > > > > Xymon Monitor Administrator > > > > > VMware Administrator > > > > > Virginia Commonwealth University > > > > > UCC/Office of Technology Services > > > > > www.ucc.vcu.edu > > > > > zfor...@vcu.edu - 804-828-4807 > > > > > Don't be a phishing victim - VCU and other reputable organizations > > will > > > > > never use email to request that you reply with your password, social > > > > > security number or confidential personal information. For more > > details > > > > > visit http://phishing.vcu.edu/ > > > > > > > > -- > > > > -- Skylar Thompson (skyl...@u.washington.edu) > > > > -- Genome Sciences Department, System Administrator > > > > -- Foege Building S046, (206)-685-7354 > > > > -- University of Washington School of Medicine > > > > > > > > > > > > > -- > > > *Zoltan Forray* > > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator > > > Xymon Monitor Administrator > > > VMware Administrator > > > Virginia Commonwealth University > > > UCC/Office of Technology Services > > > www.ucc.vcu.edu > > > zfor...@vcu.edu - 804-828-4807 > > > Don't be a phishing victim - VCU and other reputable organizations will > > > never use email to request that you reply with your password, social > > > security number or confidential personal information. For more details > > > visit http://phishing.vcu.edu/ > > > > -- > > -- Skylar Thompson (skyl...@u.washington.edu) > > -- Genome Sciences Department, System Administrator > > -- Foege Building S046, (206)-685-7354 > > -- University of Washington School of Medicine > > > > > -- > *Zoltan Forray* > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator > Xymon Monitor Administrator > VMware Administrator > Virginia Commonwealth University > UCC/Office of Technology Services > www.ucc.vcu.edu > zfor...@vcu.edu - 804-828-4807 > Don't be a phishing victim - VCU and other reputable organizations will > never use email to request that you reply with your password, social > security number or confidential personal information. For more details > visit http://phishing.vcu.edu/ -- -- Skylar Thompson (skyl...@u.washington.edu) -- Genome Sciences Department, System Administrator -- Foege Building S046, (206)-685-7354 -- University of Washington School of Medicine