Re: lightly loaded system eats swap space

2018-07-08 Thread tech-lists

On 25/06/2018 18:28, tech-lists wrote:

On 20/06/2018 06:08, Shane Ambler wrote:

This review is aiming to fix this -
https://reviews.freebsd.org/D7538

I have been running the patch on stable/11 and after eight days uptime I
still have zero swap in use, I can't recall a time in the last few years
that I have had no swap usage past the first hour or two uptime.


will this work on a recent 12-current?


Hi,

Just to let the thread know - upgrading to 12-current (in this case 
r336037 but I guess it's not specific, just a recent version) fixed the 
problem.


thanks,
--
J.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Unable to boot memstick on APU2

2018-07-08 Thread Stefan Bethke
I'm stumped by a weird error: loader loads the kernel, and the kernel probes 
the USB stick successfully, but da0 never shows up. I’ve tried with 
FreeBSD-11.1-RELEASE-amd64-memstick.img and 
FreeBSD-11.2-RELEASE-amd64-memstick.img.

While at the mount root prompt, unplugging and replugging the USB stick and 
entering . repeatedly will show the kernel messages, but "da0 at umass0" never 
shows up.

I’ve added a couple entries to /boot/loader.conf:
# cat /mnt/boot/loader.conf
vfs.mountroot.timeout="10"
beastie_disable="YES"
comconsole_speed="115200"
console="comconsole"
autoboot_delay="1"


Here’s the console output from 11.1:
Consoles: internal video/keyboard   ce+0x67
BIOS drive C: is disk0 t vpanic+0x177
BIOS drive D: is disk1 t panic+0x43
BIOS 638kB/3668660kB available memory +0x1d95
 4 0x80a93b68 at start_init+0x48
FreeBSD/x86 bootstrap loader, Revision 1.1 
(Fri Jul 21 02:03:08 UTC 2017 r...@releng2.nyi.freebsd.org) 
Loading /boot/defaults/loader.conf 
//boot/kernel/kernel text=0x14972f8 data=0x1384c0+0x4c15e8 
syms=[0x8+0x15e8b0+0x8+0x178422]ild 20170228
/080 MB ECC DRAM
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...   
Copyright (c) 1992-2017 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 11.1-RELEASE #0 r321309: Fri Jul 21 02:08:28 UTC 2017
r...@releng2.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
FreeBSD clang version 4.0.0 (tags/RELEASE_400/final 297347) (based on LLVM 
4.0.0)
VT(vga): resolution 640x480
CPU: AMD GX-412TC SOC(998.15-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x730f01  Family=0x16  Model=0x30  Stepping=1
  
Features=0x178bfbff
  
Features2=0x3ed8220b
  AMD Features=0x2e500800
  AMD 
Features2=0x1d4037ff
  Structured Extended Features=0x8
  XSAVE Features=0x1
  SVM: NP,NRIP,AFlush,DAssist,NAsids=8
  TSC: P-state invariant, performance statistics
real memory  = 4815060992 (4592 MB)
avail memory = 4087992320 (3898 MB)
Event timer "LAPIC" quality 100
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
ioapic1: Changing APIC ID to 5
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-55 on motherboard
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #3 Launched!
Timecounter "TSC" frequency 998148849 Hz quality 1000
random: entropy device external interface
kbd0 at kbdmux0
netmap: loaded module
module_register_init: MOD_LOAD (vesa, 0x80f5b220, 0) error 19
nexus0
vtvga0:  on motherboard
cryptosoft0:  on motherboard
acpi0:  on motherboard
acpi0: Power Button (fixed)
cpu0:  on acpi0
cpu1:  on acpi0
cpu2:  on acpi0
cpu3:  on acpi0
atrtc0:  port 0x70-0x71 irq 8 on acpi0
Event timer "RTC" frequency 32768 Hz quality 0
attimer0:  port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x818-0x81b on acpi0
hpet0:  iomem 0xfed0-0xfed003ff on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
acpi_button0:  on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
pcib1:  at device 2.2 on pci0
pcib1: failed to allocate initial I/O port window: 0x1000-0x1fff
pci1:  on pcib1
igb0:  mem 
0xfe60-0xfe61,0xfe62-0xfe623fff at device 0.0 on pci1
igb0: Using MSIX interrupts with 5 vectors
igb0: Ethernet address: 00:0d:b9:4b:e2:cc
igb0: Bound queue 0 to cpu 0
igb0: Bound queue 1 to cpu 1
igb0: Bound queue 2 to cpu 2
igb0: Bound queue 3 to cpu 3
igb0: netmap queues/slots: TX 4/1024, RX 4/1024
pcib2:  at device 2.3 on pci0
pci2:  on pcib2
igb1:  port 
0x2000-0x201f mem 0xfe70-0xfe71,0xfe72-0xfe723fff at device 0.0 on 
pci2
igb1: Using MSIX interrupts with 5 vectors
igb1: Ethernet address: 00:0d:b9:4b:e2:cd
igb1: Bound queue 0 to cpu 0
igb1: Bound queue 1 to cpu 1
igb1: Bound queue 2 to cpu 2
igb1: Bound queue 3 to cpu 3
igb1: netmap queues/slots: TX 4/1024, RX 4/1024
pcib3:  at device 2.4 on pci0
pci3:  on pcib3
igb2:  port 
0x3000-0x301f mem 0xfe80-0xfe81,0xfe82-0xfe823fff at device 0.0 on 
pci3
igb2: Using MSIX interrupts with 5 vectors
igb2: Ethernet address: 00:0d:b9:4b:e2:ce
igb2: Bound queue 0 to cpu 0
igb2: Bound queue 1 to cpu 1
igb2: Bound queue 2 to cpu 2
igb2: Bound queue 3 to cpu 3
igb2: netmap queues/slots: TX 4/1024, RX 4/1024
pci0:  at device 8.0 (no driver attached)
xhci0:  mem 0xfeb22000-0xfeb23fff at device 16.0 on 
pci0
xhci0: 32 bytes context size, 64-bit DMA
xhci0: Unable to map MSI-X table 
usbus0 on xhci0
usbus0: 5.0Gbps Super Speed USB v3.0
ahci0:  port 
0x4010-0x4017,0x4020-0x4023,0x4018-0x401f,0x4024-0x4027,0x4000-0x400f mem 
0xfeb25000-0xfeb253ff at device 17.0 on pci0
ahci0:

Re: Unable to boot memstick on APU2

2018-07-08 Thread Christoph Moench-Tegeder
## Stefan Bethke (s...@lassitu.de):

> I'm stumped by a weird error: loader loads the kernel, and the kernel
> probes the USB stick successfully, but da0 never shows up. I’ve tried
> with FreeBSD-11.1-RELEASE-amd64-memstick.img and
> FreeBSD-11.2-RELEASE-amd64-memstick.img.

I've had this happen with a rather old USB stick (back from the days
when 8GB was considered "huge", these days it's my boot stick...)
Letting the machine with the stick inserted sit a little on the
boot prompt did the trick for me. A newer stick "just worked fine",
so I wrote that off as "aging flash".

Regards,
Christoph

-- 
Spare Space.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: Unable to boot memstick on APU2

2018-07-08 Thread Ruben

Hi Stefan,


On 07/08/2018 06:50 PM, Christoph Moench-Tegeder wrote:

## Stefan Bethke (s...@lassitu.de):


I'm stumped by a weird error: loader loads the kernel, and the kernel
probes the USB stick successfully, but da0 never shows up. I’ve tried
with FreeBSD-11.1-RELEASE-amd64-memstick.img and
FreeBSD-11.2-RELEASE-amd64-memstick.img.




I remember having simular issues when I installed 11.0 on one of my APUs 
. My problems were solved by using the "lowest" usb port (the one 
physically closest to the underside of the device).


I thought it had something to do with the PC Engines firmware at the 
time (and it has been some time ago) but I'm not sure anymore and dit 
not find the time to follow up..


Perhaps worth giving it a try.


Regards,

Ruben
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Problem reports for sta...@freebsd.org that need special attention

2018-07-08 Thread bugzilla-noreply
To view an individual PR, use:
  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=(Bug Id).

The following is a listing of current problems submitted by FreeBSD users,
which need special attention. These represent problem reports covering
all versions including experimental development code and obsolete releases.

Status  |Bug Id | Description
+---+---
Open|227213 | FreeBSD 10.4 kernel deadlocks on sysctlmemlock

1 problems total for which you should take action.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


RE: NFS 4.1 RECLAIM_COMPLETE FS failed error

2018-07-08 Thread Daniel Engel
Hi, 

I am setting up an environment with FreeBSD 11.1 sharing a ZFS datastore to 
vmware ESXI 6.7.  There were a number of errors with NFS 4.1 sharing that I 
didn't understand until I found the following thread.  



I traced the commits that Rick has made since that thread and merged them 
'head' into 'stable': 

'svnlite checkout http://svn.freebsd.org/base/release/11.1.0/'
'svnlite merge -c 332790 http://svn.freebsd.org/base/head'
'svnlite merge -c 333508 http://svn.freebsd.org/base/head'
'svnlite merge -c 333579 http://svn.freebsd.org/base/head'
'svnlite merge -c 333580 http://svn.freebsd.org/base/head'
'svnlite merge -c 333592 http://svn.freebsd.org/base/head'
'svnlite merge -c 333645 http://svn.freebsd.org/base/head'
'svnlite merge -c 333766 http://svn.freebsd.org/base/head'
'svnlite merge -c 334396 http://svn.freebsd.org/base/head'
'svnlite merge -c 334492 http://svn.freebsd.org/base/head'
'svnlite merge -c 327674 http://svn.freebsd.org/base/head'

That completely fixed the connection instability, but the NFS share was still 
mounting read-only with a RECLAIM_COMPLETE error.  So, I manually applied the 
first patch from the previous thread and everything started working:

--- fs/nfsserver/nfs_nfsdserv.c.savrecl 2018-02-10 20:34:31.166445000 
-0500
+++ fs/nfsserver/nfs_nfsdserv.c 2018-02-10 20:36:07.94749 -0500
@@ -4226,10 +4226,9 @@ nfsrvd_reclaimcomplete(struct nfsrv_desc
goto nfsmout;
}
NFSM_DISSECT(tl, uint32_t *, NFSX_UNSIGNED);
+   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
if (*tl == newnfs_true)
-   nd->nd_repstat = NFSERR_NOTSUPP;
-   else
-   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
+   nd->nd_repstat = 0;

The question is: Did I miss something?  Is there an alternate change already in 
SVN that does the same thing better, or is there some corner case preventing 
this patch from being finalized that I just haven't run into yet?  

Thanks,
Daniel Engel
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Security patch SA-18:03 removed from 11.2 - why?

2018-07-08 Thread Peter
Release/update 11.1-p8 introduced so-called "mitigation for speculative 
execution vulnerabilities".


In RElease 11.2 these "mitigation" have been removed. What is the reason 
for the removal, and specifically why is Security advisory 18:03 still 
mentioned in the release notes?


Behaviour with 11.1-p8:

# sysctl hw.ibrs_disable
hw.ibrs_disable: 0
# sysctl hw.ibrs_active
hw.ibrs_active: 1

Behaviour with 11.2 w/ same CPU + microcode:

# sysctl hw.ibrs_disable
hw.ibrs_disable: 0
# sysctl hw.ibrs_active
hw.ibrs_active: 0
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: NFS 4.1 RECLAIM_COMPLETE FS failed error

2018-07-08 Thread Rick Macklem
Daniel Engel wrote:
>I am setting up an environment with FreeBSD 11.1 sharing a ZFS datastore to 
>vmware >ESXI 6.7.  There were a number of errors with NFS 4.1 sharing that I 
>didn't >understand until I found the following thread.
>
>
>
>I traced the commits that Rick has made since that thread and merged them 
>'head' >into 'stable':
>
>'svnlite checkout http://svn.freebsd.org/base/release/11.1.0/'
>'svnlite merge -c 332790 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333508 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333579 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333580 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333592 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333645 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333766 http://svn.freebsd.org/base/head'
>'svnlite merge -c 334396 http://svn.freebsd.org/base/head'
>'svnlite merge -c 334492 http://svn.freebsd.org/base/head'
>'svnlite merge -c 327674 http://svn.freebsd.org/base/head'
>
>That completely fixed the connection instability, but the NFS share was still 
>mounting >read-only with a RECLAIM_COMPLETE error.  So, I manually applied the 
>first patch >from the previous thread and everything started working:
>
>--- fs/nfsserver/nfs_nfsdserv.c.savrecl 2018-02-10 20:34:31.166445000 
> -0500
>+++ fs/nfsserver/nfs_nfsdserv.c 2018-02-10 20:36:07.94749 -0500
>@@ -4226,10 +4226,9 @@ nfsrvd_reclaimcomplete(struct nfsrv_desc
>   goto nfsmout;
>}
>NFSM_DISSECT(tl, uint32_t *, NFSX_UNSIGNED);
>+   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
>if (*tl == newnfs_true)
>-   nd->nd_repstat = NFSERR_NOTSUPP;
>-   else
>-   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
>+   nd->nd_repstat = 0;
>
>The question is: Did I miss something?  Is there an alternate change already 
>in SVN >that does the same thing better, or is there some corner case 
>preventing this patch >from being finalized that I just haven't run into yet?
Andreas Nagy has been doing quite a bit of testing for me w.r.t the ESXi 6.5
client, but several serious issues (which appear to be violations of the RFC to 
me)
have not yet been resolved.

This email summarizes then:
http://docs.FreeBSD.org/cgi/mid.cgi?YTOPR0101MB0953E687D013E2E97873061ADD720

He recently reported that 6.7 worked better, but he has not yet sent me any
packet traces, so I don't know which issues still exist for 6.7.
I have committed a few things that didn't break the RFC, such as adding
BindConnectiontoSession, but I haven't committed anything else yet,
due to concerns w.r.t. violating the RFC. (The above email thread discusses 
that.)

I do plan on doing something once I get packet traces from Andreas, but be
forewarned that VMware states "FreeBSD is not a supported server" and that
is certainly true. Andreas uses connection trunking. You might be ok with a
single TCP connection unless the server reboots.
(He runs a bunch of patches I gave him, some of which definitely violate
 the RFC.)

All I can suggest is that you keep an eye on freebsd-current@ for any email
about commits to handle the ESXi client better.

So, this is very much a work in progress, rick
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: NFS 4.1 RECLAIM_COMPLETE FS failed error

2018-07-08 Thread Rick Macklem
Daniel Engel wrote:
[stuff snipped]
>I traced the commits that Rick has made since that thread and merged them 
>'head' >into 'stable':
>
>'svnlite checkout http://svn.freebsd.org/base/release/11.1.0/'
>'svnlite merge -c 332790 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333508 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333579 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333580 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333592 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333645 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333766 http://svn.freebsd.org/base/head'
>'svnlite merge -c 334396 http://svn.freebsd.org/base/head'
>'svnlite merge -c 334492 http://svn.freebsd.org/base/head'
>'svnlite merge -c 327674 http://svn.freebsd.org/base/head'
Yes, you have all the commits to head related to the 4.1 server that might 
affect
the ESXi client, plus a bunch that should be harmless, but I don't think affect
the ESXi client mounts. (Most of these will get MFC'd to stable/11, but I 
haven't
gotten around to it yet.)

The ones that might be in 6.7 (they were in 6.5) that may bite you are:
- The client does an OpenDownGrade with all OPEN_SHARE_ACCESS and
   OPEN_SHARE_DENY bits set for something it calls a "drive lock".
  (Adding bits is supposed to be done via an Open/ClaimNull and not
   OpenDowngrade.) I'd really like to know if this still happens for 6.7?
- Something about "directory modified too often" when doing deletion of a bunch
  of files. (I have no idea what this one means, but apparently it was seen for
  other NFSv4.1 servers.)
- Some warnings about "wrong reason for not issuing a delegation". I have a fix
  for this one in PR#226650, but they are just warnings and don't seem to
  matter much.

The rest of the really nasty stuff happens after a server reboot. The recovery 
code
seemed to be badly broken in the 6.5 client. (All sorts of fun stuff like the 
client
looping doiing ExchangeID operations forever. VM crashes...)

>That completely fixed the connection instability, but the NFS share was still 
>mounting >read-only with a RECLAIM_COMPLETE error.  So, I manually applied the 
>first patch >from the previous thread and everything started working:
>
>--- fs/nfsserver/nfs_nfsdserv.c.savrecl 2018-02-10 20:34:31.166445000 
> -0500
>+++ fs/nfsserver/nfs_nfsdserv.c 2018-02-10 20:36:07.94749 -0500
>@@ -4226,10 +4226,9 @@ nfsrvd_reclaimcomplete(struct nfsrv_desc
>goto nfsmout;
>}
>NFSM_DISSECT(tl, uint32_t *, NFSX_UNSIGNED);
>+   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
>if (*tl == newnfs_true)
>-   nd->nd_repstat = NFSERR_NOTSUPP;
>-   else
>-   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
>+   nd->nd_repstat = 0;
I think this patch is ok to use, since no other extant client does a 
ReclaimComplete
with "one_fs == true". It does kinda violate the RFC.
The problem is that FreeBSD exports a hierarchy of file systems and telling the
server that one of them has been reclaimed is useless. (This hack just assumes
the client meant to say "one_fs == false".)
There was also a case (I think it was after a server reboot) where the client 
would
do one of these after doing a ReclaimComplete with "one_fs == false" and that is
definitely bogus (the server would reply NFS4ERR_ALREADY_COMPLETE without
the above hack) since the "one_fs == false" operation means all file systems 
have
been reclaimed.

Anyhow, once I get some packet traces from Andreas for 6.7, I'll try and figure
out how to handle at least some of the outstanding issues.

Good luck with it, rick

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: NFS 4.1 RECLAIM_COMPLETE FS failed error

2018-07-08 Thread Daniel Engel
Rick, 

Thanks for the comments.  I'm running a small "home lab" environment, so the 
ESXi client is the only one I'm concerned with right now.  I'll keep using the 
ReclaimComplete patch as is.  Definitely had problems with the NFS server 
rebooting before I applied the other commits, but that all seems to work fine 
now.  

If it helps, I'm not seeing any "OpenDownGrade"calls in a quick experiment 
mounting and browsing a test share (attached).  

Thanks again,
Daniel


On Sun, Jul 8, 2018, at 7:10 PM, Rick Macklem wrote:
> Daniel Engel wrote:
> [stuff snipped]
> >I traced the commits that Rick has made since that thread and merged them 
> >'head' >into 'stable':
> >
> >'svnlite checkout http://svn.freebsd.org/base/release/11.1.0/'
> >'svnlite merge -c 332790 http://svn.freebsd.org/base/head'
> >'svnlite merge -c 333508 http://svn.freebsd.org/base/head'
> >'svnlite merge -c 333579 http://svn.freebsd.org/base/head'
> >'svnlite merge -c 333580 http://svn.freebsd.org/base/head'
> >'svnlite merge -c 333592 http://svn.freebsd.org/base/head'
> >'svnlite merge -c 333645 http://svn.freebsd.org/base/head'
> >'svnlite merge -c 333766 http://svn.freebsd.org/base/head'
> >'svnlite merge -c 334396 http://svn.freebsd.org/base/head'
> >'svnlite merge -c 334492 http://svn.freebsd.org/base/head'
> >'svnlite merge -c 327674 http://svn.freebsd.org/base/head'
> Yes, you have all the commits to head related to the 4.1 server that 
> might affect
> the ESXi client, plus a bunch that should be harmless, but I don't think 
> affect
> the ESXi client mounts. (Most of these will get MFC'd to stable/11, but 
> I haven't
> gotten around to it yet.)
> 
> The ones that might be in 6.7 (they were in 6.5) that may bite you are:
> - The client does an OpenDownGrade with all OPEN_SHARE_ACCESS and
>OPEN_SHARE_DENY bits set for something it calls a "drive lock".
>   (Adding bits is supposed to be done via an Open/ClaimNull and not
>OpenDowngrade.) I'd really like to know if this still happens for 6.7?
> - Something about "directory modified too often" when doing deletion of a 
> bunch
>   of files. (I have no idea what this one means, but apparently it was seen 
> for
>   other NFSv4.1 servers.)
> - Some warnings about "wrong reason for not issuing a delegation". I have a 
> fix
>   for this one in PR#226650, but they are just warnings and don't seem to
>   matter much.
> 
> The rest of the really nasty stuff happens after a server reboot. The 
> recovery code
> seemed to be badly broken in the 6.5 client. (All sorts of fun stuff 
> like the client
> looping doiing ExchangeID operations forever. VM crashes...)
> 
> >That completely fixed the connection instability, but the NFS share was 
> >still mounting >read-only with a RECLAIM_COMPLETE error.  So, I manually 
> >applied the first patch >from the previous thread and everything started 
> >working:
> >
> >--- fs/nfsserver/nfs_nfsdserv.c.savrecl 2018-02-10 
> > 20:34:31.166445000 -0500
> >+++ fs/nfsserver/nfs_nfsdserv.c 2018-02-10 20:36:07.94749 -0500
> >@@ -4226,10 +4226,9 @@ nfsrvd_reclaimcomplete(struct nfsrv_desc
> >goto nfsmout;
> >}
> >NFSM_DISSECT(tl, uint32_t *, NFSX_UNSIGNED);
> >+   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
> >if (*tl == newnfs_true)
> >-   nd->nd_repstat = NFSERR_NOTSUPP;
> >-   else
> >-   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
> >+   nd->nd_repstat = 0;
> I think this patch is ok to use, since no other extant client does a 
> ReclaimComplete
> with "one_fs == true". It does kinda violate the RFC.
> The problem is that FreeBSD exports a hierarchy of file systems and 
> telling the
> server that one of them has been reclaimed is useless. (This hack just 
> assumes
> the client meant to say "one_fs == false".)
> There was also a case (I think it was after a server reboot) where the 
> client would
> do one of these after doing a ReclaimComplete with "one_fs == false" and 
> that is
> definitely bogus (the server would reply NFS4ERR_ALREADY_COMPLETE 
> without
> the above hack) since the "one_fs == false" operation means all file 
> systems have
> been reclaimed.
> 
> Anyhow, once I get some packet traces from Andreas for 6.7, I'll try and 
> figure
> out how to handle at least some of the outstanding issues.
> 
> Good luck with it, rick
> 


erebor-install-20180708-nfsd.pcap
Description: Binary data
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


RE: NFS 4.1 RECLAIM_COMPLETE FS failed error

2018-07-08 Thread NAGY Andreas
Hi! Sorry, I did not forget the traces, but had no time so far and as I am 
actually setting up several servers on the system I don't want to break 
anything by performing tests. I will send them as soon I have finished my 
actual work. Will be at least end of this week.

As I am actually setting up/cloning 80 VMs that are stored on the NFS datastore 
I can just report that the setup performs well and seems to be stable. Only 
thing that happened twice while working with ZFS snapshots/clones was that the 
ESXi host lost the connection to the NFS datastore. Don't know if it was while 
creating or deleting a clone, but the only way to recover from this was to 
restart nfsd or to switchover HAST/CARP, but all without crashing any VM.

Br,
Andi



-Original Message-
From: owner-freebsd-sta...@freebsd.org 
[mailto:owner-freebsd-sta...@freebsd.org] On Behalf Of Rick Macklem
Sent: Montag, 9. Juli 2018 04:11
To: Daniel Engel ; freebsd-stable@freebsd.org
Subject: Re: NFS 4.1 RECLAIM_COMPLETE FS failed error

Daniel Engel wrote:
[stuff snipped]
>I traced the commits that Rick has made since that thread and merged them 
>'head' >into 'stable':
>
>'svnlite checkout http://svn.freebsd.org/base/release/11.1.0/'
>'svnlite merge -c 332790 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333508 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333579 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333580 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333592 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333645 http://svn.freebsd.org/base/head'
>'svnlite merge -c 333766 http://svn.freebsd.org/base/head'
>'svnlite merge -c 334396 http://svn.freebsd.org/base/head'
>'svnlite merge -c 334492 http://svn.freebsd.org/base/head'
>'svnlite merge -c 327674 http://svn.freebsd.org/base/head'
Yes, you have all the commits to head related to the 4.1 server that might 
affect the ESXi client, plus a bunch that should be harmless, but I don't think 
affect the ESXi client mounts. (Most of these will get MFC'd to stable/11, but 
I haven't gotten around to it yet.)

The ones that might be in 6.7 (they were in 6.5) that may bite you are:
- The client does an OpenDownGrade with all OPEN_SHARE_ACCESS and
   OPEN_SHARE_DENY bits set for something it calls a "drive lock".
  (Adding bits is supposed to be done via an Open/ClaimNull and not
   OpenDowngrade.) I'd really like to know if this still happens for 6.7?
- Something about "directory modified too often" when doing deletion of a bunch
  of files. (I have no idea what this one means, but apparently it was seen for
  other NFSv4.1 servers.)
- Some warnings about "wrong reason for not issuing a delegation". I have a fix
  for this one in PR#226650, but they are just warnings and don't seem to
  matter much.

The rest of the really nasty stuff happens after a server reboot. The recovery 
code seemed to be badly broken in the 6.5 client. (All sorts of fun stuff like 
the client looping doiing ExchangeID operations forever. VM crashes...)

>That completely fixed the connection instability, but the NFS share was still 
>mounting >read-only with a RECLAIM_COMPLETE error.  So, I manually applied the 
>first patch >from the previous thread and everything started working:
>
>--- fs/nfsserver/nfs_nfsdserv.c.savrecl 2018-02-10 20:34:31.166445000 
> -0500
>+++ fs/nfsserver/nfs_nfsdserv.c 2018-02-10 20:36:07.94749 -0500
>@@ -4226,10 +4226,9 @@ nfsrvd_reclaimcomplete(struct nfsrv_desc
>goto nfsmout;
>}
>NFSM_DISSECT(tl, uint32_t *, NFSX_UNSIGNED);
>+   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
>if (*tl == newnfs_true)
>-   nd->nd_repstat = NFSERR_NOTSUPP;
>-   else
>-   nd->nd_repstat = nfsrv_checkreclaimcomplete(nd);
>+   nd->nd_repstat = 0;
I think this patch is ok to use, since no other extant client does a 
ReclaimComplete with "one_fs == true". It does kinda violate the RFC.
The problem is that FreeBSD exports a hierarchy of file systems and telling the 
server that one of them has been reclaimed is useless. (This hack just assumes 
the client meant to say "one_fs == false".) There was also a case (I think it 
was after a server reboot) where the client would do one of these after doing a 
ReclaimComplete with "one_fs == false" and that is definitely bogus (the server 
would reply NFS4ERR_ALREADY_COMPLETE without the above hack) since the "one_fs 
== false" operation means all file systems have been reclaimed.

Anyhow, once I get some packet traces from Andreas for 6.7, I'll try and figure 
out how to handle at least some of the outstanding issues.

Good luck with it, rick

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"
_