Re: NFSv4 performance degradation with 12.0-CURRENT client

2016-11-24 Thread Konstantin Belousov
On Wed, Nov 23, 2016 at 10:17:25PM -0700, Alan Somers wrote:
> I have a FreeBSD 10.3-RELEASE-p12 server exporting its home
> directories over both NFSv3 and NFSv4.  I have a TrueOS client (based
> on 12.0-CURRENT on the drm-next-4.7 branch, built on 28-October)
> mounting the home directories over NFSv4.  At first, everything is
> fine and performance is good.  But if the client does a buildworld
> using sources on NFS and locally stored objects, performance slowly
> degrades.  The degradation is most noticeable with metadata-heavy
> operations.  For example, "ls -l" in a directory with 153 files takes
> less than 0.1 seconds right after booting.  But the longer the
> buildworld goes on, the slower it gets.  Eventually that same "ls -l"
> takes 19 seconds.  When the home directories are mounted over NFSv3
> instead, I see no degradation.
> 
> top shows negligible CPU consumption on the server, and very high
> consumption on the client when using NFSv4 (nearly 100%).  The
> NFS-using process is spending almost all of its time in system mode,
> and dtrace shows that almost all of its time is spent in
> ncl_getpages().
> 
> I have delegations disabled on the server.  On the client, the home
> directories are nullfs mounted to two additional locations, and the
> buildworld was actually using one of those nullfs mounts, not the NFS
> mount directly.
> 
> Any ideas?

Try stock FreeBSD first.

If reproducable on the stock HEAD, can you point to the lines of
ncl_getpages() where the time is spent ?  Does reading of the problematic
files, as opposed to mmaping it, also cause the behaviour ?  E.g. try dd.

There is really no time-unbounded loops in the ncl_getpages() itself.
I could understand the situation if e.g. time is spent in getpbuf() or
ncl_readrpc(), but not in ncl_getpages() directly.

Also, as an experiment, you could see if HEAD after r308980 demonstrates
any difference.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Optimising generated rules for SAT solving (5/12 are duplicates)

2016-11-24 Thread Vsevolod Stakhov
On 23/11/2016 16:27, Ed Schouten wrote:
> Hi Hans,
> 
> 2016-11-23 15:27 GMT+01:00 Hans Petter Selasky :
>> I've made a patch to hopefully optimise SAT solving in our pkg utility.
> 
> Nice! Do you by any chance have any numbers that show the performance
> improvements made by this change? Assuming that the SAT solver of
> pkg(1) uses an algorithm similar to DPLL[1], a change like this would
> affect performance linearly. My guess is therefore that the running
> time is reduced by approximately 5/12. Is this correct?

There won't be any improvement if you just remove duplicates from SAT
formula. This situation is handled by picosat internally and even for
naive DPLL there is no significant influence of duplicate KNF clauses:
once you've assumed all vars in some clause, you automatically resolve
all duplicates.

Is there any real improvement of SAT solver speed with this patch? From
my experiences, SAT solving is negligible in terms of CPU time comparing
to other tasks performed by pkg.

> By the way, why attach a zip file with a diff? GitHub's pull requests
> are awesome! :-)
> 
> [1] Davis-Putnam-Logemann-Loveland algorithm:
> https://en.wikipedia.org/wiki/DPLL_algorithm
> 


-- 
Vsevolod Stakhov
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFSv4 performance degradation with 12.0-CURRENT client

2016-11-24 Thread Rick Macklem

On Wed, Nov 23, 2016 at 10:17:25PM -0700, Alan Somers wrote:
> I have a FreeBSD 10.3-RELEASE-p12 server exporting its home
> directories over both NFSv3 and NFSv4.  I have a TrueOS client (based
> on 12.0-CURRENT on the drm-next-4.7 branch, built on 28-October)
> mounting the home directories over NFSv4.  At first, everything is
> fine and performance is good.  But if the client does a buildworld
> using sources on NFS and locally stored objects, performance slowly
> degrades.  The degradation is most noticeable with metadata-heavy
> operations.  For example, "ls -l" in a directory with 153 files takes
> less than 0.1 seconds right after booting.  But the longer the
> buildworld goes on, the slower it gets.  Eventually that same "ls -l"
> takes 19 seconds.  When the home directories are mounted over NFSv3
> instead, I see no degradation.
>
> top shows negligible CPU consumption on the server, and very high
> consumption on the client when using NFSv4 (nearly 100%).  The
> NFS-using process is spending almost all of its time in system mode,
> and dtrace shows that almost all of its time is spent in
> ncl_getpages().
>
A couple of things you could do when it slow (as well as what Kostik suggested):
- nfsstat -c -e on client and nfsstat -e -s on server, to see what RPCs are 
being done
  and how quickly. (nfsstat -s -e will also show you how big the DRC is, 
although a
  large DRC should show up as increased CPU consumption on the server)
- capture packets with tcpdump -s 0 -w test.pcap host 
  - then you can email me test.pcap as an attachment. I can look at it in 
wireshark
and see if there seem to protocol and/or TCP issues. (You can look at in 
wireshark
yourself, the look for NFS4ERR_xxx, TCP segment retransmits...)

If you are using either "intr" or "soft" on the mounts, try without those mount 
options.
(The Bugs section of mount_nfs recommends against using them. If an RPC fails 
due to
 these options, something called a seqid# can be "out of sync" between 
client/server and
 that causes serious problems.)
--> These seqid#s are not used by NFSv4.1, so you could try that by adding
  "minorversion=1" to your mount options.

Good luck with it, rick
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Optimising generated rules for SAT solving (5/12 are duplicates)

2016-11-24 Thread Hans Petter Selasky

On 11/24/16 13:13, Vsevolod Stakhov wrote:

On 23/11/2016 16:27, Ed Schouten wrote:

Hi Hans,

2016-11-23 15:27 GMT+01:00 Hans Petter Selasky :

I've made a patch to hopefully optimise SAT solving in our pkg utility.


Nice! Do you by any chance have any numbers that show the performance
improvements made by this change? Assuming that the SAT solver of
pkg(1) uses an algorithm similar to DPLL[1], a change like this would
affect performance linearly. My guess is therefore that the running
time is reduced by approximately 5/12. Is this correct?


There won't be any improvement if you just remove duplicates from SAT
formula. This situation is handled by picosat internally and even for
naive DPLL there is no significant influence of duplicate KNF clauses:
once you've assumed all vars in some clause, you automatically resolve
all duplicates.

Is there any real improvement of SAT solver speed with this patch? From
my experiences, SAT solving is negligible in terms of CPU time comparing
to other tasks performed by pkg.


Hi,

I added some code to measure the time for SAT solving. During my test 
run I'm seeing values in the range 8-10ms for both versions, so I 
consider that neglible. However, the unpatched version wants to 
reinstall 185 packages while the non-patched version wants to reinstall 
1 package. That has a huge time influential. I'm not that familar with 
PKG that I can draw any conclusions from this.


# Test1:
echo "n" | /xxx/pkg/src/pkg-static upgrade --no-repo-update > b.txt

# Test2:
echo "n" | env PKG_NO_SORT=YES /xxx/pkg/src/pkg-static upgrade 
--no-repo-update > a.txt


Please find the material attached including a debug version patch you 
can play with.


--HPS
Checking for upgrades (748 candidates): .. done
Processing candidates (748 candidates): . done
Skipped 3702 identical rules
Reiterate
SAT solving took 0s and 7370 usecs
The following 52 package(s) will be affected (of 0 checked):

Installed packages to be UPGRADED:
xapian-core: 1.2.23,1 -> 1.2.24,1
webp: 0.5.0 -> 0.5.1_1
webkit2-gtk3: 2.8.5_6 -> 2.8.5_7
webkit-gtk2: 2.4.11_4 -> 2.4.11_5
vlc: 2.2.4_3,4 -> 2.2.4_4,4
trousers: 0.3.13_1 -> 0.3.14
tiff: 4.0.6_2 -> 4.0.7
thunderbird: 45.4.0_2 -> 45.5.0_1
sqlite3: 3.15.1 -> 3.15.1_1
spidermonkey170: 17.0.0_2 -> 17.0.0_3
soundtouch: 1.9.2 -> 1.9.2_1
raptor2: 2.0.15_4 -> 2.0.15_5
qt5-core: 5.6.2 -> 5.6.2_1
qt4-corelib: 4.8.7_5 -> 4.8.7_6
polkit: 0.113_1 -> 0.113_2
pciids: 20161029 -> 20161119
openimageio: 1.6.12_2 -> 1.6.12_3
openblas: 0.2.18_1,1 -> 0.2.18_2,1
openal-soft: 1.17.2 -> 1.17.2_1
libx264: 0.148.2708 -> 0.148.2708_1
libvpx: 1.6.0 -> 1.6.0_1
libvisio01: 0.1.5_3 -> 0.1.5_4
libreoffice: 5.2.3_2 -> 5.2.3_3
libpci: 3.5.1 -> 3.5.2
libmspub01: 0.1.2_4 -> 0.1.2_5
libfreehand: 0.1.1_3 -> 0.1.1_4
libe-book: 0.1.2_5 -> 0.1.2_6
libcdr01: 0.1.3_1 -> 0.1.3_2
lcms2: 2.7_2 -> 2.8
inkscape: 0.91_8 -> 0.91_9
icu: 57.1,1 -> 58.1,1
harfbuzz: 1.3.3 -> 1.3.3_1
gstreamer1-plugins: 1.8.0 -> 1.8.0_1
gstreamer-plugins: 0.10.36_6,3 -> 0.10.36_7,3
gstreamer: 0.10.36_4 -> 0.10.36_5
gnupg: 2.1.15 -> 2.1.16
glib: 2.46.2_3 -> 2.46.2_4
gcc: 4.8.5_2 -> 4.9.4
firefox: 50.0_2,1 -> 50.0_4,1
firebird25-client: 2.5.6_1 -> 2.5.6_2
ffmpeg: 2.8.8_5,1 -> 2.8.8_8,1
dejavu: 2.35 -> 2.37
chromium: 52.0.2743.116_2 -> 52.0.2743.116_4
boost-libs: 1.55.0_13 -> 1.55.0_14
blender: 2.77a -> 2.77a_1
belle-sip: 1.5.0 -> 1.5.0_1
bash: 4.4 -> 4.4.5
argyllcms: 1.7.0_1 -> 1.7.0_2
OpenEXR: 2.2.0_5 -> 2.2.0_6
ImageMagick: 6.9.5.10,1 -> 6.9.5.10_1,1
GraphicsMagick: 1.3.24,1 -> 1.3.24_1,1

Installed packages to be REINSTALLED:
baresip-0.4.19 (options changed)

Number of packages to be upgraded: 51
Number of packages to be reinstalled: 1

The process will require 19 MiB more space.
446 MiB to be downloaded.

Proceed with this action? [y/N]: Checking for upgrades (748 candidates): .. done
Processing candidates (748 candidates): . done
Reiterate
SAT solving took 0s and 5790 usecs
The following 236 package(s) will be affected (of 0 checked):

Installed packages to be UPGRADED:
xapian-core: 1.2.23,1 -> 1.2.24,1
webp: 0.5.0 -> 0.5.1_1
webkit2-gtk3: 2.8.5_6 -> 2.8.5_7
webkit-gtk2: 2.4.11_4 -> 2.4.11_5
vlc: 2.2.4_3,4 -> 2.2.4_4,4
trousers: 0.3.13_1 -> 0.3.14
tiff: 4.0.6_2 -> 4.0.7
thunderbird: 45.4.0_2 -> 45.5.0_1
sqlite3: 3.15.1 -> 3.15.1_1
spidermonkey170: 17.0.0_2 -> 17.0.0_3
soundtouch: 1.9.2 -> 1.9.2_1
raptor2: 2.0.15_4 -> 2.0.15_5
qt5-core: 5.6.2 -> 5.6.2_1
qt4-corelib: 4.8.7_5 -> 4.8.7_6
polkit: 0.113_1 

Re: Optimising generated rules for SAT solving (5/12 are duplicates)

2016-11-24 Thread Vsevolod Stakhov
On 24/11/2016 13:05, Hans Petter Selasky wrote:
> On 11/24/16 13:13, Vsevolod Stakhov wrote:
>> On 23/11/2016 16:27, Ed Schouten wrote:
>>> Hi Hans,
>>>
>>> 2016-11-23 15:27 GMT+01:00 Hans Petter Selasky :
 I've made a patch to hopefully optimise SAT solving in our pkg utility.
>>>
>>> Nice! Do you by any chance have any numbers that show the performance
>>> improvements made by this change? Assuming that the SAT solver of
>>> pkg(1) uses an algorithm similar to DPLL[1], a change like this would
>>> affect performance linearly. My guess is therefore that the running
>>> time is reduced by approximately 5/12. Is this correct?
>>
>> There won't be any improvement if you just remove duplicates from SAT
>> formula. This situation is handled by picosat internally and even for
>> naive DPLL there is no significant influence of duplicate KNF clauses:
>> once you've assumed all vars in some clause, you automatically resolve
>> all duplicates.
>>
>> Is there any real improvement of SAT solver speed with this patch? From
>> my experiences, SAT solving is negligible in terms of CPU time comparing
>> to other tasks performed by pkg.
> 
> Hi,
> 
> I added some code to measure the time for SAT solving. During my test
> run I'm seeing values in the range 8-10ms for both versions, so I
> consider that neglible. However, the unpatched version wants to
> reinstall 185 packages while the non-patched version wants to reinstall
> 1 package. That has a huge time influential. I'm not that familar with
> PKG that I can draw any conclusions from this.
> 
> # Test1:
> echo "n" | /xxx/pkg/src/pkg-static upgrade --no-repo-update > b.txt
> 
> # Test2:
> echo "n" | env PKG_NO_SORT=YES /xxx/pkg/src/pkg-static upgrade
> --no-repo-update > a.txt
> 

Then I don't understand how your patch should affect the solving
procedure. If pkg tries to reinstall something without *reason* it is a
good sign of bug in pkg itself and/or your database/repo and not in SAT
solver.

I'll try to review your issue but I'll likely need your local packages
database for this test.

-- 
Vsevolod Stakhov
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Optimising generated rules for SAT solving (5/12 are duplicates)

2016-11-24 Thread Hans Petter Selasky

On 11/24/16 14:05, Hans Petter Selasky wrote:

the non-patched version wants to reinstall 1 package.


Spelling: patched version wants to reinstall 1 package only.

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Optimising generated rules for SAT solving (5/12 are duplicates)

2016-11-24 Thread Hans Petter Selasky

On 11/24/16 14:11, Vsevolod Stakhov wrote:

On 24/11/2016 13:05, Hans Petter Selasky wrote:

On 11/24/16 13:13, Vsevolod Stakhov wrote:

On 23/11/2016 16:27, Ed Schouten wrote:

Hi Hans,

2016-11-23 15:27 GMT+01:00 Hans Petter Selasky :

I've made a patch to hopefully optimise SAT solving in our pkg utility.


Nice! Do you by any chance have any numbers that show the performance
improvements made by this change? Assuming that the SAT solver of
pkg(1) uses an algorithm similar to DPLL[1], a change like this would
affect performance linearly. My guess is therefore that the running
time is reduced by approximately 5/12. Is this correct?


There won't be any improvement if you just remove duplicates from SAT
formula. This situation is handled by picosat internally and even for
naive DPLL there is no significant influence of duplicate KNF clauses:
once you've assumed all vars in some clause, you automatically resolve
all duplicates.

Is there any real improvement of SAT solver speed with this patch? From
my experiences, SAT solving is negligible in terms of CPU time comparing
to other tasks performed by pkg.


Hi,

I added some code to measure the time for SAT solving. During my test
run I'm seeing values in the range 8-10ms for both versions, so I
consider that neglible. However, the unpatched version wants to
reinstall 185 packages while the non-patched version wants to reinstall
1 package. That has a huge time influential. I'm not that familar with
PKG that I can draw any conclusions from this.

# Test1:
echo "n" | /xxx/pkg/src/pkg-static upgrade --no-repo-update > b.txt

# Test2:
echo "n" | env PKG_NO_SORT=YES /xxx/pkg/src/pkg-static upgrade
--no-repo-update > a.txt



Then I don't understand how your patch should affect the solving
procedure. If pkg tries to reinstall something without *reason* it is a
good sign of bug in pkg itself and/or your database/repo and not in SAT
solver.

I'll try to review your issue but I'll likely need your local packages
database for this test.



Hi,

Maybe it is a bug somewhere.

I noticed some rules repeating the same variable two times for example.

Send me the list of files you need off-list.

Thank you!

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Optimising generated rules for SAT solving (5/12 are duplicates)

2016-11-24 Thread Ed Schouten
2016-11-24 13:13 GMT+01:00 Vsevolod Stakhov :
> On 23/11/2016 16:27, Ed Schouten wrote:
>> Hi Hans,
>>
>> 2016-11-23 15:27 GMT+01:00 Hans Petter Selasky :
>>> I've made a patch to hopefully optimise SAT solving in our pkg utility.
>>
>> Nice! Do you by any chance have any numbers that show the performance
>> improvements made by this change? Assuming that the SAT solver of
>> pkg(1) uses an algorithm similar to DPLL[1], a change like this would
>> affect performance linearly. My guess is therefore that the running
>> time is reduced by approximately 5/12. Is this correct?
>
> There won't be any improvement if you just remove duplicates from SAT
> formula. This situation is handled by picosat internally and even for
> naive DPLL there is no significant influence of duplicate KNF clauses:
> once you've assumed all vars in some clause, you automatically resolve
> all duplicates.

Exactly. This is why I've stated: it affects performance linearly.
Referring to Wikipedia's pseudo-code of the algorithm: the number of
calls to unit-propagate() and pure-literal-assign() drops linearly,
but the recursion will stay the same.

-- 
Ed Schouten 
Nuxi, 's-Hertogenbosch, the Netherlands
KvK-nr.: 62051717
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: NFSv4 performance degradation with 12.0-CURRENT client

2016-11-24 Thread Alan Somers
On Thu, Nov 24, 2016 at 5:53 AM, Rick Macklem  wrote:
>
> On Wed, Nov 23, 2016 at 10:17:25PM -0700, Alan Somers wrote:
>> I have a FreeBSD 10.3-RELEASE-p12 server exporting its home
>> directories over both NFSv3 and NFSv4.  I have a TrueOS client (based
>> on 12.0-CURRENT on the drm-next-4.7 branch, built on 28-October)
>> mounting the home directories over NFSv4.  At first, everything is
>> fine and performance is good.  But if the client does a buildworld
>> using sources on NFS and locally stored objects, performance slowly
>> degrades.  The degradation is most noticeable with metadata-heavy
>> operations.  For example, "ls -l" in a directory with 153 files takes
>> less than 0.1 seconds right after booting.  But the longer the
>> buildworld goes on, the slower it gets.  Eventually that same "ls -l"
>> takes 19 seconds.  When the home directories are mounted over NFSv3
>> instead, I see no degradation.
>>
>> top shows negligible CPU consumption on the server, and very high
>> consumption on the client when using NFSv4 (nearly 100%).  The
>> NFS-using process is spending almost all of its time in system mode,
>> and dtrace shows that almost all of its time is spent in
>> ncl_getpages().
>>
> A couple of things you could do when it slow (as well as what Kostik 
> suggested):
> - nfsstat -c -e on client and nfsstat -e -s on server, to see what RPCs are 
> being done
>   and how quickly. (nfsstat -s -e will also show you how big the DRC is, 
> although a
>   large DRC should show up as increased CPU consumption on the server)
> - capture packets with tcpdump -s 0 -w test.pcap host 
>   - then you can email me test.pcap as an attachment. I can look at it in 
> wireshark
> and see if there seem to protocol and/or TCP issues. (You can look at in 
> wireshark
> yourself, the look for NFS4ERR_xxx, TCP segment retransmits...)
>
> If you are using either "intr" or "soft" on the mounts, try without those 
> mount options.
> (The Bugs section of mount_nfs recommends against using them. If an RPC fails 
> due to
>  these options, something called a seqid# can be "out of sync" between 
> client/server and
>  that causes serious problems.)
> --> These seqid#s are not used by NFSv4.1, so you could try that by adding
>   "minorversion=1" to your mount options.
>
> Good luck with it, rick

I've reproduced the issue on stock FreeBSD 12, and I've also learned
that nullfs is a required factor.  Doing the buildworld directly on
the NFS mount doesn't cause any slowdown, but doing a buildworld on
the nullfs copy of the NFS mount does.  The slowdown affects the base
NFS mount as well as the nullfs copy.  Here is the nfsstat output for
both server and client duing "ls -al" on the client:

nfsstat -e -s -z

Server Info:
  Getattr   SetattrLookup  Readlink  Read WriteCreateRemove
  800 0   121 0 0 2 0 0
   Rename  Link   Symlink Mkdir Rmdir   Readdir  RdirPlusAccess
0 0 0 0 0 0 0 8
MknodFsstatFsinfo  PathConfCommit   LookupP   SetClId SetClIdCf
0 0 0 0 1 3 0 0
 Open  OpenAttr OpenDwnGr  OpenCfrm DelePurge   DeleRet GetFH  Lock
0 0 0 0 0 0   123 0
LockT LockU CloseVerify   NVerify PutFH  PutPubFH PutRootFH
0 0 0 0 0   674 0 0
Renew RestoreFHSaveFH   Secinfo RelLckOwn  V4Create
0 0 0 0 0 0
Server:
RetfailedFaults   Clients
0 0 0
OpenOwner Opens LockOwner LocksDelegs
0 0 0 0 0
Server Cache Stats:
   Inprog  Idem  Non-idemMisses CacheSize   TCPPeak
0 0 0   674 16738 16738

nfsstat -e -c -z
Client Info:
Rpc Counts:
  Getattr   SetattrLookup  Readlink  Read WriteCreateRemove
   60 0   119 0 0 0 0 0
   Rename  Link   Symlink Mkdir Rmdir   Readdir  RdirPlusAccess
0 0 0 0 0 0 0 3
MknodFsstatFsinfo  PathConfCommit   SetClId SetClIdCf  Lock
0 0 0 0 0 0 0 0
LockT LockU  Open   OpenCfr
0 0 0 0
OpenOwner Opens LockOwner LocksDelegs  LocalOwn LocalOpen LocalLOwn
 5638141453 0 0 0 0 0 0
LocalLock
0
Rpc Info:
 TimedOut   Invalid X Replies   Retries  Requests
0 0 0 0   662
Cache Info:
Attr HitsMisses Lkup HitsMisses BioR HitsMisses BioW HitsMisses
 127558   83

Re: NFSv4 performance degradation with 12.0-CURRENT client

2016-11-24 Thread Konstantin Belousov
On Thu, Nov 24, 2016 at 11:42:41AM -0700, Alan Somers wrote:
> On Thu, Nov 24, 2016 at 5:53 AM, Rick Macklem  wrote:
> >
> > On Wed, Nov 23, 2016 at 10:17:25PM -0700, Alan Somers wrote:
> >> I have a FreeBSD 10.3-RELEASE-p12 server exporting its home
> >> directories over both NFSv3 and NFSv4.  I have a TrueOS client (based
> >> on 12.0-CURRENT on the drm-next-4.7 branch, built on 28-October)
> >> mounting the home directories over NFSv4.  At first, everything is
> >> fine and performance is good.  But if the client does a buildworld
> >> using sources on NFS and locally stored objects, performance slowly
> >> degrades.  The degradation is most noticeable with metadata-heavy
> >> operations.  For example, "ls -l" in a directory with 153 files takes
> >> less than 0.1 seconds right after booting.  But the longer the
> >> buildworld goes on, the slower it gets.  Eventually that same "ls -l"
> >> takes 19 seconds.  When the home directories are mounted over NFSv3
> >> instead, I see no degradation.
> >>
> >> top shows negligible CPU consumption on the server, and very high
> >> consumption on the client when using NFSv4 (nearly 100%).  The
> >> NFS-using process is spending almost all of its time in system mode,
> >> and dtrace shows that almost all of its time is spent in
> >> ncl_getpages().
> >>
> > A couple of things you could do when it slow (as well as what Kostik 
> > suggested):
> > - nfsstat -c -e on client and nfsstat -e -s on server, to see what RPCs are 
> > being done
> >   and how quickly. (nfsstat -s -e will also show you how big the DRC is, 
> > although a
> >   large DRC should show up as increased CPU consumption on the server)
> > - capture packets with tcpdump -s 0 -w test.pcap host 
> >   - then you can email me test.pcap as an attachment. I can look at it in 
> > wireshark
> > and see if there seem to protocol and/or TCP issues. (You can look at 
> > in wireshark
> > yourself, the look for NFS4ERR_xxx, TCP segment retransmits...)
> >
> > If you are using either "intr" or "soft" on the mounts, try without those 
> > mount options.
> > (The Bugs section of mount_nfs recommends against using them. If an RPC 
> > fails due to
> >  these options, something called a seqid# can be "out of sync" between 
> > client/server and
> >  that causes serious problems.)
> > --> These seqid#s are not used by NFSv4.1, so you could try that by adding
> >   "minorversion=1" to your mount options.
> >
> > Good luck with it, rick
> 
> I've reproduced the issue on stock FreeBSD 12, and I've also learned
> that nullfs is a required factor.  Doing the buildworld directly on
> the NFS mount doesn't cause any slowdown, but doing a buildworld on
> the nullfs copy of the NFS mount does.  The slowdown affects the base
> NFS mount as well as the nullfs copy.  Here is the nfsstat output for
> both server and client duing "ls -al" on the client:
> 
> nfsstat -e -s -z
> 
> Server Info:
>   Getattr   SetattrLookup  Readlink  Read WriteCreate
> Remove
>   800 0   121 0 0 2 0 > 0
>Rename  Link   Symlink Mkdir Rmdir   Readdir  RdirPlus
> Access
> 0 0 0 0 0 0 0 
> 8
> MknodFsstatFsinfo  PathConfCommit   LookupP   SetClId 
> SetClIdCf
> 0 0 0 0 1 3 0 > 0
>  Open  OpenAttr OpenDwnGr  OpenCfrm DelePurge   DeleRet GetFH  
> Lock
> 0 0 0 0 0 0   123 > 0
> LockT LockU CloseVerify   NVerify PutFH  PutPubFH 
> PutRootFH
> 0 0 0 0 0   674 0 > 0
> Renew RestoreFHSaveFH   Secinfo RelLckOwn  V4Create
> 0 0 0 0 0 0
> Server:
> RetfailedFaults   Clients
> 0 0 0
> OpenOwner Opens LockOwner LocksDelegs
> 0 0 0 0 0
> Server Cache Stats:
>Inprog  Idem  Non-idemMisses CacheSize   TCPPeak
> 0 0 0   674 16738 16738
> 
> nfsstat -e -c -z
> Client Info:
> Rpc Counts:
>   Getattr   SetattrLookup  Readlink  Read WriteCreate
> Remove
>60 0   119 0 0 0 0 > 0
>Rename  Link   Symlink Mkdir Rmdir   Readdir  RdirPlus
> Access
> 0 0 0 0 0 0 0 
> 3
> MknodFsstatFsinfo  PathConfCommit   SetClId SetClIdCf  
> Lock
> 0 0 0 0 0 0 0 > 0
> LockT LockU  Open   OpenCfr
> 0 0 0 0
> OpenOwner Opens LockOwner LocksDelegs  LocalOwn LocalOpen 
> LocalLOwn
>  5638141453 0 0 

Re: NFSv4 performance degradation with 12.0-CURRENT client

2016-11-24 Thread Rick Macklem
asom...@gmail.com wrote:
[stuff snipped]
>I've reproduced the issue on stock FreeBSD 12, and I've also learned
>that nullfs is a required factor.  Doing the buildworld directly on
>the NFS mount doesn't cause any slowdown, but doing a buildworld on
>the nullfs copy of the NFS mount does.  The slowdown affects the base
>NFS mount as well as the nullfs copy.  Here is the nfsstat output for
>both server and client duing "ls -al" on the client:
>
>nfsstat -e -s -z
If you do this again, avoid using "-z" and I think you'll see the Opens (below 
Server:)
going up and up...
>
>Server Info:
>  Getattr   SetattrLookup  Readlink  Read WriteCreateRemove
>  800 0   121 0 0 2 0 0
>   Rename  Link   Symlink Mkdir Rmdir   Readdir  RdirPlusAccess
>0 0 0 0 0 0 0 8
>MknodFsstatFsinfo  PathConfCommit   LookupP   SetClId SetClIdCf
>   0 0 0 0 1 3 0 0
> Open  OpenAttr OpenDwnGr  OpenCfrm DelePurge   DeleRet GetFH  Lock
>0 0 0 0 0 0   123 0
>LockT LockU CloseVerify   NVerify PutFH  PutPubFH PutRootFH
>0 0 0 0 0   674 0 0
>Renew RestoreFHSaveFH   Secinfo RelLckOwn  V4Create
>0 0 0 0 0 0
>Server:
>RetfailedFaults   Clients
>0 0 0
>OpenOwner Opens LockOwner LocksDelegs
>0 0 0 0 0
Oops, I think this is an nfsstats bug. I don't normally use "-z", so I didn't 
notice
it clears these counts and it probably should not, since they are "how many of
these that are currently allocated".
I'll check this. (Not relevant to this issue, but needs fixin.;-)
>Server Cache Stats:
>   Inprog  Idem  Non-idemMisses CacheSize   TCPPeak
>0 0 0   674 16738 16738
>
>nfsstat -e -c -z
>Client Info:
>Rpc Counts:
> Getattr   SetattrLookup  Readlink  Read WriteCreateRemove
>   60 0   119 0 0 0 0 0
>   Rename  Link   Symlink Mkdir Rmdir   Readdir  RdirPlusAccess
>0 0 0 0 0 0 0 3
>MknodFsstatFsinfo  PathConfCommit   SetClId SetClIdCf  Lock
>0 0 0 0 0 0 0 0
>LockT LockU  Open   OpenCfr
>0 0 0 0
>OpenOwner Opens LockOwner LocksDelegs  LocalOwn LocalOpen LocalLOwn
> 5638141453 0 0 0 0 0 0
Ok, I think this shows us the problem. 141453 opens is a lot and the client 
would have
to chek these every time another open is done (there goes all that CPU;-).

Now, why has this occurred?
Well, the NFSv4 client can't close NFSv4 Opens on a vnode until that vnode's
v_usecount goes to 0. This is because mmap'd files might do I/O after the file
descriptor is closed.
Now, hopefully Kostik will know something about nullfs and can help with this.
My guess is that nullfs ends up acquiring a refcnt on the NFS vnode so the
v_usecount doesn't go to 0 and, therefore, the client never closes the NFSv4 
Opens.
Kostik, do you know if this is the case and whether or not it can be changed?
>LocalLock
>0
>Rpc Info:
>TimedOut   Invalid X Replies   Retries  Requests
>0 0 0 0   662
>Cache Info:
>Attr HitsMisses Lkup HitsMisses BioR HitsMisses BioW HitsMisses
> 127558   837   121 0 0 0 0
>BioRLHitsMisses BioD HitsMisses DirE HitsMisses
>1 0 6 0 1 0
>
[more stuff snipped]
>What role could nullfs be playing?
As noted above, my hunch is that is acquiring a refcnt on the NFS client vnode 
such
that the v_usecount doesn't go to zero (at least for a long time) and without
a VOP_INACTIVE() on the NFSv4 vnode, the NFSv4 Opens don't get closed and
accumulate.
(If that isn't correct, it is somehow interfering with the client Closing the 
NFSv4 Opens
 in some other way.)

rick

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"