Hello everyone, looking closer and comparing our servers, we infact have 2 servers that behave differently, but they also have a different workload (mail instead of file storage). They have identical hardware, but were installed at a later time, compared to the servers which do have the issue.
They have a stable l2arc cache size, not like the one i described previously, where the l2arc size is bigger than the dataset in the pool. Nonleaking server: L2 ARC Summary: (HEALTHY) Passed Headroom: 63.05m Tried Lock Failures: 198.57m IO In Progress: 53.57k Low Memory Aborts: 32 Free on Write: 21.40k Writes While Full: 16.50k R/W Clashes: 3.50k Bad Checksums: 0 IO Errors: 0 SPA Mismatch: 613.80m L2 ARC Size: (Adaptive) 443.42 GiB Header Size: 0.27% 1.22 GiB L2 ARC Evicts: Lock Retries: 1.33k Upon Reading: 0 L2 ARC Breakdown: 191.36m Hit Ratio: 28.27% 54.09m Miss Ratio: 71.73% 137.27m Feeds: 1.68m L2 ARC Buffer: Bytes Scanned: 4.28 PiB Buffer Iterations: 1.68m List Iterations: 107.55m NULL List Iterations: 1.16m L2 ARC Writes: Writes Sent: 100.00% 915.04k No bad checksums or IO Errors. The l2arc size of 443gb is sensible compared to it's actual size (373gb). Yet aside of the workload I cannot find any difference between the 2 "mail" fileservers and the other 6+ "web" fileservers doing storage for websites. They're identical in every regard, aside of the workload and the time of installation. The mail fileservers were installed much later in comparison. I just wanted to add this, as it maybe very relevant. With kind regards, Daniel On 08/11/2015 04:42 PM, Daniel Genis wrote: > Dear FreeBSD community, > > We're facing a somewhat odd issue, perhaps similar to what is discussed > here: https://forums.freebsd.org/threads/l2arc-degraded.47540/ > > The issue is that the L2ARC header seems to grow without limit, similar > to a memory leak, pressuring more and more memory over time out of the ARC. > > For example, the output of "zpool iostat -v 1" > > capacity operations bandwidth > pool alloc free read write read write > ------------ ----- ----- ----- ----- ----- ----- > syspool 1.15G 275G 0 0 0 0 > mirror 1.15G 275G 0 0 0 0 > gpt/zfs0 - - 0 0 0 0 > gpt/zfs1 - - 0 0 0 0 > ------------ ----- ----- ----- ----- ----- ----- > tank 1.21T 1.51T 229 1.99K 3.67M 9.48M > mirror 124G 154G 67 125 787K 503K > da0 - - 20 27 440K 503K > da1 - - 45 28 379K 503K > [...] > mirror 124G 154G 34 164 454K 612K > da18 - - 26 12 417K 612K > da19 - - 6 13 58.8K 612K > logs - - - - - - > mirror 117M 74.4G 0 109 0 1.75M > da21 - - 0 109 0 1.75M > da22 - - 0 109 0 1.75M > cache - - - - - - > da23 1.67T 16.0E 302 7 2.85M 223K > ------------ ----- ----- ----- ----- ----- ----- > > > Here the cache shows 1.67T, in use and 16.0E free. > The cache is a 373GB Intel SSD. > > # diskinfo -v da23 > da23 > 512 # sectorsize > 400088457216 # mediasize in bytes (373G) > 781422768 # mediasize in sectors > 4096 # stripesize > 0 # stripeoffset > 48641 # Cylinders according to firmware. > 255 # Heads according to firmware. > 63 # Sectors according to firmware. > BTTV4234089C400HGN # Disk ident. > id1,enc@n500e004aaaaaaa3e/type@0/slot@18 # Physical path > > > > The L2ARC stats section from "zfs-stats -a": > > L2 ARC Summary: (DEGRADED) > Passed Headroom: 133.33m > Tried Lock Failures: 4.90b > IO In Progress: 313.63k > Low Memory Aborts: 1.52k > Free on Write: 589.79k > Writes While Full: 34.57k > R/W Clashes: 46.95k > Bad Checksums: 408.40m > IO Errors: 151.99m > SPA Mismatch: 632.00m > > L2 ARC Size: (Adaptive) 1.89 TiB > Header Size: 0.88% 16.98 GiB > > L2 ARC Evicts: > Lock Retries: 1.27k > Upon Reading: 2 > > L2 ARC Breakdown: 2.10b > Hit Ratio: 32.89% 691.15m > Miss Ratio: 67.11% 1.41b > Feeds: 3.70m > > L2 ARC Buffer: > Bytes Scanned: 10.70 PiB > Buffer Iterations: 3.70m > List Iterations: 236.30m > NULL List Iterations: 24.86m > > L2 ARC Writes: > Writes Sent: 100.00% 3.38m > > > Here we can see that currently the Header Size is almost 17gb. > This header size grows continuously without (apparent) limit. > Also zfs appears to think it's holding 1.89 TiB inside the L2ARC, which > seems very very unlikely. > > # freebsd-version > 10.1-RELEASE-p13 > > # uname -a > FreeBSD servername 10.1-RELEASE-p10 FreeBSD 10.1-RELEASE-p10 #0: Wed May > 13 06:54:13 UTC 2015 > r...@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 > > # uptime > 4:35PM up 42 days, 15:24, 1 user, load averages: 1.35, 0.96, 0.84 > > > Does anyone know how we can alleviate the issue? > We originally thought the issue was caused by > https://www.freebsd.org/security/advisories/FreeBSD-EN-15:07.zfs.asc > > We have updated our Servers since but the header size seems to keep > growing still. For reference, we have multiple bsd fileservers which are > used mostly over NFS, all with identical configuration (but varying > workload). They all still show these symptoms. > > Any tips/hints/pointers are appreciated! > > With kind regards, > > Daniel > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org" > -- Met vriendelijke groeten, Daniel Genis Medewerker Techniek Byte Internet W http://www.byte.nl/ E dan...@byte.nl T 020 521 6226 F 020 521 6227 _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"