On Aug 11, 2012, at 5:33 PM, Chad Leigh - Pengar LLC wrote:

> Hi
> 
> I have a FreeBSD 9 system with ZFS root.  It is actually a VM under Xen on a 
> beefy piece of HW (4 core Sandy Bridge 3ghz Xeon, total HW memory 32GB -- VM 
> has 4vcpus and 6GB RAM).  Mirrored gpart partitions.  I am looking for data 
> integrity more than performance as long as performance is reasonable (which 
> it has more than been the last 3 months).
> 
> The other "servers" on the same HW, the other VMs on the same, don't have 
> this problem but are set up the same way.  There are 4 other FreeBSD VMs, one 
> running email for a one man company and a few of his friends, as well as some 
> static web pages and stuff for him, one runs a few low use web apps for 
> various customers, and one runs about 30 websites with apache and nginx, 
> mostly just static sites.  None are heavily used.  There is also one VM with 
> linux running a couple low use FrontBase databases.   Not high use database 
> -- low use ones.
> 
> The troubleseome VM  has been running fine for over 3 months since I 
> installed it.    Level of use has been pretty much constant.   The server 
> runs 4 jails on it, each dedicated to a different bit of email processing for 
> a small number of users.   One is a secondary DNS.  One runs clamav and 
> spamassassin.  One runs exim for incoming and outgoing mail.  One runs 
> dovecot for imap and pop.   There is no web server or database or anything 
> else running.
> 
> Total number of mail users on the system is approximately 50, plus or minus.  
> Total mail traffic is very low compared to "real" mail servers.
> 
> Earlier this week things started "freezing up".  It might last a few minutes, 
> or it might last 1/2 hour.   Processes become unresponsive.  This can last a 
> few minutes or much longer.  It eventually resolves itself and things are 
> good for another 10 minutes or 3 hours until it happens again.  When it 
> happens,  lots of processes are listed in "top" as 
> 
> zfs
> zio->i
> zfs
> tx->tx
> db->db
> 
> state.   These processes only get listed in these states when there are 
> problems.   What are these states indicative of?
> 

Ok, after much reading of ZFS blog posts, forum postings, email list postings, 
and trying stuff out, I seem to have gotten stuff back down to normal and 
reasonable performance.

In case anyone has similar issues in a similar circumstance, here is what I 
did.  Some of these may have had little or no effect but this is what was 
changed.

The biggest effect was when I did the following:

vfs.zfs.zfetch.block_cap  from default 256 down to 64

This was like night and day.  The idea to try this from a post by user 
"madtrader" in the forum 
http://forums.sagetv.com/forums/showthread.php?t=43830&page=2  .  He was 
recording multiple streams of HD video and trying to play HD video off a stream 
from the same server/ZFS file system.  


Also, setting

vfs.zfs.write_limit_override   to something other than the default disabled "0" 
seems to have had a relatively significant effect.   Before I worked with the  
"block_cap" above, I was focussing on this and had tried everything from 64M to 
768M.  It is currently set to 576M and is around the area where I was having 
best results on my system with my amount of RAM (6GB).  I tried 512M and had 
good results and then 768M, which was still good but not quite as good as far 
as I could tell from testing.  So I went with 576M on my last attempt and then 
added in the block_cap and things really are pretty much back to normal.


I turned on vdev caching

vfs.zfs.vdev.cache.size   form 0 to 10M.   Don't know if it helped.  

I also lowered 

vfs.zfs.txg.timeout   from 5 to 3.   This seems to have had a slightly 
noticeable effect.


I also adjusted


vfs.zfs.arc_max

The default of 0 (meaning system self set) seemed to result in an actual value 
of around 75-80% of RAM, which seemed high.   I ended up setting it at 3072M, 
which for me seems to work well.  Don't know what the overall effect on the 
problem was though.


Thanks
Chad


Reply via email to