Hello,

I have several ~12TB storage servers using Solaris with ZFS.  Two of them have 
recently developed performance issues where the majority of time in an 
spa_sync() will be spent in the space_map_*() functions.  During this time, 
"zpool iostat" will show 0 writes to disk, while it does hundreds or thousands 
of small (~3KB) reads each second, presumably reading space map data from disk 
to find places to put the new blocks.  The result is that it can take several 
minutes for an spa_sync() to complete, even if I'm only writing a single 128KB 
block.

Using DTrace, I can see that space_map_alloc() frequently returns -1 for 128KB 
blocks.  From my understanding of the ZFS code, that means that one or more 
metaslabs has no 128KB blocks available.  Because of that, it seems to be 
spending a lot of time going through different space maps which aren't able to 
all be cached in RAM at the same time, thus causing bad performance as it has 
to read from the disks.  The on-disk space map size seems to be about 500MB.

I assume the simple solution is to leave enough free space available so that 
the space map functions don't have to hunt around so much.  This problem starts 
happening when there's about 1TB free out of the 12TB.  It seems like such a 
shame to waste that much space, so if anyone has any suggestions, I'd be glad 
to hear them.

1) Is there anything I can do to temporarily fix the servers that are having 
this problem? They are production servers, and I have customers complaining, so 
a temporary fix is needed.

2) Is there any sort of tuning I can do with future servers to prevent this 
from becoming a problem?  Perhaps a way to make sure all the space maps are 
always in RAM?

3) I set recordsize=32K and turned off compression, thinking that should fix 
the performance problem for now.  However, using a DTrace script to watch calls 
to space_map_alloc(), I see that it's still looking for 128KB blocks (!!!) for 
reasons that are unclear to me, thus it hasn't helped the problem.

Thanks,
Scott
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to