Scott wrote:
> Hello,
> 
> I have several ~12TB storage servers using Solaris with ZFS.  Two of
> them have recently developed performance issues where the majority of
> time in an spa_sync() will be spent in the space_map_*() functions.
> During this time, "zpool iostat" will show 0 writes to disk, while it
> does hundreds or thousands of small (~3KB) reads each second,
> presumably reading space map data from disk to find places to put the
> new blocks.  The result is that it can take several minutes for an
> spa_sync() to complete, even if I'm only writing a single 128KB
> block.
> 
> Using DTrace, I can see that space_map_alloc() frequently returns -1
> for 128KB blocks.  From my understanding of the ZFS code, that means
> that one or more metaslabs has no 128KB blocks available.  Because of
> that, it seems to be spending a lot of time going through different
> space maps which aren't able to all be cached in RAM at the same
> time, thus causing bad performance as it has to read from the disks.
> The on-disk space map size seems to be about 500MB.

This indeed sounds like ZFS is trying to find bigger chunks of properly 
aligned free space segments and fails to find it.

> I assume the simple solution is to leave enough free space available
> so that the space map functions don't have to hunt around so much.
> This problem starts happening when there's about 1TB free out of the
> 12TB.  It seems like such a shame to waste that much space, so if
> anyone has any suggestions, I'd be glad to hear them.

Although fix for "6596237 Stop looking and start ganging" as suggested
by Sanjeev will provide some relief here, you are running you pool at 
92% capacity, so it may be time to consider expanding your pool.

> 1) Is there anything I can do to temporarily fix the servers that are
> having this problem? They are production servers, and I have
> customers complaining, so a temporary fix is needed.

Setting ZFS recordsize to some smaller value than default 128k may help
but only temporarily.

> 2) Is there any sort of tuning I can do with future servers to
> prevent this from becoming a problem?  Perhaps a way to make sure all
> the space maps are always in RAM?

Fix for 6596237 will help improve performance in such cases, so
probably you need to make sure that it is installed once available.

Ability to defragment pool could be useful as well.

> 3) I set recordsize=32K and turned off compression, thinking that
> should fix the performance problem for now.  However, using a DTrace
> script to watch calls to space_map_alloc(), I see that it's still
> looking for 128KB blocks (!!!) for reasons that are unclear to me,
> thus it hasn't helped the problem.

Changing recordsize affect block sizes ZFS uses for data blocks. It may
still require bigger blocks for metadata needs.

DTrace may help to better understand what is causing ZFS to try to
allocate bigger block. For example, larger blocks may still be used for ZIL.

Wbr,
victor

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to