On Tue, May 17, 2011 at 6:49 AM, Jim Klimov <jimkli...@cos.ru> wrote:
> 2011-05-17 6:32, Donald Stahl пишет:
>>
>> I have two follow up questions:
>>
>> 1. We changed the metaslab size from 10M to 4k- that's a pretty
>> drastic change. Is there some median value that should be used instead
>> and/or is there a downside to using such a small metaslab size?
>>
>> 2. I'm still confused by the poor scrub performance and it's impact on
>> the write performance. I'm not seeing a lot of IO's or processor load-
>> so I'm wondering what else I might be missing.
>
> I have a third question, following up to the first one above ;)
>
> 3) Is the "4k" size anyhow theoretically based?
> Namely, is it a "reasonably large" amount of eight or so
> metadata blocks of 512Kb size,or something else is in
> play - like a 4Kb IO?

The 4k blocksize was based on some analysis I had done on some systems
at Oracle. The code uses this shifted by another tuneable (defaults to
4) to determine the "fragmented" minimum size. So if you bump this to
32k then the fragmented size is 512k which tells ZFS to switch to a
different metaslab once it drops below this threshold.

>
> In particular, since my system uses 4Kb blocks (ashift=12),
> for similar benefit I should set metaslab size to 32k (4K*8
> blocks) - yes/no?
>
> Am I also correct to assume that if I have a large streaming
> write and ZFS can see or predict that it would soon have to
> reference many blocks, it can allocate a metaslab larger
> that this specified minimum and thus keep fragmentation
> somewhat not extremely high?

The metaslabs are predetermined at config time and their sizes are
fixed. A good way to think about them is as slices of your disk. If
you take your disk size and divided them up into 200 equally sized
sections then you end up with your metaslab size.

>
> Actually, am I understanding correctly that metaslabs are
> large contiguous ranges reserved for metadata blocks?
> If so, and if they are indeed treated specially anyway,
> is it possible to use 512-byte records for metadata even
> on VDEVs with 4kb block size configured by ashift=12?
> Perhaps not today, but as an RFE for ZFS development
> (I posted the idea here https://www.illumos.org/issues/954 )
>

No, metaslab are for all allocations and not specific to metadata.
There's more work to do to efficiently deal with 4k block sizes.

> Rationale: Very much space is wasted on my box just to
> reference data blocks and keep 3.5kb of trailing garbage ;)
>
> 4) In one internet post I've seen suggestions about this
> value to be set as well:
> set zfs:metaslab_smo_bonus_pct = 0xc8
>
> http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg40765.html

This is used to add more weight (i.e. preference) to specific
metaslabs. A metaslab receives this bonus if it has an offset which is
lower than a previously use metaslab. Sorry this is somewhat
complicated and hard to explain without a whiteboard. :-)

Thanks,
George

>
> Can anybody comment - what it is and whether it would
> be useful? The original post passed the knowledge as-is...
> Thanks
>
>
>
> --
>
>
> +============================================================+
> |                                                            |
> | Климов Евгений,                                 Jim Klimov |
> | технический директор                                   CTO |
> | ЗАО "ЦОС и ВТ"                                  JSC COS&HT |
> |                                                            |
> | +7-903-7705859 (cellular)          mailto:jimkli...@cos.ru |
> |                          CC:ad...@cos.ru,jimkli...@mail.ru |
> +============================================================+
> | ()  ascii ribbon campaign - against html mail              |
> | /\                        - against microsoft attachments  |
> +============================================================+
>
>
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>



-- 
George Wilson



M: +1.770.853.8523
F: +1.650.494.1676
275 Middlefield Road, Suite 50
Menlo Park, CA 94025
http://www.delphix.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to