On Mon, Mar 01, 2010 at 09:22:38AM -0800, Richard Elling wrote: > > Once again, I'm assuming that each DDT entry corresponds to a > > record (slab), so to be exact, I would need to know the number of > > slabs (which doesn't currently seem possible). I'd be satisfied > > with a guesstimate based on what my expected average block size > > it. But what I need to know is how big a DDT entry is for each > > record. I'm trying to parse the code, and I don't have it in a > > sufficiently intelligent IDE right now to find all the > > cross-references. > > > > I've got as far as (in ddt.h) > > > > struct ddt_entry { [..] > > }; > > > > Any idea what these structure size actually are? > > Around 270 bytes, or one 512 byte sector.
Is the assumption above that correct - the DDT stores one of these records per "block", and as such the native recordsize of DDT is just 512 bytes? Or are they aggregated somehow? Is this the difference between the in-memory and on-disk sizes, due to sector alignment padding? We got as far as showing that 512-byte records in L2ARC are expensive in RAM overhead, but I still don't know for sure the recordsize of DDT as seen by ARC. I'm still hoping someone will describe how to use zdb to find and inspect the DDT object on-disk. Last time we got stuck at trying to determine the units used for numbers printed by zdb -D. This whole sizing business is getting to be quite a FAQ, and there hasn't really been a clear answer. Yes, there are many moving parts, and applying generic sizing recommendations is hard -- but at least being able to see more of the parts would help. If nothing else, it would help move these kinds of discussion along to more specific analysis. So: - what are the units/sizes in bytes reported by zdb -D - what is the in-memory size of a DDT entry, including overheads - what is the on-disk size of a DDT entry, including overheads - what is the recordsize of DDT, as visible to L2ARC - what ram overhead %age does L2ARC need for that recordsize With those, one can start from zdb stats on an existing pool, or estimations about certain kinds of data, and add overheads and muliply down the list to model the totals. One can also probably see clearly the benefit of extra L1ARC capacity vs L2ARC with those overheads, and the cost of doing dedup at all. -- Dan.
pgpg43V2dNsH9.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss