On Mon, Mar 01, 2010 at 09:22:38AM -0800, Richard Elling wrote:
> >  Once again, I'm assuming that each DDT entry corresponds to a
> >  record (slab), so to be exact, I would need to know the number of
> >  slabs (which doesn't currently seem possible).  I'd be satisfied
> >  with a guesstimate based on what my expected average block size
> >  it.  But what I need to know is how big a DDT entry is for each
> >  record. I'm trying to parse the code, and I don't have it in a
> >  sufficiently intelligent IDE right now to find all the
> >  cross-references.  
> >  
> > I've got as far as (in ddt.h)
> > 
> > struct ddt_entry {
[..]
> > };
> > 
> > Any idea what these structure size actually are?
> 
> Around 270 bytes, or one 512 byte sector.

Is the assumption above that correct - the DDT stores one of these
records per "block", and as such the native recordsize of DDT is just
512 bytes? Or are they aggregated somehow?  Is this the difference
between the in-memory and on-disk sizes, due to sector alignment
padding? 

We got as far as showing that 512-byte records in L2ARC are expensive
in RAM overhead, but I still don't know for sure the recordsize of DDT
as seen by ARC.  I'm still hoping someone will describe how to use zdb
to find and inspect the DDT object on-disk.  Last time we got stuck at
trying to determine the units used for numbers printed by zdb -D.

This whole sizing business is getting to be quite a FAQ, and there
hasn't really been a clear answer.  Yes, there are many moving parts,
and applying generic sizing recommendations is hard -- but at least
being able to see more of the parts would help.  If nothing else, it
would help move these kinds of discussion along to more specific
analysis. 

So:

 - what are the units/sizes in bytes reported by zdb -D
 - what is the in-memory size of a DDT entry, including overheads
 - what is the on-disk size of a DDT entry, including overheads
 - what is the recordsize of DDT, as visible to L2ARC
 - what ram overhead %age does L2ARC need for that recordsize

With those, one can start from zdb stats on an existing pool, or
estimations about certain kinds of data, and add overheads and
muliply down the list to model the totals.  One can also probably see
clearly the benefit of extra L1ARC capacity vs L2ARC with those
overheads, and the cost of doing dedup at all.

--
Dan.

Attachment: pgpg43V2dNsH9.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to