Thursday, August 30, 2012, 5:29:16 PM, you wrote:

> 1. Is there a one to many relation between a cached object and a Doc
> structure?

Yes, that's the chained Docs in the lower right. Each Doc represents a fragment 
and we have discussed previously how objects are stored as an ordered set of 
fragments.

> 2. A Doc structure contains a table of fragments which are represented as
> uint64_t offsets.

The fragment offsets are the location in the object of the fragment data. 
frag_offset[i] is the address of the first byte past the end of fragment i. To 
simplify, presume fragments are at most 100 bytes long and we have an object 
that is 390 bytes long in four fragments (0,1,2,3). Then frag_offset[1] could 
be 201 which would mean the next byte past the end of fragment 1 is byte 201 of 
390 in the original object (or equivalently that the first byte of fragment 2 
is byte 201 of 390 in the object). This data is used to accelerate range 
requests so that if you had a request for bytes 220-300 of the object you could 
skip immediately to fragment 2 without reading fragment 1. In real life there 
are some very large (20M+) files out there and being able to skip reading the 
first 10M or so from disk is an enormous performance improvement. Note the 
earliest doc for an alternate is always read because of the way the logic is 
done now, although in theory that could be skipped because you the fragment 
data you need is in the First Doc which doesn't contain any object data for 
objects with multiple alternates or that are more than 1 fragment long.

Reply via email to