On Thu, Sep 08, 2011 at 01:08:33AM -0000, danie...@apache.org wrote:
> Author: danielsh
> Date: Thu Sep  8 01:08:33 2011
> New Revision: 1166489
> 
> URL: http://svn.apache.org/viewvc?rev=1166489&view=rev
> Log:
> On the fs-successor-ids branch, actually implement sharding.
> 
> Found by: stsp
> 
> * subversion/libsvn_fs_fs/fs_fs.c
>   (FSFS_SUCCESSORS_REVISIONS_PER_SHARD): New helper.
>   (path_successor_ids_shard, path_successor_ids,
>    path_successor_node_revs_shard, path_successor_node_revs):
>      Fix path calculations.
>   (update_successor_ids_file):
>      Fix checks for 'New shard' and 'New file in a shard'.
> 

Just to make sure we both have the same idea:

Each file in the successor store is responsible for a fixed
number of revisions (currently 1000).

max-files-per-dir tells us how many files can be in a single directory.
If more than max-files-per-dir files exist in a given directory
we open a new directory and store files there instead.

So I would expect sharding within the successors tree
to behave like this:

 filename:                  file stores successor data created in:
 db/successors/ids/0/0      r0..r999
 db/successors/ids/0/1      r1000..r1999
 ...                        ...
 db/successors/ids/0/999    r1000000..r1999999
 db/successors/ids/1/0      r2000000..r2000999
 ...                        ...

Data for the first million revs goes into the first shard,
data for the second million revs goes into the second shard, etc.

Is this what you've implemented?

I probably would have used FSFS_SUCCESSORS_FILES_PER_SHARD
instead of FSFS_SUCCESSORS_REVISIONS_PER_SHARD, and then
computed the filename based on that number. I don't like
thinking of it in terms of "revisions per shard" because
the numbers get so big :)

Reply via email to