After the fiasco with confusing struct revision for a struct commit, I've worked out something that makes more sense. I've actually ported fsck-cache, rev-tree, and my merge-base to it, so it should at least be comprehensive.
The design is as follows: There is a struct object for each object in the database, although they are only created on demand. It contains the type and sha1 of the object, as well as a flag for whether the object contents have been read, more flags for general use, a list of objects which it references, and a flag for whether any objects reference it. Each struct object is embedded in a type-specific struct, which contains further information. For example, struct commit has the date, the parents, and the tree. Parsing objects is progressive; objects are created in an unread state (with no disk access), and functions can be called to parse each object as it is determined to be interesting. This should generally allow for only the necessary portions of a large set of object references to be read. Any comment on the design, or should I send my implementation? -Daniel *This .sig left intentionally blank* - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html