Hi Ivan & interested parties, Excellent of you starting with the API work! For all who don't know yet: Ivan and I talked about this idea in Berlin and agreed that it sounds like a good idea. I had played with some prototypic implementation FSX and learned a bit about the its pitfalls.
Below there is a list of questions that I had when reviewing the first few commits. There are also sections on what I have in mind and partly in code for the new API. -- Stefan^2. Questions / Feedback to fs-node-api as of r1723819 -------------------------------------------------- [Not commenting on the implementation since that depends on points in the other sections.] * BRANCH-README: "The branch is NOT going to be merged to trunk; I'm going to re-implement everything in trunk ..." - Why implementing it a second time (assuming the new API turns out to be useful)? - Do you expect the code to be "too prototypic" to be used for production? If so, any specific indication for that? - I think having the feature developed on a branch makes sense because the switch-over to the new API will be very noisy. Otherwise, it could be done on /trunk directly - and be reverted again if necessary. * svn_fs.h - Maybe, move the new API functions to new section. They will not always be 1:1 replacements. * "svn_fs_node_t does not depends on svn_fs_root object created from" - I'm fine with that. Any specific reason behind this other than that node_t should be be self-contained within fs_t and result_pool? * "svn_fs_node_t always reference existing node in tree (?)" - Within a transaction, nodes of type svn_node_none make sense, e.g. after a deletion. Footnote: in-txn nodes do have a number of update / invalidation issues, which makes having separate vtables instances for in-txn and in-rev nodes a good idea. - I'm not too sure we should allow creating nodes for any random path but I guess we must at least have the option to do so b/c of symmetry. * svn_fs_open_node - Could we name this svn_fs_node or svn_fs_get_node? "open" suggests that there are resources involved which require an implicit and sometimes explicit "close". * svn_fs_file_length - I was thinking of naming that svn_fs_node_length. It could error out on non-files as it does today or return some informative value like the representation size. Caller logic needs to handle the "not a file" case anyway, so it could use the new info on its own discretion. * svn_fs_dir_entries2 - Again, this could simply be svn_fs_sub_nodes. - The result should be an array of node_t *, sorted by their names. This is what the storage layer delivers in FS* and it is what the repos layer would like to scan during reporting. Additional Goals ---------------- * Use 2-pool paradigm. * Where appropriate, make the API more orthogonal and consistent (e.g. svn_fs_node_make instead of svn_fs_node_make_file and svn_fs_node_make_dir). Observations in my Experiments ------------------------------ * Despite calling it "node API" it actually concerned with efficiently navigating directories. So, always represent the node using 2 structs: the actual node plus a (usually shared) struct containing all the context info - representing the directory. * The parameters for svn_fs_node_make, svn_fs_node_copy etc. should take pure paths to describe the target instead of parent-node + local-name. * Some function API function names clash with existing ones, i.e. need to be revved (svn_fs_node_history3, svn_fs_node_relation2 and everything prop). * Txn nodes and revision nodes should use 2 different instances of the vtable. Error handling for operations that either apply to txn nodes or revision nodes only can then be moved to lib_fs by checking for NULL pointers in the vtable. Proposal for an Implementation Strategy --------------------------------------- * node_compat.* - Provide a default / fallback implementation in lib_fs for backends that don't have native node API support. - Blueprint for backend code, i.e. ensure we can have a clean structure - Implement API (roughly) one function at a time. * Ensure semantic equivalence to old API and high test coverage from the start - Implement node_compat in terms of the old backend API - Implement the old FS API in terms of the new API, i.e. route all calls through node_compat.* - Have #define that controls whether the old API is run natively or gets diverted through node_compat.* - If the backend vtable entry is NULL for an old API function, divert it through node_compat.* * Switch API users over to the new API - This is what makes the branch worthwhile. - Deprecate the old API only after migrating most callers * Implement the node API in FSFS - Take node_compat and only replace the vtable calls with direct function calls. - Add direct addressing info (noderev IDs) to the structs. - Switch functions over to "native" code. - NULL the old API vtable entries to enforce the new backend logic. * Final review and merger into /trunk - Report processing as in "diff --summary" is a good performance indicator.