FS Node API

Stefan Fuhrmann Fri, 08 Jan 2016 16:48:48 -0800

Hi Ivan & interested parties,

Excellent of you starting with the API work!  For all
who don't know yet:  Ivan and I talked about this idea
in Berlin and agreed that it sounds like a good idea.
I had played with some prototypic implementation FSX
and learned a bit about the its pitfalls.


Below there is a list of questions that I had when
reviewing the first few commits.  There are also sections
on what I have in mind and partly in code for the new API.

-- Stefan^2.


Questions / Feedback to fs-node-api as of r1723819
--------------------------------------------------

[Not commenting on the implementation since that depends
 on points in the other sections.]

* BRANCH-README: "The branch is NOT going to be merged to
  trunk; I'm going to re-implement everything in trunk ..."

  - Why implementing it a second time (assuming the new
    API turns out to be useful)?
  - Do you expect the code to be "too prototypic" to be
    used for production?  If so, any specific indication
    for that?
  - I think having the feature developed on a branch makes
    sense because the switch-over to the new API will be
    very noisy.  Otherwise, it could be done on /trunk
    directly - and be reverted again if necessary.

* svn_fs.h

  - Maybe, move the new API functions to new section.
    They will not always be 1:1 replacements.

* "svn_fs_node_t does not depends on svn_fs_root object created from"

  - I'm fine with that.  Any specific reason behind this
    other than that node_t should be be self-contained
    within fs_t and result_pool?

* "svn_fs_node_t always reference existing node in tree (?)"

  - Within a transaction, nodes of type svn_node_none
    make sense, e.g. after a deletion.  Footnote: in-txn
    nodes do have a number of update / invalidation issues,
    which makes having separate vtables instances for
    in-txn and in-rev nodes a good idea.
  - I'm not too sure we should allow creating nodes for
    any random path but I guess we must at least have
    the option to do so b/c of symmetry.

* svn_fs_open_node

  - Could we name this svn_fs_node or svn_fs_get_node?
    "open" suggests that there are resources involved
    which require an implicit and sometimes explicit
    "close".

* svn_fs_file_length

  - I was thinking of naming that svn_fs_node_length.
    It could error out on non-files as it does today
    or return some informative value like the representation
    size.  Caller logic needs to handle the "not a file"
    case anyway, so it could use the new info on its own
    discretion.

* svn_fs_dir_entries2

  - Again, this could simply be svn_fs_sub_nodes.
  - The result should be an array of node_t *, sorted
    by their names.  This is what the storage layer
    delivers in FS* and it is what the repos layer
    would like to scan during reporting.


Additional Goals
----------------

* Use 2-pool paradigm.

* Where appropriate, make the API more orthogonal and
  consistent (e.g. svn_fs_node_make instead of
  svn_fs_node_make_file and svn_fs_node_make_dir).


Observations in my Experiments
------------------------------

* Despite calling it "node API" it actually concerned with
  efficiently navigating directories.  So, always represent
  the node using 2 structs: the actual node plus a (usually
  shared) struct containing all the context info - representing
  the directory.

* The parameters for svn_fs_node_make, svn_fs_node_copy etc.
  should take pure paths to describe the target instead of
  parent-node + local-name.

* Some function API function names clash with existing ones,
  i.e. need to be revved (svn_fs_node_history3,
  svn_fs_node_relation2 and everything prop).

* Txn nodes and revision nodes should use 2 different
  instances of the vtable.  Error handling for operations
  that either apply to txn nodes or revision nodes only
  can then be moved to lib_fs by checking for NULL pointers
  in the vtable.


Proposal for an Implementation Strategy
---------------------------------------

* node_compat.*
  - Provide a default / fallback implementation in lib_fs
    for backends that don't have native node API support.
  - Blueprint for backend code, i.e. ensure we can have
    a clean structure
  - Implement API (roughly) one function at a time.

* Ensure semantic equivalence to old API and high test
  coverage from the start
  - Implement node_compat in terms of the old backend API
  - Implement the old FS API in terms of the new API,
    i.e. route all calls through node_compat.*
  - Have #define that controls whether the old API is run
    natively or gets diverted through node_compat.*
  - If the backend vtable entry is NULL for an old API
    function, divert it through node_compat.*

* Switch API users over to the new API
  - This is what makes the branch worthwhile.
  - Deprecate the old API only after migrating most callers

* Implement the node API in FSFS
  - Take node_compat and only replace the vtable calls
    with direct function calls.
  - Add direct addressing info (noderev IDs) to the structs.
  - Switch functions over to "native" code.
  - NULL the old API vtable entries to enforce the new
    backend logic.

* Final review and merger into /trunk
  - Report processing as in "diff --summary" is a good
    performance indicator.

FS Node API

Reply via email to