Re: unionfs Documentation

olafBuddenhagen Thu, 16 Jul 2009 21:22:34 -0700

Hi,

On Sat, Jun 13, 2009 at 09:23:23AM +0300, Sergiu Ivanov wrote:

> I'm sending in my attempt to compile a unionfs documentation. It is
> formatted as a stand-alone Texinfo file for now, so that I am able to
> build and view .info files from it.

I don't understand -- why can't you just build it as part of the Hurd
manual?

> However, according to http://preview.tinyurl.com/lfy436, I will be
> able to add this document to the Hurd documentation using one or two
> @lowersections.

Considering all the headers that simply do not make sense when included
within the Hurd manual, I don't think this is really an option.

Also, even if it worked, it would be ugly IMHO. Just do it as I asked
you from the beginning: put it in the normal Hurd manual, where the
"shadowfs" placeholder used to be.

> Also note that I'm not sending a git patch, since it makes little
> sense to me: it would have been the same document but with some patch
> headers and +'s added to each line...

This doesn't really matter in this case; but in general, even if it's
only adding new files, a patch is better: much easier to apply; has all
the author information etc. right.

> \input texinfo
> 
> @c %**start of header
> @setfilename unionfs.info
> @settitle GNU/Hurd unionfs Documentation
> @c %**end of header
> 
> @copying
> Copyright @copyright{} 2009 Free Software Foundation, Inc.
> 
> @command{unionfs} is free software; you can redistribute it and/or
> modify it under the terms of the GNU General Public License as
> published by the Free Software Foundation, version 2.
> 
> @command{unionfs} is distributed in the hope that it will be useful,
> but WITHOUT ANY WARRANTY; without even the implied warranty of
> MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> General Public License for more details.

I don't think it's right to include the copyright notice of the
*program* in the manual?...

Of course, once it is a chapter in the Hurd manual, the question doesn't
arise anymore :-)

> 
> You should have received a copy of the GNU General Public License
> along with GNU make; see the file COPYING.

"make"?... :-)

> If not, write to the Free
> Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA
> 02110-1301, USA.
> @end copying
> 
> @titlepage
> @title GNU/Hurd unionfs Documentation
> 
> @page
> @vskip Opt plus 1fill1
> @insertcopying
> @end titlepage
> 
> @contents
> 
> @ifnottex
> @node Top
> @top GNU/Hurd unionfs Documentation
> 
> A short GNU/Hurd unionfs translator documentation.
> @end ifnottex
> 
> @menu
> * Introduction::
> * Command Line Interface::
> * Stowing::
> * Basic Internals::
> * Caveat::
> 
> * Index::
> @end menu
> 
> @node Introduction
> @chapter Introduction
> @cindex introduction, example
> 
> The @command{unionfs} translator is a GNU/Hurd implementation of the
> union mounting functionality,

Not sure "union mounting" is a good term for "normal" unionfs?...

> which basically consists in merging the
> contents of several file systems together and mounting the result on a
> single node.

The unioned directories do *not* need to be distinct file systems...

> An implementation of this functionality exists on
> platforms other than GNU/Hurd and is known under the name of
> @samp{UnionFS}.
> 
> One of the best known use cases for union mounting is met in LiveCDs,
> where it is often necessary to combine a read-only file system,
> residing on the CD, with a read/write RAM disk file system.
> 
> It may happen that the merged file systems will contain files with the
> same path. @command{unionfs} uses the priority approach to solve such
> conflicts. @xref{Resolving Conflicts}.
> 
> 
> To join @file{foo/}, @file{bar/} and @file{baz/} under @file{quux/} do
> the following:
> @example
> 
> @code{settrans -a quux/ unionfs foo/ bar/ baz/}
> 
> @end example
> 
> If one would also like to include the directory tree under
> @file{quux/} in the list of merged file systems, use @code{unionfs
> -u}.

Is it really usful and/or customary to provide specific examples of
command lines and options in the introduction?... Seems to me that it
would be better only to explain the general possibilities here.

> 
> File systems can be added at run-time using
> @command{fsysopts}. @xref{Run-time Options}.
> 
> @node Command Line Interface
> @chapter Command Line Interface
> @cindex man, manual, command line, interface
> 
> @section Synopsis
> @cindex synopsis, usage
> 
> @example
> @code{unionfs [ options ] --add @var{file systems} --remove @var{file
> systems}}
> @end example

I don't think this is a good synopsis: it looks as if --add and --remove
were mandatory...

I think it would be more correct to just use: "unionfs [options]
directories ..."

Or, if you want to be more verbose, perhaps something like: "unionfs
[global options] ( [directory-specific options] directories ) ..."

(I'm not sure about the exact syntax used in info manuals.)

> 
> @noindent
> Where @samp{options} may be any option (or options) enumerated in the
> section below. @samp{file systems} is a list of paths to directory
> trees (file system) to be merged.
> 
> @noindent
> @strong{Note}: Since @command{unionfs} is a translator, the node to
> mount the merged file systems on should be specified as an argument to
> @command{settrans}.

I don't think this is really a useful note -- using translators is
explained elsewhere in the manual.

> 
> @section Options
> @cindex option, start-up, run-time
> 
> @subsection Start-up Options
> 
> @table @option
> @item -c @var{size}
> @itemx --cache-si...@var{size}
> Specify the maximal number of nodes in the node cache.
> 
> @item -u
> @itemx --underlying
> Add the underlying file system to the list of union mounted file
> systems.
> 
> @item -w
> @itemx --writable
> Specify the following file system as writable. This makes it possible
> to create new nodes in the specified file system.

Are these two really only possible on startup? Seems like a major
limitation, which should be fixed...

Also, I wonder whether it is really useful to have extra sections for
startup and runtime options? I tend to think it would be better to just
flag the startup-only ones in the descriptions... (Except for those
where it is obvious, like --help, where it's not necessary to mention in
explicitely at all.)

> @end table
> 
> @subsection Start-up and Run-time Options
> @anchor{Run-time Options}
> @table @option
> @item -a
> @itemx --add
> Add the following file system. This option is on by default, i.e. all
> file systems for which neither @option{--add} nor @option{--remove}
> was specified are added to the list of merged file systems.
> 
> @item -r
> @itemx --remove
> Remove the following file system from the list of merged file systems.

Only one? My understanding is that *all* following ones will be
added/removed...

> 
> @item -p @var{value}
> @itemx --priori...@var{value}
> Set the priority for the following file system to @var{value}.
> 
> @item -s @var{stowdir}
> @itemx --st...@var{stowdir}
> Use the given directory as a stow. @xref{Stowing}.

"stow" is a verb :-)

> 
> @item -m @var{pattern}
> @itemx --match @var{pattern}
> Add only the nodes of the stow directory which match
> @var{pattern}. The pattern must be a valid shell wildcard pattern
> (suitable for @samp{<fnmatch.h>} functions).

Is this really only for --stow? The description from --help doesn't seem
to imply that...

Generally, I think some of the --help description are actually
clearer... I guess you should just copy them, unless you have a
description that is actually more verbose than the --help one.

(If you think that some of them can just be described better than what
is presenly in --help, without being more verbose, this should be
patched already in the --help output; not only in the manual...)

> @end table
> 
> @subsection Self-documenting Options
> @table @option
> @item -?
> @itemx --help
> Print the list of options with short descriptions.
> 
> @item --usage
> Print a short usage message.
> 
> @item -V
> @itemx --version
> Print program version.
> @end table
> 
> @node Stowing
> @chapter Stowing
> @cindex stow
> @command{unionfs} can watch for changes in a directory specified as
> stowing directory and automatically adjust the list of merged
> directory trees to the list of subdirectories of the stowing
> directory.
> 
> To use stow do the following:
> 
> @example
> @code{settrans -a foo/ unionfs --stow=/stow}
> @end example
> 
> Now, when a new directory appears under @file{/stow/},
> @command{unionfs} will automatically add the contents of this new
> directory to the list of merged file systems. Corresponding updates
> are done when a directory is removed from under @file{/stow/}.
> 
> To control which nodes from under @file{/stow/} make their way into
> the list of merged file systems, one can use the @option{--match}
> option. The pattern specified via this option must be valid shell
> wildcard. The following is a typical example of using a
> pattern-controlled stow:
> 
> @example
> @code{settrans -a foo/ unionfs -m bar --stow=/stow}
> @end example

Probably better to use a real-world example, like "-m bin" -- this is
the kind of stuff stowfs is really meant for...

> 
> Note that this syntax actually means that all file system nodes
> matching @file{/stow/*/bar} will be included in the merged file
> system.

I don't understand that part.

> 
> One can specify multiple patterns, which will be combined by logical
> @samp{OR} operation. In other words, the following two commands are
> equivalent:
> 
> @example
> @code{settrans -a foo/ unionfs -m bin -m sbin --stow=/stow}
> @code{settrans -a foo/ unionfs -m [s]bin --stow=/stow}
> @end example
> 
> @node Basic Internals
> @chapter Basic Internals
> @cindex internals, node, light node, conflict
> 
> In this chapter a short description of how @command{unionfs} works
> will be done. This description is intended for people who have at
> least a vague idea about GNU/Hurd translator programming.
> 
> Note that what follows is an overview of the basic functionality.

I don't think the introduction paragraphs are necessary. And drop the
"basic".

> 
> @command{unionfs} is a @samp{libnetfs}-based translator. At the base
> of the file system published by this translator lies the concept of a
> @samp{libnetfs} node. Like many other @samp{libnetfs}-based
> translators,

Probably better to actually list the other examples.

> @command{unionfs} does not maintain a node for every file
> that appears in the merged file system it publishes, because it would
> present the disadvantage of consuming more space and (what is more
> important) will put @command{unionfs} in the middle of every I/O
> request coming from the client to the underlying file system. Instead,
> nodes are maintained only for directories and when non-directory
> entries are looked up, @command{unionfs} gives off a port to the
> @emph{real} entry (i.e., not to a @samp{libnetfs} node).
> 
> Since @samp{libnetfs} does not impose the necessity of keeping the
> nodes in a truly hierarchical structure and since a @samp{libnetfs}
> node carries along with it some bits of technical information,
> @command{unionfs} introduces the concept of a @dfn{light node}. A
> light node is a custom structure (by no means connected to
> @samp{libnetfs}) which contains the information about a file system
> entry and some links to other light nodes. These links are meant to
> organize the light nodes in a hierarchical structure, corresponding to
> the virtual file system exposed by @command{unionfs} translator.

Is this really specific to unionfs? My impression is that this is
essentially a general description of the libnetfs "node->nn" concept,
with only a few specific bits mixed in...

> 
> The light nodes are created in a lazy fashion, i.e. when the first
> necessity arises. Immediately after start-up there is only one light
> node, corresponding to the root node of the merged file system. In
> further look-ups the tree of light nodes is populated.
> 
> When asked to look up a file, @command{unionfs} creates light nodes
> per each directory in the path to the file. Correspondingly, a normal
> @samp{libnetfs} node is created for each light node. Each
> @samp{libnetfs} node is stored in a node cache, which is actually a
> list of references to nodes. Once a node stored in the node cache is
> accessed, it is pushed to the @acronym{MRU, Most Recently Used} end of
> the list. When, after a series of additions of new nodes the node
> cache becomes full, the references to the nodes at the @acronym{LRU,
> Least Recently Used} are dropped. When there are no references
> pointing to a node, it is destroyed. The same thing concerns light
> nodes: when the reference count stored in a light node gets to zero,
> the light node is destroyed.

But the light nodes are not managed in an LRU cache, right? So how can
references go away?...

> 
> @anchor{Resolving Conflicts} @command{unionfs} associates a list of
> ports to the underlying file system(s) with each @samp{libnetfs}
> node. When look-ups are requested under this node, @command{unionfs}
> iterates the list and attempts to locate the required file in one of
> the merged file systems. The client gets the port to the file (in case
> it is a regular file) found in the file system that comes first in the
> list. File systems are installed into the list according to priorities
> specified at their addition.
> 
> @node Caveat
> @chapter Caveat
> @cindex caveat, warning, problem, permission
> While using @command{unionfs}, you could experience some permission
> errors or difficult or impossible file or directory deletion. The
> following is a brief list of things that might happen.
> 
> @section Warnings
> @itemize @bullet
> @item
> If the translator is run by an unpriviledged user, other users will
> fail to create files or directories, since the translator won't be
> able to change the ownership of the file.
> @end itemize
> 
> @section Problems
> @itemize @bullet
> @item
> If there is a name conflict in underlying file systems between
> directories and files -- say that @file{foo} is a directory in
> underlying file system @file{a} while is a file in the underlying file
> system @file{b}" -- then unionfs will be unable to delete this
> entry. This is a structural bug (there's no clean way to solve it),
> and should be fixed.

There is no way to solve it, but it should be fixed?...

> 
> @item
> If there's a name conflict in underlying file systems between
> directories (or between files), and the user has no permission to
> delete all the entries -- e.g. one hidden entry is read-only -- then
> he will get an @samp{EPERM} even if permissions seems ok. This is a
> structural bug (there's no clean way to solve it), and should be
> fixed.

I think both of these problems are actually a result of handling file
deletion fundamentally wrong; or perhaps even handling writable entries
wrong in general. It's probably never right to write to more than one
directory at a time.

This is a non-trivial problem. Other unionfs implementations probably
spent considerable time figuring out how to do it best. I entreated
Gianluca to check what over implementations do, instead of trying to
reinvent the wheel -- but he wouldn't listen :-(

> @end itemize
> 
> @node Index
> @unnumbered Index
> 
> @printindex cp
> 
> @bye
> 

In general, there are various places where I believe I could improve the
wording -- but instead of explaining each seperately, I think it's
easier if I just post a followup patch, or a revised version of your
patch, once all issues with the actual content are sorted out...

-antrik-

Re: unionfs Documentation

Reply via email to