walk, find, locate and friends try to cope with exploring filesystem
metadata at breadth
and length effectively, efficiently and with controlled time/space
consumption.
Proposal: walkfs(4) (or finds, or indexfs, or sphinx, or ...)
walkfs serves a filesystem tree similar to network devices. A 'clone'
file is used to create new connection to the metadata database. Each
connection subdirectory contains the following files (and maybe more):
root, data, ctl, size, count, stat, metadata.
Writing a string "subdir" to the 'root' file (re)starts a walk through
the designated
subdirectory. Each read from the 'data' file returns the next path
found. EOF indicates
that the walk has terminated.
Query constraints and walking indications can be written to the 'ctl'
files as
attribute=value pairs. Example:
"user=glenda traversal=depth depth=3 type=file mode=u=r".
The ctl message "mode=sync" indicates that the walker thread writing to
the data file
and the reading process synchornize on each found path. The message
"mode=async" allows the walker thread to run ahead of the reader. The
message
"mode=stat" does not write any found path to the data file, but just
updates size and
count (see below).
The stat file indicates the current status of traversal. If it is "eof"
the count file holds
the number of found files and the stat size file holds the total size in
bytes of all found files. While in traversal (stat is "walking") the
size and count files hold the totals up to the moment.
The metadata file holds the metadata corresponding to the current path
in 'data' in
"attribute=value" format. If walkfs has been called with a cache
options, writing a path inside root to data, sets the metadata to the
respective values.
walkfs options:
-c size indicates the size of the memory buffer or cache to be
used by walkfs
-p path filename of a persistent cache/metadata database. When
walkfs is started
with -p information may be outdated, ctl messages are
used to update the
metadata database
-s channel
a fileserver can write update messages to channel, when
changing
metadata of the filesystem it serves. walkfs then
updates its metadata
database.
One usage scenario of walkfs is to implement find, du, walk, rdup and
the like.
Another usage schenario of walkfs, with the -s option, is to add file
indexing to a
fileserver.
Regards,
Jorge-León