Re: anyone uses ReiserFS today

Nicolas George Thu, 15 Aug 2024 04:14:13 -0700

Michael Stone (12024-08-14):
> The short answer is that the reason it handles small files well is because
> Reiser wanted the filesystem to be used for direct storage of small objects,
> whereas most applications dealing with small objects combine them into a
> larger object which is what is stored in the filesystem. E.g., a database
> like sqlite stores records in a large file which the database software
> manages internally rather than storing each record as a separate file. If
> the database wanted to take advantage of this paradigm and store small
> records in individual files, it would exhibit ridiculously poor performance
> on every other filesystem and OS, and writing a database only for reiserfs
> seemed overly limiting. Remember reiserfs was always a research project, and
> never quite done;
> reiser4 pushed these concepts further (e.g., added various atomic
> transaction modes) but never got merged.


Except the original plan did not hold water, even at the time.

The blocks at using the file system instead of a more advanced format is
not just the inefficiency of the storage.

First, there are system calls, they are expensive. Reading a file takes
at least three system calls: open, read, close, that is assuming you
already have enough memory and the file is small enough to fit in it in
a single read.

With one record per file, you need three system calls per record. With
multiple records per file, you can read thousands of records with the
same number of system calls. Or use mmap and have all the records
available without system calls — but with page faults.


Second, the file system offers only key → value conversion and
hierarchical enumeration: you can efficiently get at a file if you know
its name, or a set of files if they are all one directory.

But if you want, for example, the files in a certain interval of time,
no luck. You could organize your directories to make the kind of request
you make frequently efficient, like having a YYYY/MM/DD/HH hierarchy,
but it is made awkward by the very limited API of the file system, and
cannot even remotely compete with the indexing abilities of structured
formats with multiple records per file.


Third (and last of what I think of right now), libraries or servers to
handle structured data often infrastructure to ensure non-trivial
consistency in the data. For example it can delete automatically
sub-records associated with a main record you just deleted. With the
file system, you would have to reinvent all that.


Do not get me wrong, I am not a fan at all of “if all you've got is SQL,
everything looks like a flat list, even a straightforward tree
structure”, but the “just use the file system” people do not even
realize the kind of services they do render.

Regards,

-- 
  Nicolas George

Re: anyone uses ReiserFS today

Reply via email to