Re: Nepomuk in 4.13 and beyond

Andreas Hartmetz Thu, 19 Dec 2013 09:34:49 -0800

On Thursday 19 December 2013 16:16:32 Sebastian Kügler wrote:
> On Thursday, December 19, 2013 14:48:57 Vishesh Handa wrote:
> > On Wednesday 18 Dec 2013 12:09:02 François K. wrote:
> > > Hi Vishesh, hi guys,
> > > 
> > > 
> > > 
> > > I'm sorry to short-circuit the thread. I deleted Vishesh's original
> > > email
> > > by mistake...
> > > 
> > > 
> > > 
> > > Well, that sounds really exciting ! Thanks again for your work.
> > > 
> > > 
> > > 
> > > Here are a few thoughts/questions I have since you've made the
> > > announcement. They might be a bit technical, I hope that's not a problem
> > > (should I start a new thread ?).
> > > 
> > > 
> > > 
> > > * What are the plans to store tags ? On OSX, tags are stored in files
> > > xattrs which is -IMHO- very nice : - Metadata live and die with the file
> > > ;>
> > > 
> > >   - No "store" query when you move or copy a file ;
> > >   - You don't rely on a "store" to tag files ;
> > >   - You also don't end with a huge store full of unuseful things like it
> > > 
> > > used to happen with Nepomuk some time ago (no offense) ; - You can
> > > easily
> > > backup the metadata (at least files metadata) : you just have to use a
> > > decent backup tool that handles xattrs ; - It's CLI-friendly ;
> > > 
> > >   - ...
> > 
> > +1
> > 
> > I'm leaning towards this as well.
> 
> To my knowledge, the list of filesystems with proper xattr support is rather
> short. 
>From Wikipedia "In Linux, the ext2, ext3, ext4, JFS, ReiserFS, XFS, Btrfs and 
OCFS2 1.6 filesystems support extended attributes".
That list includes all filesystems you are going to use on a desktop system.


> This means, a fallback mechanism is needed, at which time one has to
> ask if the primary mechanism is really needed. Programs like "cp" don't
> seem to consider xattr by default, so the "CLI" friendly is limited to "if
> you remember to copy xattr" as well.
> 
> The advantages of xattr are quite limited due to this, it's definitely not a
> silver bullet.
> 
A big advantage that I see is performance, both I/O and CPU. A database
of attributes means
- File changes (moves, renames etc) need to be tracked
- Every file change triggers a read from and (if it has tags) a write
  to the database, which is an effective I/O multiplier or put another
  way an I/O performance divisor.
  Note that while writes can be cached, reads from blocks that aren't
  "accidentally" already in RAM cannot be cached. If the data is needed
  ASAP, it must be read right away. So the database reads when moving
  files could easily generate >10x the disk traffic of the easily
  cacheable and mergeable original write I/O.
It may or may not be possible to avoid the I/O multiplication problem
for full-text search. At least there one doesn't absolutely have to
keep track of all changes since the data is in the file itself, so the
file can be asynchronously re-indexed in its new location. If the file
is excluded from indexing, it's even easier. This is not possible for
tags where you need to check for every moved file if it has tags.
It is hard to overstate the severity of this problem.

Another general consideration is that, if some change in the kernel
would help us tremendously (e.g. in file change tracking APIs), we
should try to effect such a change. We in the KDE community tend to
treat the kernel as out of our influence, which does not need to be so.

>> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<

Re: Nepomuk in 4.13 and beyond

Reply via email to