On Thursday 19 December 2013 16:16:32 Sebastian Kügler wrote: > On Thursday, December 19, 2013 14:48:57 Vishesh Handa wrote: > > On Wednesday 18 Dec 2013 12:09:02 François K. wrote: > > > Hi Vishesh, hi guys, > > > > > > > > > > > > I'm sorry to short-circuit the thread. I deleted Vishesh's original > > > email > > > by mistake... > > > > > > > > > > > > Well, that sounds really exciting ! Thanks again for your work. > > > > > > > > > > > > Here are a few thoughts/questions I have since you've made the > > > announcement. They might be a bit technical, I hope that's not a problem > > > (should I start a new thread ?). > > > > > > > > > > > > * What are the plans to store tags ? On OSX, tags are stored in files > > > xattrs which is -IMHO- very nice : - Metadata live and die with the file > > > ;> > > > > > > - No "store" query when you move or copy a file ; > > > - You don't rely on a "store" to tag files ; > > > - You also don't end with a huge store full of unuseful things like it > > > > > > used to happen with Nepomuk some time ago (no offense) ; - You can > > > easily > > > backup the metadata (at least files metadata) : you just have to use a > > > decent backup tool that handles xattrs ; - It's CLI-friendly ; > > > > > > - ... > > > > +1 > > > > I'm leaning towards this as well. > > To my knowledge, the list of filesystems with proper xattr support is rather > short. >From Wikipedia "In Linux, the ext2, ext3, ext4, JFS, ReiserFS, XFS, Btrfs and OCFS2 1.6 filesystems support extended attributes". That list includes all filesystems you are going to use on a desktop system.
> This means, a fallback mechanism is needed, at which time one has to > ask if the primary mechanism is really needed. Programs like "cp" don't > seem to consider xattr by default, so the "CLI" friendly is limited to "if > you remember to copy xattr" as well. > > The advantages of xattr are quite limited due to this, it's definitely not a > silver bullet. > A big advantage that I see is performance, both I/O and CPU. A database of attributes means - File changes (moves, renames etc) need to be tracked - Every file change triggers a read from and (if it has tags) a write to the database, which is an effective I/O multiplier or put another way an I/O performance divisor. Note that while writes can be cached, reads from blocks that aren't "accidentally" already in RAM cannot be cached. If the data is needed ASAP, it must be read right away. So the database reads when moving files could easily generate >10x the disk traffic of the easily cacheable and mergeable original write I/O. It may or may not be possible to avoid the I/O multiplication problem for full-text search. At least there one doesn't absolutely have to keep track of all changes since the data is in the file itself, so the file can be asynchronously re-indexed in its new location. If the file is excluded from indexing, it's even easier. This is not possible for tags where you need to check for every moved file if it has tags. It is hard to overstate the severity of this problem. Another general consideration is that, if some change in the kernel would help us tremendously (e.g. in file change tracking APIs), we should try to effect such a change. We in the KDE community tend to treat the kernel as out of our influence, which does not need to be so. >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<