On Wednesday 18 Dec 2013 12:09:02 François K. wrote: > Hi Vishesh, hi guys, > > I'm sorry to short-circuit the thread. I deleted Vishesh's original email by > mistake... > > Well, that sounds really exciting ! Thanks again for your work. > > Here are a few thoughts/questions I have since you've made the announcement. > They might be a bit technical, I hope that's not a problem (should I start > a new thread ?). > > * What are the plans to store tags ? On OSX, tags are stored in files xattrs > which is -IMHO- very nice : - Metadata live and die with the file ; > - No "store" query when you move or copy a file ; > - You don't rely on a "store" to tag files ; > - You also don't end with a huge store full of unuseful things like it > used to happen with Nepomuk some time ago (no offense) ; - You can easily > backup the metadata (at least files metadata) : you just have to use a > decent backup tool that handles xattrs ; - It's CLI-friendly ; > - ...
+1 I'm leaning towards this as well. > > * What are the plans to store indexes ? Again, with OSX (sorry, I work a lot > with Macs -maybe too much-), the system builds an index per volume. This is > quite nice because when you connect a volume that has already been indexed, > the system gets the information and can immediatly search the volume index. > Let's take an example : let's say you have some remote storage (NAS or > whatever) at home with your medias. You mount this remote volume and let > the indexers do their stuff. Then you mount the volume from another device > and *tadaaa*, you're able to query the previously-built index. Wouldn't > that be awesome ? If you disconnect the volume, the index for this volume > isn't available anymore and you don't get results for it. This also means > that if one index gets corrupted, you don't have to scan and index every > volume again. I think this would also solve Ignacio's issue. > This is exactly what I'm aiming for. We're currently using Xapian to store the indexes. Its engine allows multiple databases to be queried easily. > * You probably already know it, but SQLite DB might have some problems when > stored on remote filesystems (see: http://www.sqlite.org/wal.html and > especially "All processes using a database must be on the same host > computer; WAL does not work over a network filesystem."). So if you plan to > store each index on its volume (as previously suggested), SQLite might not > be the (best) solution. > Nah. The sqlite is used to map file urls to unique identifiers. We need unique identifiers for files since the url can change on rename/move. This unique identifier (an unsigned integer) is then used in xapian to uniquely identify the file. > * Will there be several separated indexers (one for PDF files, one for video > files, ...) or just one that takes care of everything ? I was thinking > about the ability to add indexers that could retrieve stuff from the > Internet. For example, have an indexer that could retrieve movie > information from TheMovieDB.org. > There are separate indexers for each file format, as was the case with Nepomuk. Please have a look at kfilemetadata [1]. For web extractors, I still haven't figured out how we would approach that. Another Nepomuk developer, Jorg, has similar ideas. Maybe we should start a thread about it and discuss it? > * I hope there will be a nice query API ? Dealing with Sparql was a > nightmare for me ! > There is one right now. Perhaps you could take a look and give some feedback? > * Will it come with a QML DataEngine ? > Can't say. It will have QML Bindings, but I'm not sure about a DataEngine. Lets see. -- Vishesh Handa [1] https://projects.kde.org/projects/playground/base/kfilemetadata >> Visit http://mail.kde.org/mailman/listinfo/kde-devel#unsub to unsubscribe <<