Hi Prakash,

Thanks for your reply!

On Thu, 2006-11-16 at 12:15 -0800, Prakash Sangappa wrote:
> As it has been previously discussed, watching for file events on an entire
> filesystem or a directory tree can be a scalability issue. So the File 
> events API may not be suitable for an application like the desktop search. 

I beg to differ, as we're already using inotify pretty effectively on
Linux for just this purpose.

It's important to maintain a sense of scope.  I fear that we as software
developers try to find a perfect general solution at the expense of
real, more limited use cases.

In the previous thread I recall someone saying that this won't scale on
a 10 TB filesystem, and that person is probably right.  However, I use
Beagle on my 120 GB home directory and there definitely isn't a torrent
of activity most of the time.

Beagle is a bit of a pathological case, though; a more realistic case
would be GNOME VFS, which monitors directories based on folders that are
open in the Nautilus file manager.

> For a desktop search system like Beagle or spotlight, it appears that a 
> better and an useful method would be for the filesystem to provide an 
> interface using which we could efficiently collect all the changes
> that have occurred since some given time.

This would be an incredibly useful feature -- the initial crawl of the
filesystem in Beagle is expensive and painful -- but it's only truly
useful when used in tandem with a file notification system.  Otherwise
we're still stuck in the world of polling, just with a nicer API.
Beagle will be forced to run that files_changed_since_time() function
once a second in a loop to pick up changes.  Is that really more
efficient?

> The approach we are taking for the file events notification API, is to 
> address the needs where applications have to  repeatedly stat 
> files/directories for changes. Now using the file events API, 
> applications have to just wait for file events that are sent when a 
> file or directory status changes.

This is essentially what Beagle wants too, just for all the files under
your home directory. :)

> File events interface:
> 
> Event types:
>       * FILE_ACCESS          /* Monitored file/directory was accessed */
>       * FILE_MODIFIED        /* Monitored file/directory was modified */
>       * FILE_ATTRIB          /* Monitored file/directory's ATTRIB was 
> changed */
>
> 
> Exception events:
>       * FILE_DELETE       /* Monitored file/directory was deleted */
>       * FILE_RENAME_TO    /* Monitored file/directory was renamed */
>       * FILE_RENAME_FROM  /* Monitored file/directory was renamed */
>       * UNMOUNTED         /* Monitored file system got unmounted */

How is file creation handled?  It seems like you would need an event
analogous to FILE_DELETE here to notice any newly added files.

> The application can only watch the following events. The exception events
> are reported as they occur. They don't have to be watched for.
> 
> FILE_ACCESS,
> FILE_MODIFIED,
> FILE_ATTRIB.
> 
> typedef struct file_obj {
>         timestruc_t     atime;          /* Access time got from stat(2) */
>         timestruc_t     mtime;          /* Modification time from stat(2) */
>         timestruc_t     ctime;          /* Change time from stat(2) */
>         char            *name;          /* Null terminated file name */
> } file_obj_t;

Does watching a directory imply that all the contained files are
watched?  That is, if I watch /home/joe, will touching /home/joe/foo
cause a FILE_MODIFIED event to be thrown for /home/joe/foo?  The two use
cases I can think of (Beagle and gnome-vfs) are more interested in
watching all the files in a directory rather than individual files.
(Although there are certainly use cases for individual files as well.)

> To activate  monitoring(watching) a file, it needs to be registered
> Upon delivering an event, the file monitor is disabled. It needs to be
> re-registered again to reactivate the monitor and receive further events.

Why is this?  Usually apps that monitor a file want to do it on their
own terms, because they have the state they need to determine when to
watch a file or not.  This constraint seems like it will just be a
burden on programmers.

Joe

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to