Darren Kenny wrote:
Hi Prakash,
I don't think it's the implementation that bridges between the kernel
and the user spaces that's important to JDS, and probably most other
people - it's the ultimate API that people will have to write to, and
I think from this perspective that sysevents is not what we want - it
may be part of the implementation of the final API, but it shouldn't
really be the main API that developers have to write to. At the moment
there are two projects, both with the same ABI (AFAIK), that are
prevalent in the Linux developer community:
I understand. It is the API. We should be implementing to support some
standard API for
file events notification. Unfortunately there is no standard API nor
what the requirements are.
Hence the discussion is around understanding what requirements are for
such a feature.
* libfam - http://oss.sgi.com/projects/fam/
* gamin - http://www.gnome.org/~veillard/gamin/index.html
They are compared (with some bias of course) at:
http://www.gnome.org/~veillard/gamin/differences.html
It's an API like these that we need, so I think any solution proposed
(albeit sysevents underneath) should expose one of these APIs to the
user, if the same people produce this API as those that expose things
via sysevents, then things should work at their optimal. Whether it's
the SAME API is up for debate, but what ever we expose should at least
provide the same functionality.
When you say 'sysevents' what are you referring to?
Support for a commonly used API, can be provided means of a library.
The implementation will provide a native set of interfaces to support
the desired
functionality/requirements, like using the Event ports interfaces. The
library can
use these native interfaces and expose a common API.
One think that people have mentioned before as well is the handling of
distributed file-systems - how do you propose we handle these - this
is especially important on Solaris given that the majority of our
customers use NFS for their home directories. From what I understand
of the File Event Mechanism (FEM) in the kernel, we are using this in
NFSv4, can this provide us with the ability to see changes to files
that occur on a NFSv4 filesystem mounted on my desktop, for instance?
The proposed solution will not provide file events generated on a
distributed filesystem
from a remote node. But it certainly can provide file events generated
locally on this distributed
filesystem(Ex NFS client side). I don't think the 'FEM' framework in the
kernel, that is
used for the NFSv4 delegation can provide support for events from
remote nodes.
Clearly, this should be transparent to the API.
It appears that the distributed filesystem implementation should
provide necessary means
to collect such events. I think the responsibility falls squarely on the
distributed file system
implementation. I don't of if there are any distributed file system
implementation which can
do that.
-Prakash.
Thanks,
Darren.
Prakash Sangappa wrote:
Glynn Foster wrote:
Yeah, currently Beagle only indexes a relatively small amount of
per-user data, generally in $HOME - however, as has been mentioned,
it's probably one of the first proper use cases of inotify.
I'd suggest that it's definitely worth looking at what inotify does
- given that there seems to have been a lot of churn with
dnotify/inotify/FAM/gamin/ etc.., there's probably some
implementation lessons to be learned from our Linux neighbours [some
of http://kerneltrap.org/node/3847 might be interesting reading].
FILE_CLOSE_WRITE is missing from the list of events posted
previously that is apparently useful, as suggested by some GNOME
developers.
Yes, I have looked at some of the issue with 'dnotify' that got
addressed
by 'inotify' and also some of the issue with inotify.
Here is another pointer which lists some issues with 'inotify'. These
may have
been addressed already with the later version of inotify.
http://manic.desrt.ca/inotify
Proposed interfaces:
- Unlike the 'inotify' interfaces, which uses 'ioctls' to a device,
our
interfaces will be based on the existing Event ports framework.
The events
will be delivered to a specified 'Event port'(which is an fd).
The Event port can receive events from multiple sources. Currently
available event sources in Solaris 'poll, aio, timer, user,
message queue(this was recently integrated)'.
More event sources can be added.
Ex,
if the application needs to wait on a 'poll' event and also
receive file events
notifications, it can receive both these types of events on one
Event port.
- The application does not have to open the file/directory being
watched.
Therefore it will not prevent the file from being deleted or
unmounting the filesystem. These where issues with 'dnotify'.
- If the file system gets unmounted, it will automatically de-register
the event notification and send an event indicating that.
It can also send a 'file deleted event' when watching a file which
gets deleted.
- A pointer to some user data can be passed in when registering file
event notication.
The user data will be returned along with the event notification.
This was something
missing with the inotify interface.
I think 'inotify' returns something called 'watch descriptors' when
registering the watches.
They had some issue with 'watch descriptors' being reused that would
cause confusion in
identifying the received events. This issue may have been
addressed already with their
new version.
This problem should not exist with the event ports interface since a
user data pointer will
be returned with the event. The application can differentiate the
events based on that.
- To de-register or to re-register, the object(file) needs to be
passed again. But the
file name could disappear(get removed) from the directory. To
accommodate this,
we could change the passed in object to a structure which will
include the filename.
Once it gets registered, the system can return an unique id. So
subsequent actions
(de-register, re-register), we depend on this 'id'.
Ex.
file_event {
uintptr_t id; /* id returned after
registering */
int len; /* length of the file name */
char fname[0]; /* filename */
}fobj;
So, now the interface will be
port_associate(port, PORT_SOURCE_FILE, (uintptr_t)&fobj, ... )
So this 'fobj' structure can be passed to de-register(port_dissociate)
or re-register(port_associate again) the file events.
FILE_CLOSE_WRITE - What is the purpose of this event? If it is
found to be useful, we could include it.
Rgds,
-Prakash.
Glynn
Glynn
Does this mean we don't get told /what/ got created? Is an
application that
wants to know "what files are disappearing/appearing under
/foo/bar/?" going
to have to readdir() the whole directory every time it gets an
event?
Otherwise we get into the queued event problem; what happens if the
application is watching a directory w/ a million files, and
someone does
rm * in that directory? Do we generate a million events? Clearly
there
are limits to the number of events we can queue in the kernel, esp.
since the application isn't obligated to read them in a timely
fashion.
Forcing a (recursive!) readdir() every time can't scale either; it
just pushes
the problem out all the userspace apps. Perhaps a compromise
approach would
work, so at least the readdir() cost is amortized; i.e. give names
up to a
particular limit.
Or how do you expect Beagle to be able to work nicely? Is this just
going to
remain something explicitly unsupportable?
I'd rather have a model like signals; multiple file writes are
combined into one event until that event is read by the
application; any subsequent writes generate another event.
Would work fine for modifications, yes.
I see this as very useful to avoid the {sleep(); stat() } loops we
often
see. It's not a mechanism to insert an application as an synchronous
interposer into the filesystem VOPS.
I wasn't trying to suggest it was. Synchronisation is neither
needed nor
wanted.
The nscd could use this to watch for modifications to configuration
files rather than stat'ing them before each cache lookup.
I wasn't suggesting that a non-recursive approach doesn't solve a
whole class
of such situations; it does. In fact, I was merely trying to raise
awareness of
what applications like Beagle actually need in terms of
notifications. If it's
really too hard to do, that's a pity.
regards,
john
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org