On 2009.06.03 21:55:27 +0530, Senthil Kumaran wrote: > 1) I need to constantly monitor a particular directory for new files. > 2) Whenever a new file is dropped; I read that file and get > information on where to collect data from that is a) another machine b) > machine2-different method c) database. > 3) I collect data from those machines and store it. > > The data is huge and I need the three processes a, b, c to be > non-blocking, and I can just do a function call like do_a(), do_b(), > do_c() to perform them. > > For 1) to constantly monitor a particular directory for new files, I > am doing something like this: > My Question: Can this be designed in way that looking for new files is > also asynchronous activity?
If your OS has a way to let you register your interest in particular directories and then notify you when new files appear there, then yes. If you're using Linux then check out the Twisted inotify wrapper that's in dialtone's sandbox. http://twistedmatrix.com/trac/changeset/25717 If you're using something else then it probably has a similar API but it'll be more work because AFAIK nobody's already written the Twisted wrapper for you. Or maybe you can get away with just periodically calling os.listdir from a subthread, using deferToThread. Not technically asynchronous but probably good enough. > Now, after reading the contents, I will have to do a non-blocking call > to fetch data, either using fun_a, fun_b or fun_b. How should I > associate this requirement to deferred/callback pattern? Depends. If it's just a simple cheap Python function that doesn't block then you can just do: deferred1 = reactor.callLater(0, fun_a) deferred1.addCallback(fun_a_callback) deferred1.addErrback(fun_a_errback) If it's a simple function that blocks and can't be changed to not block but doesn't use too much CPU then you can use deferToThread. If it's a piggy enough function that you really want it in a separate process so it can use another CPU core, then write a little Python script that wraps it, and call it using the Twisted process APIs: http://twistedmatrix.com/projects/core/documentation/howto/process.html But just because you can do this in Twisted doesn't mean you necessarily should. If you need an asynchronous main loop then Twisted has really good APIs for dealing with asynchronous main loops. (If you're on Linux and can use inotify then it qualifies.) But if you end up polling the filesystem with os.listdir in one thread, and running your fun_x in other threads, and you're not really doing anything asynchronous, then IMO Twisted won't really add any value. In that case I'd just use Python's threading and Queue modules. -- David Ripton drip...@ripton.net _______________________________________________ Twisted-Python mailing list Twisted-Python@twistedmatrix.com http://twistedmatrix.com/cgi-bin/mailman/listinfo/twisted-python