(0) Create a toolkit to run multiple parallel, tightly communicating clojure apps on google-app engine, simulating a single, long-running, multithreaded JVM instance that does not appear, to the user, to be limited by the constraints of GAE's java implementation (e.g. single threading, shared refs work despite being distributed); at the same time, the toolkit would minimize resources consumed in GAE, persisting threads/continuations waiting for data, and heuristically determining lowest cost for long-term storage versus memory and runtime consumption.
Note: http://elhumidor.blogspot.com/2009/04/clojure-on-google-appengine.html THE BIG CAVEAT > > Two unusual aspects of the Google AppEngine environment create pretty major > constraints on your ability to write idiomatic Clojure. > > First, an AppEngine application runs in a security context that doesn't > permit spawning threads, so you won't be able to use Agents, the > clojure.parallel library, or Futures. > > Second, one of the most exciting features of AppEngine is that your > application will be deployed on Google's huge infrastructure, dynamically > changing its footprint depending on demand. That means you'll potentially be > running on many JVMs at once. Unfortunately this is a strange fit for > Clojure's concurrency features, which are most useful when you have precise > control over what lives on what JVM (and simplest when everything runs on > one JVM). Since shared references (Vars, Refs, and Atoms) are shared only > within a single JVM, they are not suitable for many of their typical uses > when running on AppEngine. You should still use Clojure's atomic references > (and their associated means of modification) for any state that it makes > sense to keep global per-JVM, since there may be multiple threads serving > requests in one JVM. But remember JVMs will come and go during the lifetime > of your application, so anything truly global should go in the Datastore or > Memcache. (1) a clojure implementation of Yahoo's PNUTs, using STM's and all the cool facilities clojure provides: http://research.yahoo.com/files/pnuts.pdf (interesting to have a writeup of a real-world impl alongside comparisons to Google Bigtable and Amazon Dynamo) We describe PNUTS, a massively parallel and geographically distributed > database system for Yahoo!'s web applications. > The foremost requirements of web applications are scalability, consistently > good response time for geographically dispersed users, and high > availability. At the same time, web applications can frequently tolerate > relaxed consistency guarantees. > For example, if a user changes an avatar ... little harm is done if the new > avatar is not initially visible to one friend .... It is often acceptable to > read (slightly) stale data, but occasionally stronger guarantees are > required by applications. > PNUTS provides a consistency model that is between the two extremes of > general serializability and eventual consistency ... We provide per-record > timeline consistency: all replicas of a given record apply all updates to > the record in the same order .... The application [can] indicate cases where > it can do with some relaxed consistency for higher performance .... [such as > reading] a possibly stale version of the record. Some interesting commentary from http://glinden.blogspot.com/2009/02/details-on-yahoos-distributed-database.html <http://glinden.blogspot.com/2009/02/details-on-yahoos-distributed-database.html> .................. * * *When reading the paper, a couple things about PNUTS struck me as surprising: First, the system is layered on top of the guarantees of a reliable pub-sub message broker which acts "both as our replacement for a redo log and our replication mechanism." I have to wonder if the choice to not build these pieces of the database themselves could lead to missed opportunities for improving performance and efficiency. Second, as figures 3 and 4 show, the average latency of requests to their database seems quite high, roughly 100 ms. This is high enough that web applications probably would incur too much total latency if they made a few requests serially (e.g. ask for some data, then, depending on what the data looks like, ask for some other data). That seems like a problem. Please see also my August 2006 post, "Google Bigtable paper<http://glinden.blogspot.com/2006/08/google-bigtable-paper.html>", which discusses the distributed database behind many products at Google. Please see also my earlier post, "Highly available distributed hash store at Amazon<http://glinden.blogspot.com/2007/10/highly-available-distributed-hash.html>", on the distributed database behind some features at Amazon.com. Please see also my earlier posts, "Cassandra data store at Facebook<http://glinden.blogspot.com/2008/08/cassandra-data-store-at-facebook.html>" and "HBase: A Google Bigtable clone<http://glinden.blogspot.com/2007/07/hbase-google-bigtable-clone.html> ". Update: One of the developers of PNUTS commented<http://glinden.blogspot.com/2009/02/details-on-yahoos-distributed-database.html?showComment=1233884340000#c1254841206330803677> on this post, pointing out that PNUTS performance is much better in practice (1-10ms/request) when caching layers are in place and making a few comparisons to Bigtable.* * * *.................. * * * -- Niels http://nielsmayer.com -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en