For getting a start on this, one can download this: http://hortonworks.com/products/hortonworks-sandbox/
There is all of Hadoop stuff in there, including YARN, ZooKeeper etc. I'll start doing a YARN app to run one Pharo node on the cluster and move from there. One done, more nodes. Then REST callbacks. At one point, Pharo in a docker container deployed. Here is how to write a YARN application: http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html Phil On Thu, Apr 30, 2015 at 9:32 AM, Marcus Denker <marcus.den...@inria.fr> wrote: > Definitly interesting! > > > On 29 Apr 2015, at 14:57, p...@highoctane.be wrote: > > I am involved in some Hadoop deployments and there is a very interesting > possiblity for Pharo in that ecosystem. > > Namely, there is a YARN thing in there which is a scheduler for > distributing computing on a cluster of nodes. > > It is possible to deploy all kinds of technologies on the nodes (e.g. > Python, R, Java) and Pharo images and VM (in headless mode) could be > deployed as well. > > The deployed node can communicate back to what is called an > AppllicationManager via REST callbacks (easy game in Pharo). There is also > a C API (now, this is FFI or a plugin - > http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html) > > There is also an Hadoop component named ZooKeeper that focuses on acting > as a distributed configuration repository. > > One can talk to it with REST too ( > https://github.com/apache/zookeeper/tree/trunk/src/contrib/rest) > > Given the fact that we also can use some Java calls (using the JNI module > with 32-bits Java), we can integrate well enough on YARN I'd say. > > There is also another project which is very nice and this is SLIDER (on > YARN). > This is about deploying stuff in an elastic way, (see > http://slider.incubator.apache.org/) > > The next logical thing is to have docker containers (containing a pharo > stack) deployed dynamically on the cluster using Slider (like this: > http://www.slideshare.net/hortonworks/docker-on-slider-45493303) > > First step here would be to have a basic YARN-Pharo application and a PoC > for talking to ZooKeeper. > > This would open interesting gates for Pharo given its strengths. > Even more when we'll get a 64-bit VM. > > What is cool with Pharo is that an image can be very small and self > containing vs Java application (which have tons of Jar files attached). > > Access to the data on the HDFS thing can happen through NFSv3 so, we can > go that route. > There is also a REST API to it ( > https://hadoop.apache.org/docs/r1.0.4/webhdfs.html) > > Tell me what you think! > > Phil > > >