Phil
I proposed to the head of the engineer to see if one of the guys working
on databases could be interested
by a one month project. Now I have no idea if they will like it/have the
time.
Stef
Le 29/4/15 14:57, p...@highoctane.be a écrit :
I am involved in some Hadoop deployments and there is a very
interesting possiblity for Pharo in that ecosystem.
Namely, there is a YARN thing in there which is a scheduler for
distributing computing on a cluster of nodes.
It is possible to deploy all kinds of technologies on the nodes (e.g.
Python, R, Java) and Pharo images and VM (in headless mode) could be
deployed as well.
The deployed node can communicate back to what is called an
AppllicationManager via REST callbacks (easy game in Pharo). There is
also a C API (now, this is FFI or a plugin -
http://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html)
There is also an Hadoop component named ZooKeeper that focuses on
acting as a distributed configuration repository.
One can talk to it with REST too
(https://github.com/apache/zookeeper/tree/trunk/src/contrib/rest)
Given the fact that we also can use some Java calls (using the JNI
module with 32-bits Java), we can integrate well enough on YARN I'd say.
There is also another project which is very nice and this is SLIDER
(on YARN).
This is about deploying stuff in an elastic way, (see
http://slider.incubator.apache.org/)
The next logical thing is to have docker containers (containing a
pharo stack) deployed dynamically on the cluster using Slider (like
this: http://www.slideshare.net/hortonworks/docker-on-slider-45493303)
First step here would be to have a basic YARN-Pharo application and a
PoC for talking to ZooKeeper.
This would open interesting gates for Pharo given its strengths.
Even more when we'll get a 64-bit VM.
What is cool with Pharo is that an image can be very small and self
containing vs Java application (which have tons of Jar files attached).
Access to the data on the HDFS thing can happen through NFSv3 so, we
can go that route.
There is also a REST API to it
(https://hadoop.apache.org/docs/r1.0.4/webhdfs.html)
Tell me what you think!
Phil