Another way is to configure S3 as Tachyon's under storage system, and then
run Spark on Tachyon.
More info: http://tachyon-project.org/Setup-UFS.html
Best,
Haoyuan
On Wed, May 13, 2015 at 10:52 AM, Stephen Carman
wrote:
> Thank you for the suggestions, the problem exists in the fact we need t
Thank you for the suggestions, the problem exists in the fact we need to
initialize the vfs s3 driver so what you suggested Akhil wouldn’t fix the
problem.
Basically a job is submitted to the cluster and it tries to pull down the data
from s3, but fails because the s3 uri hasn’t been initilized
Might I ask why vfs? I'm new to vfs and not sure wether or not it predates
the hadoop file system interfaces (HCFS).
After all spark natively supports any HCFS by leveraging the hadoop
FileSystem api and class loaders and so on.
So simply putting those resources on your classpath should be suffi
Did you happened to have a look at this https://github.com/abashev/vfs-s3
Thanks
Best Regards
On Tue, May 12, 2015 at 11:33 PM, Stephen Carman
wrote:
> We have a small mesos cluster and these slaves need to have a vfs setup on
> them so that the slaves can pull down the data they need from S3 w
We have a small mesos cluster and these slaves need to have a vfs setup on them
so that the slaves can pull down the data they need from S3 when spark runs.
There doesn’t seem to be any obvious way online on how to do this or how easily
accomplish this. Does anyone have some best practices or so