subject:"Remove dependence on HDFS"

Re: Remove dependence on HDFS

2017-02-13 Thread Calvin Jia

Hi Ben, You can replace HDFS with a number of storage systems since Spark is compatible with other storage like S3. This would allow you to scale your compute nodes solely for the purpose of adding compute power and not disk space. You can deploy Alluxio on your compute nodes to offset the perform

Re: Remove dependence on HDFS

2017-02-13 Thread Saisai Shao

IIUC Spark doesn't strongly bind to HDFS, it uses a common FileSystem layer which supports different FS implementations, HDFS is just one option. You could also use S3 as a backend FS, from Spark's point it is transparent to different FS implementations. On Sun, Feb 12, 2017 at 5:32 PM, ayan guh

Re: Remove dependence on HDFS

2017-02-12 Thread ayan guha

How about adding more NFS storage? On Sun, 12 Feb 2017 at 8:14 pm, Sean Owen wrote: > Data has to live somewhere -- how do you not add storage but store more > data? Alluxio is not persistent storage, and S3 isn't on your premises. > > On Sun, Feb 12, 2017 at 4:29 AM Benjamin Kim wrote: > > Ha

Re: Remove dependence on HDFS

2017-02-12 Thread Sean Owen

Data has to live somewhere -- how do you not add storage but store more data? Alluxio is not persistent storage, and S3 isn't on your premises. On Sun, Feb 12, 2017 at 4:29 AM Benjamin Kim wrote: > Has anyone got some advice on how to remove the reliance on HDFS for > storing persistent data. W

Re: Remove dependence on HDFS

2017-02-12 Thread Jörn Franke

You're have to carefully choose if your strategy makes sense given your users workloads. Hence, I am not sure your reasoning makes sense. However, You can , for example, install openstack swift as an object store and use this as storage. HDFS in this case can be used as a temporary store and/or

Remove dependence on HDFS

2017-02-11 Thread Benjamin Kim

Has anyone got some advice on how to remove the reliance on HDFS for storing persistent data. We have an on-premise Spark cluster. It seems like a waste of resources to keep adding nodes because of a lack of storage space only. I would rather add more powerful nodes due to the lack of processing

Re: Remove dependence on HDFS

Re: Remove dependence on HDFS

Re: Remove dependence on HDFS

Re: Remove dependence on HDFS

Re: Remove dependence on HDFS

Remove dependence on HDFS

6 matches

Site Navigation

Mail list logo

Footer information