subject:"Re\: Spark configuration with 5 nodes"

Re: Spark configuration with 5 nodes

2016-03-19 Thread Mich Talebzadeh

Thanks Steve, For NN it all depends how fast you want a start-up. 1GB of NameNode memory accommodates around 42T so if you are talking about 100GB of NN memory then SSD may make sense to speed up the start-up. Raid 10 is the best one that one can get assuming all internal disks. In general it is

Re: Spark configuration with 5 nodes

2016-03-19 Thread Steve Loughran

> On 17 Mar 2016, at 12:28, Mich Talebzadeh wrote: > > Thanks Steve, > > For NN it all depends how fast you want a start-up. 1GB of NameNode memory > accommodates around 42T so if you are talking about 100GB of NN memory then > SSD may make sense to speed up the start-up. Raid 10 is the best

Re: Spark configuration with 5 nodes

2016-03-19 Thread Steve Loughran

On 11 Mar 2016, at 16:25, Mich Talebzadeh mailto:mich.talebza...@gmail.com>> wrote: Hi Steve, My argument has always been that if one is going to use Solid State Disks (SSD), it makes sense to have it for NN disks start-up from fsimage etc. Obviously NN lives in memory. Would you also reromme

Re: Spark configuration with 5 nodes

2016-03-19 Thread Mich Talebzadeh

Thank you for info Steve. I always believed (IMO) that there is an optimal position where one can plot the projected NN memory (assuming 1GB--> 40TB of data) to the number of nodes. For example heuristically how many nodes would be sufficient for 1PB of storage with nodes each having 512GB of mem

Re: Spark configuration with 5 nodes

2016-03-11 Thread Mich Talebzadeh

Hi Steve, My argument has always been that if one is going to use Solid State Disks (SSD), it makes sense to have it for NN disks start-up from fsimage etc. Obviously NN lives in memory. Would you also rerommend RAID10 (mirroring & striping) for NN disks? Thanks Dr Mich Talebzadeh LinkedI

Re: Spark configuration with 5 nodes

2016-03-11 Thread Steve Loughran

On 10 Mar 2016, at 22:15, Ashok Kumar mailto:ashok34...@yahoo.com.invalid>> wrote: Hi, We intend to use 5 servers which will be utilized for building Bigdata Hadoop data warehouse system (not using any propriety distribution like Hortonworks or Cloudera or others). I'd argue that life is i

Re: Spark configuration with 5 nodes

2016-03-10 Thread Mich Talebzadeh

Hi, Bear in mind that you typically need 1GB of NameNode memory for 1 million blocks. So if you have 128MB block size, you can store 128 * 1E6 / (3 *1024) = 41,666GB of data for every 1GB. Number 3 comes from the fact that the block is replicated three times. In other words just under 42TB of data

Re: Spark configuration with 5 nodes

2016-03-10 Thread Prabhu Joseph

Ashok, Cluster nodes has enough memory but CPU cores are less. 512GB / 16 = 32 GB. For 1 core the cluster has 32GB memory. Either their should be more cores available to use efficiently the available memory or don't configure a higher executor memory which will cause lot of GC. Thanks, Prabhu

Re: Spark configuration with 5 nodes

Re: Spark configuration with 5 nodes

Re: Spark configuration with 5 nodes

Re: Spark configuration with 5 nodes

Re: Spark configuration with 5 nodes

Re: Spark configuration with 5 nodes

Re: Spark configuration with 5 nodes

Re: Spark configuration with 5 nodes

8 matches

Site Navigation

Mail list logo

Footer information