subject:"Re\: Load balancing"

Re: Load balancing

2015-03-22 Thread Mohit Anchlia

posting my question again :) Thanks for the pointer, looking at the below description from the site it looks like in spark block size is not fixed, it's determined by block interval and in fact for the same batch you could have different block sizes. Did I get it right? - Another para

Re: Load balancing

2015-03-22 Thread Jeffrey Jedele

Hi Mohit, please make sure you use the "Reply to all" button and include the mailing list, otherwise only I will get your message ;) Regarding your question: Yes, that's also my understanding. You can partition streaming RDDs only by time intervals, not by size. So depending on your incoming rate,

Re: Load balancing

2015-03-20 Thread Jeffrey Jedele

Hi Mohit, it also depends on what the source for your streaming application is. If you use Kafka, you can easily partition topics and have multiple receivers on different machines. If you have sth like a HTTP, socket, etc stream, you probably can't do that. The Spark RDDs generated by your receiv

Re: Load balancing

2015-03-20 Thread Akhil Das

1. If you are consuming data from Kafka or any other receiver based sources, then you can start 1-2 receivers per worker (assuming you'll have min 4 core per worker) 2. If you are having single receiver or is a fileStream then what you can do to distribute the data across machines is to do a repar

Re: Load balancing

Re: Load balancing

Re: Load balancing

Re: Load balancing

4 matches

Site Navigation

Mail list logo

Footer information