Hello Rachana,
The easiest way would be to start with creating a 'parent' JavaRDD and run
different filters (based on different input arguments) to create respective
'child' JavaRDDs dynamically.
Notice that the creation of these children RDDs is handled by the
application driver.
Hope this help
referring to both fine grain and coarse grain?
>
> Desirable number of executors per node could be interesting but it can't
> be guaranteed (or we could try to and when failed abort the job).
>
> How would you imagine this new option to actually work?
>
>
> Tim
>
&g
Hi Tim,
An option like spark.mesos.executor.max to cap the number of executors per
node/application would be very useful. However, having an option like
spark.mesos.executor.num
to specify desirable number of executors per node would provide even/much
better control.
Thanks,
Ajay
On Wed, Aug 12
Hi Sujit,
>From experimenting with Spark (and other documentation), my understanding
is as follows:
1. Each application consists of one or more Jobs
2. Each Job has one or more Stages
3. Each Stage creates one or more Tasks (normally, one Task per
Partition)
4. Master
he other option is to define a function to find open port and
> use that.
>
>
> Thanks
>
> Joji John
>
>
> --
> *From:* Ajay Singal
> *Sent:* Friday, July 24, 2015 6:59 AM
> *To:* Joji John
> *Cc:* user@spark.apache.org
> *Subj
Hi Chintan,
This is more of Oracle VirtualBox virtualization issue than Spark issue.
VT-x is hardware assisted virtualization and it is required by Oracle
VirtualBox for all (64 bits) guests. The error message indicates that
either your processor does not support VT-x (but your VM is configure
Hi Jodi,
I guess, there is no hard limit on number of Spark applications running in
parallel. However, you need to ensure that you do not use the same (e.g.,
default) port numbers for each application.
In your specific case, for example, if you try using default SparkUI port
"4040" for more than
Greetings,
We have an analytics workflow system in production. This system is built in
Java and utilizes other services (including Apache Solr). It works fine
with moderate level of data/processing load. However, when the load goes
beyond certain limit (e.g., more than 10 million messages/docum