Make sure you are setting num executors correctly
M
> On Jul 17, 2015, at 9:16 PM, Charles Menguy wrote:
>
> I am trying to use PySpark on EMR to analyze some data stored as
> SequenceFiles on S3, but running into performance issues due to data
> locality. Here is a very simple sample that
Try setting spark.driver.host to the actual ip or hostname of the box
submitting the work. More info the networking section in this link:
http://spark.apache.org/docs/latest/configuration.html
Also check the spark config for your application for these driver settings in
the application web UI a
Jorn: Vertica
Cody: I posited the limit just as an example of how jdbcrdd could be used least
invasively. Let's say we used a partition on a time field -- we would still
need to have N executions of those queries. The queries we have are very
intense and concurrency is an issue even if the the
> I assume it's not viable to throw the query results into another table in
> your database and then query that using the normal approach?
>
> --eric
>
>> On 3/1/15 4:28 AM, michal.klo...@gmail.com wrote:
>> Jorn: Vertica
>>
>> Cody: I posited th
A SparkContext can submit jobs remotely.
The spark-submit options in general can be populated into a SparkConf and
passed in when you create a SparkContext.
We personally have not had too much success with yarn-client remote submission,
but standalone cluster mode was easy to get going.
M
>
Not sure if there's a spark native way but we've been using consul for this.
M
> On Apr 26, 2015, at 5:17 AM, James King wrote:
>
> Thanks for the response.
>
> But no this does not answer the question.
>
> The question was: Is there a way (via some API call) to query the number and
> type
According to the docs it should go like this:
spark://host1:port1,host2:port2
https://spark.apache.org/docs/latest/spark-standalone.html#standby-masters-with-zookeeper
Thanks
M
> On Apr 28, 2015, at 8:13 AM, James King wrote:
>
> I have multiple masters running and I'm trying to submit an ap
I've been querying Zookeeper directly via the Zookeeper client tools, it has
the ip of the current master leader in the master_status data. We are also
running Exhibitor for zookeeper which has a nice UI for exploring if you want
to look up manually
Thanks,
Michal
> On May 12, 2015, at 1:28