When using spark on mesos and deploying a job in cluster mode using
dispatcher, there appears to be no memory overhead configuration for the
launched driver processes ("--driver-memory" is the same as Xmx which is the
same as the memory quota). This makes it almost a guarantee that a long
running d
It seems like this is a real issue, so I've opened an issue:
https://issues.apache.org/jira/browse/SPARK-17928
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/No-way-to-set-mesos-cluster-driver-memory-overhead-tp27897p27901.html
Sent from the Apache Spark Us
When writing a dataframe, a _SUCCESS file is created to mark that the entire
dataframe is written. However, the existence of this _SUCCESS does not seem
to be validated by default on reads. This would allow in some cases for
partially written dataframes to be read back. Is this behavior configurabl
I'm trying to figure out how to multiple tables from a single external source
directly in spark sql. Say I do the following in spark SQL:
CREATE OR REPLACE TEMPORARY VIEW t1 USING jdbc OPTIONS ( dbtable 't1' ...)
CREATE OR REPLACE TEMPORARY VIEW t2 USING jdbc OPTIONS ( dbtable 't2' ...)
SELECT *