from:"Dan Dietterich"

Re: Spark SQL 1.1.0 - large insert into parquet runs out of memory

2014-09-23 Thread Dan Dietterich

I have only been using spark through the SQL front-end (CLI or JDBC). I don't think I have access to saveAsParquetFile from there, do I? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-1-1-0-large-insert-into-parquet-runs-out-of-memory-tp14924p1492

Spark SQL 1.1.0 - large insert into parquet runs out of memory

2014-09-23 Thread Dan Dietterich

I am trying to load data from csv format into parquet using Spark SQL. It consistently runs out of memory. The environment is: * standalone cluster using HDFS and Hive metastore from HDP2.0 * spark1.1.0 * parquet jar files (v1.5) explicitly added when starting spark-sql.