subject:"Spark SQL 1.1.0 \- large insert into parquet runs out of memory"

Re: Spark SQL 1.1.0 - large insert into parquet runs out of memory

2014-09-23 Thread Dan Dietterich

I have only been using spark through the SQL front-end (CLI or JDBC). I don't think I have access to saveAsParquetFile from there, do I? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-1-1-0-large-insert-into-parquet-runs-out-of-memory-tp14924p1492

Re: Spark SQL 1.1.0 - large insert into parquet runs out of memory

2014-09-23 Thread Michael Armbrust

I would hope that things should work for this kind of workflow. I'm curious if you have tried using saveAsParquetFile instead of inserting directly into a hive table (you could still register this as an external table afterwards). Right now inserting into Hive tables is going to through their Ser

Spark SQL 1.1.0 - large insert into parquet runs out of memory

2014-09-23 Thread Dan Dietterich

I am trying to load data from csv format into parquet using Spark SQL. It consistently runs out of memory. The environment is: * standalone cluster using HDFS and Hive metastore from HDP2.0 * spark1.1.0 * parquet jar files (v1.5) explicitly added when starting spark-sql.

Re: Spark SQL 1.1.0 - large insert into parquet runs out of memory

Re: Spark SQL 1.1.0 - large insert into parquet runs out of memory

Spark SQL 1.1.0 - large insert into parquet runs out of memory

3 matches

Site Navigation

Mail list logo

Footer information