this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717p19789.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
>
user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717p19789.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsu
Ohh...how can I miss that. :(. Thanks!
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717p19788.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
true); //tried with false also. Tried
> repartition(1) too.
>
> claimSchemaRdd.saveAsParquetFile(parquetPath);
> }
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-Java
th false also. Tried
repartition(1) too.
claimSchemaRdd.saveAsParquetFile(parquetPath);
}
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717p19776.html
Sent from the Apache Spark Us
ilter(new NullFilter());
JavaSchemaRDD claimSchemaRdd = sqlCtx.applySchema(claimRdd,
Claim.class);
claimSchemaRdd.coalesce(1)
claimSchemaRdd.saveAsParquetFile(parquetPath);
}
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Control-number-of-pa
gt; sc.hadoopConfiguration().setInt("parquet.block.size", MB_128);
>
> No luck.
> Is there a way to control the size/number of parquet files generated?
>
> Thanks
> Tridib
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.
-
From: tridib [mailto:tridib.sama...@live.com]
Sent: Tuesday, November 25, 2014 9:54 AM
To: u...@spark.incubator.apache.org
Subject: Control number of parquet generated from JavaSchemaRDD
Hello,
I am reading around 1000 input files from disk in an RDD and generating
parquet. It always produces
user-list.1001560.n3.nabble.com/Control-number-of-parquet-generated-from-JavaSchemaRDD-tp19717.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spark.apach