Hi,
I'm reading in a CSV file, and I would like to write it back as a permanent
table, but with partitioning by year, etc.
Currently I do this:
from pyspark.sql import HiveContext
sqlContext = HiveContext(sc)
df =
sqlContext.read.format('com.databricks.spark.csv').options(header='true',
infersche
Date: Wed, 1 Apr 2015 04:35:08 +0000
> Subject: Creating Partitioned Parquet Tables via SparkSQL
> To: user@spark.apache.org
>
>
> Creating Parquet tables via .saveAsTable is great but was wondering if
> there was an equivalent way to create partitioned parquet tables.
>
> Thanks!
>
>
This is tracked by these JIRAs..
https://issues.apache.org/jira/browse/SPARK-5947
https://issues.apache.org/jira/browse/SPARK-5948
From: denny.g@gmail.com
Date: Wed, 1 Apr 2015 04:35:08 +
Subject: Creating Partitioned Parquet Tables via SparkSQL
To: user@spark.apache.org
Creating
Creating Parquet tables via .saveAsTable is great but was wondering if
there was an equivalent way to create partitioned parquet tables.
Thanks!