partitioned parquet tables

2016-04-01 Thread Imran Akbar
Hi, I'm reading in a CSV file, and I would like to write it back as a permanent table, but with partitioning by year, etc. Currently I do this: from pyspark.sql import HiveContext sqlContext = HiveContext(sc) df = sqlContext.read.format('com.databricks.spark.csv').options(header='true', infersche

Re: Creating Partitioned Parquet Tables via SparkSQL

2015-04-01 Thread Denny Lee
Date: Wed, 1 Apr 2015 04:35:08 +0000 > Subject: Creating Partitioned Parquet Tables via SparkSQL > To: user@spark.apache.org > > > Creating Parquet tables via .saveAsTable is great but was wondering if > there was an equivalent way to create partitioned parquet tables. > > Thanks! > >

RE: Creating Partitioned Parquet Tables via SparkSQL

2015-04-01 Thread Felix Cheung
This is tracked by these JIRAs.. https://issues.apache.org/jira/browse/SPARK-5947 https://issues.apache.org/jira/browse/SPARK-5948 From: denny.g@gmail.com Date: Wed, 1 Apr 2015 04:35:08 + Subject: Creating Partitioned Parquet Tables via SparkSQL To: user@spark.apache.org Creating

Creating Partitioned Parquet Tables via SparkSQL

2015-03-31 Thread Denny Lee
Creating Parquet tables via .saveAsTable is great but was wondering if there was an equivalent way to create partitioned parquet tables. Thanks!