subject:"Re\: Dataset doesn't have partitioner after a repartition on one of the columns"

Re: Dataset doesn't have partitioner after a repartition on one of the columns

2016-09-28 Thread Igor Berman

Michael, can you explain please why bucketBy is supported when using writeAsTable() to parquet by not with parquet() Is it only difference between table api and dataframe/dataset api? or there are some other? org.apache.spark.sql.AnalysisException: 'save' does not support bucketing right now; at o

Re: Dataset doesn't have partitioner after a repartition on one of the columns

2016-09-28 Thread Michael Armbrust

Hi Darin, In SQL we have finer grained information about partitioning, so we don't use the RDD Partitioner. Here's a notebook that walks