Hi Ayan, Depends on the version of Spark you are using.
Have you tried updating stats in Hive? ANALYZE TABLE ${DATABASE}.${TABLE} PARTITION (${PARTITION_NAME}) COMPUTE STATISTICS FOR COLUMNS and then do show create table ${TABLE} HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 7 October 2016 at 02:37, ayan guha <guha.a...@gmail.com> wrote: > Hi > > Faced one issue: > > - Writing Hive Partitioned table using > > df.withColumn("partition_date",to_date(df["INTERVAL_DATE"])) > .write.partitionBy('partition_date').saveAsTable("sometable" > ,mode="overwrite") > > - Data got written to HDFS fine. I can see the folders with partition > names such as > > /app/somedb/hive/somedb.db/sometable/partition_date=2016-09-28 > /app/somedb/hive/somedb.db/sometable/partition_date=2016-09-29 > > and so on. > - Also, _common_metadata & _metadata files are written properly > > - I can read data from spark fine using > read.parquet("/app/somedb/hive/somedb.db/sometable"). > Printschema showing all columns. > > - However, I can not read from hive. > > Problem 1: Hive does not think the table is partitioned > Problem 2: Hive sees only 1 column > array<string> from deserializer > Problem 3: MSCK repair table failed, saying partitions are not in Metadata. > > Question: Is it a known issue with Spark to write to Hive partitioned > table? > > > -- > Best Regards, > Ayan Guha >