What Storage Format?
> On 30 Oct 2015, at 12:05, Rex Xiong <bycha...@gmail.com> wrote: > > Hi folks, > > I have a Hive external table with partitions. > Every day, an App will generate a new partition day=yyyy-MM-dd stored by > parquet and run add-partition Hive command. > In some cases, we will add additional column to new partitions and update > Hive table schema, then a query across new and old partitions will fail with > exception: > > org.apache.hive.service.cli.HiveSQLException: > org.apache.spark.sql.AnalysisException: cannot resolve 'newcolumn' given > input columns .... > > We have tried schema merging feature, but it's too slow, there're hundreds of > partitions. > Is it possible to bypass this schema check and return a default value (such > as null) for missing columns? > > Thank you