What Storage Format?


> On 30 Oct 2015, at 12:05, Rex Xiong <bycha...@gmail.com> wrote:
> 
> Hi folks,
> 
> I have a Hive external table with partitions.
> Every day, an App will generate a new partition day=yyyy-MM-dd stored by 
> parquet and run add-partition Hive command.
> In some cases, we will add additional column to new partitions and update 
> Hive table schema, then a query across new and old partitions will fail with 
> exception:
> 
> org.apache.hive.service.cli.HiveSQLException: 
> org.apache.spark.sql.AnalysisException: cannot resolve 'newcolumn' given 
> input columns ....
> 
> We have tried schema merging feature, but it's too slow, there're hundreds of 
> partitions.
> Is it possible to bypass this schema check and return a default value (such 
> as null) for missing columns?
> 
> Thank you

Reply via email to