Hi Antoine,

Yes, I think you can replace the files backing a table (external or
managed) and if they conform to the same schema it should work. Remember
that you should regenerate the statistics afterwards.

For partitioned tables, you should run "MSCK REPAIR" after creating the new
directory and adding the file, or Hive will not see it.

Cheers,

Pau.

On Wed, Jan 30, 2019, 10:15 Antoine DUBOIS <antoine.dub...@cc.in2p3.fr
wrote:

> Hello everyone,
> I'd like some advise on a use-case.
> I have heterogeneous information in different CSV representing more than
> 100 GB in text that I want to consolidate everyday.
> My goal is to be able to provide those information via hive use SQL
> Alchemy.
> The information will be erase and regenerated everyday.
>
> I want to use spark to generate a consolidate orc file that would be
> integrated in hive.
>
> My question is :
> If I define an external table with hive pointing toward a orc file , is it
> possible to simply replace the orc file by  a newer version everyday
> without having to remove the table from hive and recreate it ?
> The orc file will always have the same schema , partitioning would occurs
> on the same column but value in this column might evolve.
>
> is it a possible scenario using hive or did I get it completely wrong ?
>
> thank you very much in advance for any advise or help given.
>
> Antoine
>
>
>

Reply via email to