Hello everyone, I'd like some advise on a use-case. I have heterogeneous information in different CSV representing more than 100 GB in text that I want to consolidate everyday. My goal is to be able to provide those information via hive use SQL Alchemy. The information will be erase and regenerated everyday.
I want to use spark to generate a consolidate orc file that would be integrated in hive. My question is : If I define an external table with hive pointing toward a orc file , is it possible to simply replace the orc file by a newer version everyday without having to remove the table from hive and recreate it ? The orc file will always have the same schema , partitioning would occurs on the same column but value in this column might evolve. is it a possible scenario using hive or did I get it completely wrong ? thank you very much in advance for any advise or help given. Antoine
smime.p7s
Description: S/MIME Cryptographic Signature