Hi Shivam, There were a lot of changes around ACID with the Hive 3.0 release. I assume below, that your question is about Hive 3.x release.
Hive ACID v2 implements UPDATE as deleting the old row, and creating a new one for performance reasons. See Eugene's nice presentation for the details: https://www.slideshare.net/Hadoop_Summit/transactional-operations-in-apache-hive-present-and-future-102803358 <https://www.slideshare.net/Hadoop_Summit/transactional-operations-in-apache-hive-present-and-future-102803358> https://www.youtube.com/watch?v=GyzU9wG0cFQ&t=834s <https://www.youtube.com/watch?v=GyzU9wG0cFQ&t=834s> So if your UPDATE command changes every raw in the partition, then yes, essentially the whole partition is rewritten. Just a side-note: Currently UPDATEs are only working for full ACID tables. With the current implementation full ACID tables should be stored in ORC file format. I hope this helps, Peter > On Nov 20, 2019, at 08:34, Shivam Sharma <28shivamsha...@gmail.com> wrote: > > Hi All, > > If we do update column in Hive with data stored in parquet format does Hive > rewrite the whole partition or it upsert the only subset of files in that > partition? > > Thanks > > -- > Shivam Sharma > Indian Institute Of Information Technology, Design and Manufacturing Jabalpur > Email:- 28shivamsha...@gmail.com <mailto:28shivamsha...@gmail.com> > LinkedIn:-https://www.linkedin.com/in/28shivamsharma > <https://www.linkedin.com/in/28shivamsharma>