Thanks, Shawn. That was my suspicion but I'm glad to hear it from someone else. 
Do you have any links or documentation I can give to a manager to help them 
feel more confident in this?

Also, I misspoke, I'm running Hive 1.2.1.

Get Outlook for Android<https://aka.ms/ghei36>

________________________________
From: Shawn Weeks <swe...@weeksconsulting.us>
Sent: Wednesday, November 6, 2019 5:35:17 PM
To: user@hive.apache.org <user@hive.apache.org>
Subject: Re: INSERT OVERWRITE Failure Saftey


I’m not sure specific to Hive 1.3 but in other versions the data is written to 
a temp location and then at the end of the query the previous data is deleted 
and the new data is renamed/moved. Something to watch out for is if the query 
returns no rows than the old data isn’t removed.



Thanks

Shawn



From: David M <mcginni...@outlook.com>
Reply-To: "user@hive.apache.org" <user@hive.apache.org>
Date: Wednesday, November 6, 2019 at 3:27 PM
To: "user@hive.apache.org" <user@hive.apache.org>
Subject: INSERT OVERWRITE Failure Saftey



All,



I have a Hive 1.3 cluster running in production, and there was a question about 
INSERT OVERWRITE queries on tables. If I perform an INSERT OVERWRITE query on a 
table, and the query fails half way through, will the old data still exist in 
the table? I’m not completely clear on the exact process which INSERT OVERWRITE 
follows, but I believe it puts the data into the staging folder, and then does 
a remove and move, which should be safe. It could also just wipe the folder 
before the query starts however, which would cause issues if the query itself 
failed. Can someone give me a definitive answer on this? Pointers to the source 
code or documentation that explains this would be even better.



Thanks!



David McGinnis


Reply via email to