Re: Data Deleted on Hive External Table

2015-08-25 Thread Peyman Mohajerian
Data was generated in some other cluster, they moved it to s3 and then copied it to my cluster into the warehouse path. I then created a schema over it. You are correct that this would not be the right process and we had no plans to do this in production, it was a POC. Nevertheless in my view 'exte

Re: Data Deleted on Hive External Table

2015-08-25 Thread Jeetendra G
if you put external in the table definition and point INPATH to hive the original data(where data is landing from other source ). then how come data will come to /user/hive/warehouse. /user/hive/warehouse should only be populated with data when its 'internal'? On Tue, Aug 25, 2015 at 7:33 PM, Pe

Re: Data Deleted on Hive External Table

2015-08-25 Thread Peyman Mohajerian
Hi Jeetendra, What I was originally saying is that if you drop the table, it will deleted the data despite the fact that you put 'external' in the definition. I think this behavior is due to the fact that data is in /user/hive/warehouse and therefore Hive assumes ownership and ignores the 'externa

Re: Data Deleted on Hive External Table

2015-08-24 Thread Jeetendra G
Hi Peyman I created a new Hive external table with partition column name of 'yr' instead of 'year' pointing to the same base directory. if this is a case how come /user/hive/warehouse having the data? it should not right? On Tue, Aug 25, 2015 at 4:41 AM, Peyman Mohajerian wrote: > Hi Guys, > >

Data Deleted on Hive External Table

2015-08-24 Thread Peyman Mohajerian
Hi Guys, I managed to delete some data in HDFS by dropping a partitioned external Hive table. One explanation is that data resided in the 'warehouse' directory of Hive and that had something to do with? An alternative explanation may that my 'drop table' statement didn't delete the data but my fol