Re: Managed vs external tables in hive

2012-05-14 Thread Mark Grover
che.org Cc: user@hive.apache.org Sent: Sunday, May 13, 2012 8:54:35 PM Subject: Re: Managed vs external tables in hive Thanks Mark and Edward. This is good info to keep in mind. So is it fair to say that external tables offer flexibility, in that, you can have multiple schemas on the same

Re: Managed vs external tables in hive

2012-05-13 Thread Ranjith
Thanks Mark and Edward. This is good info to keep in mind. So is it fair to say that external tables offer flexibility, in that, you can have multiple schemas on the same data asset without data duplication. Is there anything else that an external table may offer versus a hive managed table or v

Re: Managed vs external tables in hive

2012-05-13 Thread Edward Capriolo
I believe I walked through the entire process. You can ALTER TABLE a table and change it from external to managed. So someone could always change the table to MANAGED do the index thing and then change it back. Just be aware of the tables current status before it is dropped. Edward On Sun, May 1

Re: Managed vs external tables in hive

2012-05-13 Thread Mark Grover
To: user@hive.apache.org Cc: user@hive.apache.org Sent: Sunday, May 13, 2012 4:07:48 PM Subject: Re: Managed vs external tables in hive Edward, Did you confirm this through the explain plan or through the execution of the ddl alone. And have you tried buckets with external tables? Thanks, Ranjit

Re: Managed vs external tables in hive

2012-05-13 Thread Ranjith
Edward, Did you confirm this through the explain plan or through the execution of the ddl alone. And have you tried buckets with external tables? Thanks, Ranjith On May 13, 2012, at 2:33 PM, Edward Capriolo wrote: > The original design docs say you can not build indexes on external tables but

Re: Managed vs external tables in hive

2012-05-13 Thread Ranjith
Good info Edward. Thanks. Thanks, Ranjith On May 13, 2012, at 2:33 PM, Edward Capriolo wrote: > The original design docs say you can not build indexes on external tables but > I tried it in 0.8.x and confirmed you can. > > On Sunday, May 13, 2012, Ranjith wrote: > > Indexes can be built on t

Re: Managed vs external tables in hive

2012-05-13 Thread Edward Capriolo
The original design docs say you can not build indexes on external tables but I tried it in 0.8.x and confirmed you can. On Sunday, May 13, 2012, Ranjith wrote: > Indexes can be built on tables managed by hive. For external tables I do not believe that to be true. Please feel to correct if I am w

Re: Managed vs external tables in hive

2012-05-13 Thread Ranjith
Starting in .7 hive introduced indexing, https://issues.apache.org/jira/browse/HIVE-417. So indexes are available in hive. Thanks, Ranjith On May 12, 2012, at 11:13 PM, Raja Thiruvathuru wrote: > No indexing in hive. > > > On Sunday, May 13, 2012, Ranjith wrote: > Indexes can be built on ta

Re: Managed vs external tables in hive

2012-05-12 Thread Raja Thiruvathuru
No indexing in hive. On Sunday, May 13, 2012, Ranjith wrote: > Indexes can be built on tables managed by hive. For external tables I do > not believe that to be true. Please feel to correct if I am wrong. > > Thanks, > Ranjith > > On May 12, 2012, at 9:24 PM, Nanda Vijaydev > 'nanda.vijay...@g

Re: Managed vs external tables in hive

2012-05-12 Thread Ranjith
Indexes can be built on tables managed by hive. For external tables I do not believe that to be true. Please feel to correct if I am wrong. Thanks, Ranjith On May 12, 2012, at 9:24 PM, Nanda Vijaydev wrote: > In hive, the raw data is in HDFS and there is a metadata layer that defines > the st

Re: Managed vs external tables in hive

2012-05-12 Thread Nanda Vijaydev
In hive, the raw data is in HDFS and there is a metadata layer that defines the structure of the raw data. Table is usually a reference to metadata, probably in a mySQL server and it contains a reference to the location of the data in HDFS, type of delimiter or serde to use and so on. 1. With hive

Re: Managed vs external tables in hive

2012-05-10 Thread David Kulp
It's simpler than this. All files look the same -- and are often very simple delimited text -- whether managed or external. The only difference is that the files associated with a managed table are dropped when the table is dropped and files that are loaded into a managed table are moved into

Re: Managed vs external tables in hive

2012-05-10 Thread Edward Capriolo
The only actual differences is: If you drop a managed table the LOCATION it refers to will be deleted. If you drop an external table the LOCATION it refers to will not be deleted. Confusion happens because when hive creates a managed table it defaults to : fs.default.name+/user/hive/warehouse/+t