As far as I know, 1. The external table does not need to copy data from hdfs to your warehouse when loading data. 2. "Location" locates the data in hdfs and it links data to the table. And when you drop table, data is not deleted. 3. The tables' information is stored in your metastore, ie derby, mysql installed locally. Data is copied to the directory you set in hive.metastore.warehouse.dir in hive-site.xml. This path can be a path in HDFS or in local file system.
That is my opinion. Hope helpful. 2011/6/1 abh not <abh....@gmail.com> > Hi All, > > I am new to Hive and have been reading > http://wiki.apache.org/hadoop/Hive/Tutorial to get better understanding of > Hive > > I am sorry for really basic questions, but I have some confusion, here are > couple of questions: > > > 1. what is difference between internal and external table in Hive? > 2. when we create a table e.g. > > CREATE EXTERNAL TABLE page_view_stg(viewTime INT, userid > BIGINT, > page_url STRING, referrer_url STRING, > ip STRING COMMENT 'IP Address of the User', > country STRING COMMENT 'country of origination') > COMMENT 'This is the staging page view table' > ROW FORMAT DELIMITED FIELDS TERMINATED BY '44' LINES TERMINATED > BY '12' > *STORED AS TEXTFILE* > * LOCATION '/user/data/staging/page_view';* > what does last two lines means? does that mean we are creating a view in > Hive on top of data stored in HDFS '/user/data/staging/page_view' as > textfile? > > 3. In general when we create table and load data into it, where does > that data gets stored? in HDFS? or under separate directory (known as > warehouse) in HDFS? > > Thanks for any response. > > Sonia > -- dujinhang