On Wed, Feb 2, 2011 at 8:20 AM, Amlan Mandal <am...@fourint.com> wrote: > > > Sent from Amlan's iPhone > On 02-Feb-2011, at 2:17 AM, Thiruvel Thirumoolan <thiru...@yahoo-inc.com> > wrote: > > > > Local tables are like hive tables in all other senses except that they are > on the local disk rather than HDFS. The only other difference I know of is > that when you call "drop table" on a local table, only the metadata on the > table gets deleted. For tables on HDFS, the table data gets deleted with the > metadata. > > > Ajo, > Guess there is a confusion here. No concept of Local tables in Hive AFAIK. > The behavior you mention is for EXTERNAL tables. > > Can you please let me know the configuration name to configure that? > > And the data for external tables can be on local file system or HDFS, > depending on configuration. The other tables are addressed as MANAGED tables > for which Hive creates a directory under warehouse dir. > > -Ajo. > > On Tue, Feb 1, 2011 at 8:41 AM, Amlan Mandal <am...@fourint.com> wrote: >> >> Thanks Ajo. >> Please confirm if my understanding is correct. >> That means when I do "LOAD DATA *LOCAL* INPATH 'filepath' [OVERWRITE] INTO >> TABLE tablename" data in is local file system. If I need to run HIVE queries >> (which in turn would be converted to Map Reduce jobs) I need to pull the >> data some other table for which data is in HDFS by means of >> >> INSERT OVERWRITE TABLE tablename_new SELECT * FROM tablename ... (kind >> of) >> >> So those LOCAL tables are kind of temporary. > > See - http://wiki.apache.org/hadoop/Hive/LanguageManual/DML That should > clarify load local. >> >> Amlan >> >> >> On Tue, Feb 1, 2011 at 6:51 PM, Ajo Fod <ajo....@gmail.com> wrote: >> > >> > Look up for local : >> > http://wiki.apache.org/hadoop/Hive/GettingStarted >> > >> > -Ajo. >> > >> > On Tue, Feb 1, 2011 at 3:15 AM, Amlan Mandal <am...@fourint.com> wrote: >> >> >> >> LOAD DATA *LOCAL* INPATH 'filepath' [OVERWRITE] INTO TABLE tablename >> >> >> >> When I use LOCAL keyword does hive create a hdfs file for it? >> >> > > Yes. Hive creates a file for it on HDFS. > As Ping Zhu mentioned, do a 'describe formatted <tablename>' or 'describe > extended <tablename>' after loading data. Check that location on HDFS. > You can also check the logs (they are usually at /tmp/<username>/hive.log). > You can see the local file getting copied to HDFS scratch directory and then > being moved to a directory under warehouse. If you find anything strange, > can u please post them here?
Ow boy. Hive has a directory in HDFS called the warehouse directory default /user/hive/warehouse When you run 'create table atable' it is managed by a directory /user/hive/warehouse/atable If you load afile.txt into that table it goes here /user/hive/warehouse/atable/afile The differences (to name a few) 1) is EXTERNAL tables are NOT inside /user/hive/warehouse. They are anywhere because external tables allow you to specify LOCATION /user/edward/bla for the table, (and for partitions that may be inside the table) 2) If DROP an EXTERNAL table no data is delete from HDFS. Your confusion might stem from the fact that tables are either normal 'create table X' or external 'create external table X' but hive has no 'internal' keyword.