Re: S3/EMR Hive: Load contents of a single file

Keith Wiley Wed, 27 Mar 2013 10:03:07 -0700

Okay, I also saw your previous response which analyzed queries into two tables 
built around two files in the same directory.  I guess I was simply wrong in my 
understanding that a Hive table is fundamentally associated with a directory 
instead of a file.  Turns out, it be can either one.  A directory table uses 
all files in the directory while a file table uses one specific file and 
properly avoids sibling files.  My bad.


Thanks for the careful analysis and clarification.  TIL!

Cheers!

On Mar 27, 2013, at 02:58 , Tony Burton wrote:

> A bit more info - do an extended description of the table:
>  
> $ desc extended gsrc1;
>  
> And the “location” field is “location:s3://mybucket/path/to/data/src1.txt”
>  
> Do the same on a table created with a location pointing at the directory and 
> the same info gives (not surprisingly) “location:s3://mybucket/path/to/data/”
> 

________________________________________________________________________________
Keith Wiley     [email protected]     keithwiley.com    music.keithwiley.com

"I used to be with it, but then they changed what it was.  Now, what I'm with
isn't it, and what's it seems weird and scary to me."
                                           --  Abe (Grandpa) Simpson
________________________________________________________________________________

Re: S3/EMR Hive: Load contents of a single file

Reply via email to