Re: Single Map task for Hive queries

Loren Siebert Mon, 15 Aug 2011 11:00:53 -0700

You should not have to do anything special to Hive to make it use all of your 
TT’s. The actual MR job should be governed by your mapred-site.xml file.


When you run sample MR jobs (like the Pi example) and look at the job tracker, 
are you seeing all your TT’s getting used?

On Aug 15, 2011, at 10:47 AM, Jon Bender wrote:

> It's actually just an uncompressed UTF-8 text file.
> 
> This was essentially the create table clause:
> CREATE EXTERNAL TABLE foo 
> ROW FORMAT DELIMITED 
> STORED AS TEXTFILE
> LOCATION '/data/foo'
> 
> Using Hive 0.7.
> 
> On Mon, Aug 15, 2011 at 10:37 AM, Loren Siebert <lo...@siebert.org> wrote:
> Is your external file compressed with GZip or BZip? Those file formats aren’t 
> splittable, so they get assigned to one mapper.
> 
> On Aug 15, 2011, at 10:23 AM, Jon Bender wrote:
> 
> > Hello,
> >
> > I have external tables in Hive stored in a single flat text file.  When I 
> > execute queries against it, all of my jobs are run as a single map task, 
> > even on very large tables.
> >
> > What steps do I need to make to ensure that these queries are split up and 
> > pushed out to multiple TTs?  Do I need to store the Hive tables in a 
> > different internal file format?  Make some configuration changes?
> >
> > Thanks!
> > Jon
> 
>

Re: Single Map task for Hive queries

Reply via email to