you could try to set the number of reducers e.g: set mapred.reduce.tasks=4;
set this before doing the select. -Ajo On Mon, Jan 24, 2011 at 1:13 PM, Jonathan Coveney <jcove...@gmail.com> wrote: > I have a 10 node server or so, and have been mainly using pig on it, but > would like to try out Hive. > I am running this query, which doesn't take too long in Pig, but is taking > quite a long time in Hive. > > hive -e "select count(1) as ct from my_table where v1='02' and v2 = > 11112222;" > thecount > One thing is that this job only uses 1 reducer, but it is taking most of its > time in its reduce step. I tried manually setting more reducers, but I think > that for a job without groups, it forces 1 reducer? > Either way, would love to know why this is dragging? It's worth noting that > my_table is not saved in the Hive format, but rather as a flat file. I > realize that this can influence performance, but shouldn't it at least > perform on par with pig? > Thanks for your help > Jon