It's impossible to answer such a vague, open-ended question without specifics. What's the query, for example? How is the data organized (e.g., is it partitioned)? What are the cluster characteristics?
On Mon, Oct 29, 2012 at 10:20 AM, shashwat shriparv < dwivedishash...@gmail.com> wrote: > I am trying to run hive query on huge amount of data(almost in half of > petabyte), and these query running map reduce internally. it takes very > long time to generate the data set(map reduce to complete) what > optimization mechanism for hive and Hadoop i can use to make these query > faster, one more important question i have does the amount of disk > available for map reduce or in /tmp directory is important for faster map > reduce? > > -- > > > ∞ > Shashwat Shriparv > > > -- *Dean Wampler, Ph.D.* thinkbiganalytics.com +1-312-339-1330