Block Sampling Impact

2012-06-15 Thread Ladda, Anand
Hi I was trying block sampling on a 6 million (~400MB sized table) and can see if I sample about 1 percent of the data I get about 3x faster response on the queries (I can also see difference in the data returned). The input format though is 'org.apache.hadoop.mapred.TextInputFormat

Re: Block Sampling

2012-06-15 Thread Carl Steinbach
y, June 15, 2012 3:20 PM > *To:* user@hive.apache.org > *Subject:* Re: Block Sampling > > ** ** > > Hi Anand, > > ** ** > > This feature was implemented in HIVE-2121 and appeared in Hive 0.8.0. > > ** ** > > Ref: https://issues.apache.org/jira/br

RE: Block Sampling

2012-06-15 Thread Ladda, Anand
Thanks Carl. Could you give me edit rights to the wiki (ala...@microstrategy.com<mailto:ala...@microstrategy.com>) to update the sampling page with this info From: Carl Steinbach [mailto:c...@cloudera.com] Sent: Friday, June 15, 2012 3:20 PM To: user@hive.apache.org Subject: Re: Block Sa

Re: Block Sampling

2012-06-15 Thread Carl Steinbach
Hi Anand, This feature was implemented in HIVE-2121 and appeared in Hive 0.8.0. Ref: https://issues.apache.org/jira/browse/HIVE-2121 Thanks. Carl On Fri, Jun 15, 2012 at 11:59 AM, Ladda, Anand wrote: > Has the block sampling feature been added to one of the latest (Hive 0.8 > or Hi

Block Sampling

2012-06-15 Thread Ladda, Anand
Has the block sampling feature been added to one of the latest (Hive 0.8 or Hive 0.9) releases. The wiki has the blurb below on block sampling Block Sampling It is a feature that is still on trunk and is not yet in any release version. block_sample: TABLESAMPLE (n PERCENT) This will allow Hive to