Re: Hive sample test

2013-03-08 Thread Ramki Palle
If any of the 100 rows that the sub-query returns do not satisfy the where clause, there would be no rows in the overall result. Do we still consider that the Hive query is verified in this case? Regards, Ramki. On Wed, Mar 6, 2013 at 1:14 AM, Dean Wampler < dean.wamp...@thinkbiganalytics.com>

Re: Hive sample test

2013-03-05 Thread Dean Wampler
NIce, yea that would do it. On Tue, Mar 5, 2013 at 1:26 PM, Mark Grover wrote: > I typically change my query to query from a limited version of the whole > table. > > Change > > select really_expensive_select_clause > from > really_big_table > where > something=something > group by something=some

Re: Hive sample test

2013-03-05 Thread Mark Grover
I typically change my query to query from a limited version of the whole table. Change select really_expensive_select_clause from really_big_table where something=something group by something=something to select really_expensive_select_clause from ( select * from really_big_table limit 100 )t w

Re: Hive sample test

2013-03-05 Thread Dean Wampler
Unfortunately, it will still go through the whole thing, then just limit the output. However, there's a flag that I think only works in more recent Hive releases: set hive.limit.optimize.enable=true This is supposed to apply limiting earlier in the data stream, so it will give different results t

RE: Hive sample test

2013-03-05 Thread Connell, Chuck
Using the Hive sampling feature would also help. This is exactly what that feature is designed for. Chuck From: Kyle B [mailto:kbi...@gmail.com] Sent: Tuesday, March 05, 2013 1:45 PM To: user@hive.apache.org Subject: Hive sample test Hello, I was wondering if there is a way to quick-verify a

Re: Hive sample test

2013-03-05 Thread Joey D'Antoni
Just add a limit 1 to the end of your query. On Mar 5, 2013, at 1:45 PM, Kyle B wrote: > Hello, > > I was wondering if there is a way to quick-verify a Hive query before it is > run against a big dataset? The tables I am querying against have millions of > records, and I'd like to verify m