Using the Hive sampling feature would also help. This is exactly what that feature is designed for.
Chuck From: Kyle B [mailto:kbi...@gmail.com] Sent: Tuesday, March 05, 2013 1:45 PM To: user@hive.apache.org Subject: Hive sample test Hello, I was wondering if there is a way to quick-verify a Hive query before it is run against a big dataset? The tables I am querying against have millions of records, and I'd like to verify my Hive query before I run it against all records. Is there a way to test the query against a small subset of the data, without going into full MapReduce? As silly as this sounds, is there a way to MapReduce without the overhead of MapReduce? That way I can check my query is doing what I want before I run it against all records. Thanks, -Kyle