On Fri, Feb 18, 2011 at 6:58 AM, Radek Maciaszek
<radek.macias...@gmail.com> wrote:
> Hello,
> I was wondering if anyone managed to unit test Hive scripts and share
> his/her experience? My first thought was to prepare sample data, run hive
> scripts in order to generate output and then compare the generated output
> with the expected output. Sounds fairly simple but it may be a bit
> complicated if the data is read from S3 and stored in S3.
> I was also wondering if anyone managed to run the tests on EMR? I found this
> simple framework which may help with testing EMR:
> http://entxtech.blogspot.com/2010/10/how-to-unit-test-apache-hive-scripts.html
> However I am tempted to run tests on a real EMR rather than doing it
> locally.
> I am planning to integrate those tests with Jenkins (formerly Hudson).
> Many thanks,
> Radek

The process you described of diffing output is exactly how hives
current unit testing works. It has its upsites being that it is good
for catching regressions but the download is it is not really
programatic. Look for .q files in the hive source and their
corresponding results/q.out files.

Edward

Reply via email to