You can write a UDF. If it throws an exception from the UDF that will end
your hive job.

On Thu, May 26, 2011 at 5:46 PM, Igor Tatarinov <i...@decide.com> wrote:

> Here is one example. I want to make sure I don't have negative prices in my
> data. I would like to write something like:
>
> assert_empty(select * from Prices where price <= 0);
>
> as part of my Hive script. I expect my job to fail if there are negative
> prices in the data.
>
> I understand there is high cost to doing this but some of our tables are
> not that big and any kind of data checks would help.
>
>
> On Thu, May 26, 2011 at 2:18 PM, Alex Kozlov <ale...@cloudera.com> wrote:
>
>> 1) Would `select count(1) from (query)` do the same thing?  I am a bit
>> confused what is the semantic of assert: is it just no rows or some kind of
>> syntax error check?
>> 2) Hive is not an OLTP and is not optimized for single row inserts (or
>> updates for this matter).  In a trivial implementation one would just do
>> copy-on-write, i.e. overwrite the whole data file, or add a small file
>> containing one or a few rows.  I do not think indices have been implemented
>> yet either.
>> 3) You can probably do what you want with HBase (if you explain what you
>> want to do in more detail)
>>
>>
>> On Thu, May 26, 2011 at 2:00 PM, Igor Tatarinov <i...@decide.com> wrote:
>>
>>> I would like to implement some kind of assert functionality in Hive QL.
>>>
>>> Here is how I do it in MySQL. I can assert that a given query returns no
>>> (bad) rows by creating a table with one row containing '1' and a unique
>>> index. Then, I try to insert into that table select 1 from (query). If the
>>> query returns something, I have an assert failure. I've found such asserts
>>> very helpful in catching data bugs early.
>>>
>>> Can I do something similar in Hive? I suppose I could implement a UDF
>>> that fails if it gets executed. This way I can ensure there is no records
>>> matching a given query.
>>> That doesn't sound too bad but perhaps there is a cleaner solution.
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>>
>>
>

Reply via email to