Re: Run queries from external files as subqueries

Jan Dolinár Thu, 20 Jun 2013 13:56:09 -0700

Quick and dirty way to do such thing would be to use some kind of
preprocessor. To avoid writing one, you could use e.g. the one from GCC,
with just a little help from sed:


    gcc -E -x c query.hql -o- | sed '/#/d' > preprocessed.hql
    hive -f preprocessed.hql

Where query.hql can contain for example something like

    SELECT * FROM (
        #include "subquery.hql"
    ) t
    WHERE id = 1;

The includes can be nested and multiplied as much as necessary. As a bonus,
you could also use #define for repeated parts of code and/or #ifdef to
build different queries based on parameters parameters passed to gcc ;-)

Best regards,
Jan Dolinar


On Thu, Jun 20, 2013 at 10:09 PM, Bertrand Dechoux <decho...@gmail.com>wrote:

> I am afraid that there is no automatic way of doing so. But that would be
> the same answer whether the question is about hive or any relational
> database.
> (I would be glad to have counter examples.)
>
> You might want to look at oozie in order to manage worflow. But the
> creation of the worflow is manual indeed.
> http://oozie.apache.org/
>
> Regards
>
> Bertrand
>
>
>
>
> On Thu, Jun 20, 2013 at 9:59 PM, Sha Liu <lius...@hotmail.com> wrote:
>
>> Hi,
>>
>> While working on some complex queries with multiple level of subqueries,
>> I'm wonder if it is possible in Hive to refactor these subqueries into
>> different files and instruct the enclosing query to execute these files.
>> This way these subqueries can potentially be reused by other questions or
>> just run by themselves.
>>
>> Thanks,
>> Sha Liu
>>
>
>
>
> --
> Bertrand Dechoux
>

Re: Run queries from external files as subqueries

Reply via email to