Quick and dirty way to do such thing would be to use some kind of
preprocessor. To avoid writing one, you could use e.g. the one from GCC,
with just a little help from sed:

    gcc -E -x c query.hql -o- | sed '/#/d' > preprocessed.hql
    hive -f preprocessed.hql

Where query.hql can contain for example something like

    SELECT * FROM (
        #include "subquery.hql"
    ) t
    WHERE id = 1;

The includes can be nested and multiplied as much as necessary. As a bonus,
you could also use #define for repeated parts of code and/or #ifdef to
build different queries based on parameters parameters passed to gcc ;-)

Best regards,
Jan Dolinar


On Thu, Jun 20, 2013 at 10:09 PM, Bertrand Dechoux <decho...@gmail.com>wrote:

> I am afraid that there is no automatic way of doing so. But that would be
> the same answer whether the question is about hive or any relational
> database.
> (I would be glad to have counter examples.)
>
> You might want to look at oozie in order to manage worflow. But the
> creation of the worflow is manual indeed.
> http://oozie.apache.org/
>
> Regards
>
> Bertrand
>
>
>
>
> On Thu, Jun 20, 2013 at 9:59 PM, Sha Liu <lius...@hotmail.com> wrote:
>
>> Hi,
>>
>> While working on some complex queries with multiple level of subqueries,
>> I'm wonder if it is possible in Hive to refactor these subqueries into
>> different files and instruct the enclosing query to execute these files.
>> This way these subqueries can potentially be reused by other questions or
>> just run by themselves.
>>
>> Thanks,
>> Sha Liu
>>
>
>
>
> --
> Bertrand Dechoux
>

Reply via email to