I notice that HPL/SQL is not mentioned on the page I referenced, however I
expect that is another approach that you could use to modularise:

https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=59690156
http://www.hplsql.org/doc

On 15 December 2016 at 17:17, Elliot West <tea...@gmail.com> wrote:

> Some options are covered here, although there is no definitive guidance as
> far as I know:
>
> https://cwiki.apache.org/confluence/display/Hive/Unit+Testing+Hive+SQL#
> UnitTestingHiveSQL-Modularisation
>
> On 15 December 2016 at 17:08, Saumitra Shahapure <
> saumitra.offic...@gmail.com> wrote:
>
>> Hello,
>>
>> We are running and maintaining quite big and complex Hive SELECT query
>> right now. It's basically a single SELECT query which performs JOIN of
>> about ten other SELECT query outputs.
>>
>> A simplest way to refactor that we can think of is to break this query
>> down into multiple views and then join the views. There is similar
>> possibility to create intermediate tables.
>>
>> However creating multiple DDLs in order to maintain a single DML is not
>> very smooth. We would end up polluting metadata database by creating views
>> / intermediate tables which are used in just this ETL.
>>
>> What are the other efficient ways to maintain complex SQL queries written
>> in Hive? Are there better ways to break Hive query into multiple modules?
>>
>> -- Saumitra S. Shahapure
>>
>
>

Reply via email to