I notice that HPL/SQL is not mentioned on the page I referenced, however I expect that is another approach that you could use to modularise:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=59690156 http://www.hplsql.org/doc On 15 December 2016 at 17:17, Elliot West <tea...@gmail.com> wrote: > Some options are covered here, although there is no definitive guidance as > far as I know: > > https://cwiki.apache.org/confluence/display/Hive/Unit+Testing+Hive+SQL# > UnitTestingHiveSQL-Modularisation > > On 15 December 2016 at 17:08, Saumitra Shahapure < > saumitra.offic...@gmail.com> wrote: > >> Hello, >> >> We are running and maintaining quite big and complex Hive SELECT query >> right now. It's basically a single SELECT query which performs JOIN of >> about ten other SELECT query outputs. >> >> A simplest way to refactor that we can think of is to break this query >> down into multiple views and then join the views. There is similar >> possibility to create intermediate tables. >> >> However creating multiple DDLs in order to maintain a single DML is not >> very smooth. We would end up polluting metadata database by creating views >> / intermediate tables which are used in just this ETL. >> >> What are the other efficient ways to maintain complex SQL queries written >> in Hive? Are there better ways to break Hive query into multiple modules? >> >> -- Saumitra S. Shahapure >> > >