Re: Maintaining big and complex Hive queries

2016-12-21 Thread Edward Capriolo
I have been contemplating attaching meta data for the query lineage to each table such that I can know where the data came from and have a 1 click regenerate button. On Wed, Dec 21, 2016 at 3:02 PM, Stephen Sprague wrote: > my 2 cents. :) > > as soon as you say "complex query" i would submit you

Re: Maintaining big and complex Hive queries

2016-12-21 Thread Stephen Sprague
my 2 cents. :) as soon as you say "complex query" i would submit you've lost the upperhand and you're behind the eight-ball right off the bat. And you know this too otherwise you wouldn't have posted here. ha! i use cascading CTAS statements so that i can examine the intermediate tables. Anothe

Re: Maintaining big and complex Hive queries

2016-12-21 Thread Saumitra Shahapure
Hi Elliot, Thanks for letting me know. HPL-SQL sounded particularly interesting. But in the documentation I could not see any way to pass output generated by one Hive query to the next one. The tool looks good as a homogeneous PL-SQL platform for multiple big-data systems (http://www.hplsql.org/ab

Re: Maintaining big and complex Hive queries

2016-12-15 Thread Elliot West
I notice that HPL/SQL is not mentioned on the page I referenced, however I expect that is another approach that you could use to modularise: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=59690156 http://www.hplsql.org/doc On 15 December 2016 at 17:17, Elliot West wrote: > Som

Re: Maintaining big and complex Hive queries

2016-12-15 Thread Elliot West
Some options are covered here, although there is no definitive guidance as far as I know: https://cwiki.apache.org/confluence/display/Hive/Unit+Testing+Hive+SQL#UnitTestingHiveSQL-Modularisation On 15 December 2016 at 17:08, Saumitra Shahapure < saumitra.offic...@gmail.com> wrote: > Hello, > > W