Hi guys I'm looking for a way to generate a common id for all jobs generated from the same query. I'm aware of 2 possible options (described below) which are someway problematic.
Are you aware of a way to achieve this in current/future versions? Thanks Dudu 1. Setting the job name: set mapred.job.name=demo 1; select count(*) from (select 1) t; ID User Name application_1469828525963_122782<http://lvshdc2en0007.lvs.paypal.com:8088/cluster/app/application_1469828525963_122782> dmarkovitz demo 1 The downside: * I'm losing the stage information 2. Adding a comment before the query: -- demo 2 select count(*) from (select 1) t ID User Name application_1469828525963_122812<http://lvshdc2en0007.lvs.paypal.com:8088/cluster/app/application_1469828525963_122812> dmarkovitz -- demo 2 select count(*) from (select 1) t(Stage-1) The downsides: * This current behavior of determining the job name is not guaranteed * It requires to add an additional text to all queries * It contains undesired text (the prefix of the query)