Thanks Bill. Can you please point me to the Ambrose code that uses PPNL? I will open a JIRA for getting hooks with explain in.
Sent from my iPhone On Jan 22, 2013, at 9:03 PM, Bill Graham <[email protected]> wrote: > Yeah, getting at the info here is tricky. For Ambrose we're getting info > about submitted jobs, so we can just hook into the lifecycle of > PigProgressNotificationListener. The PPNL notifiers are pretty coupled to > PigStatsUtil and ScriptState, which aren't invoked during explain. > > The bulk of the action for explain all happens in the PigServer.explain(..) > method. That's where the logical plan, physical plan and execution plan are > generated before explain gets called on each to print the output. We could > look to add some sort of listener interface and hook here perhaps that gets > each of these passed during explain via a configured param. > > > > On Tue, Jan 22, 2013 at 3:05 PM, Jonathan Coveney <[email protected]>wrote: > >> I think that this is all available, it's just not the easiest thing to get >> at. If you look at the explain plan, it has a lot of this info, and you can >> definitely get at that info. I'm not sure if it has the reducers or if >> that's post MR setup, but you should be able to. >> >> That said, I do not think it would hurt to have hooks in to more clearly >> do something with this info. Bill had to do stuff like this for Ambrose, so >> maybe he can weigh in on what that could look like. >> >> >> 2013/1/22 Prashant Kommireddi <[email protected]> >> >>> Jon/others - any pointers on this? I would like to patch in hooks if this >>> is not possible at the moment. >>> >>> -Prashant >>> >>> On Mon, Jan 21, 2013 at 5:47 PM, Prashant Kommireddi <[email protected] >>>> wrote: >>> >>>> At the moment, basically info on I/O paths, operators used (group by, >>>> foreach ..), job level info such as number of reducers etc. >>>> >>>> >>>> On Mon, Jan 21, 2013 at 5:30 PM, Jonathan Coveney <[email protected] >>>> wrote: >>>> >>>>> What level of information would you like? IE if you do "explain >>> relation," >>>>> which of the three do you want to hook into? >>>>> >>>>> >>>>> 2013/1/21 Prashant Kommireddi <[email protected]> >>>>> >>>>>> Been coding with the APIs and wondering if there is anything that >>> allows >>>>>> you to only retrieve the operators, I/O paths etc without actually >>>>> issuing >>>>>> an execute or a store? Basically, being able to get information >>>>>> post-parsing of the script but pre-execution. >>>>>> >>>>>> Thanks, >>>>>> Prashant >>>>>> >>>>> >>>> >>>> >>> >> >> > > > -- > *Note that I'm no longer using my Yahoo! email address. Please email me at > [email protected] going forward.*
