No one? I thought that this would be the main point of reference for all those who work on real-time analytics.
How you fellas decide upon whether or not your script is quite fast to be utilized on a real-time environment? Any help is very much appreciated. On Thu, May 23, 2013 at 11:37 AM, Adamantios Corais < [email protected]> wrote: > Hello, > > I am looking for an efficient way to measure the performance of my pig > script in terms of time. I don't care about the overall execution time but > rather for the computation time of a single instance. > > For example, let's assume that I have two separated instances and I want > to see how much does it take to join them. How could I time this particular > sub-process? Ideally, how could I time ALL the "individual" joins - > assuming that I had a bunch of instances in one side and a single instance > on the other side - and then average all these execution times? > > I tried to write my own UDF but unfortunately the values I got were either > relative big (approximately 2 minutes!!!) or even negative - non consistent > values!!! > > So, what's the best way to compute execution time of sub-process in pig? > > Any help is very much appreciated. >
