Hello, I am looking for an efficient way to measure the performance of my pig script in terms of time. I don't care about the overall execution time but rather for the computation time of a single instance.
For example, let's assume that I have two separated instances and I want to see how much does it take to join them. How could I time this particular sub-process? Ideally, how could I time ALL the "individual" joins - assuming that I had a bunch of instances in one side and a single instance on the other side - and then average all these execution times? I tried to write my own UDF but unfortunately the values I got were either relative big (approximately 2 minutes!!!) or even negative - non consistent values!!! So, what's the best way to compute execution time of sub-process in pig? Any help is very much appreciated.
