After an offline discussion with Hong and others on this subject, it seems to make sense. +1
On 10/12/09 3:55 PM, "Hong Tang" <ht...@yahoo-inc.com> wrote: HADOOP-6218 exposed the internal "Location" object as a global Record Sequence Number (RecNum). The feature is useful in a number of ways: (1) support progress reporting for upper layers (object file, zebra); (2) use RecNum as cursor by a secondary index; (3) support aligned split across multiple parallel TFiles. Given that TFile is still at its early stage of being adopted, I suggest that we port the patch back to hadoop 0.20/0.21 now. -Hong