Actually, I'll probably just end up computing positions to use, rather than pasting in a schema, but the general point is that I'd love to do it some other way, because little hacks like these make my data pipeline feel fragile.
I'm willing to write some Java if anyone could point me in the write direction. -Mason On Tue, Jan 15, 2013 at 2:23 PM, Mason <[email protected]> wrote: > I have TSVs with a lot of columns, and I would like to address them by > name, as specified in the header line (first row), within Pig. > > The best I can come up with a.t.m is to write a script that strips the > header line from the file and converts it to the form (col1:string, > col2:string, ...), then plug that schema string into the AS portion of > my LOAD statement. Then I'll project columns I want and manually > typecast them. > > Is there a better, simple way? > > -Mason
