I do that kind of streaming on hdfs files using Hadoop streaming, outside of 
pig. I assume you could do it from inside pig too, but haven’t tested.

 

William F Dowling

Sr Technical Specialist, Software Engineering

Thomson Reuters

0 +1 215 823 3853

 

From: Moore, Michael A. [mailto:[email protected]] 
Sent: Tuesday, June 07, 2011 3:14 PM
To: [email protected]
Subject: Re: Loading Files with Comment Lines

 

Possibly.  Can I do that if the file is already in HDFS?

______________________________________

Michael Moore :: [email protected] <mailto:[email protected]> 

The Johns Hopkins University Applied Physics Laboratory

0B7B17EE1AE2A80B pgp

BC31 A861 9726 8211 F79F 7E21 0B7B 17EE 1AE2 A80B pgp fingerprint

 

 

On Jun 7, 2011, at 3:12 PM, <[email protected]> wrote:





Can you stream it through

 grep -v ‘^#’



?



William F Dowling

Sr Technical Specialist, Software Engineering

Thomson Reuters

0 +1 215 823 3853



From: Moore, Michael A. [mailto:[email protected]] 
Sent: Tuesday, June 07, 2011 3:04 PM
To: [email protected]
Subject: Loading Files with Comment Lines



Hello all-



I've got a quick question and Google isn't proving to be much help.



I've got a big file, that has a few lines in it prefaced with a pound sign (#) 
to indicate they are to be ignored.  I would like to LOAD this file using 
PigStorage.  Is there a way to do this, or is it handled automatically?



The data might look something like this:



# Data Source: Project A

# Contact MMoore with Questions

# SenderId      RecipientId

1          2

3          5

6          7

#2        1

3          6

11        7



Thanks!

-Michael



______________________________________

Michael Moore :: [email protected] <mailto:[email protected]> 

The Johns Hopkins University Applied Physics Laboratory

0B7B17EE1AE2A80B pgp

BC31 A861 9726 8211 F79F 7E21 0B7B 17EE 1AE2 A80B pgp fingerprint






 

Reply via email to