This is a common concern. There is a MR jira raised for the same.
https://issues.apache.org/jira/browse/MAPREDUCE-2076
One way I use to find which inputs went to map task is as below, :
a) Get the input spit locations from the task log;
b) Got to the location and from data node logs grep for the a
Hi, Aniket,
Does myLoader implements LoadMetaData? If it does, what schema it
returns? I suspect that your schema for bag does not set twolevelaccess
flag (though we are working to drop it in 0.9).
Daniel
Aniket Mokashi wrote:
Hi,
I have a custom loader that creates and returns a tuple of i
Hi,
I have a custom loader that creates and returns a tuple of id, bags. I
want to open these bags and get their contents.
For example-
data = load 'loc' using myLoader() as (id, bag1, bag2);
bag1Content = foreach data generate FLATTEN(bag1);
This works.
But when I do bag1Content = foreach data g
Hello all, I've been scratching my head over a problem with a pig script I'm
having, and hoping another set of eyeballs will help. I'm using pig 0.8, in
local mode
Here's my simplified use case:
I have a log file with events on pages, and the id of the event can be a
users login or a users numeri
Ok thanks
On Wed, Feb 16, 2011 at 3:35 PM, Ramesh, Amit wrote:
>
> No, joins are only possible on fields common to all the aliases in the
> join.
>
>
> On 2/16/11 2:56 PM, "sonia gehlot" wrote:
>
> > Thank you very much Amit.
> >
> > One more question in the same way if I want to join multiple
No, joins are only possible on fields common to all the aliases in the join.
On 2/16/11 2:56 PM, "sonia gehlot" wrote:
> Thank you very much Amit.
>
> One more question in the same way if I want to join multiple tables
>
>> select blah, blah
>> From
>> page_events pe
> Left Join referrer r
Thank you very much Amit.
One more question in the same way if I want to join multiple tables
> select blah, blah
> From
> page_events pe
Left Join referrer ref
on ref.id = pe.id
> Left Join page_events pe_pre
> on pe.day = pe_pre.day
> And pe.session_id = pe_pre.session_id
> And pe.pag
You can just do:
join_pe_pre = JOIN page_events BY (day, session_id, page_seq_num) LEFT
OUTER, page_events_pre BY (day, session_id, page_seq_num + 1);
Amit
On 2/16/11 2:09 PM, "sonia gehlot" wrote:
> Hi All,
>
> I am new to Hadoop and I started exploring Pig since last month. I have few
> q
Hi All,
I am new to Hadoop and I started exploring Pig since last month. I have few
question I have to replicate some SQL query to Pig that has left join for
example:
select blah, blah
From
page_events pe
Left Join page_events pe_pre
on pe.day = pe_pre.day
And pe.session_id = pe_pre.session_id
An
Thanks for the info, guys! Will look into using a recent snapshot.
Thanks!
Amit
On 2/16/11 11:46 AM, "Daniel Dai" wrote:
> Yes, it is fixed by PIG-998. Doing a describe on trunk will get:
>
> data: {f0: chararray,b1::t1: (f1: chararray,f2: int),b3: {(f3: chararray)}}
>
> Daniel
>
> Alan Ga
Yes, it is fixed by PIG-998. Doing a describe on trunk will get:
data: {f0: chararray,b1::t1: (f1: chararray,f2: int),b3: {(f3: chararray)}}
Daniel
Alan Gates wrote:
The issue here is that describe is incorrectly removing the second
level of tuple, even though dump is doing the right thing.
"no codec" means you didn't get LZO set up right on the cluster.
There are instructions on the wiki of the googlecode project for hadoop lzo.
D
On Wed, Feb 16, 2011 at 11:05 AM, Kris Coward wrote:
>
> After a bunch of fiddling around (including some pretty heavy use of the
> secretDebugCmd--than
After a bunch of fiddling around (including some pretty heavy use of the
secretDebugCmd--thanks), I finally got the LzoTokenizedStorage working,
but now I'm having problems with the LzoTokenizedLoader.
I'm still using pig 0.8.0-CDH3B4-SNAPSHOT, and for storage, have only
seemed to have luck with
This may be better asked on one of the other hadoop lists, but as the job in
question is done with Pig I thought I would start here. I have a nightly job
that runs against around 1000 gzip log files. Around once a week one of the
map tasks will fail reporting some form of gzip error/corruption
14 matches
Mail list logo