sure :-)
On Tue, Jan 25, 2011 at 5:54 PM, Dmitriy Ryaboy wrote:
> I do it pre-pig.
> I think this has to be handled at the RecordReader level if you wanted to
> do
> it in the framework.
>
> Hey want to contribute to the erorr handling design discussion? :) We
> haven't thought about LoadFuncs y
I do it pre-pig.
I think this has to be handled at the RecordReader level if you wanted to do
it in the framework.
Hey want to contribute to the erorr handling design discussion? :) We
haven't thought about LoadFuncs yet..
http://wiki.apache.org/pig/PigErrorHandlingInScripts
On Tue, Jan 25, 201
You're right. There're two issues here. First, the Jython script needs to
locate the modules in its search path (e.g. python.path). If you have the right
env variable set, Jython script should be able to find and import the module.
Second, Pig currently doesn't automatically ship the module file
Do you catch the error when you load with pig, or is that a pre-pig step?
If I wanted to catch the error in a pig load, is it possible? Where would
that code go?
-Kim
On Tue, Jan 25, 2011 at 4:44 PM, Dmitriy Ryaboy wrote:
> Yeah so the unexpected EOF is the most common one we get (lzo requires
Yeah so the unexpected EOF is the most common one we get (lzo requires a
footer, and sometimes filehandles are closed before a footer is written, if
the network hiccups or something).
Right now what we do is scan before moving to the DW, and if not successful,
extract what's extractable, catch the
This is the error I'm getting:
java.io.EOFException: Unexpected end of input stream
at
org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:99)
at
org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:87)
last time I checked, I don't think you can do join on groups. But that was
like a year ago.
On Tue, Jan 25, 2011 at 12:49 PM, Neil Kodner wrote:
> I've created a relation by grouping on a composite key. I then join a
> similar relation using the grouped key as the join key.
>
> outgoing = FOREA
How badly compressed are they? Problems in the codec, or in the data that
comes out of the codec?
We've had some lzo corruption problems, and so far have simply been dealing
with that by doing correctness tests in our log mover pipeline before moving
into the "data warehouse" area.
Skipping bad f
Hi,
I'm processing gzipped compressed files in a directory, but some files are
corrupted and can't be decompressed. Is there a way to skip the bad files
with a custom load func?
-Kim
Begin forwarded message:
From: Isabel Drost
Date: January 25, 2011 12:53:28 PM PST
To: "u...@mahout.apache.org"
Cc: "gene...@lucene.apache.org" , "gene...@hadoop.apache.org
" , "u...@hbase.apache.org" >, "solr-u...@lucene.apache.org" , "java-u...@lucene.apache.org
" , "u...@nutch.apache.or
Hi Daniel,
I did put jython.jar in classpath. By comparing other python udfs with
this one, I find those udfs which work do not import anything. Could
that be the cause? Do I need to anything extra to import module in my
udf?
Thanks!
Shawn
On Mon, Jan 24, 2011 at 5:28 PM, Daniel Dai wrote:
> P
package squeal.fun;
import java.util.Iterator;
import java.util.List;
import java.util.ArrayList;
import java.util.Map;
import java.util.HashMap;
import java.util.Set;
import java.util.HashSet;
import java.io.IOException;
import org.apache.pig.PigException;
import org.apache.pig.backend.executionen
>From what I see the data type of the DataBag is not correcly recognized.
I guess that that -1 comes from DataType.findType(), that is returning ERROR.
I also assume (but I am not sure) that the concrete type of getValue()
should be AccumulativeBag,
but for some reason it is something different. Ma
Thanks, perfect.
2011/1/25 Alan Gates
> See http://wiki.apache.org/pig/PigTools, which lists editing highlight
> scripts for Eclipse, emacs, TextMate, and vim.
>
> Alan.
>
>
> On Jan 25, 2011, at 10:34 AM, Jonathan Coveney wrote:
>
> Howdy,
>>
>> I think I saw in one post that some people use T
See http://wiki.apache.org/pig/PigTools, which lists editing highlight
scripts for Eclipse, emacs, TextMate, and vim.
Alan.
On Jan 25, 2011, at 10:34 AM, Jonathan Coveney wrote:
Howdy,
I think I saw in one post that some people use TextMate, but what do
those
among you who use Windows dev
Not all of these are exactly what you were looking for, but there are a
few highlighter plugins for the likes of Eclipse, TextMate, Emacs, and
Vim. Hope it helps...
http://wiki.apache.org/pig/PigTools
--
Shane
On 01/25/2011 01:34 PM, Jonathan Coveney wrote:
Howdy,
I think I saw in one post
I've been able to isolate the problem, but have no idea what is causing it.
The input is in this form (this is correct):
{({(a),(b),(c)}),({(a),(b),(c)}),({(a),(b),(c)})}
and the output is in this form:
{(b,c,3),(c,a,3),(b,a,3)}
which is also correct. By placing prints and whatnot, I can see t
Hi All,
I am having ClassNotFound problems w.r.t. a custom Load function.
- I am using Pig-0.7.0 with Hadoop-0.20.2
- Input to the job is a sequence file with custom key/value data
- I am including the load UDF source below. Note that the UDF does not care
about what is inside the sequence file
Hello again,
I just found something interesting in the logs:
INFO org.apache.pig.backend.hadoop.hbase.HBaseTableInputFormat:
setScan with ranges: 5192296858534827628530496329220096 -
5192343374370748142029900260897474 ( 46515835920513499403931677378)
But in my case, it should more be from 1020576
19 matches
Mail list logo