Re: Recommended way to run pig script 'file' in java application.

2010-10-06 Thread Jeff Zhang
pig scripts to java application. the scripts > have been run in command line. > I'm wondering how to do this? I just want to run pig file(*.pig) in java > code. > > Any advice would be appreciated. > > Thanks, > > - Youngwoo > -- Best Regards Jeff Zhang

Re: Passing parameters to Pig Script using Java

2010-10-07 Thread Jeff Zhang
is implemented a s preprocessor > on the script. > > Olga > > -Original Message- > From: rakesh kothari [mailto:rkothari_...@hotmail.com] > Sent: Thursday, October 07, 2010 11:47 AM > To: pig-u...@hadoop.apache.org > Subject: Passing parameters to Pig Script using Java > > > Hi, > > I have a pig script that needs certain parameters (passed using "-p" in pig > shell) to execute. Is there a way to pass these parameters if I want to > execute this script using "PigServer" after registering the script using > PigServer.registerScript() ? > > Thanks, > -Rakesh > > > -- Best Regards Jeff Zhang

Re: reading PigStorage or BinStorage from mapreduce?

2010-10-07 Thread Jeff Zhang
the output using Hadoop FileSystem API, and then using org.apache.pig.data.DataReaderWriter to read the output line by line. On Fri, Oct 8, 2010 at 3:03 AM, Corbin Hoenes wrote: > anyone ever read a pig output file with bags/tuples into a java map reduce > program? -- Best Regards

Re: pig local mode is faster than a 2 nodes cluster? Is it normal?

2010-10-08 Thread Jeff Zhang
of RAM. >>>> - Intel Quad with 4GB of RAM. >>>> >>>> Well I was aware that hadoop has overhead and that it won't be done in >>>> half >>>> an hour (time in local divided by number of nodes). But I was surprised >>>> to >>>> see this morning it took 7 hours to complete!!! >>>> >>>> My configuration was made according to this link: >>>> >>>> >>>> http://www.michael-noll.com/wiki/Running_Hadoop_On_Ubuntu_Linux_%28Multi-Node_Cluster%29 >>>> >>>> My question is simple: Is it normal? >>>> >>>> Cheers >>>> >>>> >>>> Vincent >>>> >>>> >> > > -- Best Regards Jeff Zhang

Re: pig local mode is faster than a 2 nodes cluster? Is it normal?

2010-10-08 Thread Jeff Zhang
hen mapred.job.tracker is "local". > > > > Not clear for me what is the reduce capacity of my cluster :) > > On 10/08/2010 01:00 PM, Jeff Zhang wrote: >> >> I guess maybe your reduce number is 1 which cause the reduce phase very >> slowly. >> >

Re: pig local mode is faster than a 2 nodes cluster? Is it normal?

2010-10-08 Thread Jeff Zhang
BTW, you can look at the job tracker web ui to see which part of the job cost the most of the time On Fri, Oct 8, 2010 at 5:11 PM, Jeff Zhang wrote: > No I mean whether your mapreduce job's reduce task number is 1. > > And could you share your pig script, then others can rea

Re: pig local mode is faster than a 2 nodes cluster? Is it normal?

2010-10-08 Thread Jeff Zhang
r_04 > <http://prog7.lan:50030/taskdetails.jsp?jobid=job_201010081314_0010&tipid=task_201010081314_0010_r_04> >      0.00% > > >        8-Oct-2010 14:18:11 > > > > Error: GC overhead limit exceeded > > >        0 > <http://prog7.lan:50030/ta

Re: pig local mode is faster than a 2 nodes cluster? Is it normal?

2010-10-08 Thread Jeff Zhang
Vincent, Just want to remind you that you need to restart your cluster after the reconfiguration. On Fri, Oct 8, 2010 at 7:04 PM, Jeff Zhang wrote: > Try to increase the heap size on of task by setting > mapred.child.java.opts in mapred-site.xml. The default value is > -Xmx200m

Re: pig local mode is faster than a 2 nodes cluster? Is it normal?

2010-10-08 Thread Jeff Zhang
Xmx1536m but I'm afraid that my nodes will start to > swap memory... > > Should I continue in this direction? Or it's already to much and I should > search the problem somewhere else? > > Thanks > > -Vincent > > > On 10/08/2010 03:04 PM, Jeff Zhan

Re: Support for HBase 0.89

2010-10-13 Thread Jeff Zhang
181] > 2010-10-13 14:58:44,191 [Thread-4-SendThread] INFO >  org.apache.zookeeper.ClientCnxn - Server connection successful > > and stays there. Has anyone tried running it against hbase 0.89 or is 0.20.6 > the only last supported version? > > -GS > -- Best Regards Jeff Zhang

Re: Support for HBase 0.89

2010-10-13 Thread Jeff Zhang
BTW, I guess you do not do the second thing, it seems it always try to connect the localhost's zookeeper On Thu, Oct 14, 2010 at 8:20 AM, Jeff Zhang wrote: > Hi George > > Here's three things should been taken care when you use HBaseStorage() > > > 1. Register

Re: Support for HBase 0.89

2010-10-13 Thread Jeff Zhang
t; 2010-10-13 14:58:44,182 [Thread-4-SendThread] INFO >>> >  org.apache.zookeeper.ClientCnxn - Attempting connection to server >>> > localhost/127.0.0.1:2181 >>> > 2010-10-13 14:58:44,188 [Thread-4-SendThread] INFO >>> >  org.apache.zookeeper.ClientCnxn - Priming connection to >>> > java.nio.channels.SocketChannel[connected >>> > local=/127.0.0.1:54359remote=localhost/ >>> > 127.0.0.1:2181] >>> > 2010-10-13 14:58:44,191 [Thread-4-SendThread] INFO >>> >  org.apache.zookeeper.ClientCnxn - Server connection successful >>> > >>> > and stays there. Has anyone tried running it against hbase 0.89 or is >>> > 0.20.6 >>> > the only last supported version? >>> > >>> > -GS >>> > >>> >> >> > -- Best Regards Jeff Zhang

Re: Strange error when using custom LoadFunc

2010-10-14 Thread Jeff Zhang
t;     at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621) >> > >     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305) >> > >     at >> > > >> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177) >> > > >> > >> > The script is pretty simple right now: >> > >> > rows = LOAD 'cassandra://localhost:9160/...' USING CassandraIndexReader() >> > as >> > > (col1, col2, col3); >> > > dump rows; >> > > grouped = GROUP rows BY col1; >> > > dump grouped; >> > > >> > >> > The first dump works fine,while the second just dies with the above >> error. >> > Strangely when I store it on disc and then load it with PigStorage() >> again >> > it just works as expected. >> > >> > Am I doing something wrong with my Custom Loader? >> > >> > Regards, >> > Chris >> > >> > -- Best Regards Jeff Zhang

Re: Strange error when using custom LoadFunc

2010-10-14 Thread Jeff Zhang
casts from byte array are not supported for this > loader. > >     * construction > >     * @throws IOException if there is an exception during LoadCaster > >     */ > >    public LoadCaster getLoadCaster() throws IOException { > >        return new Utf8StorageConverter

Re: BUG: PIG HDF, HADOOP MAPREDUCE java.io.IOException: ..... does not exist

2010-10-25 Thread Jeff Zhang
apred.JobClient.submitJob(JobClient.java:742) >   at org.apache.hadoop.mapred.jobcontrol.Job.submit(Job.java:370) >   at > org.apache.hadoop.mapred.jobcontrol.JobControl.startReadyJobs(JobControl.java:247) >   at org.apache.hadoop.mapred.jobcontrol.JobControl.run(JobControl.java:279) >   at java.lang.Thread.run(Thread.java:619) > > > > Please, can someone help me?? > > Ruth > > -- Best Regards Jeff Zhang

Re: PigServer not connecting to HDFS?

2010-10-27 Thread Jeff Zhang
utFormat.java:55) >    at org.apac >  he.hadoo > p.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:241)     >   at > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.getSplits(PigInputFormat.java:258) >       ... 7 more3092 [main] ERROR > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >  - Failed to produce result in: "file:///output"3092 [main] INFO   > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher >  - Failed! -- Best Regards Jeff Zhang

Re: HBaseStorage in pig 0.8

2010-11-19 Thread Jeff Zhang
ckend.hadoop.hbase.HBaseStorage('content:field1 > anchor:field1a anchor:field2a') as (content_field1, anchor_field1a, > anchor_field2a); > > dump raw; > > --- > what else am I missing? -- Best Regards Jeff Zhang

Re: HBaseStorage in pig 0.8

2010-11-19 Thread Jeff Zhang
where my hbase/conf/hbase-site.xml file is?  Not > sure how would this get passed to the HBaseStorage class? > > On Nov 19, 2010, at 5:09 PM, Jeff Zhang wrote: > >> Does the mapreduce job start ? Could you check the logs on hadoop side ? >>

Re: comments appreciated for pig AvroStorage

2010-11-30 Thread Jeff Zhang
ppreciate any kinds of comments. > > doc: > http://snaprojects.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data > > jira: > https://issues.apache.org/jira/browse/PIG-1748 > > Many thanks, > Lin > -- Best Regards Jeff Zhang

Re: matches with regular expression in pig

2010-12-02 Thread Jeff Zhang
te the regular expression? > > I tried the following: > > A = FILTER B BY (name matches 'abc\|.*'); > > but it does not work. I cannot use 'abc|.*' because it will match anything. > > Any ideas are appreciated. > > Thanks, > > Zhen > -- Best Regards Jeff Zhang

Re: Storing different relations into one file

2010-12-06 Thread Jeff Zhang
deas? Is this even possible with piglatin? > > > -- Best Regards Jeff Zhang

Re: Is there anything in pig that supports external client to stream out a content of alias? a bit like Hive Thrift server...

2010-12-07 Thread Jeff Zhang
ic alias dump, from all >>> the other stuff being logged, to be able to trigger further process. >>> >>> STREAM THROUGH seems to be one way to trigger a process, it's >>> just that it seems not suitable for the kind of process we are looking at, >>> because the gets run in hadoop cluster. >>> >>> any thought? >>> >>> J >> > > -- Best Regards Jeff Zhang

Re: calling pig from a web app

2011-01-10 Thread Jeff Zhang
http://homepages.dcc.ufmg.br/~charles/ > UFMG - ICEx - Dcc > Cel.: 55 31 87741485 > Tel.: 55 31 34741485 > Lab.: 55 31 34095840 > -- Best Regards Jeff Zhang

Re: Howl

2011-06-18 Thread Jeff Zhang
.apache.org/pig/owl > > Thanks > Alex > -- Best Regards Jeff Zhang

Re: Pig duplicate records

2011-09-21 Thread Jeff Zhang
gt;grunt> dump raw; > >.. > >Input(s): > >Successfully read 4 records (825 bytes) from: > >"hdfs://localhost:9000/user/aholmes/test.v1.avro" > > > >Output(s): > >Successfully stored 4 records (46 bytes) in: > >"hdfs://localhost:9000/tmp/temp2039109003/tmp1924774585" > > > >Counters: > >Total records written : 4 > >Total bytes written : 46 > >.. > >(r1,1) > >(r2,2) > >(r1,1) > >(r2,2) > > > >I'm sure I'm doing something wrong (again)! > > > >Many thanks, > >Alex > > > -- Best Regards Jeff Zhang

Re: use pig in eclipse

2014-12-31 Thread Jeff Zhang
$0;"); >pigServer.store("tmp_table_limit", "/user/hadoop/shi.txt"); > I always get error: > 14/12/30 17:28:33 WARN hadoop20.PigJobControl: falling back to default > JobControl (not using hadoop 0.20 ?) > java.lang.NoSuchFieldException: runnerState > at java.lang.Class.getDeclaredField(Class.java:1948) > at > org.apache.pig.backend.hadoop20.PigJobControl.(PigJobControl.java:51) > > > > > > > > help!! -- Best Regards Jeff Zhang

Re: Welcome our new Pig PMC chair Rohini Palaniswamy

2015-03-18 Thread Jeff Zhang
> > >> >> Hi all, > >> >> > >> >> Now it's official that Rohini Palaniswamy is our new Pig PMC chair. > >> Please > >> >> join me in congratulating Rohini for her new role. Congrats! > >> >> > >> >> Thanks! > >> >> Cheolsoo > >> >> > >> > >> > > -- Best Regards Jeff Zhang

No tez engine in pig-0.14 jar ?

2015-04-02 Thread Jeff Zhang
I try to use pig-0.14 as my dependency in pom file. But it looks like there's no tez engine in neither pig-0.14.jar or pig-0.14-h2.jar ? Is that missed or it is intentionally ? Thanks -- Best Regards Jeff Zhang

Why pig depends on job history server ?

2016-09-27 Thread Jeff Zhang
I hit the issue as the following link, seems restarting job history server can fix the issue. But I am just confused why pig would depend on job history server. Anyone know that ? Thanks -- Best Regards Jeff Zhang

Re: Why pig depends on job history server ?

2016-09-27 Thread Jeff Zhang
The issue I mentioned http://stackoverflow.com/questions/29784532/pig-keeps-trying-to-connect-to-job-history-server-and-fails On Tue, Sep 27, 2016 at 8:38 PM, Jeff Zhang wrote: > > > I hit the issue as the following link, seems restarting job history server > can fix the issue. B

Is there any cancel job api for PigRunner ?

2016-09-27 Thread Jeff Zhang
Can I cancel job if I launch it using PigRunner ? Thanks -- Best Regards Jeff Zhang

Use pig in zeppelin

2017-02-05 Thread Jeff Zhang
Hi Folks, Zeppelin just release 0.7 today which is the first release to support pig. User can do all the things in zeppelin as you do in grunt shell. Besides, you can take advantage of zeppelin's visualization feature to visualize the pig output. Here's one article I wrote for the details. I would

Re: [ANNOUNCE] Welcome new Pig Committer - Adam Szita

2017-05-22 Thread Jeff Zhang
Congratulations, Adam! Ke, Xianda 于2017年5月23日周二 上午10:22写道: > Congrats, Adam! > > Regards, > Xianda > > -Original Message- > From: Zhang, Liyun [mailto:liyun.zh...@intel.com] > Sent: Tuesday, May 23, 2017 9:11 AM > To: d...@pig.apache.org > Cc: user@pig.apache.org > Subject: RE: [ANNOUNCE

Re: Pig-Eclipse update - version 1.1.0

2015-03-30 Thread Jianfeng (Jeff) Zhang
Nice job Eyal, it¹s very helpful for the community Best Regard, Jeff Zhang On 3/31/15, 5:19 AM, "Eyal Allweil" wrote: >Hi all, > >I'm glad to announce a new version of Pig-Eclipse. We've been using it >for a few weeks where I work (PayPal) and I thin

Re: Setup debug mode in eclipse for Java UDF and pig script

2015-07-09 Thread Jianfeng (Jeff) Zhang
You can use pig¹s java API to debug your script in eclipse. Here¹s one simple example. public static void main(String[] args) throws IOException { PigServer pig = new PigServer(ExecType.LOCAL); pig.registerScript("myscript.pig"); } Best Regard, Jeff Zhang

Re: Pig FS commands

2015-07-13 Thread Jianfeng (Jeff) Zhang
What¹s your purpose for that ? Best Regard, Jeff Zhang On 7/11/15, 9:29 AM, "Andy Srine" wrote: >Folks, > >Is there an easy way to store the output of the FS commands to a variable >in a pig script? Either native pig or even a solution using embedded >Python >will help. > >Thanks, >Andy

Re: Java UDF Error: ERROR 1066: Unable to open iterator for alias

2015-07-13 Thread Jianfeng (Jeff) Zhang
> Application application_1436453941326_0020 failed 2 times due to AM > Container for appattempt_1436453941326_0020_02 exited with exitCode: >1 Could you check the yarn app log ? Best Regard, Jeff Zhang On 7/10/15, 5:38 PM, "Divya Gehlot" wrote: >Hi >

Re: Setup debug mode in eclipse for Java UDF and pig script

2015-07-13 Thread Jianfeng (Jeff) Zhang
Is com.pig.udf.PigUDF on your classpath ? Best Regard, Jeff Zhang From: Divya Gehlot mailto:divya.htco...@gmail.com>> Date: Saturday, July 11, 2015 at 12:16 AM To: "user@pig.apache.org<mailto:user@pig.apache.org>" mailto:user@pig.apache.org>>, Jianfeng Zhang m

Re: Query | Join Internals

2015-07-14 Thread Jianfeng (Jeff) Zhang
This document should be helpful for you https://wiki.apache.org/pig/PigSkewedJoinSpec Best Regard, Jeff Zhang On 7/14/15, 4:56 AM, "Gagan Juneja" wrote: >Hi Team, > >We are using Pig intensively in our various projects. We are doing >optimizations for that we w

Re: NoClassDefFoundError: org/joda/time/ReadableInstant & Server IPC version 9 cannot communicate with client version 4

2015-07-14 Thread Jianfeng (Jeff) Zhang
It should be classpath issue. Did you set the PIG_HOME ? Maybe you still point PIG_HOME to pig version 0.13 Best Regard, Jeff Zhang On 7/14/15, 12:00 PM, "Antoine Lafleur" wrote: >Evening, > > Sorry to bother, in case of anyone do have the same issue, it seem

Re: NoClassDefFoundError: org/joda/time/ReadableInstant & Server IPC version 9 cannot communicate with client version 4

2015-07-15 Thread Jianfeng (Jeff) Zhang
BTW, You can use the following command to get the classpath pig -printCmdDebug Best Regard, Jeff Zhang On 7/14/15, 10:17 PM, "Jianfeng (Jeff) Zhang" wrote: >It should be classpath issue. Did you set the PIG_HOME ? Maybe you still >point PIG_HOME to pig version 0.13 >

Re: NoClassDefFoundError: org/joda/time/ReadableInstant & Server IPC version 9 cannot communicate with client version 4

2015-07-15 Thread Jianfeng (Jeff) Zhang
Do you find the joda jar under /share/hadoop/tools/lib/joda-time-2.5.jar ? Best Regard, Jeff Zhang On 7/15/15, 1:42 PM, "Antoine Lafleur" wrote: >Hi, > >The result of the command is : > >Find hadoop at /usr/local/hadoop/bin/hadoop >dry run: >HADOOP_CLASSPA

Re: Error when writing totuple

2015-07-23 Thread Jianfeng (Jeff) Zhang
More context & logs would be helpful Best Regard, Jeff Zhang On 7/23/15, 12:31 AM, "sajid mohammed" wrote: >i am trying to create totuple but getting some parse errors

Re: PigServer class and script execution

2016-01-11 Thread Jianfeng (Jeff) Zhang
Of course, you can. AvroStorage is a built-in UDF, you just need to put pig jar on the classpath Best Regard, Jeff Zhang On 1/11/16, 4:22 AM, "John Smith" wrote: >Hi, > >I have a java code that generates pig script. I am wondering if there is >option to execut

Re: PigServer class and script execution

2016-01-11 Thread Jianfeng (Jeff) Zhang
Do you use PigServer.registerScript(fileName) ? Then what errors do you see if you use AvroStorage ? Best Regard, Jeff Zhang On 1/11/16, 9:36 AM, "John Smith" wrote: >hi, > >i think you are answering something different. I need execute whole pig >script using P

Re: [ANNOUNCE] Welcome new Pig Committer - Liyun Zhang

2016-12-19 Thread Jianfeng (Jeff) Zhang
Congratulations Liyun! Best Regard, Jeff Zhang On 12/20/16, 11:29 AM, "Pallavi Rao" wrote: >Congratulations Liyun!

Re: Assigning resources to individual MR jobs of a Pig script

2017-03-17 Thread Jianfeng (Jeff) Zhang
I would suggest you to use tez which just launch one yarn app for one pig script. http://pig.apache.org/docs/r0.16.0/perf.html#enable-tez Best Regard, Jeff Zhang On 3/17/17, 3:24 AM, "Mohammad Tariq" wrote: >Hi group, > >In any real world pig script we end up wi

Re: Assigning resources to individual MR jobs of a Pig script

2017-03-18 Thread Jianfeng (Jeff) Zhang
In that case, you can use keyword parallel to control the number of reducer task of each mr job. The number of mapper task is determined by the InputFormat. See. http://pig.apache.org/docs/r0.16.0/basic.html Best Regard, Jeff Zhang On 3/18/17, 6:03 PM, "Mohammad Tariq" wrote:

Re: [ANNOUNCE] Apache Pig 0.17.0 released

2017-06-28 Thread Jianfeng (Jeff) Zhang
Awesome, I also have integrated spark engine in zeppelin notebook as well. https://github.com/zjffdu/zeppelin/tree/ZEPPELIN-2615 Best Regard, Jeff Zhang On 6/23/17, 5:09 AM, "Rohini Palaniswamy" wrote: >Thanks Adam for being the Release Manager and getting this important