回复:Re: Re: Re: Re: how to distributed run a bash shell in spark

2015-05-27 Thread luohui20001
gt; rdd2.values.count() * x) is invalid because the values transformation and count action cannot be performed inside of the rdd1.map transformation. For more information, see SPARK-5063. any advices on this? Thanks. -------------------- Thanks&Best regards! San.Luo - 原始邮件

回复:Re: Re: Re: Re: how to distributed run a bash shell in spark

2015-05-26 Thread luohui20001
compare.sh to compare chr1.txt and samplechr1.txt, get a result1.txt. And looping it from 1 to 21 so that those 42 file are compared and I can get 21 files like result1.txt,result2.txt...result21.txt. Sorry for not adding some comments for my code. ------------------------ Thanks&Best r

Re: Re: Re: Re: how to distributed run a bash shell in spark

2015-05-25 Thread Akhil Das
(runmodifyshell) > val pipeModify = runmodifyshellRDD.pipe("sh > /opt/data/shellcompare/modify.sh") > > pipeModify.collect() > > > > //running on driver manager > val shellcompare = List("run","sort.sh") > val shellcomp

回复:Re: Re: Re: how to distributed run a bash shell in spark

2015-05-25 Thread luohui20001
mplechr1.txt, get a result1.txt. And looping it from 1 to 21 so that those 42 file are compared and I can get 21 files like result1.txt,result2.txt...result21.txt. Sorry for not adding some comments for my code. -------------------- Thanks&Best regards! San.Luo - 原始邮件 -

Re: Re: Re: how to distributed run a bash shell in spark

2015-05-25 Thread Akhil Das
RDD(shellcompare) > val result = List("aggregate","result") > val resultRDD = sc.makeRDD(result) > for(j <- 1 to 21){ > shellcompareRDD.pipe("sh /opt/sh/bin/sort.sh > /opt/data/shellcompare/chr" + j + ".txt /opt/data/shellcompare

回复:Re: Re: how to distributed run a bash shell in spark

2015-05-25 Thread luohui20001
sh/bin/sort.sh /opt/data/shellcompare/chr" + j + ".txt /opt/data/shellcompare/samplechr" + j + ".txt /opt/data/shellcompare/result" + j + ".txt 600").collect() if (j > 1) resultRDD.pipe("cat result" + j + ".txt >> result1.txt").coll

Re: Re: how to distributed run a bash shell in spark

2015-05-24 Thread madhu phatak
Hi, You can use pipe operator, if you are running shell script/perl script on some data. More information on my blog . Regards, Madhukara Phatak http://datamantra.io/ On Mon, May 25, 2015 at 8:02 AM, wrote: > Thanks Akhil, > >your c