Re: Spark 1.2

2016-10-25 Thread ayan guha
Thank you both. On Tue, Oct 25, 2016 at 11:30 PM, Sean Owen wrote: > archive.apache.org will always have all the releases: > http://archive.apache.org/dist/spark/ > > On Tue, Oct 25, 2016 at 1:17 PM ayan guha wrote: > >> Just in case, anyone knows how I can download Spark 1.2? It is not >> show

Re: Spark 1.2

2016-10-25 Thread Luciano Resende
All previous releases are available on the Release Archives http://archive.apache.org/dist/spark/ On Tue, Oct 25, 2016 at 2:17 PM, ayan guha wrote: > Just in case, anyone knows how I can download Spark 1.2? It is not showing > up in Spark download page drop down > > -- > Best Regards, > Ayan Gu

Re: Spark 1.2

2016-10-25 Thread Sean Owen
archive.apache.org will always have all the releases: http://archive.apache.org/dist/spark/ On Tue, Oct 25, 2016 at 1:17 PM ayan guha wrote: > Just in case, anyone knows how I can download Spark 1.2? It is not showing > up in Spark download page drop down > > > -- > Best Regards, > Ayan Guha >

Re: Spark 1.2. loses often all executors

2015-03-23 Thread Ted Yu
In this thread: http://search-hadoop.com/m/JW1q5DM69G I only saw two replies. Maybe some people forgot to use 'Reply to All' ? Cheers On Mon, Mar 23, 2015 at 8:19 AM, mrm wrote: > Hi, > > I have received three replies to my question on my personal e-mail, why > don't they also show up on the m

Re: Spark 1.2. loses often all executors

2015-03-23 Thread mrm
Hi, I have received three replies to my question on my personal e-mail, why don't they also show up on the mailing list? I would like to reply to the 3 users through a thread. Thanks, Maria -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-2-loses-o

Re: Spark 1.2. loses often all executors

2015-03-20 Thread Davies Liu
Maybe this is related to a bug in 1.2 [1], it's fixed in 1.2.2 (not released), could checkout the 1.2 branch and verify that? [1] https://issues.apache.org/jira/browse/SPARK-5788 On Fri, Mar 20, 2015 at 3:21 AM, mrm wrote: > Hi, > > I recently changed from Spark 1.1. to Spark 1.2., and I noticed

Re: Spark 1.2. loses often all executors

2015-03-20 Thread Akhil Das
Isn't that a feature? Other than running a buggy pipeline, just kills all executors? You can always handle exceptions with proper try catch in your code though. Thanks Best Regards On Fri, Mar 20, 2015 at 3:51 PM, mrm wrote: > Hi, > > I recently changed from Spark 1.1. to Spark 1.2., and I noti

Re: Spark 1.2 – How to change Default (Random) port ….

2015-03-15 Thread Shailesh Birari
Hi SM, Apologize for delayed response. No, the issue is with Spark 1.2.0. There is a bug in Spark 1.2.0. Recently Spark have latest 1.3.0 release so it might have fixed in it. I am not planning to test it soon, may be after some time. You can try for it. Regards, Shailesh -- View this messa

Re: Spark 1.2.x Yarn Auxiliary Shuffle Service

2015-02-09 Thread Arush Kharbanda
Is this what you are looking for 1. Build Spark with the YARN profile . Skip this step if you are using a pre-packaged distribution. 2. Locate the spark--yarn-shuffle.jar. This should be under $SPARK_HOME/network/yarn/target/s

Re: spark 1.2 ec2 launch script hang

2015-01-28 Thread Nicholas Chammas
Ey-chih, >>>>> >>>>> That makes more sense. This is a known issue that will be fixed as >>>>> part of SPARK-5242 <https://issues.apache.org/jira/browse/SPARK-5242>. >>>>> >>>>> Charles, >>>>&g

Re: spark 1.2 ec2 launch script hang

2015-01-28 Thread Nicholas Chammas
>>>>> >>>>> (I've stood up 4 integration clusters and 2 production clusters on EC2 >>>>> since with no problems.) >>>>> >>>>> On Wed Jan 28 2015 at 12:05:43 PM Nicholas Chammas < >>>>> nicholas.ch

Re: spark 1.2 ec2 launch script hang

2015-01-28 Thread Peter Zybrick
known issue that will be fixed as part >>>> of SPARK-5242 <https://issues.apache.org/jira/browse/SPARK-5242>. >>>> >>>> Charles, >>>> >>>> Thanks for the info. In your case, when does spark-ec2 hang? Only when >>>> th

Re: spark 1.2 ec2 launch script hang

2015-01-28 Thread Charles Feduke
path to the identity file doesn't exist? Or also when you >>> specify the path as a relative path or with ~? >>> >>> Nick >>> >>> >>> On Wed Jan 28 2015 at 9:29:34 AM ey-chih chow >>> wrote: >>> >>>> We found the prob

Re: spark 1.2 ec2 launch script hang

2015-01-28 Thread Nicholas Chammas
ve path or with ~? >> >> Nick >> >> >> On Wed Jan 28 2015 at 9:29:34 AM ey-chih chow wrote: >> >>> We found the problem and already fixed it. Basically, spark-ec2 >>> requires ec2 instances to have external ip addresses. You need to specify &g

Re: spark 1.2 ec2 launch script hang

2015-01-28 Thread Charles Feduke
d to specify this in >> the ASW console. >> ------ >> From: nicholas.cham...@gmail.com >> Date: Tue, 27 Jan 2015 17:19:21 + >> Subject: Re: spark 1.2 ec2 launch script hang >> To: charles.fed...@gmail.com; pzybr...@gmail.com; eyc...@

Re: spark 1.2 ec2 launch script hang

2015-01-28 Thread Nicholas Chammas
> the ASW console. > -- > From: nicholas.cham...@gmail.com > Date: Tue, 27 Jan 2015 17:19:21 + > Subject: Re: spark 1.2 ec2 launch script hang > To: charles.fed...@gmail.com; pzybr...@gmail.com; eyc...@hotmail.com > CC: user@spark.apache.org > > > For thos

RE: spark 1.2 ec2 launch script hang

2015-01-28 Thread ey-chih chow
We found the problem and already fixed it. Basically, spark-ec2 requires ec2 instances to have external ip addresses. You need to specify this in the ASW console. From: nicholas.cham...@gmail.com Date: Tue, 27 Jan 2015 17:19:21 + Subject: Re: spark 1.2 ec2 launch script hang To

Re: spark 1.2 ec2 launch script hang

2015-01-27 Thread Nicholas Chammas
For those who found that absolute vs. relative path for the pem file mattered, what OS and shell are you using? What version of Spark are you using? ~/ vs. absolute path shouldn’t matter. Your shell will expand the ~/ to the absolute path before sending it to spark-ec2. (i.e. tilde expansion.) Ab

Re: spark 1.2 ec2 launch script hang

2015-01-27 Thread Charles Feduke
Absolute path means no ~ and also verify that you have the path to the file correct. For some reason the Python code does not validate that the file exists and will hang (this is the same reason why ~ hangs). On Mon, Jan 26, 2015 at 10:08 PM Pete Zybrick wrote: > Try using an absolute path to the

Re: spark 1.2 ec2 launch script hang

2015-01-26 Thread Pete Zybrick
Try using an absolute path to the pem file > On Jan 26, 2015, at 8:57 PM, ey-chih chow wrote: > > Hi, > > I used the spark-ec2 script of spark 1.2 to launch a cluster. I have > modified the script according to > > https://github.com/grzegorz-dubicki/spark/commit/5dd8458d2ab9753aae939b3bb33

Re: Spark 1.2 – How to change Default (Random) port ….

2015-01-26 Thread Shailesh Birari
Thanks. But after setting "spark.shuffle.blockTransferService" to "nio" application fails with Akka Client disassociation. 15/01/27 13:38:11 ERROR TaskSchedulerImpl: Lost executor 3 on wynchcs218.wyn.cnw.co.nz: remote Akka client disassociated 15/01/27 13:38:11 INFO TaskSetManager: Re-queueing tas

Re: spark 1.2 - Writing parque fails for timestamp with "Unsupported datatype TimestampType"

2015-01-26 Thread Manoj Samel
Awesome ! That would be great !! On Mon, Jan 26, 2015 at 3:18 PM, Michael Armbrust wrote: > I'm aiming for 1.3. > > On Mon, Jan 26, 2015 at 3:05 PM, Manoj Samel > wrote: > >> Thanks Michael. I am sure there have been many requests for this support. >> >> Any release targeted for this? >> >> Tha

Re: spark 1.2 - Writing parque fails for timestamp with "Unsupported datatype TimestampType"

2015-01-26 Thread Michael Armbrust
I'm aiming for 1.3. On Mon, Jan 26, 2015 at 3:05 PM, Manoj Samel wrote: > Thanks Michael. I am sure there have been many requests for this support. > > Any release targeted for this? > > Thanks, > > On Sat, Jan 24, 2015 at 11:47 AM, Michael Armbrust > wrote: > >> Those annotations actually don'

Re: spark 1.2 - Writing parque fails for timestamp with "Unsupported datatype TimestampType"

2015-01-26 Thread Manoj Samel
Thanks Michael. I am sure there have been many requests for this support. Any release targeted for this? Thanks, On Sat, Jan 24, 2015 at 11:47 AM, Michael Armbrust wrote: > Those annotations actually don't work because the timestamp is SQL has > optional nano-second precision. > > However, the

Re: Spark 1.2 – How to change Default (Random) port ….

2015-01-25 Thread Aaron Davidson
This was a regression caused by Netty Block Transfer Service. The fix for this just barely missed the 1.2 release, and you can see the associated JIRA here: https://issues.apache.org/jira/browse/SPARK-4837 Current master has the fix, and the Spark 1.2.1 release will have it included. If you don't

Re: Spark 1.2 – How to change Default (Random) port ….

2015-01-25 Thread Shailesh Birari
Can anyone please let me know ? I don't want to open all ports on n/w. So, am interested in the property by which this new port I can configure. Shailesh -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-2-How-to-change-Default-Random-port-tp21306p2

Re: spark 1.2 - Writing parque fails for timestamp with "Unsupported datatype TimestampType"

2015-01-24 Thread Michael Armbrust
Those annotations actually don't work because the timestamp is SQL has optional nano-second precision. However, there is a PR to add support using parquets INT96 type: https://github.com/apache/spark/pull/3820 On Fri, Jan 23, 2015 at 12:08 PM, Manoj Samel wrote: > Looking further at the trace a

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-24 Thread Fengyun RAO
Hi, Davies The log shows that LogParser initializes and loads data once per executor, thus I think singleton still works. I change the code to sc.textFile(inputPath) .flatMap(line => LogParser.parseLine(line)) .foreach(_ => {}) to avoid shuffle IO, but it’s slower. I thought it may be caused by

Re: spark 1.2 - Writing parque fails for timestamp with "Unsupported datatype TimestampType"

2015-01-23 Thread Manoj Samel
Looking further at the trace and ParquetTypes.scala, it seems there is no support for Timestamp and Date in fromPrimitiveDataType(ctype: DataType): Option[ParquetTypeInfo]. Since Parquet supports these type with some decoration over Int ( https://github.com/Parquet/parquet-format/blob/master/Logica

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread Davies Liu
On Tue, Jan 20, 2015 at 11:13 PM, Fengyun RAO wrote: > the LogParser instance is not serializable, and thus cannot be a broadcast, You could create a empty LogParser object (it's serializable), then load the data in executor lazily. Could you add some logging to LogParser to check the behavior b

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread Kay Ousterhout
Is it possible to re-run your job with spark.eventLog.enabled to true, and send the resulting logs to the list? Those have more per-task information that can help diagnose this. -Kay On Wed, Jan 21, 2015 at 1:57 AM, Fengyun RAO wrote: > btw: Shuffle Write(11 GB) mean 11 GB per Executor, for eac

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread Fengyun RAO
btw: Shuffle Write(11 GB) mean 11 GB per Executor, for each task, it's ~40 MB 2015-01-21 17:53 GMT+08:00 Fengyun RAO : > I don't know how to debug distributed application, any tools or suggestion? > > but from spark web UI, > > the GC time (~0.1 s), Shuffle Write(11 GB) are similar for spark 1.1

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread Fengyun RAO
I don't know how to debug distributed application, any tools or suggestion? but from spark web UI, the GC time (~0.1 s), Shuffle Write(11 GB) are similar for spark 1.1 and 1.2. there are no Shuffle Read and Spill. The only difference is Duration DurationMin25th percentileMedian75th percentileMaxs

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread Fengyun RAO
of my > previous post. > > Please check your shuffle I/O differences between the two in spark web UI > because it can be possibly related to my case. > > > > Thanks > > Kevin > > > > --- *Original Message* --- > > *Sender* : Fengyun RAO > >

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread JaeBoo Jung
wo in spark web UI because it can be possibly related to my case.   Thanks Kevin   --- Original Message --- Sender : Fengyun RAO Date : 2015-01-21 17:41 (GMT+09:00) Title : Re: spark 1.2 three times slower than spark 1.1, why?   maybe you mean different spark-submit script? we

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread Fengyun RAO
maybe you mean different spark-submit script? we also use the same spark-submit script, thus the same memory, cores, etc configuration. ​ 2015-01-21 15:45 GMT+08:00 Sean Owen : > I don't know of any reason to think the singleton pattern doesn't work or > works differently. I wonder if, for examp

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread Fengyun RAO
Thanks, Paul, I don’t understand how subclass FlatMapFunction helps, could you show a sample code? We need one instance per executor, not per partition, thus mapPartitions() doesn’t help. ​ 2015-01-21 16:07 GMT+08:00 Paul Wais : > To force one instance per executor, you could explicitly subclas

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread Fengyun RAO
thanks, Sean. I don't quite understand "you have *more *partitions across *more *workers". It's within the same cluster, and the same data, thus I think the same partition, the same workers. we switched from spark 1.1 to 1.2, then it's 3x slower. (We upgrade from CDH 5.2.1 to CDH 5.3, hence spa

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-21 Thread Paul Wais
To force one instance per executor, you could explicitly subclass FlatMapFunction and have it lazy-create your parser in the subclass constructor. You might also want to try RDD#mapPartitions() (instead of RDD#flatMap() if you want one instance per partition. This approach worked well for me when

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-20 Thread Sean Owen
I don't know of any reason to think the singleton pattern doesn't work or works differently. I wonder if, for example, task scheduling is different in 1.2 and you have more partitions across more workers and so are loading more copies more slowly into your singletons. On Jan 21, 2015 7:13 AM, "Feng

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-20 Thread Fengyun RAO
the LogParser instance is not serializable, and thus cannot be a broadcast, what’s worse, it contains an LRU cache, which is essential to the performance, and we would like to share among all the tasks on the same node. If it is the case, what’s the recommended way to share a variable among all t

Re: spark 1.2 three times slower than spark 1.1, why?

2015-01-20 Thread Davies Liu
Maybe some change related to serialize the closure cause LogParser is not a singleton any more, then it is initialized for every task. Could you change it to a Broadcast? On Tue, Jan 20, 2015 at 10:39 PM, Fengyun RAO wrote: > Currently we are migrating from spark 1.1 to spark 1.2, but found the

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Shailesh Birari
Thanks Aaron. Adding Guava jar resolves the issue. Shailesh On Wed, Jan 21, 2015 at 3:26 PM, Aaron Davidson wrote: > Spark's network-common package depends on guava as a "provided" dependency > in order to avoid conflicting with other libraries (e.g., Hadoop) that > depend on specific versions

RE: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Bob Tiernay
18:26:32 -0800 Subject: Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro To: sbirar...@gmail.com CC: fnoth...@berkeley.edu; so...@cloudera.com; user@spark.apache.org Spark's network-common package depends on guava as a "provided" dependenc

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Aaron Davidson
Spark's network-common package depends on guava as a "provided" dependency in order to avoid conflicting with other libraries (e.g., Hadoop) that depend on specific versions. com/google/common/base/Preconditions has been present in Guava since version 2, so this is likely a "dependency not found" r

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Shailesh Birari
Hi Frank, Its a normal eclipse project where I added Scala and Spark libraries as user libraries. Though, I am not attaching any hadoop libraries, in my application code I have following line. System.setProperty("hadoop.home.dir", "C:\\SB\\HadoopWin") This Hadoop home dir contains "winutils.ex

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Frank Austin Nothaft
Shailesh, To add, are you packaging Hadoop in your app? Hadoop will pull in Guava. Not sure if you are using Maven (or what) to build, but if you can pull up your builds dependency tree, you will likely find com.google.guava being brought in by one of your dependencies. Regards, Frank Austin

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Shailesh Birari
Hello, I double checked the libraries. I am linking only with Spark 1.2. Along with Spark 1.2 jars I have Scala 2.10 jars and JRE 7 jars linked and nothing else. Thanks, Shailesh On Wed, Jan 21, 2015 at 12:58 PM, Sean Owen wrote: > Guava is shaded in Spark 1.2+. It looks like you are mixing

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Ted Yu
Please also see this thread: http://search-hadoop.com/m/JW1q5De7pU1 On Tue, Jan 20, 2015 at 3:58 PM, Sean Owen wrote: > Guava is shaded in Spark 1.2+. It looks like you are mixing versions > of Spark then, with some that still refer to unshaded Guava. Make sure > you are not packaging Spark with

Re: Spark 1.2 - com/google/common/base/Preconditions java.lang.NoClassDefFoundErro

2015-01-20 Thread Sean Owen
Guava is shaded in Spark 1.2+. It looks like you are mixing versions of Spark then, with some that still refer to unshaded Guava. Make sure you are not packaging Spark with your app and that you don't have other versions lying around. On Tue, Jan 20, 2015 at 11:55 PM, Shailesh Birari wrote: > Hel

Re: spark 1.2 compatibility

2015-01-17 Thread bhavyateja
Yes it works with 2.2 but we are trying to use spark 1.2 on HDP 2.1 On Sat, Jan 17, 2015, 11:18 AM Chitturi Padma [via Apache Spark User List] < ml-node+s1001560n21208...@n3.nabble.com> wrote: > It worked for me. spark 1.2.0 with hadoop 2.2.0 > > On Sat, Jan 17, 2015 at 9:39 PM, bhavyateja [via A

Re: spark 1.2 compatibility

2015-01-17 Thread Chitturi Padma
It worked for me. spark 1.2.0 with hadoop 2.2.0 On Sat, Jan 17, 2015 at 9:39 PM, bhavyateja [via Apache Spark User List] < ml-node+s1001560n21207...@n3.nabble.com> wrote: > Hi all, > > Thanks for your contribution. We have checked and confirmed that HDP 2.1 > YARN don't work with Spark 1.2 > > On

Re: spark 1.2 compatibility

2015-01-17 Thread bhavyateja
Hi all, Thanks for your contribution. We have checked and confirmed that HDP 2.1 YARN don't work with Spark 1.2 On Sat, Jan 17, 2015 at 9:11 AM, bhavya teja potineni < bhavyateja.potin...@gmail.com> wrote: > Hi > > Did you try using spark 1.2 on hdp 2.1 YARN > > Can you please go thru the thread

Re: spark 1.2 compatibility

2015-01-17 Thread bhavyateja
Hi Did you try using spark 1.2 on hdp 2.1 YARN Can you please go thru the thread http://apache-spark-user-list.1001560.n3.nabble.com/Troubleshooting-Spark-tt21189.html and check where I am going wrong. As my word count program is erroring out when using spark 1.2 using YARN but its getting execut

Re: spark 1.2 compatibility

2015-01-17 Thread Chitturi Padma
Yes. I built spar 1.2 with apache hadoop 2.2. No compatibility issues. On Sat, Jan 17, 2015 at 4:47 AM, bhavyateja [via Apache Spark User List] < ml-node+s1001560n21197...@n3.nabble.com> wrote: > Is spark 1.2 is compatibly with HDP 2.1 > > -- > If you reply to this em

Re: spark 1.2 compatibility

2015-01-16 Thread Matei Zaharia
oblem. > > However officially HDP 2.1 + Spark 1.2 is not a supported scenario. > > -Original Message- > From: Judy Nash > Sent: Friday, January 16, 2015 5:35 PM > To: 'bhavyateja'; user@spark.apache.org > Subject: RE: spark 1.2 compatibility > > Y

RE: spark 1.2 compatibility

2015-01-16 Thread Judy Nash
apache.org Subject: RE: spark 1.2 compatibility Yes. It's compatible with HDP 2.1 -Original Message- From: bhavyateja [mailto:bhavyateja.potin...@gmail.com] Sent: Friday, January 16, 2015 3:17 PM To: user@spark.apache.org Subject: spark 1.2 compatibility Is spark 1.2 is compatib

RE: spark 1.2 compatibility

2015-01-16 Thread Judy Nash
Yes. It's compatible with HDP 2.1 -Original Message- From: bhavyateja [mailto:bhavyateja.potin...@gmail.com] Sent: Friday, January 16, 2015 3:17 PM To: user@spark.apache.org Subject: spark 1.2 compatibility Is spark 1.2 is compatibly with HDP 2.1 -- View this message in context: htt

Re: spark 1.2 defaults to MR1 class when calling newAPIHadoopRDD

2015-01-09 Thread Shixiong Zhu
The official distribution has the same issue. I opened a ticket: https://issues.apache.org/jira/browse/SPARK-5172 Best Regards, Shixiong Zhu 2015-01-08 15:51 GMT+08:00 Shixiong Zhu : > I have not used CDH5.3.0. But looks > spark-examples-1.2.0-cdh5.3.0-hadoop2.5.0-cdh5.3.0.jar contains some > ha

Re: spark 1.2 defaults to MR1 class when calling newAPIHadoopRDD

2015-01-07 Thread Shixiong Zhu
I have not used CDH5.3.0. But looks spark-examples-1.2.0-cdh5.3.0-hadoop2.5.0-cdh5.3.0.jar contains some hadoop1 jars (come from a wrong hbase version). I don't know the recommanded way to build "spark-examples" jar because the official Spark docs does not mention how to build "spark-examples" jar

Re: spark 1.2 defaults to MR1 class when calling newAPIHadoopRDD

2015-01-07 Thread Antony Mayi
thanks, I found the issue, I was including  /usr/lib/spark/lib/spark-examples-1.2.0-cdh5.3.0-hadoop2.5.0-cdh5.3.0.jar into the classpath - this was breaking it. now using custom jar with just the python convertors and all works as a charm.thanks,antony. On Wednesday, 7 January 2015, 23:57,

Re: spark 1.2 defaults to MR1 class when calling newAPIHadoopRDD

2015-01-07 Thread Sean Owen
Yes, the distribution is certainly fine and built for Hadoop 2. It sounds like you are inadvertently including Spark code compiled for Hadoop 1 when you run your app. The general idea is to use the cluster's copy at runtime. Those with more pyspark experience might be able to give more useful direc

Re: spark 1.2 defaults to MR1 class when calling newAPIHadoopRDD

2015-01-07 Thread Antony Mayi
this is official cloudera compiled stack cdh 5.3.0 - nothing has been done by me and I presume they are pretty good in building it so I still suspect it now gets the classpath resolved in different way? thx,Antony. On Wednesday, 7 January 2015, 18:55, Sean Owen wrote: Problems lik

Re: spark 1.2 defaults to MR1 class when calling newAPIHadoopRDD

2015-01-07 Thread Sean Owen
Problems like this are always due to having code compiled for Hadoop 1.x run against Hadoop 2.x, or vice versa. Here, you compiled for 1.x but at runtime Hadoop 2.x is used. A common cause is actually bundling Spark / Hadoop classes with your app, when the app should just use the Spark / Hadoop pr

Re: spark 1.2: value toJSON is not a member of org.apache.spark.sql.SchemaRDD

2015-01-05 Thread Michael Armbrust
I think you are missing something: $ javap -cp ~/Downloads/spark-sql_2.10-1.2.0.jar org.apache.spark.sql.SchemaRDD|grep toJSON public org.apache.spark.rdd.RDD toJSON(); On Mon, Jan 5, 2015 at 3:11 AM, bchazalet wrote: > Hi everyone, > > I have just switched to spark 1.2.0 from 1.1.1, updating

Re: Spark 1.2 Release Date

2014-12-18 Thread Al M
Awesome. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-2-Release-Date-tp20765p20767.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To uns

Re: Spark 1.2 Release Date

2014-12-18 Thread nitin
Soon enough :) http://apache-spark-developers-list.1001551.n3.nabble.com/RESULT-VOTE-Release-Apache-Spark-1-2-0-RC2-td9815.html -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-1-2-Release-Date-tp20765p20766.html Sent from the Apache Spark User List ma

Re: Spark 1.2 Release Date

2014-12-18 Thread Silvio Fiorito
It’s on Maven Central already http://search.maven.org/#browse%7C717101892 On 12/18/14, 2:09 PM, "Al M" wrote: >Is there a planned release date for Spark 1.2? I saw on the Spark Wiki > that >we >are already in the latter p

Re: Spark 1.2 + Avro file does not work in HDP2.2

2014-12-16 Thread Zhan Zhang
Hi Manas, There is a small patch needed for HDP2.2. You can refer to this PR https://github.com/apache/spark/pull/3409 There are some other issues compiling against hadoop2.6. But we will fully support it very soon. You can ping me, if you want. Thanks. Zhan Zhang On Dec 12, 2014, at 11:38 AM

Re: Spark 1.2 + Avro does not work in HDP2.2

2014-12-16 Thread Sean Owen
Given that the error is java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected ...this usually means there is a Hadoop version problem. But in particular it's https://issues.apache.org/jira/browse/SPARK-3039 which affects as

Re: Spark 1.2 + Avro does not work in HDP2.2

2014-12-16 Thread manasdebashiskar
Hi All, I saw some helps online about forcing avro-mapred to hadoop2 using classifiers. Now my configuration is thus val avro= "org.apache.avro" % "avro-mapred" % V.avro classifier "hadoop2" How ever I still get java.lang.IncompatibleClassChangeError. I think I am not building sp