Spark development with IntelliJ

2015-01-08 Thread Jakub Dubovsky
Hi devs,   I'd like to ask if anybody has experience with using intellij 14 to step into spark code. Whatever I try I get compilation error: Error:scalac: bad option: -P:/home/jakub/.m2/repository/org/scalamacros/ paradise_2.10.4/2.0.1/paradise_2.10.4-2.0.1.jar   Project is set up by Patrick's

Re: Spark development with IntelliJ

2015-01-08 Thread Petar Zecevic
This helped me: http://stackoverflow.com/questions/26995023/errorscalac-bad-option-p-intellij-idea On 8.1.2015. 11:00, Jakub Dubovsky wrote: Hi devs, I'd like to ask if anybody has experience with using intellij 14 to step into spark code. Whatever I try I get compilation error: Error:sc

Re: Spark development with IntelliJ

2015-01-08 Thread Sean Owen
Yeah, I hit this too. IntelliJ picks this up from the build but then it can't run its own scalac with this plugin added. Go to Preferences > Build, Execution, Deployment > Scala Compiler and clear the "Additional compiler options" field. It will work then although the option will come back when th

Re: Spark development with IntelliJ

2015-01-08 Thread Jakub Dubovsky
Thanks that helped. I vote for wiki as well. More fine graned documentation should be on wiki and linked, Jakub -- Původní zpráva -- Od: Sean Owen Komu: Jakub Dubovsky Datum: 8. 1. 2015 11:29:22 Předmět: Re: Spark development with IntelliJ "Yeah, I hit this too. IntelliJ pic

Results of tests

2015-01-08 Thread Tony Reix
Hi, I'm checking that Spark works fine on a new environment (PPC64 hardware). I've found some issues, with versions 1.1.0, 1.1.1, and 1.2.0, even when running on Ubuntu on x86_64 with Oracle JVM. I'd like to know where I can find the results of the tests of Spark, for each version and for the dif

Re: Results of tests

2015-01-08 Thread Ted Yu
Please take a look at https://amplab.cs.berkeley.edu/jenkins/view/Spark/ On Thu, Jan 8, 2015 at 5:40 AM, Tony Reix wrote: > Hi, > I'm checking that Spark works fine on a new environment (PPC64 hardware). > I've found some issues, with versions 1.1.0, 1.1.1, and 1.2.0, even when > running on Ubun

K-Means And Class Tags

2015-01-08 Thread Devl Devel
Hi All, I'm trying a simple K-Means example as per the website: val parsedData = data.map(s => Vectors.dense(s.split(',').map(_.toDouble))) but I'm trying to write a Java based validation method first so that missing values are omitted or replaced with 0. public RDD prepareKMeans(JavaRDD data)

RE:Results of tests

2015-01-08 Thread Tony Reix
Thanks ! I've been able to see that there are 3745 tests for version 1.2.0 with profile Hadoop 2.4 . However, on my side, the maximum tests I've seen are 3485... About 300 tests are missing on my side. Which Maven option has been used for producing the report file used for building the page:

Re: K-Means And Class Tags

2015-01-08 Thread Yana Kadiyska
How about data.map(s=>s.split(",")).filter(_.length>1).map(good_entry=>Vectors.dense((Double.parseDouble(good_entry[0]), Double.parseDouble(good_entry[1])) ​ (full disclosure, I didn't actually run this). But after the first map you should have an RDD[Array[String]], then you'd discard everything

Re: Registering custom metrics

2015-01-08 Thread Enno Shioji
FYI I found this approach by Ooyala. /** Instrumentation for Spark based on accumulators. * * Usage: * val instrumentation = new SparkInstrumentation("example.metrics") * val numReqs = sc.accumulator(0L) * instrumentation.source.registerDailyAccumulator(numReqs, "numReqs") * instrument

Re: Spark on teradata?

2015-01-08 Thread xhudik
I don't think this makes sense. TD database is standard RDBMS (even parallel) while Spark is used for non-relational issues. What could make sense is to deploy Spark on Teradata Aster. Aster is a database cluster that might call external programs via STREAM operator. That said Spark/Scala app can

Re: Results of tests

2015-01-08 Thread Ted Yu
Here it is: [centos] $ /home/jenkins/tools/hudson.tasks.Maven_MavenInstallation/Maven_3.0.5/bin/mvn -DHADOOP_PROFILE=hadoop-2.4 -Dlabel=centos -DskipTests -Phadoop-2.4 -Pyarn -Phive clean package You can find the above in https://amplab.cs.berkeley.edu/jenkins/view/Spark/job/Spark-1.2-Maven-wit

Re: Registering custom metrics

2015-01-08 Thread Gerard Maas
Very interesting approach. Thanks for sharing it! On Thu, Jan 8, 2015 at 5:30 PM, Enno Shioji wrote: > FYI I found this approach by Ooyala. > > /** Instrumentation for Spark based on accumulators. > * > * Usage: > * val instrumentation = new SparkInstrumentation("example.metrics") > * va

Re: Maintainer for Mesos

2015-01-08 Thread RJ Nowling
Hi Andrew, Patrick Wendell and Andrew Or have committed previous patches related to Mesos. Maybe they would be good committers to look at it? RJ On Mon, Jan 5, 2015 at 6:40 PM, Andrew Ash wrote: > Hi Spark devs, > > I'm interested in having a committer look at a PR [1] for Mesos, but > there's

Re: Spark on teradata?

2015-01-08 Thread Reynold Xin
Depending on your use cases. If the use case is to extract small amount of data out of teradata, then you can use the JdbcRDD and soon a jdbc input source based on the new Spark SQL external data source API. On Wed, Jan 7, 2015 at 7:14 AM, gen tang wrote: > Hi, > > I have a stupid question: >

Re: K-Means And Class Tags

2015-01-08 Thread devl.development
Thanks for the suggestion, can anyone offer any advice on the ClassCast Exception going from Java to Scala? Why does going from JavaRDD.rdd() and then a collect() result in this exception? -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/K-Means-And-Cla

Re: K-Means And Class Tags

2015-01-08 Thread Devl Devel
Thanks for the suggestion, can anyone offer any advice on the ClassCast Exception going from Java to Scala? Why does JavaRDD.rdd() and then a collect() result in this exception? On Thu, Jan 8, 2015 at 4:13 PM, Yana Kadiyska wrote: > How about > > data.map(s=>s.split(",")).filter(_.length>1).map(

Re: K-Means And Class Tags

2015-01-08 Thread Joseph Bradley
I believe you're running into an erasure issue which we found in DecisionTree too. Check out: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/tree/RandomForest.scala#L134 That retags RDDs which were created from Java to prevent the exception you're running

Re: Spark development with IntelliJ

2015-01-08 Thread Bill Bejeck
I was having the same issue and that helped. But now I get the following compilation error when trying to run a test from within Intellij (v 14) /Users/bbejeck/dev/github_clones/bbejeck-spark/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala Error:(308, 109) polymorphic

PR #3872

2015-01-08 Thread Bill Bejeck
Could one of the admins take a look at PR 3872 (JIRA 3299) submitted on 1/1

Re: Spark development with IntelliJ

2015-01-08 Thread Sean Owen
I remember seeing this too, but it seemed to be transient. Try compiling again. In my case I recall that IJ was still reimporting some modules when I tried to build. I don't see this error in general. On Thu, Jan 8, 2015 at 10:38 PM, Bill Bejeck wrote: > I was having the same issue and that helpe

Re: Spark development with IntelliJ

2015-01-08 Thread Nicholas Chammas
Side question: Should this section in the wiki link to Useful Developer Tools ? On Thu Jan 08 2015 at 6:19:55 PM Sean Owe

Re: Spark development with IntelliJ

2015-01-08 Thread Bill Bejeck
That worked, thx On Thu, Jan 8, 2015 at 6:17 PM, Sean Owen wrote: > I remember seeing this too, but it seemed to be transient. Try > compiling again. In my case I recall that IJ was still reimporting > some modules when I tried to build. I don't see this error in general. > > On Thu, Jan 8, 2015

missing document of several messages in actor-based receiver?

2015-01-08 Thread Nan Zhu
Hi, TD and other streaming developers, When I look at the implementation of actor-based receiver (ActorReceiver.scala), I found that there are several messages which are not mentioned in the document case props: Props => val worker = context.actorOf(props) logInfo("Started receiver worker at:

[ANNOUNCE] Apache Science and Healthcare Track @ApacheCon NA 2015

2015-01-08 Thread Lewis John Mcgibbney
Hi Folks, Apologies for cross posting :( As some of you may already know, @ApacheCon NA 2015 is happening in Austin, TX April 13th-16th. This email is specifically written to attract all folks interested in Science and Healthcare... this is an official call to arms! I am aware that there are man

Re: Spark development with IntelliJ

2015-01-08 Thread Patrick Wendell
Nick - yes. Do you mind moving it? I should have put it in the "Contributing to Spark" page. On Thu, Jan 8, 2015 at 3:22 PM, Nicholas Chammas wrote: > Side question: Should this section > > in >

Re: Spark development with IntelliJ

2015-01-08 Thread Patrick Wendell
Actually I went ahead and did it. On Thu, Jan 8, 2015 at 10:25 PM, Patrick Wendell wrote: > Nick - yes. Do you mind moving it? I should have put it in the > "Contributing to Spark" page. > > On Thu, Jan 8, 2015 at 3:22 PM, Nicholas Chammas > wrote: >> Side question: Should this section >>