FYI: Prof John Canny is giving a talk on "Machine Learning at the limit" in SF Big Analytics Meetup

2015-02-10 Thread Chester Chen
Just in case you are in San Francisco, we are having a meetup by Prof John Canny http://www.meetup.com/SF-Big-Analytics/events/220427049/ Chester

Re: Spark impersonation

2015-02-07 Thread Chester Chen
Sorry for the many typos as I was typing from my cell phone. Hope you still can get the idea. On Sat, Feb 7, 2015 at 1:55 PM, Chester @work wrote: > > I just implemented this in our application. The impersonation is done > before the job is submitted. In spark yarn (we are using yarn cluster mo

Re: Possible bug in ClientBase.scala?

2014-07-13 Thread Chester Chen
Ron, Which distribution and Version of Hadoop are you using ? I just looked at CDH5 ( hadoop-mapreduce-client-core- 2.3.0-cdh5.0.0), MRJobConfig does have the field : java.lang.String DEFAULT_MAPREDUCE_APPLICATION_CLASSPATH; Chester On Sun, Jul 13, 2014 at 6:49 PM, Ron Gonzalez wr

Re: Issues in opening UI when running Spark Streaming in YARN

2014-07-07 Thread Chester Chen
so, > this is the correct behavior. However, I believe the redirect error has > little to do with Spark itself, but more to do with how you set up the > cluster. I have actually run into this myself, but I haven't found a > workaround. Let me know if you find anything. > > &g

Re: spark-assembly libraries conflict with needed libraries

2014-07-07 Thread Chester Chen
I don't have experience deploying to EC2. can you use add.jar conf to add the missing jar at runtime ? I haven't tried this myself. Just a guess. On Mon, Jul 7, 2014 at 12:16 PM, Chester Chen wrote: > with "provided" scope, you need to provide the "provided&quo

Re: spark-assembly libraries conflict with needed libraries

2014-07-07 Thread Chester Chen
with "provided" scope, you need to provide the "provided" jars at the runtime yourself. I guess in this case Hadoop jar files. On Mon, Jul 7, 2014 at 12:13 PM, Robert James wrote: > Thanks - that did solve my error, but instead got a different one: > java.lang.NoClassDefFoundError: > org/apac

Re: Issues in opening UI when running Spark Streaming in YARN

2014-07-07 Thread Chester Chen
As Andrew explained, the port is random rather than 4040, as the the spark driver is started in Application Master and the port is random selected. But I have the similar UI issue. I am running Yarn Cluster mode against my local CDH5 cluster. The log states "14/07/07 11:59:29 INFO ui.SparkUI: St

Re: spark-assembly libraries conflict with needed libraries

2014-07-07 Thread Chester Chen
Have you tried to change the spark SBT scripts? You can change the dependency scope to "provided". This similar to compile scope, except JDK or container need to provide the dependency at runtime. This assume the Spark will work with the new version of common libraries. Of course, this is not a

Re: Shark Vs Spark SQL

2014-07-02 Thread Chester Chen
Yes, they have announced that Shark is no longer under development and be replaced with Spark SQL in Spark Summit 2014. Chester On Wed, Jul 2, 2014 at 3:53 PM, Subacini B wrote: > Hi, > > > http://mail-archives.apache.org/mod_mbox/spark-user/201403.mbox/%3cb75376b8-7a57-4161-b604-f919886cf...

Re: Integrate spark-shell into officially supported web ui/api plug-in? What do you think?

2014-06-27 Thread Chester Chen
I am more interested to use Scala.js as the front-end and just use Spark as the back-end. I thought that would be more than simply a spark-shell on browser. Similar to ipython that would allows user to interact with other plotting libraries in the browser such as D3 etc. Chester On Fri, Jun 27,

Re: Is spark context in local mode thread-safe?

2014-06-09 Thread Chester Chen
Akka to spawn more threads and in that case it would probably be okay. See  http://doc.akka.io/docs/akka/snapshot/java/dispatchers.html for some details on Akka thread usage and how to configure it. Matei On Jun 9, 2014, at 4:54 PM, Chester Chen wrote: Matei,  >If we use different Akka act

Re: Is spark context in local mode thread-safe?

2014-06-09 Thread Chester Chen
Matei,  If we use different Akka actors to process different user's requests, (not different threads), is the SparkContext still safe to use for different users ?  Yes, it would be nice to disable UI via configuration,especially when we develop locally. We use sbt-web plugin to debug tomcat code

[ANN]: Scala By the Bay Developer Conference, CFP now open

2014-05-29 Thread Chester Chen
Hi Sparkers Scala By The Bay 2014 Scala By The Bay 2014 A new conference for developers who use the Scala language or are interested in functional programming practices. View on www.scalabythebay.org Preview by Yahoo Scala By The Bay is renamed from last year's successful "Silicon Valley

Re: "sbt/sbt run" command returns a JVM problem

2014-05-01 Thread Chester Chen
that takes. But your error is that you're already asking for too much memory for your machine. So maybe you are setting the value successfully, but it's not valid. How big? On Thu, May 1, 2014 at 2:57 PM, Chester Chen wrote: > You might want to check the memory settings in sbt itself, wh

Re: "sbt/sbt run" command returns a JVM problem

2014-05-01 Thread Chester Chen
You might want to check the memory settings in sbt itself, which its a shell scripts run a java command. I don't have computer at hand, but if you vim or cat the sbt/sbt , you might see the memory settings , you change it to fit your need You might also can overwrite the setting by change .sbto

[ANN]: Scala By The Bay Conference ( aka Silicon Valley Scala Symposium)

2014-04-30 Thread Chester Chen
Hi,        This is not related to Spark. But I thought you might be interested in the  second SF Scala conference is coming this August. The SF Scala conference was called "Sillicon Valley Scala Symposium" last year.  From now on, it will be known as "Scala By The Bay".  http://www.scalabythebay

Re: is it okay to reuse objects across RDD's?

2014-04-28 Thread Chester Chen
Tom, Are you suggesting two RDDs, one with loss and another for the rest info, using zip to tie them together, but do update on loss RDD (copy) ? Chester Sent from my iPhone On Apr 28, 2014, at 9:45 AM, Tom Vacek wrote: > I'm not sure what I said came through. RDD zip is not hacky at al

Re: K-means with large K

2014-04-28 Thread Chester Chen
David, Just curious to know what kind of use cases demand such large k clusters Chester Sent from my iPhone On Apr 28, 2014, at 9:19 AM, "Buttler, David" wrote: > Hi, > I am trying to run the K-means code in mllib, and it works very nicely with > small K (less than 1000). However, when I

Re: Is Branch 1.0 build broken ?

2014-04-11 Thread Chester Chen
environment. Do you need proxy settings? Any other errors in the log about why you can't access it? On Apr 11, 2014 12:32 AM, "Chester Chen" wrote: I just updated and got the following:  > > > > >[error] (external-mqtt/*:update) sbt.ResolveException: unresolved

Is Branch 1.0 build broken ?

2014-04-10 Thread Chester Chen
I just updated and got the following:  [error] (external-mqtt/*:update) sbt.ResolveException: unresolved dependency: org.eclipse.paho#mqtt-client;0.4.0: not found [error] Total time: 7 s, completed Apr 10, 2014 4:27:09 PM Chesters-MacBook-Pro:spark chester$ git branch * branch-1.0   master Look

Re: spark config params conventions

2014-03-14 Thread Chester Chen
Based on typesafe config maintainer's response, with latest version of typeconfig, the double quote is no longer needed for key like spark.speculation, so you don't need code to strip the quotes Chester Alpine data labs Sent from my iPhone On Mar 12, 2014, at 2:50 PM, Aaron Davidson wrote: