Re: hive on spark query error

2015-09-25 Thread Jimmy Xiang
> Error: Master must start with yarn, spark, mesos, or local What's your setting for spark.master? On Fri, Sep 25, 2015 at 9:56 AM, Garry Chen wrote: > Hi All, > > I am following > https://cwiki.apache.org/confluence/display/Hive/Hive+on+Spark%3A+Getting+Started? > to setup hive

Re: Is there any Spark implementation for Item-based Collaborative Filtering?

2014-11-30 Thread Jimmy
The latest version of MLlib has it built in no? J Sent from my iPhone > On Nov 30, 2014, at 9:36 AM, shahab wrote: > > Hi, > > I just wonder if there is any implementation for Item-based Collaborative > Filtering in Spark? > > best, > /Shahab

Re: how to convert System.currentTimeMillis to calendar time

2014-11-13 Thread Jimmy McErlain
You could also use the jodatime library, which has a ton of great other options in it. J ᐧ *JIMMY MCERLAIN* DATA SCIENTIST (NERD) *. . . . . . . . . . . . . . . . . .* *IF WE CAN’T DOUBLE YOUR SALES,* *ONE OF US IS IN THE WRONG BUSINESS.* *E*: ji...@sellpoints.com *M*: *510.303.7751

Re: which is the recommended workflow engine for Apache Spark jobs?

2014-11-10 Thread Jimmy McErlain
I have used Oozie for all our workflows with Spark apps but you will have to use a java event as the workflow element. I am interested in anyones experience with Luigi and/or any other tools. On Mon, Nov 10, 2014 at 10:34 AM, Adamantios Corais < adamantios.cor...@gmail.com> wrote: > I have som

Re: Unable to use HiveContext in spark-shell

2014-11-06 Thread Jimmy McErlain
can you be more specific what version of spark, hive, hadoop, etc... what are you trying to do? what are the issues you are seeing? J ᐧ *JIMMY MCERLAIN* DATA SCIENTIST (NERD) *. . . . . . . . . . . . . . . . . .* *IF WE CAN’T DOUBLE YOUR SALES,* *ONE OF US IS IN THE WRONG BUSINESS

Re: Spark v Redshift

2014-11-04 Thread Jimmy McErlain
and over again to fit models so its pulled into memory once then basically analyzed through the algos... other DBs systems are reading and writing to disk repeatedly and are thus slower, such as mahout (though its getting ported over to Spark as well to compete with MLlib)... J ᐧ *JIMMY MCERLAIN

Re: issue on applying SVM to 5 million examples.

2014-10-30 Thread Jimmy
t;> > // val prediction = model.predict(point.features) >> > // (point.label, prediction) >> > // } >> > // val trainErr = labelAndPreds.filter(r => r._1 != r._2).count.toDouble / >> > testParsedData.count >> > // println("Training E

Re: issue on applying SVM to 5 million examples.

2014-10-30 Thread Jimmy
Watch the app manager it should tell you what's running and taking awhile... My guess it's a "distinct" function on the data. J Sent from my iPhone > On Oct 30, 2014, at 8:22 AM, peng xia wrote: > > Hi, > > > > Previous we have applied SVM algorithm in MLlib to 5 million records (600 > mb

Re: Spark + Tableau

2014-10-30 Thread Jimmy
What ODBC driver are you using? We recently got the Hortonworks JODBC drivers working on a Windows box but was having issues with Mac Sent from my iPhone > On Oct 30, 2014, at 4:23 AM, Bojan Kostic wrote: > > I'm testing beta driver from Databricks for Tableua. > And unfortunately i enco

Re: Exception while reading SendingConnection to ConnectionManagerId

2014-10-16 Thread Jimmy Li
Does anyone know anything re: this error? Thank you! On Wed, Oct 15, 2014 at 3:38 PM, Jimmy Li wrote: > Hi there, I'm running spark on ec2, and am running into an error there > that I don't get locally. Here's the error: > > 11335 [handle-r

Re: TaskNotSerializableException when running through Spark shell

2014-10-16 Thread Jimmy McErlain
is working fine... it leads me to believe that it is a bug within the REPL for 1.1 Can anyone else confirm this? ᐧ *JIMMY MCERLAIN* DATA SCIENTIST (NERD) *. . . . . . . . . . . . . . . . . .* *IF WE CAN’T DOUBLE YOUR SALES,* *ONE OF US IS IN THE WRONG BUSINESS.* *E*: ji

Exception while reading SendingConnection to ConnectionManagerId

2014-10-15 Thread Jimmy Li
agerId([IP HERE]) java.nio.channels.ClosedChannelException Does anyone know what might be causing this? Spark is running on my ec2 instances. Thanks, Jimmy

Re: Spark can't find jars

2014-10-14 Thread Jimmy McErlain
pushing them out to the cluster and pointing them to corresponding dependent jars Sorry I cannot be more help! J ᐧ *JIMMY MCERLAIN* DATA SCIENTIST (NERD) *. . . . . . . . . . . . . . . . . .* *IF WE CAN’T DOUBLE YOUR SALES,* *ONE OF US IS IN THE WRONG BUSINESS.* *E*: ji...@sellpoints.com

Re: Spark can't find jars

2014-10-13 Thread Jimmy McErlain
BTW this has always worked for me before until we upgraded the cluster to Spark 1.1.1... J ᐧ *JIMMY MCERLAIN* DATA SCIENTIST (NERD) *. . . . . . . . . . . . . . . . . .* *IF WE CAN’T DOUBLE YOUR SALES,* *ONE OF US IS IN THE WRONG BUSINESS.* *E*: ji...@sellpoints.com *M*: *510.303.7751

Re: Spark can't find jars

2014-10-13 Thread Jimmy McErlain
That didnt seem to work... the jar files are in the target > scala2.10 folder when I package, then I move the jar to the cluster and launch the app... still the same error... Thoughts? J ᐧ *JIMMY MCERLAIN* DATA SCIENTIST (NERD) *. . . . . . . . . . . . . . . . . .* *IF WE CAN’T DOU

Re: Spark can't find jars

2014-10-13 Thread Jimmy
Having the exact same error with the exact same jar Do you work for Altiscale? :) J Sent from my iPhone > On Oct 13, 2014, at 5:33 PM, Andy Srine wrote: > > Hi Guys, > > Spark rookie here. I am getting a file not found exception on the --jars. > This is on the yarn cluster mode and I am

Re: Print Decision Tree Models

2014-10-01 Thread Jimmy
Yeah I'm using 1.0.0 and thanks for taking the time to check! Sent from my iPhone > On Oct 1, 2014, at 8:48 PM, Xiangrui Meng wrote: > > Which Spark version are you using? It works in 1.1.0 but not in 1.0.0. > -Xiangrui > >> On Wed, Oct 1, 2014 at 2:13 PM, Jimmy M

Print Decision Tree Models

2014-10-01 Thread Jimmy McErlain
to print but where it resides in memory. Thanks, J *JIMMY MCERLAIN* DATA SCIENTIST (NERD) *. . . . . . . . . . . . . . . . . .* *IF WE CAN’T DOUBLE YOUR SALES,* *ONE OF US IS IN THE WRONG BUSINESS.* *E*: ji...@sellpoints.com *M*: *510.303.7751* ᐧ

Re: Window comparison matching using the sliding window functionality: feasibility

2014-09-30 Thread Jimmy McErlain
Not sure if this is what you are after but its based on a moving average within spark... I was building an ARIMA model on top of spark and this helped me out a lot: http://stackoverflow.com/questions/23402303/apache-spark-moving-average ᐧ *JIMMY MCERLAIN* DATA SCIENTIST (NERD