date:20151130

Re: Removing the Mesos fine-grained mode

2015-11-30 Thread Adam McElwee

To eliminate any skepticism around whether cpu is a good performance metric for this workload, I did a couple comparison runs of an example job to demonstrate a more universal change in performance metrics (stage/job time) between coarse and fine-grained mode on mesos. The workload is identical he

Re: Problem in running MLlib SVM

2015-11-30 Thread Fazlan Nazeem

You should never use the training data to measure your prediction accuracy. Always use a fresh dataset (test data) for this purpose. On Sun, Nov 29, 2015 at 8:36 AM, Jeff Zhang wrote: > I think this should represent the label of LabledPoint (0 means negative 1 > means positive) > http://spark.ap

Re: Need suggestions on monitor Spark progress

2015-11-30 Thread Jacek Laskowski

Hi, My limited understanding of Spark tells me that a task is the least possible working unit and Spark itself won't give you much. It wouldn't expect so since "acount" is a business entity not Spark's one. What about using mapPartitions* to know the details of partitions and do whatever you want

Re: Need suggestions on monitor Spark progress

2015-11-30 Thread Alex Rovner

In these scenarios it's fairly standard to report the metrics either directly or through accumulators ( http://spark.apache.org/docs/latest/programming-guide.html#accumulators-a-nameaccumlinka) to a time series database such as Graphite (http://graphite.wikidot.com/) or OpenTSDB (http://opentsdb.ne

Re: question about combining small parquet files

2015-11-30 Thread Nezih Yigitbasi

This looks interesting, thanks Ruslan. But, compaction with Hive is as simple as an insert overwrite statement as Hive supports CombineFileInputFormat, is it possible to do the same with Spark? On Thu, Nov 26, 2015 at 9:47 AM, Ruslan Dautkhanov wrote: > An interesting compaction approach of smal

Re: Export BLAS module on Spark MLlib

2015-11-30 Thread DB Tsai

The workaround is have your code in the same package, or write some utility wrapper in the same package so you can use them in your code. Mostly we implement those BLAS for our own need, and we don't have general use-case in mind. As a result, if we open them up prematurely, it will add our api mai

Re: Removing the Mesos fine-grained mode

2015-11-30 Thread Timothy Chen

Hi Adam, Thanks for the graphs and the tests, definitely interested to dig a bit deeper to find out what's could be the cause of this. Do you have the spark driver logs for both runs? Tim On Mon, Nov 30, 2015 at 9:06 AM, Adam McElwee wrote: > To eliminate any skepticism around whether cpu is a

Re: Bringing up JDBC Tests to trunk

2015-11-30 Thread Josh Rosen

The JDBC drivers are currently being pulled in as test-scope dependencies of the `sql/core` module: https://github.com/apache/spark/blob/f2fbfa444f6e8d27953ec2d1c0b3abd603c963f9/sql/core/pom.xml#L91 In SBT, these wind up on the Docker JDBC tests' classpath as a transitive dependency of the `spark-

Re: Export BLAS module on Spark MLlib

2015-11-30 Thread Burak Yavuz

Or you could also use reflection like in this Spark Package: https://github.com/brkyvz/lazy-linalg/blob/master/src/main/scala/com/brkyvz/spark/linalg/BLASUtils.scala Best, Burak On Mon, Nov 30, 2015 at 12:48 PM, DB Tsai wrote: > The workaround is have your code in the same package, or write som

Re: Export BLAS module on Spark MLlib

2015-11-30 Thread DB Tsai

I used reflection initially, but I found it's very slow especially in a tight loop. Maybe caching the reflection can help which I never try. Sincerely, DB Tsai -- Web: https://www.dbtsai.com PGP Key ID: 0xAF08DF8D On Mon, Nov 30, 2015 at 2

Re: Problem in running MLlib SVM

2015-11-30 Thread Joseph Bradley

model.predict should return a 0/1 predicted label. The example code is misleading when it calls the prediction a "score." On Mon, Nov 30, 2015 at 9:13 AM, Fazlan Nazeem wrote: > You should never use the training data to measure your prediction > accuracy. Always use a fresh dataset (test data)

Re: Grid search with Random Forest

2015-11-30 Thread Joseph Bradley

It should work with 1.5+. On Thu, Nov 26, 2015 at 12:53 PM, Ndjido Ardo Bar wrote: > > Hi folks, > > Does anyone know whether the Grid Search capability is enabled since the > issue spark-9011 of version 1.4.0 ? I'm getting the "rawPredictionCol > column doesn't exist" when trying to perform a g

Re: Grid search with Random Forest

2015-11-30 Thread Ndjido Ardo BAR

Hi Joseph, Yes Random Forest support Grid Search on Spark 1.5.+ . But I'm getting a "rawPredictionCol field does not exist exception" on Spark 1.5.2 for Gradient Boosting Trees classifier. Ardo On Tue, 1 Dec 2015 at 01:34, Joseph Bradley wrote: > It should work with 1.5+. > > On Thu, Nov 26, 2

FOSDEM 2016 - take action by 4th of December 2015

2015-11-30 Thread Roman Shaposhnik

As most of you probably know FOSDEM 2016 (the biggest, 100% free open source developer conference) is right around the corner: https://fosdem.org/2016/ We hope to have an ASF booth and we would love to see as many ASF projects as possible present at various tracks (AKA Developer rooms): htt

Re: Grid search with Random Forest

2015-11-30 Thread Benjamin Fradet

Hi Ndjido, This is because GBTClassifier doesn't yet have a rawPredictionCol like the. RandomForestClassifier has. Cf: http://spark.apache.org/docs/latest/ml-ensembles.html#output-columns-predictions-1 On 1 Dec 2015 3:57 a.m., "Ndjido Ardo BAR" wrote: > Hi Joseph, > > Yes Random Forest support G

Re: How to add 1.5.2 support to ec2/spark_ec2.py ?

2015-11-30 Thread Alexander Pivovarov

just want to follow up On Nov 25, 2015 9:19 PM, "Alexander Pivovarov" wrote: > Hi Everyone > > I noticed that spark ec2 script is outdated. > How to add 1.5.2 support to ec2/spark_ec2.py? > What else (except of updating spark version in the script) should be done > to add 1.5.2 support? > > We al

Re: Grid search with Random Forest

2015-11-30 Thread Ndjido Ardo BAR

Hi Benjamin, Thanks, the documentation you sent is clear. Is there any other way to perform a Grid Search with GBT? Ndjido On Tue, 1 Dec 2015 at 08:32, Benjamin Fradet wrote: > Hi Ndjido, > > This is because GBTClassifier doesn't yet have a rawPredictionCol like > the. RandomForestClassifier h

Re: Removing the Mesos fine-grained mode

Re: Problem in running MLlib SVM

Re: Need suggestions on monitor Spark progress

Re: Need suggestions on monitor Spark progress

Re: question about combining small parquet files

Re: Export BLAS module on Spark MLlib

Re: Removing the Mesos fine-grained mode

Re: Bringing up JDBC Tests to trunk

Re: Export BLAS module on Spark MLlib

Re: Export BLAS module on Spark MLlib

Re: Problem in running MLlib SVM

Re: Grid search with Random Forest

Re: Grid search with Random Forest

FOSDEM 2016 - take action by 4th of December 2015

Re: Grid search with Random Forest

Re: How to add 1.5.2 support to ec2/spark_ec2.py ?

Re: Grid search with Random Forest

17 matches

Site Navigation

Mail list logo

Footer information