Spark meetup on Oct 15 in NYC

2014-09-28 Thread Reynold Xin
Hi Spark users and developers, Some of the most active Spark developers (including Matei Zaharia, Michael Armbrust, Joseph Bradley, TD, Paco Nathan, and me) will be in NYC for Strata NYC. We are working with the Spark NYC meetup group and Bloomberg to host a meetup event. This might be the event w

Re: view not supported in spark thrift server?

2014-09-28 Thread Du Li
Thanks, Michael, for your quick response. View is critical for my project that is migrating from shark to spark SQL. I have implemented and tested everything else. It would be perfect if view could be implemented soon. Du From: Michael Armbrust mailto:mich...@databricks.com>> Date: Sunday, Se

Re: view not supported in spark thrift server?

2014-09-28 Thread Michael Armbrust
Views are not supported yet. Its not currently on the near term roadmap, but that can change if there is sufficient demand or someone in the community is interested in implementing them. I do not think it would be very hard. Michael On Sun, Sep 28, 2014 at 11:59 AM, Du Li wrote: > > Can anyb

view not supported in spark thrift server?

2014-09-28 Thread Du Li
Can anybody confirm whether or not view is currently supported in spark? I found “create view translate” in the blacklist of HiveCompatibilitySuite.scala and also the following scenario threw NullPointerException on beeline/thriftserver (1.1.0). Any plan to support it soon? > create table src

Re: SparkSQL: map type MatchError when inserting into Hive table

2014-09-28 Thread Du Li
It turned out a bug in my code. In the select clause the list of fields is misaligned with the schema of the target table. As a consequence the map data couldn’t be cast to some other type in the schema. Thanks anyway. On 9/26/14, 8:08 PM, "Cheng Lian" wrote: >Would you mind to provide the DDL

Re: Workflow Scheduler for Spark

2014-09-28 Thread Egor Pahomov
I created Jira and design doc on this matter. 2014-09-17 22:28 GMT+04:00 Reynold Xin : > There might've been some misunderstanding. I was referring

Re: How to use multi thread in RDD map function ?

2014-09-28 Thread Yi Tian
for yarn-client mode: SPARK_EXECUTOR_CORES * SPARK_EXECUTOR_INSTANCES = 2(or 3) * TotalCoresOnYourCluster for standlone mode: SPARK_WORKER_INSTANCES * SPARK_WORKER_CORES = 2(or 3) * TotalCoresOnYourCluster Best Regards, Yi Tian tianyi.asiai...@gmail.com On Sep 28, 2014, at 17:59, myasu

How to use multi thread in RDD map function ?

2014-09-28 Thread myasuka
Hi, everyone I come across with a problem about increasing the concurency. In a program, after shuffle write, each node should fetch 16 pair matrices to do matrix multiplication. such as: *import breeze.linalg.{DenseMatrix => BDM} pairs.map(t => { val b1 = t._2._1.asInstanceOf[BDM[Dou

[MLlib] LogisticRegressionWithSGD and LogisticRegressionWithLBFGS converge with different weights.

2014-09-28 Thread Yanbo Liang
Hi We have used LogisticRegression with two different optimization method SGD and LBFGS in MLlib. With the same dataset and the same training and test split, but get different weights vector. For example, we use spark-1.1.0/data/mllib/sample_binary_classification_data.txt as our training and test

Re: A Spark Compilation Question

2014-09-28 Thread Yi Tian
I think you should modify the module settings in IDEA instead of pom.xml Best Regards, Yi Tian tianyi.asiai...@gmail.com On Sep 26, 2014, at 18:09, Yanbo Liang wrote: > Hi Hansu, > > I have encountered the same problem. Maven compiled avro file and generated > corresponding Java file in n