RE: Add Char support in SQL dataTypes

2015-03-19 Thread Cheng, Hao
Can you use the Varchar or String instead? Currently, Spark SQL will convert the varchar into string type internally(without max length limitation). However, "char" type is not supported yet. -Original Message- From: A.M.Chan [mailto:kaka_1...@163.com] Sent: Friday, March 20, 2015 9:56

Add Char support in SQL dataTypes

2015-03-19 Thread A.M.Chan
case class PrimitiveData( charField: Char, // Can't get the char schema info intField: Int, longField: Long, doubleField: Double, floatField: Float, shortField: Short, byteField: Byte, booleanField: Boolean) I can't get the schema from case class PrimitiveData. An e

Re: Exception using the new createDirectStream util method

2015-03-19 Thread Cody Koeninger
Yeah, I wouldn't be shocked if Kafka's metadata apis didn't return results for topics that don't have any messages. (sorry about the triple negative, but I think you get my meaning). Try putting a message in the topic and seeing what happens. On Thu, Mar 19, 2015 at 4:38 PM, Alberto Rodriguez w

Re: Which linear algebra interface to use within Spark MLlib?

2015-03-19 Thread Debasish Das
Yeah it will be better if we consolidate the development on one of them...either Breeze or mllib.BLAS... On Thu, Mar 19, 2015 at 2:25 PM, Ulanov, Alexander wrote: > Thanks for quick response. > > I can use linealg.BLAS.gemm, and this means that I have to use MLlib > Matrix. The latter does not

Re: Exception using the new createDirectStream util method

2015-03-19 Thread Alberto Rodriguez
Thank you for replying, Ted, I have been debuging and the getLeaderOffsets method is not appending errors because the method findLeaders that is called at the first line of getLeaderOffsets is not returning leaders. Cody, the topics do not have any messages yet. Could this be an issue?? If you g

Re: Which linear algebra interface to use within Spark MLlib?

2015-03-19 Thread Ulanov, Alexander
Thanks for quick response. I can use linealg.BLAS.gemm, and this means that I have to use MLlib Matrix. The latter does not support some useful functionality needed for optimization. For example, creation of Matrix given matrix size, array and offset in this array. This means that I will need t

Re: Which linear algebra interface to use within Spark MLlib?

2015-03-19 Thread Debasish Das
I think for Breeze we are focused on dot and dgemv right now (along with several other matrix vector style operations)... For dgemm it is tricky since you need to do add dgemm for both DenseMatrix and CSCMatrix...and for CSCMatrix you need to get something like SuiteSparse which is under lgpl...so

Re: Exception using the new createDirectStream util method

2015-03-19 Thread Cody Koeninger
What is the value of your topics variable, and does it correspond to topics that already exist on the cluster and have messages in them? On Thu, Mar 19, 2015 at 3:10 PM, Ted Yu wrote: > Looking at KafkaCluster#getLeaderOffsets(): > > respMap.get(tp).foreach { por: PartitionOffsetsRespo

Re: Which linear algebra interface to use within Spark MLlib?

2015-03-19 Thread Ulanov, Alexander
Thank you! When do you expect to have gemm in Breeze and that version of Breeze to ship with MLlib? Also, could someone please elaborate on the linalg.BLAS and Matrix? Are they going to be developed further, should in long term all developers use them? Best regards, Alexander 18.03.2015, в 23:

Re: Exception using the new createDirectStream util method

2015-03-19 Thread Ted Yu
Looking at KafkaCluster#getLeaderOffsets(): respMap.get(tp).foreach { por: PartitionOffsetsResponse => if (por.error == ErrorMapping.NoError) { ... } else { errs.append(ErrorMapping.exceptionFor(por.error)) } There should be some error ot

Exception using the new createDirectStream util method

2015-03-19 Thread Alberto Rodriguez
Hi all, I am trying to make the new kafka and spark streaming integration work (direct approach "no receivers" ). I have created an unit test where I configure and start both zookeeper and kafka. When I try to create the InputDS

Spark SQL ExternalSorter not stopped

2015-03-19 Thread Michael Allman
I've examined the experimental support for ExternalSorter in Spark SQL, and it does not appear that the external sorted is ever stopped (ExternalSorter.stop). According to the API documentation, this suggests a resource leak. Before I file a bug report in Jira, can someone familiar with the code

Spark scheduling, data locality

2015-03-19 Thread Zoltán Zvara
I'm trying to understand the task scheduling mechanism of Spark, and I'm curious about where does locality preferences get evaluated? I'm trying to determine if locality preferences are fetchable before the task get serialized. A HintSet would be most appreciated! Have nice day! Zvara Zoltán m

Re: SparkSQL 1.3.0 JDBC data source issues

2015-03-19 Thread Pei-Lun Lee
JIRA and PR for first issue: https://issues.apache.org/jira/browse/SPARK-6408 https://github.com/apache/spark/pull/5087 On Thu, Mar 19, 2015 at 12:20 PM, Pei-Lun Lee wrote: > Hi, > > I am trying jdbc data source in spark sql 1.3.0 and found some issues. > > First, the syntax "where str_col='valu