Re: How to retrieve multiple columns values (in one row) to variables in Spark Scala method

2019-04-05 Thread ayan guha
Try like this: val primitiveDS = spark.sql("select 1.2 avg ,2.3 stddev").collect().apply(0) val arr = Array(primitiveDS.getDecimal(0), primitiveDS.getDecimal(1)) primitiveDS: org.apache.spark.sql.Row = [1.2,2.3] arr: Array[java.math.BigDecimal] = Array(1.2, 2.3)

How to retrieve multiple columns values (in one row) to variables in Spark Scala method

2019-04-05 Thread Mich Talebzadeh
Hi, Pretty basic question. I use Spark on Hbase to retrieve the last 14 prices average and standard deviation for a security (ticker) from an Hbase table. However, the call is expensive in Spark streaming where these values are used to indicate buy and sell and subsequently the high value pric

Re: Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Jason Nerothin
*I guess I was focusing on this:* #2 I want to do the above as a event driven way, *without using the batches* (i tried micro batches, but I realised that’s not what I want), i.e., *for each arriving event or as soon as a event message come my stream, not by accumulating the event * If you want t

Re: Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Basavaraj
I have checked broadcast of accumulated values, but not satellite stateful stabbing But, I am not sure how that helps here On Fri, 5 Apr 2019, 10:13 pm Jason Nerothin, wrote: > Have you looked at Arbitrary Stateful Streaming and Broadcast Accumulators? > > On Fri, Apr 5, 2019 at 10:55 AM Basava

Re: combineByKey

2019-04-05 Thread Jason Nerothin
Take a look at this SOF: https://stackoverflow.com/questions/24804619/how-does-spark-aggregate-function-aggregatebykey-work On Fri, Apr 5, 2019 at 12:25 PM Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi, > > Thank you for the details. It is a typo error while composing the mail. >

Re: combineByKey

2019-04-05 Thread Madabhattula Rajesh Kumar
Hi, Thank you for the details. It is a typo error while composing the mail. Below is the actual flow. Any idea, why the combineByKey is not working. aggregateByKey is working. //Defining createCombiner, mergeValue and mergeCombiner functions def createCombiner = (Id: String, value: String) => (

Re: Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Jason Nerothin
Have you looked at Arbitrary Stateful Streaming and Broadcast Accumulators? On Fri, Apr 5, 2019 at 10:55 AM Basavaraj wrote: > Hi > > Have two questions > > #1 > I am trying to process events in realtime, outcome of the processing has > to find a node in the GraphX and update that node as well (

Re: combineByKey

2019-04-05 Thread Jason Nerothin
I broke some of your code down into the following lines: import spark.implicits._ val a: RDD[Messages]= sc.parallelize(messages) val b: Dataset[Messages] = a.toDF.as[Messages] val c: Dataset[(String, (String, String))] = b.map{x => (x.timeStamp + "-" + x.Id, (x.Id, x.value))} You

Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Basavaraj
HiHave two questions #1 I am trying to process events in realtime, outcome of the processing has to find a node in the GraphX and update that node as well (in case if any anomaly or state change), If a node is updated, I have to update the related nodes as well, want to know if GraphX can help in t

Checking if cascading graph computation is possible in Spark

2019-04-05 Thread Basavaraj
Hi Have two questions #1 I am trying to process events in realtime, outcome of the processing has to find a node in the GraphX and update that node as well (in case if any anomaly or state change), If a node is updated, I have to update the related nodes as well, want to know if GraphX can h

Is there any spark API function to handle a group of companies at once in this scenario?

2019-04-05 Thread Shyam P
Hi , In my scenario I have few companies , for which I need to calculate few stats like avg I need to be stored in Cassandra , for next set of records I need to get previously calculated and over it i need to calculate accumulated results ( i.e preset set of data + previously stored stats) and stor

combineByKey

2019-04-05 Thread Madabhattula Rajesh Kumar
Hi, Any issue in the below code. case class Messages(timeStamp: Int, Id: String, value: String) val messages = Array( Messages(0, "d1", "t1"), Messages(0, "d1", "t1"), Messages(0, "d1", "t1"), Messages(0, "d1", "t1"), Messages(0, "d1", "t2"), Messages(0, "d1",