Try like this:
val primitiveDS = spark.sql("select 1.2 avg ,2.3 stddev").collect().apply(0)
val arr = Array(primitiveDS.getDecimal(0), primitiveDS.getDecimal(1))
primitiveDS: org.apache.spark.sql.Row = [1.2,2.3] arr:
Array[java.math.BigDecimal] = Array(1.2, 2.3)
Hi,
Pretty basic question.
I use Spark on Hbase to retrieve the last 14 prices average and standard
deviation for a security (ticker) from an Hbase table.
However, the call is expensive in Spark streaming where these values are
used to indicate buy and sell and subsequently the high value pric
*I guess I was focusing on this:*
#2
I want to do the above as a event driven way, *without using the batches*
(i tried micro batches, but I realised that’s not what I want), i.e., *for
each arriving event or as soon as a event message come my stream, not by
accumulating the event *
If you want t
I have checked broadcast of accumulated values, but not satellite stateful
stabbing
But, I am not sure how that helps here
On Fri, 5 Apr 2019, 10:13 pm Jason Nerothin,
wrote:
> Have you looked at Arbitrary Stateful Streaming and Broadcast Accumulators?
>
> On Fri, Apr 5, 2019 at 10:55 AM Basava
Take a look at this SOF:
https://stackoverflow.com/questions/24804619/how-does-spark-aggregate-function-aggregatebykey-work
On Fri, Apr 5, 2019 at 12:25 PM Madabhattula Rajesh Kumar <
mrajaf...@gmail.com> wrote:
> Hi,
>
> Thank you for the details. It is a typo error while composing the mail.
>
Hi,
Thank you for the details. It is a typo error while composing the mail.
Below is the actual flow.
Any idea, why the combineByKey is not working. aggregateByKey is working.
//Defining createCombiner, mergeValue and mergeCombiner functions
def createCombiner = (Id: String, value: String) => (
Have you looked at Arbitrary Stateful Streaming and Broadcast Accumulators?
On Fri, Apr 5, 2019 at 10:55 AM Basavaraj wrote:
> Hi
>
> Have two questions
>
> #1
> I am trying to process events in realtime, outcome of the processing has
> to find a node in the GraphX and update that node as well (
I broke some of your code down into the following lines:
import spark.implicits._
val a: RDD[Messages]= sc.parallelize(messages)
val b: Dataset[Messages] = a.toDF.as[Messages]
val c: Dataset[(String, (String, String))] = b.map{x => (x.timeStamp +
"-" + x.Id, (x.Id, x.value))}
You
HiHave two questions #1 I am trying to process events in realtime, outcome of the processing has to find a node in the GraphX and update that node as well (in case if any anomaly or state change), If a node is updated, I have to update the related nodes as well, want to know if GraphX can help in t
Hi
Have two questions
#1
I am trying to process events in realtime, outcome of the processing has to
find a node in the GraphX and update that node as well (in case if any anomaly
or state change), If a node is updated, I have to update the related nodes as
well, want to know if GraphX can h
Hi ,
In my scenario I have few companies , for which I need to calculate few
stats like avg I need to be stored in Cassandra , for next set of records I
need to get previously calculated and over it i need to calculate
accumulated results ( i.e preset set of data + previously stored stats) and
stor
Hi,
Any issue in the below code.
case class Messages(timeStamp: Int, Id: String, value: String)
val messages = Array(
Messages(0, "d1", "t1"),
Messages(0, "d1", "t1"),
Messages(0, "d1", "t1"),
Messages(0, "d1", "t1"),
Messages(0, "d1", "t2"),
Messages(0, "d1",
12 matches
Mail list logo