I have registered a udf with sqlcontext , I am trying to read another
parquet using sqlcontext under same udf it’s throwing null pointer
exception .
Any help how to access sqlcontext inside a udf ?
Regards,
Sk
As you may be aware the granularity that Spark streaming has is
micro-batching and that is limited to 0.5 second. So if you have continuous
ingestion of data then Spark streaming may not be granular enough for CEP.
You may consider other products.
Worth looking at this old thread on mine "Spark su
Hello all,
Has anyone used spark streaming for CEP (Complex Event processing). Any
CEP libraries that works well with spark. I have a use case for CEP and
trying to see if spark streaming is a good fit.
Currently we have a data pipeline using Kafka, Spark streaming and
Cassandra for data ingesti
I performed a series of TeraGen jobs via spark-submit ( each job generated
equal size dataset into different S3 buckets )
I noticed that some jobs were fast and some were slow.
Slow jobs always had many log prints like
DEBUG TaskSchedulerImpl: parentName: , name: TaskSet_1.0, runningTasks: 1
( o
Hi Debu,
First, Instead of using ‘+’, you can use ‘concat’ to concatenate string
columns. And you should enclose “0” with "lit()" to make it a column.
Second, 1440 become null because you didn’t tell spark what to do if the
when clause is failed. So it simply set the value to null. To fix this, yo