Monotonically_increasing_id() will give the same functionality
On Mon, 7 Feb, 2022, 6:57 am , wrote:
> For a dataframe object, how to add a column who is auto_increment like
> mysql's behavior?
>
> Thank you.
>
> -
> To unsubscr
Hi All,
I want to write a Spark Streaming Job from Kafka to Elasticsearch. Here I
want to detect the schema dynamically while reading it from Kafka.
Can you help me to do that.?
I know, this can be done in Spark Batch Processing via the below line.
val schema =
spark.read.json(dfKafkaPayload.se
Hi Jainshasha,
I need to read each row from Dataframe and made some changes to it before
inserting it into ES.
Thanks
Siva
On Mon, Oct 5, 2020 at 8:06 PM jainshasha wrote:
> Hi Siva
>
> To emit data into ES using spark structured streaming job you need to used
> ElasticSearch jar which has sup
Hi Team,
I have a spark streaming job, which will read from kafka and write into
elastic via Http request.
I want to validate each request from Kafka and change the payload as per
business need and write into Elastic Search.
I have used ES Http Request to push the data into Elastic Search. Can s
Hi all,
I am using Spark Structured Streaming (Version 2.3.2). I need to read from
Kafka Cluster and write into Kerberized Kafka.
Here I want to use Kafka as offset checkpointing after the record is
written into Kerberized Kafka.
Questions:
1. Can we use Kafka for checkpointing to manage offset
Yes, I am also facing the same issue. Did you figured out?
On Tue, 9 Jul 2019, 7:25 pm Kamalanathan Venkatesan, <
kamalanatha...@in.ey.com> wrote:
> Hello,
>
>
>
> I have below spark structural streaming code and I was expecting the
> results to be printed on the console every 10 seconds. But, I
Hi Team,
Need help on windowing & watermark concept. This code is not working as
expected.
package com.jiomoney.streaming
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.functions._
import org.apache.spark.sql.streaming.ProcessingTime
object SlingStreaming {
def main(arg
lect statement. If I'm not mistaken, it is known
> as a bit costly since each call would produce a new Dataset. Defining
> schema and using "from_json" will eliminate all the call of withColumn"s"
> and extra calls of "get_json_object".
>
> -
Hello All,
I am using Spark 2.3 version and i am trying to write Spark Streaming Join.
It is a basic join and it is taking more time to join the stream data. I am
not sure any configuration we need to set on Spark.
Code:
*
import org.apache.spark.sql.SparkSession
import or
You can try with this, it will work
val finaldf = merchantdf.write.
format("org.apache.spark.sql.cassandra")
.mode(SaveMode.Overwrite)
.option("confirm.truncate", true)
.options(Map("table" -> "tablename", "keyspace" -> "keyspace"))
.save()
On Wed 27 Jun, 2018,
10 matches
Mail list logo