In structured streaming, the QueryProgressEvent does not seem to have
the final emitted record count to the destination, I see only the number of
input rows. I was trying to use the count (additional action after
persisting the dataset), but I face the below exception when calling persist
or c
ionMs" : {
"addBatch" : 2263426,
"getBatch" : 12,
"getOffset" : 273,
"queryPlanning" : 13,
"triggerExecution" : 2264288,
"walCommit" : 552
},
regards
aravias
--
View this message in context:
http://apache-spark-user-list.
Hi,
we have a structured streaming app consuming data from kafka and writing to
s3.
I keep getting this timeout exception whenever the executor is specified and
running with more than one core per executor. If someone can share any info
related to this if you know it would be great.
17/08/08 21
Hi,
Is there a way to get the *consumerGroupId* assigned to a structured
streaming application when its consuming from kafka?
regards
Aravind
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Structured-Streaming-consumerGroupId-tp28828.html
Sent from the Ap
Hi,
we are using structured streaming for stream processing and for each
message to do some data enrichment i have to lookup data from cassandra and
that data in cassandra gets periodically (like once in a day) updated.
I want to look at the option of loading it as a dataset and then register it
the bug is related to where long checkpoints are truncated when dealing with
topics have large number of partitions, in my case 120.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/StructuredStreaming-StreamingQueryException-tp28749p28754.html
Sent from the
this is a bug in spark version 2.1.0, seems to be fixed in spark 2.1.1 when
ran with that version.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/StructuredStreaming-StreamingQueryException-tp28749p28753.html
Sent from the Apache Spark User List mailing li
hi,
I have one read stream to consume data from a Kafka topic , and based on an
attribute value in each of the incoming messages, I have to write data to
either of the 2 different locations in S3 (if value1 write to location1,
otherwise to location2).
On a high level below is what I have for doin