xiong(Ryan) Zhu"
Cc: Rick Moritz , user
Subject: Re: [StructuredStreaming] multiple queries of the socket source: only
one query works.
Hi Shixiong,
Thanks for the explanation.
In my view, this is different from the intuitive understanding of the
Structured Streaming model [1], where incomi
Hi Shixiong,
Thanks for the explanation.
In my view, this is different from the intuitive understanding of the
Structured Streaming model [1], where incoming data is appended to an
'unbounded table' and queries are run on that. I had expected that all
queries would run on that 'unbounded table vi
Spark creates one connection for each query. The behavior you observed is
because how "nc -lk" works. If you use `netstat` to check the tcp
connections, you will see there are two connections when starting two
queries. However, "nc" forwards the input to only one connection.
On Fri, Aug 11, 2017 a
Hi Gerard, hi List,
I think what this would entail is for Source.commit to change its
funcationality. You would need to track all streams' offsets there.
Especially in the socket source, you already have a cache (haven't looked
at Kafka's implementation to closely yet), so that shouldn't be the is
Hi,
I've been investigating this SO question:
https://stackoverflow.com/questions/45618489/executing-separate-streaming-queries-in-spark-structured-streaming
TL;DR: when using the Socket source, trying to create multiple queries does
not work properly, only one the first query in the start order