Hi Anastasiia,

Thanks for the email. I think you can tweak this spark config
*spark.connect.session.manager.defaultSessionTimeout,
*this is defined here*: *
https://github.com/apache/spark/blob/343471dac4b96b43a09763d759b6c30760fb626e/sql/connect/server/src/main/scala/org/apache/spark/sql/connect/config/Connect.scala#L86-L93

Once the session timeout is changed to be longer Spark won't kill the
session and won't kill the query also.

Thanks,
Wei Liu

Anastasiia Sokhova <anastasiia.sokh...@honic.eu.invalid> 于2024年9月23日周一
09:41写道:

> Dear Spark Team,
>
>
>
> I am working with a standalone cluster, and I am using Spark Connect to
> submit my applications.
>
> My current version is 3.5.1.
>
>
>
> I am trying to run Structured Streaming Queries with relatively long
> trigger intervals (2 hours, 1 day).
>
> The first issue I encountered was “Streaming query has been idle and
> waiting for new data more than 10000ms”. I solved it by increasing the
> value in the internal config property
>  ‘spark.sql.streaming.noDataProgressEventInterval’.
>
> Now my query is not considered idle anymore but Connect expires the
> session after ~1 hour, and the query is killed with it.
>
>
>
> I believe, I have studied everything I could find online, but I could not
> find the answers.
>
> I would really appreciate if you provided some 😊
>
>
>
> Is it not intended for Spark Connect to support “detached” Streaming
> Queries?
>
> Would you consider detaching StreamingQueries from the sessions that start
> them, as they are meant to run continuously?
>
> Would you consider extending control options in Spark Connect UI (start,
> stop, reset checkpoints)?
>
> It will help the users like me, who want to use Spark’s Structured
> Streaming and Connect without running additional applications just to keep
> the session alive.
>
>
>
> I will be happy to answer any question from your side or provide more
> details.
>
>
>
> Best regards,
>
> Anastasiia
>

Reply via email to