Re: Dynamic resource allocation for structured streaming [SPARK-24815]

Mich Talebzadeh Mon, 07 Aug 2023 21:55:26 -0700

Hi,

I glanced over the design doc.

You are providing certain configuration parameters plus some settings based
on static values. For example:

spark.dynamicAllocation.schedulerBacklogTimeout": 54s

I cannot see any use of <processing time> which ought to be at least half
of the batch interval to have the correct margins (confidence level). If
you are going to have additional indicators why not look at scheduling
delay as well. Moreover most of the needed statistics are also available to
set accurate values. My inclination is that this is a great effort but we
ought to utilise the historical statistics collected under checkpointing
directory to get more accurate statistics. I will review the design
document in duew course

HTH

Mich Talebzadeh,
Solutions Architect/Engineering Lead
London
United Kingdom

   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

 https://en.everybodywiki.com/Mich_Talebzadeh

*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.

On Tue, 8 Aug 2023 at 01:30, Pavan Kotikalapudi
<pkotikalap...@twilio.com.invalid> wrote:

> Hi Spark Dev,
>
> I have extended traditional DRA to work for structured streaming
> use-case.
>
> Here is an initial Implementation draft PR
> https://github.com/apache/spark/pull/42352 and design doc:
> https://docs.google.com/document/d/1_YmfCsQQb9XhRdKh0ijbc-j8JKGtGBxYsk_30NVSTWo/edit?usp=sharing
>
> Please review and let me know what you think.
>
> Thank you,
>
> Pavan
>

Re: Dynamic resource allocation for structured streaming [SPARK-24815]

Reply via email to