[DISCUSS] Incremental statistics collection

2023-08-26 Thread RAKSON RAKESH
Hi all, I would like to propose the incremental collection of statistics in spark. SPARK-44817 has been raised for the same. Currently, spark invalidates the stats after data changing commands which would make CBO non-functional. To update these

Two new tickets for Spark on K8s

2023-08-26 Thread Mich Talebzadeh
Hi, @holden Karau recently created two Jiras that deal with two items of interest namely: 1. Improve Spark Driver Launch Time SPARK-44950 2. Improve Spark Dynamic Allocation SPARK-44951

Re: [DISCUSS] Incremental statistics collection

2023-08-26 Thread Mich Talebzadeh
Hi, Impressive, yet in the realm of classic DBMSs, it could be seen as a case of old wine in a new bottle. The objective, I assume, is to employ dynamic sampling to enhance the optimizer's capacity to create effective execution plans without the burden of complete I/O and in less time. For instan