Kimahriman commented on PR #50612: URL: https://github.com/apache/spark/pull/50612#issuecomment-2813756089
> @Kimahriman - if you are removing/adding executors per batch, then locality probably is not very useful. Yeah this includes not even reporting to the coordinator being active since that's just used for locality. > But I'm curious about the perf diff you see with large state (especially as the large state grows) - I guess it might not matter a whole lot - because even today - you are doing a fresh pull for each batch ? Yeah generally for us there's no performance drop since many of our executors will end up get deallocated between batches anyway, so we have to redownload the state each batch regardless. The long pole in the tent for us is generally the time it takes to create and upload a checkpoint. This is partially due to issues where a checkpoint is generally created every batch for RocksDB even with the changelog enabled, because of the hard coded 10k row check as well as not initializing the latest snapshot version on a fresh load (both of which appear to be fixed for the 4.0 release) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org