Re: [K8S] ExecutorPodsWatchSnapshotSource with no spark-exec-inactive label in 3.1?

2021-03-08 Thread attilapiros
Hi, I do not think this could cause any problem. The PODs polling snapshots would contain those PODs which were inactivated before. Every time when the polling is triggered, so here it make sense to handle them like deleted PODs and skip those. On the other hand PODs watcher only informed about

Re: minikube and kubernetes cluster versions for integration testing

2021-03-04 Thread attilapiros
Thanks Shane! I can do the documentation task and the Minikube version check can be incorporated into my PR. When my PR is finalized (probably next week) I will create a jira for you and you can set up the test systems and you can even test my PR before merging it. Is this possible / fine for you

Re: SPARK-34600. Support user-defined types in Pandas UDF

2021-03-03 Thread attilapiros
Hi! First of all thanks for your contribution! PySpark is not an area I am familiar with but I can answer your question regarding Jira. The issue will be assigned to you when your change is in: > The JIRA will be Assigned to the primary contributor to the change as a > way of giving credit. If

Re: Is there any inplict RDD cache operation for query optimizations?

2021-02-15 Thread attilapiros
hi, There is good reason why the decision about caching is left for the user. Spark does not know about the future of the DataFrames and RDDs. Think about how your program is running (you are still running program), so there is an exact point where the execution is and when Spark reaches an actio

Re: Using bundler for Jekyll?

2021-02-12 Thread attilapiros
Managed to improve the site building a bit more: with a Gemfile we can pin Jekyll to an exact version. For this we just have to call Jekyll via `bundle exec jekyll`. The PR [1] is opened. [1] https://github.com/apache/spark-website/pull/303 -- Sent from: http://apache-spark-developers-list.10

Re: Using bundler for Jekyll?

2021-02-12 Thread attilapiros
Sure I will do that, too. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: Using bundler for Jekyll?

2021-02-12 Thread attilapiros
I run into the same problem today and tried to find the version where the diff is minimal, so I wrote a script: ``` #!/bin/zsh versions=('3.7.3' '3.7.2' '3.7.0' '3.6.3' '3.6.2' '3.6.1' '3.6.0' '3.5.2' '3.5.1' '3.5.0' '3.4.5' '3.4.4' '3.4.3' '3.4.2' '3.4.1' '3.4.0') for i in $versions; do gem u

Re: [DISCUSS] Spark cannot identify the problem executor

2021-02-11 Thread attilapiros
Hi, There is an existing way to handle this situation. Those tasks will become zombie tasks [1] and they should not be counted into the tasks failures [2]. Even the shuffle blocks should be unregistered for the lost executor, although the lost executor might be already cached as a mapoutput in th

Re: [K8S] KUBERNETES_EXECUTOR_REQUEST_CORES

2021-02-10 Thread attilapiros
Hi, This is just an extra unnecessary usage of the /sparkConf/ member val directly (those two lines are added by two different PRs). Actually both uses the same /sparkConf/ to give back the config value, as /KubernetesExecutorConf/ extends the /KubernetesConf/ [1] which uses the passed /sparkConf