Hello All,
I am executing spark jobs but in executor tab I am missing information, I
cant see any data/info coming up. Please let me know what I am missing .
Hi Luca,
Thanks for your reply, which is very helpful for me :)
I am trying other metrics sinks with cAdvisor to see the effect. If it
works well, I will share it with the community.
On Fri, Feb 10, 2023 at 4:26 PM Luca Canali wrote:
> Hi Qian,
>
>
>
> Indeed the metrics available with the Pro
Hello All,
I'm trying to run a simple application on GKE (Kubernetes), and it is
failing:
Note : I have spark(bitnami spark chart) installed on GKE using helm
install
Here is what is done :
1. created a docker image using Dockerfile
Dockerfile :
```
FROM python:3.7-slim
RUN apt-get update && \
Hi,
We are facing this issue when we convert RDD -> Dataset followed by repartition
+ write. We are using spot instances on k8s which means they can die at any
moment. And when they do during this phase, we very often see data duplication
happening.
Pseudo job code:
val rdd = data.map(…)
val
Alright, this is the working Java version of it:
List listCols = new ArrayList();
> Arrays.asList(dataset.columns()).forEach(column -> {
> listCols.add(org.apache.spark.sql.functions.collect_set(column)); });
> Column[] arrCols = listCols.toArray(new Column[listCols.size()]);
> dataset = dataset.s