Hello, I recently encountered a problem that confuses me when using spark3.0.
I used the tpcx-bb dataset (200GB) and executed Query#5 in it. The SQL will
read about 65.7GB of table data.
Query#5 is as
follows(https://github.com/NVIDIA/spark-rapids/blob/branch-0.3/integration_tests/src/main/s
Hello, I recently encountered a problem that confuses me when using spark3.0.
I used the tpcx-bb dataset (200GB) and executed Query#5 in it. The SQL will
read about 65.7GB of table data.
Query#5 is as
follows(https://github.com/NVIDIA/spark-rapids/blob/branch-0.3/integration_tests/src/main/sca
you means sparkSession.streams.awaitAnyTermination()? May i have your
code ? or you can see the following:
my demo code:
val hourDevice =
beginTimeDevice.groupBy($"subsId",$"eventBeginHour",$"serviceType")
.agg("duration" -> "sum").withColumnRenamed("sum(duration)",
"durationForHo
Hi , I am using 16 nodes spark cluster with below config
1. Executor memory 8 GB
2. 5 cores per executor
3. Driver memory 12 GB.
We have streaming job. We do not see problem but sometimes we get exception
executor-1 heap memory issue. I am not understanding if data size is same
and this job rece