Hi team,
I'm kube-batch/Volcano founder, and I'm excited to hear that the spark
community also has such requirements :)
Volcano provides several features for batch workload, e.g. fair-share,
queue, reservation, preemption/reclaim and so on.
It has been used in several product environments with Sp
Thanks Johnny for sharing your experience. Have you tried to use S3A
committer? Looks like this one is introduced in the latest Hadoop for
solving problems with other committers.
https://hadoop.apache.org/docs/r3.1.1/hadoop-aws/tools/hadoop-aws/committers.html
- ND
On 6/22/21 6:41 PM, Johnn
Looks like repartitioning was my friend, seems to be distributed across the
cluster now.
All good. Thanks!
On Wed, Jun 23, 2021 at 2:18 PM Tom Barber wrote:
> Okay so I tried another idea which was to use a real simple class to drive
> a mapPartitions... because logic in my head seems to sugge
Hi, I only know about comments which you can add to each column where you
can add these key values.
Thanks.
On Wed, Jun 23, 2021 at 11:31 AM Bode, Meikel, NMA-CFD <
meikel.b...@bertelsmann.de> wrote:
> Hi folks,
>
>
>
> Maybe not the right audience but maybe you came along such an requirement.
>
Hi folks,
Maybe not the right audience but maybe you came along such an requirement.
Is it possible to define a parquet schema, that contains technical column names
and a list of translations for a certain column name into different languages?
I give an example:
Technical: "custnr" would transla
Please allow me to be diverse and express a different point of view on
this roadmap.
I believe from a technical point of view spending time and effort plus
talent on batch scheduling on Kubernetes could be rewarding. However, if I
may say I doubt whether such an approach and the so-called democra
Okay so I tried another idea which was to use a real simple class to drive
a mapPartitions... because logic in my head seems to suggest that I want to
map my partitions...
@SerialVersionUID(100L)
class RunCrawl extends Serializable{
def mapCrawl(x: Iterator[(String, Iterable[Resource])], job:
Sp
(I should point out that I'm diagnosing this by looking at the active tasks
https://pasteboard.co/K7VryDJ.png, if I'm reading it incorrectly, let me
know)
On Wed, Jun 23, 2021 at 11:38 AM Tom Barber wrote:
> Uff hello fine people.
>
> So the cause of the above issue was, unsurprisingly, huma
Uff hello fine people.
So the cause of the above issue was, unsurprisingly, human error. I found a
local[*] spark master config which gazumped my own one so mystery
solved. But I have another question, that is still the crux of this problem:
Here's a bit of trimmed code, that I'm currentl