Re: What causes a task to change parallelism?

Caizhi Weng Mon, 09 May 2022 19:54:15 -0700

Hi!

I can't see the image (if there is any) in the email. But from the
description it is related to the arrow labeled GLOBAL.


A global shuffle collects all records from its upstream and aggregate them
in its downstream. There are several SQL patterns which lead to this type
of shuffle, for example aggregate functions without grouping.

How can we go about pinpointing which part of our query belongs to that
> specific task?
>

This might not be straight forward but there are clues. Each task is
assigned a task name and the task name represents what operators, functions
or columns are used in this task. For example projections and filters are
called Calc. Projected columns and filter conditions will also appear in
the task name.

Jason Politis <jpoli...@carrera.io> 于2022年5月10日周二 04:45写道：

> Good evening all,
>
> We are running a job in flink SQL.  We've confirmed all Kafka topics that
> we are sourcing from have 5 partitions.  All source tasks in the larger
> DAG, of which we're only showing a small portion of it below, have a
> parallelism of 5.  But for some reason, this one little guy here (to which
> we can not figure out which part of the query he belongs to) decides to not
> parallelize.  The task following him though IS parallelized again.
>
> What is the most common cause for this?
>
> Does it have anything to do with the arrows pointing to him and their
> labels saying GLOBAL?
>
> How can we go about pinpointing which part of our query belongs to that
> specific task?  We have 104 tasks, so quickly pinpointing the exact part of
> the query would help us out alot.
>
> Thank you
>
>
> Jason Politis
> Solutions Architect, Carrera Group
> carrera.io
> | jpoli...@carrera.io <kpatter...@carrera.io>
> <http://us.linkedin.com/in/jasonpolitis>
>

Re: What causes a task to change parallelism?

Reply via email to