INITIALIZING is the very first state a job is in.
It is the state of a job that has been accepted by the JobManager, but
the processing of said job has not started yet.
In other words, INITIALIZING = submitted job, CREATED = data-structures
and components required for scheduling have been create
Hello,
I have looked into this issue:
https://issues.apache.org/jira/browse/FLINK-16866 which supposedly adds
"INITIALIZING" state.
I tried to find the documentation here:
-
https://ci.apache.org/projects/flink/flink-docs-release-1.12/internals/job_scheduling.html#jobmanager-data-structures
-
htt
Hello Vijay,
I have the same use case where I am reading from Kafka and want to
report count corresponding to each event every 5 mins. On Prometheus, I
want to set an alert if fr any event we do not receive the event like say
count is zero.
So can you please help me with how you implemented this f
If you do the aggregation in Prometheus I would think that you do not
need to reset the counter; but it's been a while since I've used it.
Flink will not automatically reset counters.
If this is necessary then you will have to manually reset the counter
every 5 seconds.
The name under which it
Hi David,
Thx for your reply.
To summarize:
Use a Counter:
counter = getRuntimeContext()
.getMetricGroup()
.addGroup("MyMetricsKey", "MyMetricsValue") //specify my value for
each custom event_name here- I might not know all custom event_names
in advance
.counter("myCounter");
This MyMetric
I'd recommend to do the aggregation over 5 seconds in
graphite/prometheus etc., and expose a counter in Flink for each
attribute/event_name.
User variables are a good choice for encoding the attribute/event_name
values.
As for your remaining questions:
Flink does not support aggregating ope
Hi Al,
I am looking at the Custom User Metrics to count incoming records by an
incomng attribute, event_name and aggregate it over 5 secs.
I looked at
https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#reporter
.
I am trying to figure out which one to use Counter or Mete
Hi David,
Thanks for your reply.
I am already using the PrometheusReporter. I am trying to figure out how to
dig into the application data and count grouped by an attribute called
event_name in the incoming application data and report to Grafana via
Prometheus.
I see the following at a high level
Setting up a Flink metrics dashboard in Grafana requires setting up and
configuring one of Flink's metrics reporters [1] that is supported by
Grafana as a data source. That means your options for a metrics reporter
are Graphite, InfluxDB, Prometheus, or the Prometheus push reporter.
If you want re
Hi All,
Is there a way to send hints to the job graph builder!? Like
specifically disabling or enabling chaining.
Great, thanks!
On Tue, Oct 10, 2017 at 7:52 AM Aljoscha Krettek
wrote:
> Hi,
>
> The execution graph looks like this because Flink optimises your graph to
> fit all operations within a single Task. This operation is called chaining.
> The operation can be applied when ther
Hi,
The execution graph looks like this because Flink optimises your graph to fit
all operations within a single Task. This operation is called chaining. The
operation can be applied when there is no shuffle between operations and when
the parallelism is the same (roughly speaking).
If you
Hi my execution graph looks like following, all things stuffed into on
tile.[image:
image.png]
How can I get something like this?
On Wed, Jun 29, 2016 at 9:19 PM, Bajaj, Abhinav wrote:
> Is their a plan to add the Job id or name to the logs ?
This is now part of the YARN client output and should be part of the
1.1 release.
Regarding your other question: in standalone mode, you have to
manually make sure to not submit mult
sday, June 21, 2016 at 8:23 AM
To: "user@flink.apache.org<mailto:user@flink.apache.org>"
mailto:user@flink.apache.org>>, Till Rohrmann
mailto:trohrm...@apache.org>>
Cc: Aljoscha Krettek mailto:aljos...@apache.org>>
Subject: Re: Documentation for translation of J
g>"
mailto:user@flink.apache.org>>, Till Rohrmann
mailto:trohrm...@apache.org>>
Cc: Aljoscha Krettek mailto:aljos...@apache.org>>
Subject: Re: Documentation for translation of Job graph to Execution graph
Hi,
the link has been added newly, yes.
Regarding Q1, since there is n
ings I am trying to understand and get
> comfortable with -
>
> 1. How a Job graph is translated to Execution graph. The logs and
>monitoring APIs are for the Execution graph. So, I need to map them to the
>Job graph. I am trying to bridge this gap.
>2. The job manager &am
Hi,
Thanks for sharing this link. I have not see it before. May be this is newly
added in 1.0 docs. I will go through it.
In general, there are two things I am trying to understand and get comfortable
with -
1. How a Job graph is translated to Execution graph. The logs and monitoring
APIs
hinav wrote:
> Hi,
>
> When troubleshooting a flink job, it is tricky to map the Job graph
> (application code) to the logs & monitoring REST APIs.
>
> So, I am trying to find documentation on how a Job graph is translated to
> Execution graph.
> I found this -
> https:
Hi,
When troubleshooting a flink job, it is tricky to map the Job graph
(application code) to the logs & monitoring REST APIs.
So, I am trying to find documentation on how a Job graph is translated to
Execution graph.
I found this -
https://ci.apache.org/projects/flink/flink-docs-release
Thanks, altering via pause/update/resume is OK, at least for now. Will try
it on practice.
Just in case - question was inspired by Apache NiFi. If you haven't seen
this https://www.youtube.com/watch?v=sQCgtCoZyFQ - at 29:10.
I would say such thing is a must have feature in "production" where
stopp
Hey, currently this is not possible. You can use savepoints
(https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/streaming/savepoints.html)
to stop the job and then resume with the altered job version. There
are plans to allow dynamic rescaling of the execution graph, but I
think they
hus not affecting logic of 'current' flow and
already generated state) and internally writes payload to log or sends to
Kafka for further processing.
I guess, same way it should be possible to remove 'transformationless'
functions from the execution graph at run-time without st
Yes, the web client always shows parallelism 1. That is a bug but it does
not affect the execution of your program.
If you specify the default parallelism in your Flink config, you don't have
to set it in your program or via the command line argument (-p). However,
if you leave it at its default a
Hi everybody and thanks for the answer
So if I understood you said that
apart from some operation, most of them are executed at the default parallelism
value (that is what I expected)
but the viewer will always show 1 if something different is not set via
setParallelism
is it right?
I don’t ha
As an addition, some operators can only be run with a parallelism of 1. For
example data sources based on collections and (un-grouped) all reduces. In
some cases, the parallelism of the following operators will as well be set
to 1 to avoid a network shuffle.
If you do:
env.fromCollection(myCollec
The web client currently does not support to configure the parallelism. There
is an issue for it. So it will soon be fixed.
---
What you can do right now:
1) Either configure the following key in flink-conf.yaml
parallelism.default: PARALLELISM
2) Or set it via the environment:
final Executi
Hi Michele,
If you don't set the parallelism, the default parallelism is used. For the
visualization in the web client, a parallelism of one is used. When you run
your example from your IDE, the default parallelism is set to the number of
(virtual) cores of your CPU.
Moreover, Flink will currentl
Hi, I was trying to run my program in the flink web environment (the local one)
when I run it I get the graph of the planned execution but in each node there
is a "parallelism = 1”, instead i think it runs with par = 8 (8 core, i always
get 8 output)
what does that mean?
is that wrong or is it
29 matches
Mail list logo