where the data for all jobs is
written to.
Best,
Aljoscha
> On 5. Feb 2018, at 20:27, xiatao123 wrote:
>
> The external checkpoints are in the format of
> checkpoint_metadata-0057
> which I have no idea which job this checkpoint metadata belongs to if I have
> multi
The external checkpoints are in the format of
checkpoint_metadata-0057
which I have no idea which job this checkpoint metadata belongs to if I have
multiple jobs running at the same time.
If a job failed unexpected, I need to know which checkpoints belongs to the
failed job. Is there API
r=s3://aljoscha/data-generator-external-checkpoints \
../flink-state-machine-kafka011-1.0-SNAPSHOT.jar --parallelism 2 --topic
events6 --bootstrap.servers some-other-ip:9092 --numKeys 1000 --sleep 1
--checkpointDir s3://aljoscha/data-generator-11-checkpoints
--externalizedCheckpoints true
This is
Hi Aviad,
sorry for the late reply.
You can configure the checkpoint directory (which is also used for
externalized checkpoints) when you create the state backend:
env.setStateBackend(new RocksDBStateBackend("hdfs:///checkpoints-data/");
This configures the checkpoint directory to be hdfs:///che
Hi,
thanks for the answer.
I can use the first option (REST API).
for some reason it is undocumented in flink documentation
(https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/rest_api.html)
regarding the second option, configure each job with an externalized
checkpoint direct
Hi Aviad,
I had a similar situation and my solution was to use the flink
monitoring rest api (/jobs/{jobid}/checkpoints) to get the mapping
between job and checkpoint file.
Wrap this in a script and run periodically( in my case, it was 30 sec).
You can also configure each job with an external
Hi,
I have several jobs which configured for external check-pointing
(enableExternalizedCheckpoints)
how can I correlate between checkpoint and jobs.
for example, if i want to write script which monitor if the job is up or
not and if the job is down it will resume the job from the externalized
chec
le, can you attach a
>> profiler/sampling to your job manager and figure out the hotspot methods
>> where most time is spend? This would be very helpful as a starting point
>> where the problem is potentially caused.
>>
>> Best,
>> Stefan
>>
>>>
is spend? This would be very helpful as a starting point
> where the problem is potentially caused.
>
> Best,
> Stefan
>
>> Am 29.06.2017 um 18:02 schrieb Jared Stehler
>> > <mailto:jared.steh...@intellifylearning.com>>:
>>
>> We’re seeing
On Mon, Jul 3, 2017 at 12:02 PM, Stefan Richter
wrote:
> Another thing that could be really helpful, if possible, can you attach a
> profiler/sampling to your job manager and figure out the hotspot methods
> where most time is spend? This would be very helpful as a starting point
> where the probl
/sampling to your job manager and figure out the hotspot methods where
most time is spend? This would be very helpful as a starting point where the
problem is potentially caused.
Best,
Stefan
> Am 29.06.2017 um 18:02 schrieb Jared Stehler
> :
>
> We’re seeing our external checkpoints di
We’re seeing our external checkpoints directory grow in an unbounded fashion…
after upgrading to Flink 1.3. We are using Flink-Mesos.
In 1.2 (HA standalone mode), we saw (correctly) that only the latest external
checkpoint was being retained (i.e., respecting state.checkpoints.num-retained
12 matches
Mail list logo