Hi all
We have a pipeline (runs on YARN, Flink v1.7.1) which consumes a union of
Kafka and
HDFS sources. We remarked that the throughput is 10 times higher if only
one of these sources is consumed. While trying to identify the problem I
implemented a no-op source which was unioned with one of the
Hi Zhijiang
Thanks for the clarification we were thinking about the very same solution,
we'll then go in this direction.
Best
Peter
zhijiang ezt írta (időpont: 2019. ápr. 15., H,
4:28):
> Hi Peter,
>
> The lifecycle of these metrics are coupled with lifecycle of task, So the
> metrics would be
Hi all
We're exposing Prometheus metrics from our Flink (v1.7.1) pipeline to
Prometheus, e.g: the total number of processed records. This works fine
until any of the tasks is restarted within this yarn application. Then the
counter is reset and it starts incrementing values from 0.
How can we reta
Hi all
Our intention is to create a savepoint from the current prod pipeline
(running on Flink 1.7.1) and bring up another one behind the scenes using
this savepoint to avoid reprocessing of all data and create the local state
again.
It looks like it's technically possible but we're unsure about t
state..
Thanks,
Peter
2018-06-11 11:31 GMT+02:00 Stefan Richter :
> Hi,
>
> > Am 08.06.2018 um 01:16 schrieb Peter Zende :
> >
> > Hi all,
> >
> > We have a streaming pipeline (Flink 1.4.2) for which we implemented
> stoppable sources to be able to grac
Hi all,
We have a streaming pipeline (Flink 1.4.2) for which we implemented
stoppable sources to be able to gracefully exit from the job with Yarn
state "finished/succeeded".
This works fine, however after creating a savepoint, stopping the job (stop
event) and restarting it we remarked that the
Hi all,
Is it possible to define two DataStream sources - one which reads from
Kafka, the other reads from HDFS - and apply mapWithState with
CoFlatMapFunction? The idea would be to read historical data from HDFS
along with the live stream from Kafka and based on some business write the
output t
Hi,
We use RocksDB with FsStateBackend (HDFS) to store state used by the
mapWithState operator. Is it possible to initialize / populate this state
during the streaming application startup?
Our intention is to reprocess the historical data from HDFS in a batch job
and save the latest state of the
Hi,
We have a Flink streaming pipeline (1.4.2) which reads from Kafka, uses
mapWithState with RocksDB and writes the updated states to Cassandra.
We also would like to reprocess the ingested records from HDFS. For this we
consider computing the latest state of the records over the whole dataset
in