I think this is primarily a shortcoming in my ability to grep through Scala
efficiently, but are there any resources on how to programmatically spin up
& administrate Flink jobs on YARN? The CLI naturally works, but I'd like to
build out a service handling the nuances of job management rather than
thanks for the information .
I want to collect some performance metrics like throughput , latency etc
.thats why looking for these data to collect some information .
how can i collect some performance data in flink ?
Regards
Prateek
On Mon, May 16, 2016 at 2:32 PM, Fabian Hueske-2 [via Apache
Hi Prateek,
the missing numbers are an artifact from how the stats are collected.
ATM, Flink does only collect these metrics for data which is sent over
connections *between* Flink operators.
Since sources and sinks connect to external systems (and not Flink
operators), the dash board does not sho
Hi there,
I have been testing checkpointing on rocksdb backed by s3. Checkpoints
seems successful except snapshot states of timeWindow operator on
keyedStream. Here is the env setting I used
env.setStateBackend(new RocksDBStateBackend(new URI("s3://backupdir/")))
The checkpoint for always fail co
Thanks Fabian. Actually I don’t see a .valid-length suffix file in the output
HDFS folder.
Can you please tell me how would I debug this issue or do you suggest anything
else to solve this duplicates problem.
Thank you.
From: Fabian Hueske mailto:fhue...@gmail.com>>
Reply-To: "user@flink.apach
Hi
In my flink kafka streaming application i am fetching data from one topic
and then process and sent to output topic . my application is working fine
but flink dashboard shows Source [Bytes/records Received] and Sink
[Bytes/records sent] is zero.
Duration Name
Hello all,
How could I broadcast the variable in Datastream or perform similar
operation so that I could read the value as in DataSet:
IterativeDataSet *loop* = centroids.iterate(numIterations);
DataSet *newCentroids* = points.map(new SelectNearestCenter()).
*withBroadcastSet*(*loop*, "*centroids*
Hello Aljoscha,
For *DataSet: *
IterativeDataSet *loop* = centroids.iterate(numIterations);
DataSet *newCentroids* = points.map(new SelectNearestCenter()).
*withBroadcastSet*(*loop*, "*centroids*")
.map(new CountAppender()).groupBy(0).reduce(new CentroidAccumulator())
.map(new CentroidAverager());
There you have your explanation. A loop actually has to be a loop for it to
work in Flink.
On Sat, 14 May 2016 at 16:35 subash basnet wrote:
> Hello,
>
> I had to use,
> private static IterativeStream *loop*;
> loop as global variable because I cannot broadcast it like that of DataSet
> API in D
Thanks Ufuk.
Thanks for explaining. The reasons behind the savepoint being restored
successfully kind of make sense, but it seems like the window type (count
vs time) should be taken into account when restoring savepoints. I don't
actually see anyone doing this, but I would expect flink to complai
Hi,
We can see in [2] many interesting (and expected!) improvements (promises) like
extended SQL support, unified API (DataFrames, DataSets), improved engine
(Tungsten relates to ideas from modern compilers and MPP databases - similar to
Flink [3]), structured streaming etc. It seems we somehow
Hi,
I was looking into the flink snapshotting algorithm details also mentioned
here:
http://data-artisans.com/high-throughput-low-latency-and-exactly-once-stream-processing-with-apache-flink/
https://blog.acolyer.org/2015/08/19/asynchronous-distributed-snapshots-for-distributed-dataflows/
http://m
Hi Flavio, Till,
do you think this can be possibly related to the serialization problem
caused by 'the management' of Kryo serializer buffer when spilling on disk?
We are definitely going beyond what is managed in memory with this task.
saluti,
Stefano
2016-05-16 9:44 GMT+02:00 Flavio Pompermaie
That exception showed just once, but the following happens randomly (if I
re-run the job after stopping and restartign the cluster it doesn't show up
usually):
Caused by: java.io.IOException: Serializer consumed more bytes than the
record had. This indicates broken serialization. If you are using
14 matches
Mail list logo