Hi,
Is there a chance of data loss if there is a failure between the checkpoint
completion and when "notifyCheckpointComplete" is invoked.
The pending files are moved to final state in the "notifyCheckpointComplete"
method. So if there is a failure in this method or just before the method is
invo
Hi,
We are trying to scale some of our scientific applications written in
Flink. A few questions on tuning Flink performance.
1. What parameters are available to control parallelism within a node?
2. Does Flink support shared memory-based messaging within a node (without
doing TCP calls)?
3. Is t
When RocksDB holds a very large state, is there a concern over the time
takes in checkpointing the RocksDB data to HDFS? Is asynchronous
checkpointing a recommended practice here?
https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state_backends.html
"The RocksDBStateBackend h
I am reading Stream Checkpointing doc below. But somehow couldn't find that
"switch" in any other Apache Flink docs. Has anyone of you tried this
switch?
https://ci.apache.org/projects/flink/flink-docs-master/internals/stream_checkpointing.html
"Flink has a switch to skip the stream alignment du
Just wanted to followup and know if there are any updates regarding this
query?
Thanks
Samiksha Sharma
Engineer
HERE Predictive Analytics
HERE Seattle
701 Pike Street, #2000, Seattle, WA 98101, USA
47° 36' 41" N. 122° 19' 57" W
HERE Maps
On 6/24/16, 12:14 PM, "Sharma, Samiksha" wrote:
>H
I'm running a Flink cluster as a YARN application, started by:
./bin/yarn-session.sh -n 4 -jm 2048 -tm 4096 -d
There are 2 worker nodes, so each are allocated 2 task managers. There is a
stateful Flink job running on the cluster with a parallelism of 2.
If I now want to increase the number of wor
Yes, that was my fault. I'm used to auto reply-all on my desktop machine,
but my phone just did a simple reply.
Sorry for the confusion,
Fabian
2016-06-29 19:24 GMT+02:00 Ovidiu-Cristian MARCU <
ovidiu-cristian.ma...@inria.fr>:
> Thank you, Aljoscha!
> I received a similar update from Fabian, o
Hi Robert,
Thanks for helpful reply.
I have couple of follow up questions on your reply - "In general, we recommend
running one JobManager per job”
I understand how this can be achieved while running in Yarn, I.e. by submitting
single Flink Jobs.
Is their some other way of setting Flink to conf
Hi all,
Is there any information out there on how to avoid breaking saved
states/savepoints when making changes to a Flink job and redeploying it?
I want to know how to avoid exceptions like this:
java.lang.RuntimeException: Failed to deserialize state handle and
setup initial operator state.
Thank you, Aljoscha!
I received a similar update from Fabian, only now I see the user list was not
in CC.
Fabian::The optimizer hasn’t been touched (except for bugfixes and new
operators) for quite some time.
These limitations are still present and I don’t expect them to be removed
anytime soon
Hi,
you can explicitly specify that you want processing-time windows like this:
stream.keyBy(...).window(TumblingProcessingTimeWindows.of(Time.seconds(1))).sum(...)
Also note that the timestamp you append in
"writeAsCsv("records-per-second-" + System.currentTimeMillis())" will only
take the times
Hi,
could you load the properties file when starting the application and add it
to the user functions so that it would be serialized along with them? This
way, you wouldn't have to ship the file to each node.
Cheers,
Aljoscha
On Wed, 29 Jun 2016 at 12:09 Janardhan Reddy
wrote:
> We are running
Hi,
the result of splitting by key is that processing can easily be distributed
among the workers because the windows for individual keys can be processed
independently. This should improve cluster utilization.
Cheers,
Aljoscha
On Tue, 28 Jun 2016 at 17:26 Vishnu Viswanath
wrote:
> Hi,
>
> Than
Hi,
I think this document is still up-to-date since not much was done in these
parts of the code for the 1.0 release and after that.
Maybe Timo can give some insights into what optimizations are done in the
Table API/SQL that will be be released in an updated version in 1.1.
Cheers,
Aljoscha
+Ti
Other than increasing the ask.timeout, we've seen such failures being
caused by long GC pauses over bigger heaps. In such a case, you could
fiddle with a) enabling object reuse, or b) enabling off-heap memory (i.e.
taskmanager.memory.off-heap == true) to mitigate GC-induced issues a bit.
Hope it h
OK, looks like you can easily give more memory to the network stack,
e.g. for 2 GB set
taskmanager.network.numberOfBuffers = 65536
taskmanager.network.bufferSizeInBytes = 32768
For the other exception, your logs confirm that there is something
else going on. Try increasing the akka ask timeout:
Hi Ufuk,
so the memory available per node is 48294 megabytes per node, but I reserve
28 by flink conf file.
taskmanager.heap.mb = 28672
taskmanager.memory.fraction = 0.7
taskmanager.network.numberOfBuffers = 32768
taskmanager.network.bufferSizeInBytes = 16384
Anyway Follows what I found in log fi
Hey Andrea! Sorry for the bad user experience.
Regarding the network buffers: you should be able to run it after
increasing the number of network buffers, just account for it when
specifying the heap size etc. You currently allocate 32768 * 16384
bytes = 512 MB for them. If you have a very long pi
Hi everyone,
I am running some Flink experiments with Peel benchmark
http://peel-framework.org/ and I am struggling with exceptions: the
environment is a 25-nodes cluster, 16 cores per nodes. The dataset is
~80GiB and is located on Hdfs 2.7.1. Flink version is 1.0.3.
At the beginning I tried with
We are running multiple flink jobs inside a yarn session. For each flink
job we have a separate property file. We are copying the property files to
each node in the cluster before submitting the job.
Is there a better way to read the properties file? Can we read it from
hdfs or s3. Do we need to
Hi,
the problem was solved after I figured out there was an istance of Flink
TaskManager running on a node of the cluster.
Thank you,
Andrea
2016-06-28 12:17 GMT+02:00 ANDREA SPINA <74...@studenti.unimore.it>:
> Hi Max,
> thank you for the fast reply and sorry: I use flink-1.0.3.
> Yes I tested
21 matches
Mail list logo