Checkpointing wasn't enabled in the streaming job, but the offsets should
have been committed to zookeeper.
But we don't see the offsets being written to zookeeper.
On Tue, Aug 2, 2016 at 7:41 PM, Till Rohrmann wrote:
> Hi Janardhan,
>
> Flink should commit the current offsets to Zookeeper when
Topology snip:
datastream =
some_stream.keyBy(keySelector).timeWindow(Time.seconds(60)).reduce(new
some_KeyReduce());
If I have a KeySelector that's pretty 'loose' (IE lots of matches) the
'some_KeyReduce' function gets hit frequently and some set of values is
printed out via 'datastream.print(
I’m currently trying to programmatically create a Flink cluster on a given YARN
cluster. I’m using the FlinkYarnClientBase class to do this currently with some
limitations (Flink version 1.0.3).
I’m wanting to pass in my own YARN configuration so that I can deploy Flink on
different YARN cluste
Hi,
I think it would probably be a good idea to make these tunable from the
command line. Otherwise we might run into the problem of accidentally
restoring a job that should fail like it does now.
Gyula
Stephan Ewen ezt írta (időpont: 2016. aug. 2., K, 17:17):
> +1 to ignore unmatched state.
>
Hi Max,
Is there a way to limit the JVM memory usage (something like the -Xmx flag)
for the task manager so that it won't go over the YARN limit but will just
run GC until there is memory to use? Trying to allocate "enough" memory for
this stream task is not ideal because I could have indefinitely
+1 to ignore unmatched state.
Also +1 to allow programs that resume partially (add some new state that
starts empty)
Both are quite important for program evolution.
On Tue, Aug 2, 2016 at 2:58 PM, Ufuk Celebi wrote:
> No, unfortunately this is the same for 1.1. The idea was to be explicit
> ab
I can tell you that we are reading Avro data from Kafka on Flink without
problems. It seems like you have a mistake somewhere in your system. If I were
you I would try your serialization & deserialization code in a simple program
within the same JVM, then gradually add the other components in or
Thank you- It is very clear now.
Sameer
On Tue, Aug 2, 2016 at 10:29 AM, Till Rohrmann
wrote:
> The CEP operator maintains for each pattern a window length. This means
> that every starting event will set its own timeout value.
>
> So if T=51 arrives in the 11th minute, then it depends whether
Hi Stephan,
I went through one of the old mail thread
http://mail-archives.apache.org/mod_mbox/flink-user/201510.mbox/%3CCANC1h_vq-TVjTNhXyYLoVso7GRGkdGWioM5Ppg%3DGoQPjvigqYg%40mail.gmail.com%3E
Here it is mentioned that When reading from Kafka you are expected to define a
DeserializationSche
The CEP operator maintains for each pattern a window length. This means
that every starting event will set its own timeout value.
So if T=51 arrives in the 11th minute, then it depends whether the second
T=31 arrived sometime between the 1st and 11th minute. If that's the case,
then you should als
Hi Claudia,
1) At the moment the offset information will be written to the ZooKeeper
quorum used by Kafka as well as to the savepoint. Reading the savepoint is
not so easy to do since you would need to know the internal representation
of the savepoint. But you could try to read the Kafka offsets f
Hi!
I think this is a known limitation for Flink 1.0 and it is fixed in Flink
1.1
Here is the JIRA ticket: https://issues.apache.org/jira/browse/FLINK-3691
Here is the mail thread:
http://mail-archives.apache.org/mod_mbox/flink-user/201604.mbox/%3CCAOFSxKtJXfxRKm2=bplu+xvpwqrwd3c8ynuk3iwk9aqvgrc
Thanks Till,
In that case if I have a pattern -
First = T > 30
Followed By = T > 50
Within 10 minutes
If I get the following sequence of events within 10 minutes
T=31, T=51, T=31, T=51
I assume the alert will fire twice now.
But what happens if the last T=51 arrives in the 11th minute. If the
p
Hi Janardhan,
Flink should commit the current offsets to Zookeeper whenever a checkpoint
has been completed. In case that you disabled checkpointing, then the
offsets will be periodically committed to ZooKeeper. The default value is
60s.
Could it be that there wasn't yet a completed checkpoint? W
Hi,
I am using Flink 1.0.3 and FlinkKafkaConsumer08 to read AVRO data from flink. I
am having the AVRO schema file with me which was used to write data in Kafka.
Here
https://ci.apache.org/projects/flink/flink-docs-release-0.8/example_connectors.html
you have mentioned that using the GenericDa
Hi Sameer,
the within clause of CEP uses neither tumbling nor sliding windows. It is
more like a session window which is started whenever an element which
matches the starting condition arrives. As long as new events which fulfill
the pattern definition arrive within the length of the window, they
No, unfortunately this is the same for 1.1. The idea was to be explicit
about what works and what not. I see that this is actually a pain for this
use case (which is very nice and reasonable ;)). I think we can either
always ignore state that does not match to the new job or if that is too
aggressi
Hello again,
Having had another go at this today, I clearly see that I cannot pass a
certain type into the fold/window function and expect to be able to return a
datastream of another type from the window function. I have tried a
different approach and am now receiving a run-time exception, cause
Your job creates a lot of String objects which need to be garbage
collected. It could be that the JVM is not fast enough and Yarn kills
the JVM for consuming too much memory.
You can try two things:
1) Give the task manager more memory
2) Increase the Yarn heap cutoff ratio (e.g yarn.heap-cutoff-
Hi,
When the run the following command i am getting that no topic is available
for that consumer group. i am suing
flink-connector-kafka-0.8_${scala.version}(2.11).
./bin/kafka-consumer-groups.sh --zookeeper <> --group <> --describe
No topic available for consumer group provided
Does the kafk
My guess would be that you have a thread leak in the user code.
More memory will not solve the problem, only push it a bit further away.
On Mon, Aug 1, 2016 at 9:15 PM, Paulo Cezar wrote:
> Hi folks,
>
>
> I'm trying to run a DataSet program but after around 200k records are
> processed a "java
Hi All,
I am trying to read AVRO data from Kafka using Flink 1.0.3 but I am getting
error. I have posted this issue in Stack Overflow:
http://stackoverflow.com/questions/38715286/how-to-decode-kafka-messages-using-avro-and-flink
. Is there any mistake we can try to look into or there a better w
22 matches
Mail list logo