Hi Vishal,
I think you're right! And thanks for looking into this so deeply.
With your last mail your basically saying, that the checkpoint could not be
restored because your HDFS was temporarily down. If Flink had not deleted that
checkpoint it might have been possible to restore it at a late
Fabian,
Turns out I was wrong. My flow was in fact running in two separate jobs
due to me trying to use a local variable calculated by
...distinct().count() in a downstream flow. The second flow indeed set
parallelism correctly! Thank you for the help. :)
On Wed, Oct 4, 2017 at 8:01 AM, Fabia
Hi Kostas,
I noticed that you commented on FLINK-7549 and FLINK-7606. I was
monitoring both these JIRAs.
I was always using time characteristic as event time like you had suggested
but I continue to see patterns not getting detected. Could you help shed
more light on this ? I had shared some cod
I think this is the offending piece. There is a catch all Exception, which
IMHO should understand a recoverable exception from an unrecoverable on.
try {
completedCheckpoint = retrieveCompletedCheckpoint(checkpointStateHandle);
if (completedCheckpoint != null) {
completedCheckpoints.add(completed
So this is the issue and tell us that it is wrong. ZK had some state (
backed by hdfs ) that referred to a checkpoint ( the same exact last
successful checkpoint that was successful before NN screwed us ). When the
JM tried to recreate the state and b'coz NN was down failed to retrieve the
CHK hand
Also note that the zookeeper recovery did ( sadly on the same hdfs
cluster ) also showed the same behavior. It had the pointers to the chk
point ( I think that is what it does, keeps metadata of where the
checkpoint etc ) . It too decided to keep the recovery file from the
failed state.
-rw-
Thanks a lot, Carst!
I hadn't realized that
Best regards
> Оригинално писмо
>От: Carst Tankink ctank...@bol.com
>Относно: Re: kafka consumer parallelism
>До: "r. r."
>Изпратено на: 05.10.2017 09:04
> Hi,
>
> The latter (map will be spread out if you rebal
Another thing I noted was this thing
drwxr-xr-x - root hadoop 0 2017-10-04 13:54
/flink-checkpoints/prod/c4af8dfa864e2f9a51764de9f0725b39/chk-44286
drwxr-xr-x - root hadoop 0 2017-10-05 09:15
/flink-checkpoints/prod/c4af8dfa864e2f9a51764de9f0725b39/chk-45428
Generally what
Hello Fabian,
First of all congratulations on this fabulous
framework. I have worked with GDF and though GDF has some natural pluses
Flink's state management is far more advanced. With kafka as a source it
negates issues GDF has ( GDF integration with pub/sub is organic and th
That depends on the state backend [1] that you are using.
If you use the RocksDBStateBackend, state is written to RocksDB which
persists to disk.
[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/ops/state_backends.html
2017-10-05 14:41 GMT+02:00 Rahul Raj :
> Thanks for the respon
Thanks for the response. So, i am guessing windows in flink will store the
records in memory before processing them. Correct?
Rahul Raj
On Oct 5, 2017 17:50, "Fabian Hueske" wrote:
> Hi,
>
> I'd suggest to have a look at the window operators [1]. For example a
> tumbling window of 1 minute can
Hi Yunus,
thanks for reporting this problem.
I opened the JIRA issue FLINK-7764 [1] for this.
Thanks,
Fabian
[1] https://issues.apache.org/jira/browse/FLINK-7764
2017-10-05 13:38 GMT+02:00 Yunus Olgun :
> Hi,
>
> I am using Flink 1.3.2. When I try to use KafkaProducer with timestamps it
> fail
Hi,
I'd suggest to have a look at the window operators [1]. For example a
tumbling window of 1 minute can be used to compute metrics every minute.
Flink's window operators are very extensible and you can implement very
custom logic if the predefined windows don't match your use case. In any
case,
Hi,
I am using Flink 1.3.2. When I try to use KafkaProducer with timestamps it
fails to set name, uid or parallelism. It uses default values.
———
FlinkKafkaProducer010.FlinkKafkaProducer010Configuration producer =
FlinkKafkaProducer010
.writeToKafkaWithTimestamps(stream, topicName, schema
Hi,
I have to calculate some complicated metrics like click through rate ,
click value rate and conversions on real time data using flink. But I am
not sure what functionality of flink should I use to program this because
it involves collection of some records in memory for certain time may be 1
m
Hi Vishal,
window operators are always stateful because the operator needs to remember
previously received events (WindowFunction) or intermediate results
(ReduceFunction).
Given the program you described, a checkpoint should include the Kafka
consumer offset and the state of the window operator.
Gordon
Thanks for the detailed response. I have verified your assumption and that
is, unfortunately, the case.
I also looked into creating a custom Kryo serializer but I got stuck on
serializing arrays of complex types. It seems like this isn't trivial at all
with Kryo.
As an alternative, I've b
17 matches
Mail list logo