Hi,
the state will be kept indefinitely but we are planning to introduce a
setting that would allow setting a time-to-live on state. I think this is
exactly what you would need. As an alternative, maybe you could implement
your program using windows? In this way you would also bound how long state
Hi,
If I specify the userId as the key variable as you suggested, will the
state variables be kept for every observed value of the key? I have a
situation where I have a lot of userIds and many of them occure just once,
so I don't want to keep the state for them for ever. I need the possibility
to
Hi,
newly added nodes would sit idle, yes. Only when we finish the rescaling
work mentioned in the link will we be able to dynamically adapt.
The internal implementation of this will in fact hash keys to a larger
number of partitions than the number of individual partitions and use these
"key grou
Hi,
So is there any possibility to utilize an extra node that joins the cluster
or will it remain idle?
What if I use a custom key function that matches the key variable to a
number of keys bigger than the initial number of nodes (following the idea
from your link)?
What about running flink on yarn
Hi,
first question: are you manually keying by "userId % numberOfPartitions"?
Flink internally does roughly "key.hash() % numPartitions" so it is enough
to specify the userId as your key.
Now, for you questions:
1. What Flink guarantees is that the state for a key k is always available
when an el
Hi,
I have the following situation.
- a keyed stream with a key defined as: userId % numberOfPartitions
- a custom flatMap transformation where I use a StateValue variable to keep
the state of some calculations for each userId
- my questions are:
1. Does flink guarantee that the users with a given
Hi,
right now, this does not work but we're is also actively working on that.
This is the design doc for part one of the necessary changes:
https://docs.google.com/document/d/1G1OS1z3xEBOrYD4wSu-LuBCyPUWyFd9l3T9WyssQ63w/edit?usp=sharing
Cheers,
Aljoscha
On Wed, 25 May 2016 at 13:32 Malgorzata Kud
Hi,
Thanks for your reply.
Is Flink able to detect that an additional server joined and rebalance the
processing? How is it done if I have a keyed stream and some custom
ValueState variables?
Cheers,
Gosia
2016-05-25 11:32 GMT+02:00 Aljoscha Krettek :
> Hi Gosia,
> right now, Flink is not doing
Hi Gosia,
right now, Flink is not doing incremental checkpoints. Every checkpoint is
fully valid in isolation. Incremental checkpointing came up several times
on ML discussions and we a planning to work on it once someone finds some
free time.
Cheers,
Aljoscha
On Wed, 25 May 2016 at 09:29 Rubén C
Hi Gosia
You can have a look to the PROTEUS project we are doing [1]. We are
implementing incremental version of analytics operations. For example you can
see in [2] the implementation of the incremental AVG. Maybe the code can give
you some ideas :-)
[1] https://github.com/proteus-h2020/pr
10 matches
Mail list logo