Hi Niels,
I was more talking from a theoretical point of view.
Flink does not have a hook to inject a custom hash function (yet). I'm not
familiar with the details of the implementation to make an assessment
whether this would be possible or how much work it would be. However,
several users have a
Hi,
> However, if you would like to keyBy the original key attribute, Flink
would need to have access to the hash function that was used to assign
events to partitions.
So if my producing application and my consuming application use the same
source attributes AND the same hashing function to dete
Hi Niels,
I think the biggest problem for keyed sources is that Flink must be able to
co-locate key-partitioned state with the pre-partitioned data.
This might work, if the key is the partition ID, i.e, not the original key
attribue that was hashed to assign events to partitions.
Flink could need
Hi Niels,
If it’s only for simple data filtering that does not depend on the key, a
simple “flatMap” or “filter" directly after the source can be chained to the
source instances.
What that does is that the filter processing will be done within the same
thread as the one fetching data from a Kaf
Hi,
Ok. I think I get it.
WHAT IF:
Assume we create a addKeyedSource(...) which will allow us to add a source
that makes some guarantees about the data.
And assume this source returns simply the Kafka partition id as the result
of this 'hash' function.
Then if I have 10 kafka partitions I would r
Hi Niels,
Thank you for bringing this up. I recall there was some previous discussion
related to this before: [1].
I don’t think this is possible at the moment, mainly because of how the API is
designed.
On the other hand, a KeyedStream in Flink is basically just a DataStream with a
hash part
Hi,
In my scenario I have click stream data that I persist in Kafka.
I use the sessionId as the key to instruct Kafka to put everything with the
same sessionId into the same Kafka partition. That way I already have all
events of a visitor in a single kafka partition in a fixed order.
When I read