Hi

Have you tried the key selector function[1]?

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/api_concepts.html#define-keys-using-key-selector-functions
Best,
Congxian


Heidi Hazem Mohamed <h.ha...@nu.edu.eg> 于2019年10月27日周日 下午11:04写道:

> Hi,
>
> What I want : I have my own partitioning technique that generates keys for
> DataStream tuples ,those keys range are equal to the number of nodes in the
> clusters like if I set the parallelism equal to 4 the generated keys will
> be 0,1,2 and 3 and so on and then every key should be partitioned to the
> same node to do such more keyed processing using keyed state.
>
> What happened: I have implemented my logic using the keyBy so I can use a
> keyed state but it suffers from a great skewness some of the nodes had
> received no records and other ones received more than one. I have tried to
> use custom partitioning it did the physical partitioning as I want but I
> can not use the keyed state with it without using keyBy.
>
> What I expect (questions): Is there a way to control the skewness or
> enforce keys to be parallelized over the available nodes? or Is there a way
> to overwrite the partitioning technique used in keyBy? or Is there a way to
> use a keyed state with custom partitioning?
>
> *Best Regards*
>
> *Heidy Hazem*
>
> *Heidy Hazem–* *Teaching assistant, School of Information Technology and
> Computer Science (Formerly, CIT, Communication Engineering, and Information
> Technology School)*
> *T*: +201000 63-25-63   *office*: UB1-room 701
>
> 26th July Corridor, Sheikh Zayed, Giza, Egypt
> *www.nu.edu.eg* <http://www.nu.edu.eg/>* | *
> *www.facebook.com/NileUniversity* <http://www.facebook.com/NileUniversity>
>
>

Reply via email to