Eason Ye created FLINK-36371: -------------------------------- Summary: Field BinaryStringData as keySelector would lead the same records hash shuffle into different tumble windows Key: FLINK-36371 URL: https://issues.apache.org/jira/browse/FLINK-36371 Project: Flink Issue Type: Improvement Components: API / Type Serialization System Affects Versions: 2.0.0 Reporter: Eason Ye
The pipeline is Source(Kafka serialize msg to RowData type) -> FlatMap --(keyby)–> TumbleProcessWindow -> sink. The Source would emit two fields: UserName String, and Amount Integer. FlatMap collects the BinaryRowData type record and emits it to the TumbleProcessWindow operator. KeySelector is using the UserName BinaryStringData. key by code is : {code:java} .keyBy(rowData -> rowData.getString(0)){code} The window result is unexpected, the same username records arrived at TumbleProcessWindow simultaneously, but these records were calculated in the different windows. When I use the below keyBy, the window result is correct. {code:java} .keyBy(rowData -> rowData.getString(0).toString()){code} -- This message was sent by Atlassian Jira (v8.20.10#820010)