[ https://issues.apache.org/jira/browse/FLINK-29091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
lincoln lee updated FLINK-29091: -------------------------------- Description: RAND and RAND_INTEGER are declared as dynamic function (isDynamicFuntion returns true), as the declaration it should only evaluate once at query-level (not per record) for batch mode, FLINK-21713 did the similar fix for temporal functions. But current behavior is completely a non-deterministic function which evaluated per record for both batch and streaming mode, it's not a good choice to break current behavior, and the determinism of RAND function are also different across vendors: [1] evaluated at query-level though it is treated as non-deterministic function [https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions?view=sql-server-ver16#built-in-function-determinism|https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions?view=sql-server-ver16#built-in-function-determinism)] [2][ evaluated at row level: [https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand]|https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand)] [3] evaluated at row level if not specifies a seed, e.g., DBMS_RANDOM.normal, DBMS_RANDOM.value(1,10) [https://docs.oracle.com/database/timesten-18.1/TTPLP/d_random.htm#TTPLP71231|https://docs.oracle.com/database/timesten-18.1/TTPLP/d_random.htm#TTPLP71231)] So just keep the current behavior and update these two functions' definition to non-deterministic can avoid the affection to users, and make it clearly. was: RAND and RAND_INTEGER are declared as dynamic function (isDynamicFuntion returns true), as the declaration it should only evaluate once at query-level (not per record) for batch mode, FLINK-21713 did the similar fix for temporal functions. But current behavior is completely a non-deterministic function which evaluated per record for both batch and streaming mode, it's not a good choice to break current behavior, and the determinism of RAND function are also different across vendors: [1] evaluated at query-level though it is treated as non-deterministic function [https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions?view=sql-server-ver16#built-in-function-determinism|https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions?view=sql-server-ver16#built-in-function-determinism)] [2][ evaluated at row level: [https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand]|https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand)] [3] evaluated at row level if not specifies a seed, e.g., DBMS_RANDOM.normal, DBMS_RANDOM.value(1,10) [https://docs.oracle.com/database/timesten-18.1/TTPLP/d_random.htm#TTPLP71231|https://docs.oracle.com/database/timesten-18.1/TTPLP/d_random.htm#TTPLP71231)] So keep the current behavior and update these two functions' definition to non-deterministic can avoid the affection to users, and make it clearly. > Fix the determinism declaration of the rand function to be consistent with > the current behavior > ----------------------------------------------------------------------------------------------- > > Key: FLINK-29091 > URL: https://issues.apache.org/jira/browse/FLINK-29091 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner > Reporter: lincoln lee > Priority: Major > > RAND and RAND_INTEGER are declared as dynamic function (isDynamicFuntion > returns true), as the declaration it should only evaluate once at query-level > (not per record) for batch mode, FLINK-21713 did the similar fix for temporal > functions. > But current behavior is completely a non-deterministic function which > evaluated per record for both batch and streaming mode, it's not a good > choice to break current behavior, and the determinism of RAND function are > also different across vendors: > [1] evaluated at query-level though it is treated as non-deterministic > function > [https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions?view=sql-server-ver16#built-in-function-determinism|https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions?view=sql-server-ver16#built-in-function-determinism)] > [2][ evaluated at row level: > [https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand]|https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand)] > [3] evaluated at row level if not specifies a seed, e.g., > DBMS_RANDOM.normal, DBMS_RANDOM.value(1,10) > [https://docs.oracle.com/database/timesten-18.1/TTPLP/d_random.htm#TTPLP71231|https://docs.oracle.com/database/timesten-18.1/TTPLP/d_random.htm#TTPLP71231)] > So just keep the current behavior and update these two functions' definition > to non-deterministic can avoid the affection to users, and make it clearly. -- This message was sent by Atlassian Jira (v8.20.10#820010)