lincoln-lil opened a new pull request, #20674:
URL: https://github.com/apache/flink/pull/20674

   ## What is the purpose of the change
   RAND and RAND_INTEGER are declared as dynamic function (isDynamicFuntion 
returns true), as the declaration it should only evaluate once at query-level 
(not per record) for batch mode, 
[FLINK-21713](https://issues.apache.org/jira/browse/FLINK-21713) did the 
similar fix for temporal functions.
   
   But current behavior is completely a non-deterministic function which 
evaluated per record for both batch and streaming mode, it's not a good choice 
to break current behavior,  and the determinism of RAND function are also 
different across vendors:
   
   [1] evaluated at query-level though it is treated as non-deterministic 
function 
[https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions?view=sql-server-ver16#built-in-function-determinism](https://docs.microsoft.com/en-us/sql/relational-databases/user-defined-functions/deterministic-and-nondeterministic-functions?view=sql-server-ver16#built-in-function-determinism))
   
   [2][ evaluated at row level:  
https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand|https://dev.mysql.com/doc/refman/5.7/en/mathematical-functions.html#function_rand)]
   
   [3] evaluated at row level if not specifies a seed,  e.g., 
DBMS_RANDOM.normal, DBMS_RANDOM.value(1,10)  
[https://docs.oracle.com/database/timesten-18.1/TTPLP/d_random.htm#TTPLP71231](https://docs.oracle.com/database/timesten-18.1/TTPLP/d_random.htm#TTPLP71231))
   
   So just fix the determinism declaration of the rand function to be 
consistent with the current behavior and make it clear in the documentation.
   
   ## Brief change log
   does not change any process logic 
   only update function definition rand/ rand_integer 
   
   ## Verifying this change
   ExpressionReductionRulesTest NonDeterministicTests
   
   ## Does this pull request potentially affect one of the following parts:
     - Dependencies (does it add or upgrade a dependency): (no)
     - The public API, i.e., is any changed class annotated with 
@Public(Evolving): (no)
     - The serializers: (no )
     - The runtime per-record code paths (performance sensitive): (no)
     - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
     - The S3 file system connector: (no)
   
   ## Documentation
     - Does this pull request introduce a new feature? (no)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to