[ https://issues.apache.org/jira/browse/FLINK-22201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318974#comment-17318974 ]
Kurt Young commented on FLINK-22201: ------------------------------------ [~jamii] Thanks for the reporting. Could you provide some example data that can help us finding the bug? The query is just too simple that I can't recall any potential bug around it. > Incorrect output for simple sql query > ------------------------------------- > > Key: FLINK-22201 > URL: https://issues.apache.org/jira/browse/FLINK-22201 > Project: Flink > Issue Type: Bug > Components: Table SQL / API > Affects Versions: 1.12.2 > Environment: {code:bash} > [nix-shell:~/streaming-consistency/flink]$ java -version > openjdk version "1.8.0_265" > OpenJDK Runtime Environment (build 1.8.0_265-ga) > OpenJDK 64-Bit Server VM (build 25.265-bga, mixed mode) > [nix-shell:~/streaming-consistency/flink]$ flink --version > Version: 1.12.2, Commit ID: 4dedee0 > [nix-shell:~/streaming-consistency/flink]$ nix-info > system: "x86_64-linux", multi-user?: yes, version: nix-env (Nix) 2.3.10, > channels(jamie): "", channels(root): "nixos-20.09.3554.f8929dce13e", nixpkgs: > /nix/var/nix/profiles/per-user/root/channels/nixos > {code} > Reporter: Jamie Brandon > Priority: Major > > I'm running this simple query: > {code:sql} > CREATE VIEW credits AS > SELECT > to_account AS account, > sum(amount) AS credits > FROM > transactions > GROUP BY > to_account; > CREATE VIEW debits AS > SELECT > from_account AS account, > sum(amount) AS debits > FROM > transactions > GROUP BY > from_account; > CREATE VIEW balance AS > SELECT > credits.account AS account, > credits - debits AS balance > FROM > credits, > debits > WHERE > credits.account = debits.account; > CREATE VIEW total AS > SELECT > sum(balance) > FROM > balance; > {code} > The `total` view is a sanity check - it's value should always be 0 because > money is only moved from one account to another, never created or destroyed. > In streaming mode (code > [here|https://github.com/jamii/streaming-consistency/tree/a0f3b9d7ba178a7e184e6cb60e597a302dc3dd86/flink-table]) > only about ~0.04% of the output values are 0. The absolute error in the > outputs increases roughly linearly wrt to the number of input transactions. > But after the inputs are finished it does return to 0. > In batch mode (code > [here|https://github.com/jamii/streaming-consistency/tree/d3288e27649174c7463829c726be514610bbd056/flink]) > it produces 0 for a while but then has large jumps to incorrect outputs and > never returns to 0. In this run, the first ~44% of the outputs are correct > but the final answer is -48811 which amounts to miscounting ~5% of the inputs. > I also run a variant of that query which joins on event time. In streaming > mode it produces similar results to the original. In batch mode only 2 out of > 1718375 outputs were correct and the final error was similar to the original > query. -- This message was sent by Atlassian Jira (v8.3.4#803005)