Hi kingofthecity, > > But when I change the result-mode from TABLE to CHANGELOG, > there are 4 rows for each update like: > The results are expected when the result-mode is `CHANGELOG`.
> SELECT sector, avg(`value`) as `index` FROM stock INNER JOIN metadata ON > stock.id=metadata.id GROUP BY sector; > In this case, the topology is source -> join -> aggregate. For an update for a stock source, let's say (id1, 1) -> (id1,2). When the input record (id1, 2) arrives at join operator. the output of join would be two record: -U (id1, 1, 'Technology'), + U(id1, 2, 'Technology') Those two record arrives at downstream aggregate, each record would leads to two output record (-U, +U). so each update at source would lead to 4 output records. Hope this helps. Best, JING ZHANG thekingofcity <kingofthec...@protonmail.com> 于2021年10月20日周三 下午5:52写道: > Hi, > > I'm working on a simple sql that contains aggregation. > > The results looks fine in the `SQL Query Result (Table)` > but looks weird when I change the result-mode to CHANGELOG. > > ``` > CREATE TABLE stock ( > id VARCHAR(10) NOT NULL PRIMARY KEY, > `value` DOUBLE NOT NULL > ) WITH ( > 'connector' = 'pravega', > 'controller-uri' = 'tcp://pravega:9090', > 'scope' = 'stock', > 'scan.execution.type' = 'streaming', > 'scan.streams' = 'dbserver1.stock.stock', > 'format' = 'debezium-json' > ); > > CREATE TABLE metadata ( > id VARCHAR(10) NOT NULL PRIMARY KEY, > sector STRING NOT NULL > ) WITH ( > 'connector' = 'pravega', > 'controller-uri' = 'tcp://pravega:9090', > 'scope' = 'stock', > 'scan.execution.type' = 'streaming', > 'scan.streams' = 'dbserver1.stock.metadata', > 'format' = 'debezium-json' > ); > > SELECT sector, avg(`value`) as `index` FROM stock INNER JOIN metadata ON > stock.id=metadata.id GROUP BY sector; > ``` > > The sql should return a continuously updating stock average numbers with > small changes: > > And this is exactly what I see in the `SQL Query Result (Table)`. > > ``` > sector index > Technology 115.95986429850261 > Consumer Cyclical 482.5149917602539 > Industrials 167.49693044026694 > ``` > > But when I change the result-mode from TABLE to CHANGELOG, > there are 4 rows for each update like: > > ``` > -U[Technology, 115.77986653645833] > +U[Technology, 103.54979705810547] > -U[Technology, 103.54979705810547] > +U[Technology, 115.81319681803386] > > -U[Technology, 115.81319681803386] > +U[Technology, 103.42479705810547] > -U[Technology, 103.42479705810547] > +U[Technology, 115.82653299967448] > > -U[Technology, 115.82653299967448] > +U[Technology, 140.48730087280273] > -U[Technology, 140.48730087280273] > +U[Technology, 115.84153493245442] > > -U[Industrials, 167.40963745117188] > +U[Industrials, 140.22450256347656] > -U[Industrials, 140.22450256347656] > +U[Industrials, 167.43633524576822] > ``` > > > I insert blank line every 4 lines to get a better visualization. > > The technology sector should mostly be around 115 +/- 5, > but there are 103, 140, and 167 which are wired since stock value can't > fluctuate like this > and I can be sure because TABLE and my own python code shows the same > correct (small changes) outcome. > > With many thanks, > kingofthecity > >