Hi,

Our job was experiencing high write amplification on aggregates so we
decided to give mini-batch a go. There's a few things I've noticed that are
different from our previous job and I would like some clarification.

1) Our operators now say they have Watermarks. We never explicitly added
watermarks, and our state is essentially unbounded across all time since it
consumes from Debezium and reshapes our database data into another store.
Why does it say we have Watermarks then?

2) In our sources I see MiniBatchAssigner(interval=[1000ms],
mode=[ProcTime], what does that do?

3) I don't really see anything else different yet in the shape of our plan
even though we've turned on
configuration.setString(
"table.optimizer.agg-phase-strategy",
"TWO_PHASE"
)
is there a way to check that this optimization is on? We use user defined
aggregate functions, does it work for UDAF?

Thanks!

-- 

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
 |  FOLLOW
US <https://twitter.com/remindhq>  |  LIKE US
<https://www.facebook.com/remindhq>

Reply via email to