Thanks, Guozhang. I've been thinking about the following approach: https://imgur.com/a/pP92Z
Does this approach make sense? A key consideration will be that the product dimension table updates are processed and added to kafka before the corresponding purchase transaction record is processed. On 17 October 2017 at 02:15, Guozhang Wang <wangg...@gmail.com> wrote: > Hello Chris, > > The global table described in KIP-99 will keep the most recent snapshot of > the table when applying updates to the table, i.e. it is like type 1: > overwrite. So when a table or stream is joined with the global table, it is > always joined with the most recent values of the global table. > > However, note that in Kafka Streams api, joining streams are synchronized > based on their incoming record's timestamps (i.e. the library will choose > which records to process next, either from the global dimension table's > changelog, or from the fact table's changelog, based on their stream time > in the best effort), so if you have an updated value on the fact table, > that update's timestamp will be aligned with the the current updates on the > global table as well. > > > Guozhang > > > On Mon, Oct 16, 2017 at 12:51 PM, chris snow <chsnow...@gmail.com> wrote: > > > The streams global ktable wiki page [1] describes a data warehouse syle > > operation whereby dimension tables are joined to fact tables. > > > > I’m interested in whether this approach works for type 2 slowly changing > > dimensions [2]? In type 2 scd the dimension record history is preserved > > and the fact table record is joined to the appropriate version of the > > dimension table record. > > > > — > > [1] > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > 99%3A+Add+Global+Tables+to+Kafka+Streams > > [2] https://en.m.wikipedia.org/wiki/Slowly_changing_dimension > > > > > > -- > -- Guozhang >