Hi Chris, You can only join on the key of the table, so i don't think this would work as is. Also, the global table is updated in a different thread and there is no guarantee that it would have been updated before the purchase.
Perhaps you could do it by making the key of the product table versioned? And then the purchase references the versioned key? You would have the version history and be able to join with the appropriate product version, however there is still the possiblity that the data in the global table wasn't updated before the purchase. Thanks, Damian On Tue, 17 Oct 2017 at 12:59 chris snow <chsnow...@gmail.com> wrote: > Thanks, Guozhang. > > I've been thinking about the following approach: https://imgur.com/a/pP92Z > > Does this approach make sense? > > A key consideration will be that the product dimension table updates are > processed and added to kafka before the corresponding purchase transaction > record is processed. > > > > On 17 October 2017 at 02:15, Guozhang Wang <wangg...@gmail.com> wrote: > > > Hello Chris, > > > > The global table described in KIP-99 will keep the most recent snapshot > of > > the table when applying updates to the table, i.e. it is like type 1: > > overwrite. So when a table or stream is joined with the global table, it > is > > always joined with the most recent values of the global table. > > > > However, note that in Kafka Streams api, joining streams are synchronized > > based on their incoming record's timestamps (i.e. the library will choose > > which records to process next, either from the global dimension table's > > changelog, or from the fact table's changelog, based on their stream time > > in the best effort), so if you have an updated value on the fact table, > > that update's timestamp will be aligned with the the current updates on > the > > global table as well. > > > > > > Guozhang > > > > > > On Mon, Oct 16, 2017 at 12:51 PM, chris snow <chsnow...@gmail.com> > wrote: > > > > > The streams global ktable wiki page [1] describes a data warehouse syle > > > operation whereby dimension tables are joined to fact tables. > > > > > > I’m interested in whether this approach works for type 2 slowly > changing > > > dimensions [2]? In type 2 scd the dimension record history is > preserved > > > and the fact table record is joined to the appropriate version of the > > > dimension table record. > > > > > > — > > > [1] > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > > > 99%3A+Add+Global+Tables+to+Kafka+Streams > > > [2] https://en.m.wikipedia.org/wiki/Slowly_changing_dimension > > > > > > > > > > > -- > > -- Guozhang > > >