We call this a Slowly Changing Dimensions join, there was a previous effort
<https://docs.google.com/document/d/1LDY_CtsOJ8Y_zNv1QtkP6AGFrtzkj1q5EW_gSChOIvg/edit>
to add this to Beam that is partially implemented in Java
<https://github.com/apache/beam/pull/11477>. Unfortunately we haven't
finished the work on the SQL side, I haven't looked into what is involved
to finish it. It might be possible to do this just using PeriodicImpulse to
refresh your static dataset, but it might also require changes to sql in
BeamSideInputJoinRel
<https://github.com/apache/beam/blob/243128a8fc52798e1b58b0cf1a271d95ee7aa241/sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/impl/rel/BeamSideInputJoinRel.java>
.

You could also take a look at this stack overflow post for ideas:
https://stackoverflow.com/questions/41570276/joining-a-stream-against-a-table-in-dataflow

Andrew

On Fri, May 7, 2021 at 9:45 AM Talat Uyarer <tuya...@paloaltonetworks.com>
wrote:

> Hi,
>
> Based on Join documentation. If I have a Join with Unbounded and Bounded
>
>> For this type of JOIN bounded input is treated as a side-input by the
>> implementation. This means that window/trigger is inherented from upstreams.
>
>
> On my pipeline I dont have any triggering or window. I use a global window
> on the Unbounded side. Basically I read from kafka  data and I want to join
> with static data to enrich the kafka message. Not very frequently I want to
> update my static data. I am trying to understand How i can update when I
> update my static data.
>
> Thanks
>

Reply via email to