Re: About Beam SQL Schema Changes and Code generation

Talat Uyarer Mon, 07 Dec 2020 18:31:59 -0800

Hi Andrew,

I assume SQL query is not going to change. Changing things is the Row
schema by adding new columns or rename columns. if we keep a version
information on somewhere for example a KV pair. Key is schema information,
value is Row. Can not we generate SQL code ? Why I am asking We have 15k
pipelines. When we have a schema change we restart a 15k DF job which is
pain. I am looking for a possible way to avoid job restart. Dont you think
it is not still doable ?


Thanks


On Mon, Dec 7, 2020 at 6:10 PM Andrew Pilloud <apill...@google.com> wrote:

> Unfortunately we don't have a way to generate the SQL Java code on the
> fly, even if we did, that wouldn't solve your problem. I believe our
> recommended practice is to run both the old and new pipeline for some time,
> then pick a window boundary to transition the output from the old pipeline
> to the new one.
>
> Beam doesn't handle changing the format of data sent between intermediate
> steps in a running pipeline. Beam uses "coders" to serialize data between
> steps of the pipeline. The builtin coders (including the Schema Row Coder
> used by SQL) have a fixed data format and don't handle schema evolution.
> They are optimized for performance at all costs.
>
> If you worked around this, the Beam model doesn't support changing the
> structure of the pipeline graph. This would significantly limit the changes
> you can make. It would also require some changes to SQL to try to produce
> the same plan for an updated SQL query.
>
> Andrew
>
> On Mon, Dec 7, 2020 at 5:44 PM Talat Uyarer <tuya...@paloaltonetworks.com>
> wrote:
>
>> Hi,
>>
>> We are using Beamsql on our pipeline. Our Data is written in Avro format.
>> We generate our rows based on our Avro schema. Over time the schema is
>> changing. I believe Beam SQL generates Java code based on what we define as
>> BeamSchema while submitting the pipeline. Do you have any idea How can we
>> handle schema changes with resubmitting our beam job. Is it possible to
>> generate SQL java code on the fly ?
>>
>> Thanks
>>
>

Re: About Beam SQL Schema Changes and Code generation

Reply via email to