You can still be part of a Apache Beam User Experience Research

2020-12-07 Thread Carlos Camacho Frausto
Hello, Are you currently learning how to use apache Beam? We’d like to invite you to *provide feedback on your experience discovering, learning, and using Apache Beam* via a user experience research. The goal of this effort is to better understand the friction and pain points that Apache Beam us

About Beam SQL Schema Changes and Code generation

2020-12-07 Thread Talat Uyarer
Hi, We are using Beamsql on our pipeline. Our Data is written in Avro format. We generate our rows based on our Avro schema. Over time the schema is changing. I believe Beam SQL generates Java code based on what we define as BeamSchema while submitting the pipeline. Do you have any idea How can we

Re: About Beam SQL Schema Changes and Code generation

2020-12-07 Thread Andrew Pilloud
Unfortunately we don't have a way to generate the SQL Java code on the fly, even if we did, that wouldn't solve your problem. I believe our recommended practice is to run both the old and new pipeline for some time, then pick a window boundary to transition the output from the old pipeline to the n

Re: About Beam SQL Schema Changes and Code generation

2020-12-07 Thread Talat Uyarer
Hi Andrew, I assume SQL query is not going to change. Changing things is the Row schema by adding new columns or rename columns. if we keep a version information on somewhere for example a KV pair. Key is schema information, value is Row. Can not we generate SQL code ? Why I am asking We have 15k

Re: About Beam SQL Schema Changes and Code generation

2020-12-07 Thread Reuven Lax
Can you explain the use case some more? Are you wanting to change your SQL statement as well when the schema changes? If not, what are those new fields doing in the pipeline? What I mean is that your old SQL statement clearly didn't reference those fields in a SELECT statement since they didn't exi

Re: About Beam SQL Schema Changes and Code generation

2020-12-07 Thread Talat Uyarer
Hi, For sure let me explain a little bit about my pipeline. My Pipeline is actually simple Read Kafka -> Convert Avro Bytes to Beam Row(DoFn, Row>) -> Apply Filter(SqlTransform.query(sql)) -> Convert back from Row to Avro (DoFn)-> Write DB/GCS/GRPC etc On our jobs We have three type sqls - SELECT

Re: About Beam SQL Schema Changes and Code generation

2020-12-07 Thread Reuven Lax
And when you say schema changes, are these new fields being added to the schema? Or are you making changes to the existing fields? On Mon, Dec 7, 2020 at 9:02 PM Talat Uyarer wrote: > Hi, > For sure let me explain a little bit about my pipeline. > My Pipeline is actually simple > Read Kafka -> C

Re: About Beam SQL Schema Changes and Code generation

2020-12-07 Thread Talat Uyarer
Adding new fields is more common than modifying existing fields. But type change is also possible for existing fields, such as regular mandatory field(string,integer) to union(nullable field). No field deletion. On Mon, Dec 7, 2020 at 9:22 PM Reuven Lax wrote: > And when you say schema changes,