Hi Maminspapin, I just answered another question similarly, so let me just c&p it here:
The beauty of Avro lies in having reader and writer schema and schema compatibility, such that if your schema evolves over time (which will happen in streaming naturally but is also very common in batch), you can still use your application as is without modification. For streaming, this methodology also implies that you can process elements with different schema versions in the same run, which is mandatory for any non-toy example. If you read into this topic, you will realize that it doesn't make sense to read from Avro without specifying your reader schema (except for some generic applications, but they should be written in DataStream). If you keep in mind that your same dataset could have different schemas, you will notice that your ideas quickly reach some limitations (which schema to take?). What you could do, is to write a small script to generate the schema DDL from your current schema in your actual data if you have very many columns and datasets. It certainly would also be an interesting idea to pass a static Avro/Json schema to the DDL. Note that in KafkaStreams, you have the same issue. You usually generate your Java classes from some schema version, which will become your reader schema. You can and should do the same in Flink. Please read [1] for more information. [1] https://www.baeldung.com/java-apache-avro#read-schema On Sun, Apr 4, 2021 at 4:21 PM Maminspapin <un...@mail.ru> wrote: > Hi, @Arvid Heise-4, @Matthias > > I'm very appreciate for your attention, guys. And sorry for my late reply. > > Yes, Arvid, you are right, the second way in fact works. I coppied schema > from Schema Registry using it's API and created the .avsc format file. And > thanks again for explaining me why the first way is not compatible. > > So, my code to define schema is (I don't know is it good decision...): > > Path path = Paths.get("path_to_schema/schema.avsc"); > String content = new String(Files.readAllBytes(path)); > Schema schema = new Schema.Parser().parse(content); > > And it really works. > > But, I don't understand why should I use two schemas: > 1. schema I created (reader schema) > 2. schema I get with SR url (writer schema) > > I have some expirience with KafkaStreams lib and using it there is no need > to get reader schema. There is one service to communicate with schemas - > it's Schema Registry. Why not to use single source to get schema in Flink? > > > Again, the second way is correct, and I can to go farther with my program. > > Thanks. > > > > -- > Sent from: > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >