Hi all, If there are no more comments, I'm going to start a vote.
Best, Jingsong On Tue, May 10, 2022 at 10:37 AM Jingsong Li <jingsongl...@gmail.com> wrote: > Hi Jark, > > Thanks for your feedback. > > > 1) Does table-store support evolve schemas multiple times during a > checkpoint? > > In this case this checkpoint is split into multiple commits, e.g.: > - commit1: write 1 million rows > - commit1: write 1 million rows > - commit2: evolve mode 1 > - commit3: write 1 million lines > .... > > Some works needs to be done on the connector side. > > > 2) Does ADD COLUMN support add a NOT-NULL column? > > I tend not to support it at this time. > The other strategy is to support it, but report errors when reading data > with the new shcema, which ensures that data can be read with the old > schema. > > > 3) What's the matrix of type evolution? Do you support modifying a column > to any type? > > For type evolution, we currently only support types that are supported by > implicit conversions. (From Flink LogicalTypeCasts) > Three modes can be supported in future to allow the user to select > - Default implicit conversions > - Allow implicit and explicit conversions > - Throw exceptions when cast fail. > - Return null when cast fail. > > I have updated FLIP. > > Best, > Jingsong > > On Mon, May 9, 2022 at 8:14 PM Jark Wu <imj...@gmail.com> wrote: > >> Thanks for proposing this exciting feature, Jingsong! >> >> I only have a few questions: >> >> 1) Does table-store support evolve schemas multiple times during a >> checkpoint? >> For example, cp1 -> write 1M rows (may flush file store) -> evolve schema1 >> -> >> write 1M rows (may flush file store again) -> evolve schema2 -> write 1M >> rows -> cp2 >> >> That means the schemas of new data files are different in this snapshot. >> Besides, it may need to register schemas before the checkpoint is >> complete. >> >> 2) Does ADD COLUMN support add a NOT-NULL column? >> >> 3) What's the matrix of type evolution? Do you support modifying a column >> to any type? >> >> Best, >> Jark >> >> >> >> On Mon, 9 May 2022 at 16:44, Caizhi Weng <tsreape...@gmail.com> wrote: >> >> > Hi all! >> > >> > +1 for this FLIP. By adding schema information into data files we can >> not >> > only support schema evolution, which is a very useful feature for data >> > storages, but also make it easier for table store to integrate with >> other >> > systems. >> > >> > For example timestamp type in Hive does not support precision. With this >> > extra schema information however we can directly deduce the precision >> of a >> > schema column. >> > >> > Jingsong Li <jingsongl...@gmail.com> 于2022年4月29日周五 17:54写道: >> > >> > > Hi devs, >> > > >> > > I want to start a discussion about Schema Evolution on the Flink Table >> > > Store. [1] >> > > >> > > In FLINK-21634, We plan to support many schema changes in Flink SQL. >> > > But for the current Table Store, it may result in wrong data, unclear >> > > evolutions. >> > > >> > > In general, the user has these operations for schema: >> > > - Add column: Adding a column to a table. >> > > - Modify column type. >> > > - Drop column: Drop a column. >> > > - Rename column: For example, rename the "name_1" column to "name_2". >> > > >> > > Another schema change is partition keys, the data is changing over >> > > time, for example, a table with day partition, as the business >> > > continues to grow, the new partition of the table by day will become >> > > larger and the business wants to change to hourly partitions. >> > > >> > > A simple approach is to rewrite all the existing data when modifying >> the >> > > schema. >> > > But this expensive way is not acceptable to the user, so we need to >> > > support and define it clearly. >> > > Modifying the schema does not rewrite the existing data, when reading >> > > the original data needs to evolve to the current schema. >> > > >> > > Look forward to your feedback! >> > > >> > > [1] >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-226%3A+Introduce+Schema+Evolution+on+Table+Store >> > > >> > > Best, >> > > Jingsong >> > > >> > >> >