Thanks, Godfrey, for starting this discussion, I understand the motivation behind it. No bugfix releases, slow feature reviewing, and no compatibility guaranteed are genuinely blocking the development of Flink SQL.
I think a fork is the last choice before trying our best to cooperate with the Calcite community. But we shouldn't stop here if there is no progress. Therefore, I'm okay with maintaining a fork. However: 1) It should be a temporary solution. We should have a plan to move back to the latest Calcite version at some point (e.g., pushing them to resolve the problems mentioned above). 2) If we maintain the fork in flink-extended, we should determine a groupId for deploying to maven central. The community should have permission to deploy under the groupId. Best, Jark On Sun, 24 Apr 2022 at 16:14, godfrey he <godfre...@gmail.com> wrote: > Hi, Jing > Thanks for sharing the Calcite experiences. > About Calcite version upgrading, we should try not use the latest Calcite > version to avoid the bugs introduced by the new version if possible. > This may be a best practice. > > > Hi, Yun > Thanks for the detailed explanation for the experiences regarding FRocksDB. > I agree with you that the situation with Calcite and RocksDB is a > little difference. > The main pain point for Calcite is that we have to upgrade Calcite to > latest version > to get fix bugs and new features, but the latest version may be > unstable, which is a pain for us. > If we all agree we should maintain a forked Calcite repo, > there are many experiences we can learn from FRocksDB. > > Best, > Godfrey > > Yun Tang <myas...@live.com> 于2022年4月24日周日 11:58写道: > > > > Hi all, > > > > I could share two cents here for how we maintain FRocksDB. > > > > First of all, we also do not prefer to maintain a customized RocksDB > version in Flink, which brings additional overhead for Flink community: > > > > > > 1. RocksDB community switches to circleci for the CI tests after > RocksDB-6.x, which requires additional money to run all tests for reviewing > each PR. > > 2. We need to compile and include all kinds of FRocksDB binaries on > linux32/64, windows, ppc64, ARM and Macos platforms, which is really tough > and boring experiences. > > > > The root reason why we have to maintain a forked RocksDB repo is that > RocksDB community refuses to accept a plugin-like feature based on > compaction filter, which is heavily dependent by Flink's state TTL feature > [1]. From RocksDB-7.0, the community also moves several components to the > plugin repo [2], although this cannot avoid us to release all kinds of > binaries, it can at least decrease our energy to maintain the whole tests > if we follow this trend. > > > > Last but not least, I don't think current discussion on Apache Calcite > is in the same situation as FRocksDB. Current Flink SQL guys complain that > Calcite is released too slowly, which blocks some feature development in > Flink. However, RocksDB community itself actually release new versions more > frequently, and we don't rely on its new version for some new features > currently. Moreover, we're often more careful on upgrading underlying > storage component as it could impact the performance and data correctness. > > > > > > [1] > https://github.com/ververica/frocksdb/commit/3da8249d50c8a3a6ea229f43890d37e098372786 > > [2] https://github.com/facebook/rocksdb/issues/9390 > > > > Best > > Yun Tang > > > > ________________________________ > > From: Jing Zhang <beyond1...@gmail.com> > > Sent: Saturday, April 23, 2022 15:21 > > To: dev <dev@flink.apache.org> > > Cc: Yun Tang <myas...@live.com> > > Subject: Re: [DISCUSS] Maintain a Calcite repository for Flink to > accelerate the development for Flink SQL features > > > > Hi Godfrey, > > I would like to share some problems based on my past experience. > > 1. It's not easy to push new features in the CALCITE community. > > As @Martijn referred, https://issues.apache.org/jira/browse/CALCITE-4865 > / > > https://github.com/apache/calcite/pull/2606 is such an example. > > I tried out many ways, for example, sent review requests in the dev mail > list, left comments in JIRA and in pull requests. > > And had to give up finally. Sorry for that. > > 2. However, some new features of calcite are radical. > > Such as https://issues.apache.org/jira/browse/CALCITE-4173, which had > some strong opposition in the CALCITE community, > > But it was merged finally and caused unexpected problems, such as wrong > results (https://issues.apache.org/jira/browse/FLINK-24708) > > and other related bugs. > > 3. Every time we upgrade the calcite version, we will cross multiple > versions, resulting in a slow upgrade process and > > uncontrolled results, often causing some unexpected problems. > > > > Thank @Godfrey for driving this discussion in a big scope. > > I think it's a good chance to review these problems and find a solution. > > > > Best, > > Jing Zhang > > > > godfrey he <godfre...@gmail.com<mailto:godfre...@gmail.com>> > 于2022年4月22日周五 21:40写道: > > Hi Chesnay, > > > > There is no bug fix version until now. > > You can find the releases in https://github.com/apache/calcite/tags > > > > Best, > > Godfrey > > > > Chesnay Schepler <ches...@apache.org<mailto:ches...@apache.org>> > 于2022年4月22日周五 18:48写道: > > > > > > I find it a bit weird that the supposed only way to get a bug fix is to > > > do a big version upgrade. > > > Is Calcite not creating bugfix releases? > > > > > > On 22/04/2022 12:26, godfrey he wrote: > > > > Thanks for the feedback, guys! > > > > > > > > For Jingsong's feedback: > > > >> ## Do we have the plan to upgrade calcite to 1.31? > > > > I think we will upgrade Calcite to 1.31 only when Flink depends on > > > > some significant features of Calcite. > > > > Such as: new syntax PTF (CALCITE-4865). > > > > > > > > >## Is Cherry-pick costly? > > > > >From the experience of maintaining calcite with our company, the > cost is small. > > > > We only cherry-pick the bug fixes and needed minor features. > > > > For a major feature, we can choose to upgrade the version. > > > > > > > >> ## Are the calcite repository costly to maintain? > > > > >From the experience of @Dann y chen (One PMC of Calcite), publishing > > > > is much easier. > > > > > > > > > > > > For Chesnay's feedback: > > > > I also totally agree that a fork repository will increase the cost of > > > > maintenance. > > > > > > > > Usually, the Calcite community releases a version three months or > more. > > > > I think it's hard to let Calcite change the release cycle > > > > because Calcite supports many compute engines. > > > > > > > > > > > > For Konstantin's feedback: > > > > Some changes in Calcite may cause hundreds of plan changes in Flink, > > > > such as: CALCITE-4173. > > > > We must check whether the change is expected, whether there is > > > > performance regression. > > > > Some of the changes are very subtle, especially in the CBO planner. > > > > This situation also occurs similarly within upgrading from 1.1x to > 1.22. > > > > If you are not familiar with Flink planner and Calcite, it will be > > > > more difficult to upgrade. > > > > > > > > > > > > For Xintong's feedback: > > > > You are right, I will connect Yun for some help, Thanks for the > suggestions. > > > > > > > > > > > > For Martijn's feedback: > > > > I'm also against cherry-pick many features code into the fock > repository, > > > > and I also totally agree we should collaborate closely with the > > > > Calcite community. > > > > I'm just trying to find an approach which can avoid frequent Calcite > upgrades, > > > > but easily support bug fix and minor new feature development. > > > > > > > > As for the CALCITE-4865 case, I think we should upgrade the Calcite > > > > version to support PTF. > > > > > > > > @Jing zhang, can you share some 'feeling' for CALCITE-4865 ? > > > > > > > > Best, > > > > Godfrey > > > > > > > > Martijn Visser <martijnvis...@apache.org<mailto: > martijnvis...@apache.org>> 于2022年4月22日周五 17:31写道: > > > >> Hi everyone, > > > >> > > > >> Overall I'm against the idea of setting up a Calcite fork for the > same > > > >> reasons that Chesnay has mentioned. We've talked extensively about > doing an > > > >> upgrade of Calcite during the Flink 1.15 release period, but there > was a > > > >> lot of pushback by the maintainers against that because of the > required > > > >> efforts. Having our own fork will mean that there will be even more > effort > > > >> required, because not only do we need to perform the upgrade on > Flink's > > > >> end, we also need to maintain this Calcite fork. > > > >> > > > >> I think what we should do is have a closer collaboration with the > Calcite > > > >> community and see if we can also help out with reviewing/merging > PRs and > > > >> more frequent releases. What we're seeing is that already features > that are > > > >> proposed towards Calcite because we need them for Flink, are not > getting > > > >> picked up by the Calcite community. See > > > >> https://issues.apache.org/jira/browse/CALCITE-4865 / > > > >> https://github.com/apache/calcite/pull/2606 which is such an > example. > > > >> > > > >> I would rather invest more in collaborating with the Calcite > community > > > >> instead of maintaining our own fork. I believe that would help us > get new > > > >> features and bug fixes sooner. > > > >> > > > >> Best regards, > > > >> > > > >> Martijn Visser > > > >> https://twitter.com/MartijnVisser82 > > > >> https://github.com/MartijnVisser > > > >> > > > >> > > > >> On Fri, 22 Apr 2022 at 10:46, Xintong Song <tonysong...@gmail.com > <mailto:tonysong...@gmail.com>> wrote: > > > >> > > > >>> BTW, I think this proposal sounds similar to FRocksDB, the Flink's > custom > > > >>> RocksDB build. Maybe folks maintaining FRocksDB can share some > experiences. > > > >>> > > > >>> CC @Yun Tang > > > >>> > > > >>> Thank you~ > > > >>> > > > >>> Xintong Song > > > >>> > > > >>> > > > >>> > > > >>> On Fri, Apr 22, 2022 at 4:35 PM Xintong Song < > tonysong...@gmail.com<mailto:tonysong...@gmail.com>> > > > >>> wrote: > > > >>> > > > >>>> Hi Godfrey, > > > >>>> > > > >>>> > > > >>>>> 1. Where to put the code? https://github.com/flink-extended is > a good > > > >>>>> place. > > > >>>> > > > >>>> Please notice that `flink-extended` is not endorsed by the Apache > Flink > > > >>>> PMC. That means if the proposed new Calcite repository is hosted > there, > > > >>> the > > > >>>> maintenance and release will not be guaranteed by the Apache Flink > > > >>> project. > > > >>>> I guess the question is do we consider another 3rd party Calcite > > > >>> repository > > > >>>> more reliable and convenient than the official Apache Calcite > that we > > > >>> want > > > >>>> to depend on. > > > >>>> > > > >>>> Thank you~ > > > >>>> > > > >>>> Xintong Song > > > >>>> > > > >>>> > > > >>>> > > > >>>> On Fri, Apr 22, 2022 at 4:07 PM Chesnay Schepler < > ches...@apache.org<mailto:ches...@apache.org>> > > > >>>> wrote: > > > >>>> > > > >>>>> I'm overall against the idea of creating a fork. > > > >>>>> It implies quite some maintenance overhead, like dealing with > unstable > > > >>>>> tests, CI, licensing etc. and the overall release overhead. > > > >>>>> > > > >>>>> Is there no alternative where we can collaborate more with the > calcite > > > >>>>> guys, like verifying new features so bugs are caught sooner? > > > >>>>> > > > >>>>> On 22/04/2022 09:31, godfrey he wrote: > > > >>>>>> Dear devs, > > > >>>>>> > > > >>>>>> I would like to open a discussion on the fact that currently > many > > > >>>>>> Flink SQL function > > > >>>>>> development relies on Calcite releases, which seriously > blocks some > > > >>>>>> Flink SQL's features release. > > > >>>>>> Therefore, I would like to discuss whether it is possible to > solve > > > >>> this > > > >>>>> problem > > > >>>>>> by creating Flink's own Calcite repository. > > > >>>>>> > > > >>>>>> Currently, Flink depends on Caclite-1.26, FLIP-204[1] relies on > > > >>>>> Calcite-1.30, > > > >>>>>> and we recently want to support fully join-hints functionatity > in > > > >>>>> Flink-1.16, > > > >>>>>> which relies on Calcite-1.31 (maybe two or three months later > will be > > > >>>>> released). > > > >>>>>> In order to support some new features or fix some bugs, we need > to > > > >>>>> upgrade > > > >>>>>> the Calcite version, but every time we upgrade Calcite version > > > >>>>>> (especially upgrades > > > >>>>>> across multiple versions), the processing is very tough: I > remember > > > >>>>> clearly that > > > >>>>>> the Calcite upgrade from 1.22 to 1.26 took two weeks of > full-time to > > > >>>>> complete. > > > >>>>>> Currently, in order to fix some bugs while not upgrading the > Calcite > > > >>>>> version, > > > >>>>>> we copy the corresponding Calcite class directly into the Flink > > > >>> project > > > >>>>>> and then modify it accordingly.[2] This approach is rather > hacky and > > > >>>>>> hard for code maintenance and upgrades. > > > >>>>>> > > > >>>>>> So, I had an idea whether we could solve this problem by > maintaining a > > > >>>>>> Calcite repository > > > >>>>>> in the Flink community. This approach has been practiced within > my > > > >>>>>> company for many years. > > > >>>>>> There are similar practices in the industry. For example, > Apache > > > >>> Dill > > > >>>>>> also maintains > > > >>>>>> a separate Calcite repository[3]. > > > >>>>>> > > > >>>>>> The following is a brief analysis of the approach and the pros > and > > > >>>>>> cons of maintaining a separate repository. > > > >>>>>> > > > >>>>>> Approach: > > > >>>>>> 1. Where to put the code? https://github.com/flink-extended is > a good > > > >>>>> place. > > > >>>>>> 2. What extra code can be added to this repository? Only bug > fixes and > > > >>>>> features > > > >>>>>> that are already merged into Calcite can be cherry-picked to > this > > > >>>>> repository. > > > >>>>>> We also should try to push bug fixes to the Calcite community. > > > >>>>>> Btw, the copied Calcite class in the Flink project can be > removed. > > > >>>>>> 3. How to upgrade the Calcite version? Check out the target > Calcite > > > >>>>>> release branch > > > >>>>>> and rebase our bug fix code. (As we upgrade, we will maintain > fewer > > > >>>>>> and fewer older bug > > > >>>>>> fixes code.) And then, verify all Calcte's tests and Flink's > tests in > > > >>>>>> the developer's local > > > >>>>>> environment. If all tests are OK, release the Calcite > branch, or fix > > > >>>>>> it in the branch and re-test. > > > >>>>>> After the branch is released, then the version of Calcite in > Flink > > > >>>>>> can be upgraded. For example: > > > >>>>>> checkout calcite-1.26.0-flink-v1-SNAPSHOT branch from > > > >>> calcite-1.26.0, > > > >>>>>> move all the copied > > > >>>>>> Calcite code in Flink to the branch, and pick all the hint > related > > > >>>>>> changes from Calcite-1.31 to > > > >>>>>> the branch. Then we can change the Calcite version in Flink > to > > > >>>>>> calcite-1.26.0-flink-v1-SNAPSHOT, > > > >>>>>> and verify all tests in the locale. Release > calcite-1.26.0-flink-v1 > > > >>>>>> after all tests are successful. > > > >>>>>> At last upgrade the calcite version to > > > >>>>>> calcite-1.26.0-flink-v10-flink-v1, and open a PR. > > > >>>>>> 4. Who will maintain it? The maintenance workload is minimal, > but the > > > >>>>>> upgrade work is > > > >>>>>> laborious (actually, it's similar to before). I can maintain > it in > > > >>>>>> the early stage and standardise the processing. > > > >>>>>> > > > >>>>>> Pros. > > > >>>>>> 1. The release of Flink is decoupled from the release of > Calcite, > > > >>>>>> making feature development and bug fix quicker > > > >>>>>> 2. Reduce the hassle of unnecessary calcite upgrades > > > >>>>>> 3. No hacking in Flink to maintain the Calcite copied code > > > >>>>>> > > > >>>>>> cons. > > > >>>>>> 1. Need to maintain an additional Calcite repository > > > >>>>>> 2. The Upgrades are a little more complicated than before > > > >>>>>> > > > >>>>>> Any feedback is very welcome! > > > >>>>>> > > > >>>>>> > > > >>>>>> [1] > > > >>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-204%3A+Introduce+Hash+Lookup+Join > > > >>>>>> [2] > > > >>> > https://github.com/apache/flink/tree/master/flink-table/flink-table-planner/src/main/java/org/apache/calcite > > > >>>>>> [3] https://github.com/apache/drill/blob/master/pom.xml#L64 > > > >>>>>> > > > >>>>>> Best, > > > >>>>>> Godfrey > > > >>>>> > > > >>>>> > > > >