Great work! Thank you for sharing. Am Do., 12. Mai 2022 um 17:19 Uhr schrieb Jeff Zhang <zjf...@gmail.com>:
> That's true scala shell is removed from flink . Fortunately, Apache > Zeppelin has its own scala repl for Flink. So if Flink can support scala > 2.13, I am wondering whether it is possible to integrate it into scala > shell so that user can run flink scala code in notebook like spark. > > On Thu, May 12, 2022 at 11:06 PM Roman Grebennikov <g...@dfdx.me> wrote: > >> Hi, >> >> AFAIK scala REPL was removed completely in Flink 1.15 ( >> https://issues.apache.org/jira/browse/FLINK-24360), so there is nothing >> to cross-build. >> >> Roman Grebennikov | g...@dfdx.me >> >> >> On Thu, May 12, 2022, at 14:55, Jeff Zhang wrote: >> >> Great work Roman, do you think it is possible to run in scala shell as >> well? >> >> On Thu, May 12, 2022 at 10:43 PM Roman Grebennikov <g...@dfdx.me> wrote: >> >> >> Hello, >> >> As far as I understand discussions in this mailist, now there is almost >> no people maintaining the official Scala API in Apache Flink. Due to some >> technical complexities it will be probably stuck for a very long time on >> Scala 2.12 (which is not EOL yet, but quite close to): >> * Traversable serializer relies a lot on CanBuildFrom (so it's read and >> compiled on restore), which is missing in Scala 2.13 and 3.x - migrating >> off from this approach maintaining a savepoint compatibility can be quite a >> complex task. >> * Scala API uses an implicitly generated TypeInformation, which is >> generated by a giant scary mkTypeInfo macro, which should be completely >> rewritten for Scala 3.x. >> >> But even in the current state, scala support in Flink has some issues >> with ADT (sealed traits, popular data modelling pattern) not being natively >> supported, so if you use them, you have to fall back to Kryo, which is not >> that fast: we've seed 3x-4x throughput drops in performance tests. >> >> In my current company we made a library ( >> https://github.com/findify/flink-adt) which used Magnolia ( >> https://github.com/softwaremill/magnolia) to do all the compile-time >> TypeInformation generation to make Scala ADT nice & fast in Flink. With a >> couple of community contributions it was now possible to cross-build it >> also for scala3. >> >> As Flink 1.15 core is scala free, we extracted the DataStream part of >> Flink Scala API into a separate project, glued it together with flink-adt >> and ClosureCleaner from Spark 3.2 (supporting Scala 2.13 and jvm17) and >> cross-compiled it for 2.12/2.13/3.x. You can check out the result on this >> github project: https://github.com/findify/flink-scala-api >> >> So technically speaking, now it's possible to migrate a scala flink job >> from 2.12 to 3.x with: >> * replace flink-streaming-scala dependency with flink-scala-api >> (optional, both libs can co-exist in classpath on 2.12) >> * replace all imports of org.apache.flink.streaming.api.scala._ with ones >> from the new library >> * rebuild the job for 3.x >> >> The main drawback is that there is no savepoint compatibility due to >> CanBuildFrom and different way of handling ADTs. But if you can afford >> re-bootstrapping the state - migration is quite straightforward. >> >> The README on github https://github.com/findify/flink-scala-api#readme >> has some more details on how and why this project was done in this way. And >> the project is a bit experimental, so if you're interested in scala3 on >> Flink, you're welcome to share your feedback and ideas. >> >> with best regards, >> Roman Grebennikov | g...@dfdx.me >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> >> >> > > -- > Best Regards > > Jeff Zhang > -- https://twitter.com/snntrable https://github.com/knaufk