Hi,

AFAIK scala REPL was removed completely in Flink 1.15 
(https://issues.apache.org/jira/browse/FLINK-24360), so there is nothing to 
cross-build.

Roman Grebennikov | g...@dfdx.me


On Thu, May 12, 2022, at 14:55, Jeff Zhang wrote:
> Great work Roman, do you think it is possible to run in scala shell as well?
> 
> On Thu, May 12, 2022 at 10:43 PM Roman Grebennikov <g...@dfdx.me> wrote:
>> __
>> Hello,
>> 
>> As far as I understand discussions in this mailist, now there is almost no 
>> people maintaining the official Scala API in Apache Flink. Due to some 
>> technical complexities it will be probably stuck for a very long time on 
>> Scala 2.12 (which is not EOL yet, but quite close to):
>> * Traversable serializer relies a lot on CanBuildFrom (so it's read and 
>> compiled on restore), which is missing in Scala 2.13 and 3.x - migrating off 
>> from this approach maintaining a savepoint compatibility can be quite a 
>> complex task.
>> * Scala API uses an implicitly generated TypeInformation, which is generated 
>> by a giant scary mkTypeInfo macro, which should be completely rewritten for 
>> Scala 3.x.
>> 
>> But even in the current state, scala support in Flink has some issues with 
>> ADT (sealed traits, popular data modelling pattern) not being natively 
>> supported, so if you use them, you have to fall back to Kryo, which is not 
>> that fast: we've seed 3x-4x throughput drops in performance tests.
>> 
>> In my current company we made a library 
>> (https://github.com/findify/flink-adt) which used Magnolia 
>> (https://github.com/softwaremill/magnolia) to do all the compile-time 
>> TypeInformation generation to make Scala ADT nice & fast in Flink. With a 
>> couple of community contributions it was now possible to cross-build it also 
>> for scala3.
>> 
>> As Flink 1.15 core is scala free, we extracted the DataStream part of Flink 
>> Scala API into a separate project, glued it together with flink-adt and 
>> ClosureCleaner from Spark 3.2 (supporting Scala 2.13 and jvm17) and 
>> cross-compiled it for 2.12/2.13/3.x. You can check out the result on this 
>> github project: https://github.com/findify/flink-scala-api
>> 
>> So technically speaking, now it's possible to migrate a scala flink job from 
>> 2.12 to 3.x with:
>> * replace flink-streaming-scala dependency with flink-scala-api (optional, 
>> both libs can co-exist in classpath on 2.12)
>> * replace all imports of org.apache.flink.streaming.api.scala._ with ones 
>> from the new library
>> * rebuild the job for 3.x
>> 
>> The main drawback is that there is no savepoint compatibility due to 
>> CanBuildFrom and different way of handling ADTs. But if you can afford 
>> re-bootstrapping the state - migration is quite straightforward.
>> 
>> The README on github https://github.com/findify/flink-scala-api#readme has 
>> some more details on how and why this project was done in this way. And the 
>> project is a bit experimental, so if you're interested in scala3 on Flink, 
>> you're welcome to share your feedback and ideas. 
>> 
>> with best regards,
>> Roman Grebennikov | g...@dfdx.me
>> 
> 
> 
> -- 
> Best Regards
> 
> Jeff Zhang

Reply via email to