Re: Spark Structured Streaming Runner Roadmap

2021-08-03 Thread Etienne Chauchot
Hi, Sorry for the late answer: the streaming mode in spark structured streaming runner is stuck because of spark structured streaming framework implementation of watermark at the apache spark project side. See https://echauchot.blogspot.com/2020/11/watermark-architecture-proposal-for.html be

Speeding upload of Uber Jar for Python on Flink on K8s

2021-08-03 Thread Jeremy Lewi
Hi Folks, I'm running Beam Python on Flink on Kubernetes. One thing I'm noticing is that it takes a really long time for jobs to start. It looks like this slowdown is due to the cost of uploading the Flink Beam Uber Jar (~225 Mb) to the Job server. Is there any way to speed this up? 1. Can the J

Re: Speeding upload of Uber Jar for Python on Flink on K8s

2021-08-03 Thread Jeremy Lewi
Hi Luke and Kyle, Thanks; I think that makes sense. If I run a dedicated beam job server; I assume I use the PortableRunner rather than the FlinkRunner