[ANNOUNCE] Apache Flink Stateful Functions 2.2.0 released

2020-09-27 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache Flink Stateful Functions 2.2.0. Stateful Functions is an API that simplifies the building of distributed stateful applications with a runtime built for serverless architectures. It's based on functions with persistent state

Re: Flink being used in other open source projects?

2020-09-27 Thread vinuthomas2008
Hi Gezim, Yes, SANSA seems interesting. Can you elaborate couple of its Business Use cases? Right now I understand its a RDF data processing using Flink/Spark. Thanks Vinu On Mon, Sep 28, 2020 at 3:09 AM Gezim Sejdiu wrote: > Hi Vinu,

Re: Flink being used in other open source projects?

2020-09-27 Thread Gezim Sejdiu
Hi Vinu, we are using Flink in the SANSA project [1] and any contribution towards increasing functionality is more than welcome. But, in case you are still looking for a full list of third-party projects then Community Packages for Apache

Re: SocketException: Too many open files

2020-09-27 Thread mars
Hi, I am using 1.10.0 version of Flink on EMR. I am not using the Default Flink Sink. I have a Sink Function on the Stream and with in the invoke function i am creating a Data Structure (VO) and putting it in the Map. The EMR Step function i am running is. a Spring based FLink Job and i have

Re: Reading from HDFS and publishing to Kafka

2020-09-27 Thread Khachatryan Roman
Hi, 1. Yes, StreamingExecutionEnvironment.readFile can be used for files on HDFS 2. I think this is a valid concern. Besides that, there are plans to deprecate DataSet API [1] 4. Yes, the approach looks good I'm pulling in Aljoscha for your 3rd question (and probably some clarifications on others

Re: Hiring Flink developers

2020-09-27 Thread Khachatryan Roman
Please use user mailing list for questions related to the use of Flink. See [1] for the other lists. [1] https://flink.apache.org/community.html#mailing-lists Regards, Roman On Sun, Sep 27, 2020 at 8:29 AM Dan Hill wrote: > I'm looking to hire Flink developers (full time or contractors) to wo

Re: I have a job with multiple Kafka sources. They all contain certain historical data.

2020-09-27 Thread Piotr Nowojski
Great, thanks for the update! And please share your feedback if it worked or not. Piotrek niedz., 27 wrz 2020 o 11:20 hao kong napisał(a): > Thanks for the tip! > I am currently trying to implement a zookeeper-based coordinator.use > it to record the current watermark and control streaming

Re: Flink being used in other open source projects?

2020-09-27 Thread Khachatryan Roman
Hi, Apache Beam [1] and Zeppelin [2] can use Flink. I don't think there are Flink setups used by open-source projects. [1] https://beam.apache.org/documentation/runners/flink/ [2] https://zeppelin.apache.org/docs/0.9.0-SNAPSHOT/interpreter/flink.html Regards, Roman On Fri, Sep 25, 2020 at 6:05

Re: Checkpoint dir is not cleaned up after cancel the job with monitoring API

2020-09-27 Thread Eleanore Jin
I have noticed this: if I have Thread.sleep(1500); after the patch call returned 202, then the directory gets cleaned up, in the meanwhile, it shows the job-manager pod is in completed state before getting terminated: see screenshot: https://ibb.co/3F8HsvG So the patch call is async to terminate t

Re: Checkpoint dir is not cleaned up after cancel the job with monitoring API

2020-09-27 Thread Eleanore Jin
Hi Congxian, I am making rest call to get the checkpoint config: curl -X GET \ http://localhost:8081/jobs/d2c91a44f23efa2b6a0a89b9f1ca5a3d/checkpoints/config and here is the response: { "mode": "at_least_once", "interval": 3000, "timeout": 1, "min_pause": 1000, "max_concur

Re: I have a job with multiple Kafka sources. They all contain certain historical data.

2020-09-27 Thread hao kong
Thanks for the tip! I am currently trying to implement a zookeeper-based coordinator.use it to record the current watermark and control streaming according to your first suggest. Piotr Nowojski 于2020年9月16日周三 下午11:56写道: > Hey, > > If you are worried about increased amount of buffered data by