Re: How to add Flink a Flink connector to stateful functions

2021-09-27 Thread Igal Shilman
Hello Christian, I'm happy to hear that you are trying out StateFun and like the toolset! Currently StateFun supports "out of the box" only Kafka/Kinesis egresses, simply because so far folks didn't requested anything else. I can create a JIRA issue for that and we'll see how the community respon

Re: Write Streaming data to S3 in Parquet files

2021-09-27 Thread Guowei Ma
Hi,Harshvardhan I think you could use some factory such as `ParquetAvroWriters.for` form `ParquetAvroWriters.java` [1]. And you could see more same class in the package `org.apache.flink.formats.parquet.` [1] https://github.com/apache/flink/blob/master/flink-formats/flink-parquet/src/main/jav

AW: How to add Flink a Flink connector to stateful functions

2021-09-27 Thread Christian Krudewig (Corporate Development)
Hello Roman, Well, if that's the way to do it, I can manage to maintain a fork of the statefun repo with these tiny changes. But first my question is if that is the way it should be done? Or if there is another way to activate these connectors. Best, Christian -Ursprüngliche Nachricht

rpc invocation exceeds the maximum akka framesize

2021-09-27 Thread Deshpande, Omkar
Hello, We run a lot of flink applications. Some of them sometimes run into this error on Job Manager- The rpc invocation size exceeds the maximum akka framesize After we increase the framesize the application starts working again. What factors determine the akka framesize? We sometimes see appli

Re: How to add Flink a Flink connector to stateful functions

2021-09-27 Thread Roman Khachatryan
Hi, > Does that mean that I need to build the stateful functions java application > and afterwards the docker image? Yes, you have to rebuild the application after updating the pom, as well as its docker image. Is your concern related to synchronizing local docker images with the official repo?

How to add Flink a Flink connector to stateful functions

2021-09-27 Thread Christian Krudewig (Corporate Development)
Hello everyone, Currently I'm busy setting up a pipeline with Stateful Functions using a deployment of the standard docker image "apache/flink-statefun" to kubernetes. It has been going smoothly so far and I love the whole toolset. But now I want to add Egress modules for both Opensearch (= Ela

Re: Potential bug when assuming roles from AWS EKS when using S3 as RocksDb checkpoint backend?

2021-09-27 Thread Rommel Holmes
Hi, Ingo I was looking into the aws dependeencies, and from https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts-minimum-sdk.html the minimum required version to use the feature is 1.11.704. So 1.11.788 should be sufficient? Can you point it to me where it says that 1.1

Unable to connect to Mesos on mesos-appmaster.sh start

2021-09-27 Thread Javier Vegas
I am trying to start Flink 1.13.2 on Mesos following the instrucions in https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/deployment/resource-providers/mesos/ and using Marathon to deploy a Docker image with both the Flink and my binaries. My entrypoint for the Docker image is: /op

Re: could not stop with a Savepoint.

2021-09-27 Thread Roman Khachatryan
Hi, The above exception may be caused by both savepoint timing out and job termination timing out. To distinguish between these two cases, could you please check the status of the savepoint and the tasks in the Flink Web UI? IIUC, after you get this exception on client, you still have the job runn

Re: Flink Performance Issue

2021-09-27 Thread Arvid Heise
Hi Kamaal, I did a quick test with a local Kafka in docker. With parallelism 1, I can process 20k messages of size 4KB in about 1 min. So if you use parallelism of 15, I'd expect it to take it below 10s even with bigger data skew. What I recommend you to do is to start from scratch and just work

Re: How do I verify data is written to a JDBC sink?

2021-09-27 Thread Roman Khachatryan
Hi, Do I understand correctly, that long checkpointing times are caused by slow queries to the database? If so, async querying might resolve the issue on Flink side, but the unnecessary load on DB will remain. Instead, maybe you can use CDC to stream DB changes and send messages to RabbitMQ when

Re: Flink Performance Issue

2021-09-27 Thread Mohammed Kamaal
Hi Robert, I have removed all the business logic (keyBy and window) operator code and just had a source and sink to test it. The throughput is 20K messages in 2 minutes. It is a simple read from source (kafka topic) and write to sink (kafka topic). Don't you think 2 minutes is also not a better

Re: flink job : TPS drops from 400 to 30 TPS

2021-09-27 Thread JING ZHANG
Hi, About cpu cost, there are several methods: 1. Flink builtin metric: `Status.JVM.CPU.Load` [1] 2. Use `top` command on the target machine which deploys a suspect TaskManager 3. You could use flame graph to do deeper profiler of a JVM [2]. ... About RPC response, I'm not an expert on HBase, I'm

Re: flink job : TPS drops from 400 to 30 TPS

2021-09-27 Thread Ragini Manjaiah
please let me know how to check Does RPC response and CPU cost On Mon, Sep 27, 2021 at 1:19 PM JING ZHANG wrote: > Hi, > Since there is not enough information, you could first check the back > pressure status of the job [1], find the task which caused the back > pressure. > Then try to find out

Re: flink job : TPS drops from 400 to 30 TPS

2021-09-27 Thread JING ZHANG
Hi, Since there is not enough information, you could first check the back pressure status of the job [1], find the task which caused the back pressure. Then try to find out why the task processed data slowly, there are many reasons, for example the following reasons: (1) Does data skew exist, which