Re: flink cluster startup time

2022-03-30 Thread Gyula Fóra
Hi Frank! Thank you for the interest. As the others said, the flink-kubernetes-operator will give you quicker job/cluster startup time together with full support for the application mode. Production readiness is always relative. If I had to build a new production use-case I would not hesitate to

Flink SQL and data shuffling (keyBy)

2022-03-30 Thread Yaroslav Tkachenko
Hey everyone, I'm trying to use Flink SQL to construct a set of transformations for my application. Let's say the topology just has three steps: - SQL Source - SQL SELECT statement - SQL Sink (via INSERT) The sink I'm using (JDBC) would really benefit from data partitioning (by PK ID) to avoid c

Re: flink cluster startup time

2022-03-30 Thread Yang Wang
@Gyula Fóra is trying to prepare the preview release(0.1) for flink-kubernetes-operator. It now is fully functional for application mode. You could have a try and share more feedback with the community. The release-1.0 aims for production ready. And we still miss some important pieces(e.g. FlinkS

Re: flink docker image (1.14.4) unable to access other pods from flink program (job and task manager access is fine)

2022-03-30 Thread 胡伟华
Glad your issue was resolved. > 2022年3月31日 上午12:45,Jin Yi 写道: > > i ended up debugging this down to a command execution timeout for the lettuce > (redis client) code rather than a connection timeout. we're actually able to > hit the redis server and port, but something wonky is going on w/ th

how to achieve sideOutputLateData() in FlinkSQL?

2022-03-30 Thread liuxiangcao
Hi Flink community, In Flink DataStream Java API, user can get get data that was discarded as late using WindowedStream.sideOutputLateData(OutputTag) (see [1]). I'm wondering what is the best way for user to achieve this in Flink SQL? For background, we are providing pure sql deployment to our

Re: How to debug Metaspace exception?

2022-03-30 Thread John Smith
Also if I manually cancel and restart the same job over and over is it the same as if flink was restarting a job due to failure? I.e: When I click "Cancel Job" on the UI is the job completely unloaded vs when the job scheduler restarts a job because if whatever reason? Lile this I'll stop and res

Re: flink docker image (1.14.4) unable to access other pods from flink program (job and task manager access is fine)

2022-03-30 Thread Jin Yi
i ended up debugging this down to a command execution timeout for the lettuce (redis client) code rather than a connection timeout. we're actually able to hit the redis server and port, but something wonky is going on w/ the redis request (command) and reply loop which is meant to be synchronous.

Re: flink cluster startup time

2022-03-30 Thread Frank Dekervel
Hello David, Thanks for the information! So the two main takeaways from your email are to - Move to something supporting application mode. Is https://github.com/apache/flink-kubernetes-operator already ready enough for production deployments ? - wait for flink 1.15 thanks! Frank On

Re: Pyflink elastic search connectors

2022-03-30 Thread Sandeep Sharat
Thank you for your reply. Now I have a better understanding of it. On Wed, 30 Mar, 2022, 5:29 pm LuNing Wang, wrote: > Hi, > > The principle of the python datastream connector is interprocess > communication via py4j. I blocked in a class loading problem, so I haven't > achieved the PR about the

Re: How to debug Metaspace exception?

2022-03-30 Thread 胡伟华
> So if I run the same jobs in my dev env will I still be able to see the > similar dump? I think running the same job in dev should be reproducible, maybe you can have a try. > If not I would have to wait at a low volume time to do it on production. > Aldo if I recall the dump is as big as t

Re: How to debug Metaspace exception?

2022-03-30 Thread John Smith
I have 3 task managers (see config below). There is total of 10 jobs with 25 slots being used. The jobs are 100% ETL I.e; They load Json, transform it and push it to JDBC, only 1 job of the 10 is pushing to Apache Ignite cluster. FOR JMAP. I know that it will pause the task manager. So if I run th

Call for Presentations now open, ApacheCon North America 2022

2022-03-30 Thread Rich Bowen
[You are receiving this because you are subscribed to one or more user or dev mailing list of an Apache Software Foundation project.] ApacheCon draws participants at all levels to explore “Tomorrow’s Technology Today” across 300+ Apache projects and their diverse communities. ApacheCon showcases t

Re: Pyflink elastic search connectors

2022-03-30 Thread LuNing Wang
Hi, The principle of the python datastream connector is interprocess communication via py4j. I blocked in a class loading problem, so I haven't achieved the PR about the Python ES datastream connector yet. Compared with other connectors, the ES is a little more troublesome. Because implementing of

Re: Naming sql_statment job

2022-03-30 Thread Zhanghao Chen
Hi Lan, You can just set the configuration 'pipeline.name' = '{job_name}'. You could do that via -D parameter when you submit the job using Flink CLI or directly set it in the code (https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/). Configuration | Apache Flink

Naming sql_statment job

2022-03-30 Thread lan tran
Hi team, When I was using Table API to submit the SQL job using execute_query(), the name is created by Flink. However, I wonder there is a way to config that name. I see that in the SQL-Client they have this statementSET 'pipeline.name' = '{job_name}'. Wonder that if it can execute this using exec

Re: Pyflink elastic search connectors

2022-03-30 Thread Sandeep Sharat
Hi, I am pretty much a novice in python. So writing an entire wrapper using python may be a tough nut to crack for me. But just out of curiosity, want to ask ask the question that why were the connectors not implemented in python api. Is it because of a very lesser number of use cases ???or most