Re: JobManager Resident memory Always Increasing

2021-08-15 Thread Yun Tang
Hi Pranjul, First of all, you adopted on-heap state backend: HashMapStateBackend, which would not use native off-heap memory. Moreover, JobManager would not initialize any keyed state backend instance. And if not enable high availability, JobManagerCheckpointStorage would also not use direct me

JobManager Resident memory Always Increasing

2021-08-15 Thread Pranjul Ahuja
Hi, We are running the JobManager container with 1024GB out of which 512MB is allocated to the heap. We observe that the JobManager container's resident memory is always increasing and it never comes down. Heap usage remains to be constant and not rising abruptly. Can anyone help here where els

Re: Scaling Flink for batch jobs

2021-08-15 Thread Caizhi Weng
Hi! if I use parallelism of 2 or 4 - it takes the same time. > It might be that there is no data in some parallelisms. You can click on the nodes in Flink web UI and see if it is the case for each parallelism, or you can check out the metrics of each operator. if I don't increase parallelism and

Re: Periodic output at end of stream

2021-08-15 Thread JING ZHANG
Hi Matthias, How often do you register the event-time timer? It is registered per input record, or re-registered a new timer after an event-time timer is triggered? Would you please provide your test case code, it would be very helpful for troubleshooting. Best wishes, JING ZHANG Matthias Broeche

Re: Bug with PojoSerializer? java.lang.IllegalArgumentException: Can not set final double field Event.rating to java.lang.Integer

2021-08-15 Thread JING ZHANG
Hi Yu, The type of field `com.twosigma.research.options.optticks.core.types.Event.askPrice` in the original POJO class is Double, right? If it's right, then next step we should find out why source.readUTF() is `java.lang.Integer` instead of Double. Nathan Yu 于2021年8月13日周五 下午9:05写道: > When the

Re: PyFlink performance and deployment issues

2021-08-15 Thread Dian Fu
Hi Wouter, I suspect that it’s transferring the file venv.zip and so it may take some time. Does it stuck there forever? Could you share some log file? Regards, Dian > 2021年8月14日 下午4:47,Wouter Zorgdrager 写道: > > Hi all, > > I'm still dealing with the PyFlink deployment issue as described bel

Re: ProcessFunctionTestHarnesses for testing Python functions

2021-08-15 Thread Dian Fu
Hi Bogdan, There is still no TestHarness for Python ProcessFunction. However it seems a good idea to provide a TestHarness for Python functions such as ProcessFunction. I have created https://issues.apache.org/jira/browse/FLINK-23787 as a following-up. Regards, Dian > 2021年8月13日 下午7:11,Matthi

Re: UNION ALL on two LookupTableSources

2021-08-15 Thread JING ZHANG
Hi Yuval, >From the exception stack, it seems the original sql union all two LookupTableSource and write to a sink, right? I think it is reasonable to throw an exception, maybe we could make the error message more clear to understand. `LookupableTableSource` is used to look up data from external s

Flink installation using flinkoperator

2021-08-15 Thread Dhiru
hello all ,     I read some article , I think many company using flink operator  is using separate cluster for each job and this can be achieved using flinkk8soperator ? Please can you help me sharing some pointer video /git link which can help me for installing on AWS- EKS and I have zookeeper/

UNION ALL on two LookupTableSources

2021-08-15 Thread Yuval Itzchakov
Hi, I'm trying to run a UNION ALL query on two LookupTableSource tables defined with JDBC. When attempting this Flink complains that this is an unsupported feature: Caused by: org.apache.calcite.plan.RelOptPlanner$CannotPlanException: There are not enough rules to produce a node with desired prop

Re: Running Beam on a native Kubernetes Flink cluster

2021-08-15 Thread Gorjan Todorovski
Yes! That did it. Changed to localhost and all works fine now. I was wrong thinking it would like to connect to Beam SDK worker from my client machine, hence i added the load balancer. Thank you Jan! On Sun, 15 Aug 2021 at 16:45, Jan Lukavský wrote: > Hi Gorjan, > > the address of localhost is

Re: Running Beam on a native Kubernetes Flink cluster

2021-08-15 Thread Jan Lukavský
Hi Gorjan, the address of localhost is hard-coded in the python worker pool (see [1]). There should be no need to setup a load-balancer for the worker_pool, if you have it as another container in each TM pod, it should suffice to replace {beam_sdk_url} with 'localhost'. Each TM will then have

Getting "NaN" in KafkaConsumer.records-lag-max in flink

2021-08-15 Thread Pranjul Ahuja
Hi, I am not getting anything in the metric "flink.operator.KafkaConsumer.records-lag-max". Tried checking the value of Mbean using JMX for "org.apache.flink.taskmanager.job.task.operator.KafkaConsumer.records-lag-max" but I am always getting either NaN or 0. I have checked other Mbeans su