flink eventTime, lateness, maxoutoforderness

2017-12-16 Thread chen
eventTime, lateness, maxoutoforderness are all about time. event Time is the water mark time on the record. lateness is record time or the real word time? maxoutoforderness is record time or the real word time? dataStream.keyBy(row -> (String)row.getField(0)) .window(TumblingEventTimeWind

Re: flink eventTime, lateness, maxoutoforderness

2017-12-22 Thread chen
Hi Eron, Thanks for your help. Actually I know maxoutoforder, lateness is based on Event Time. But in my test it is not. Following is my code and test data. "key1|148325064|", "key1|1483250636000|", "key1|1483250649000|", "key1|1483250642000|", "

Re: flink eventTime, lateness, maxoutoforderness

2017-12-22 Thread chen
CODE with maxOutOfOrdernesstime effect: dataStream.keyBy(row -> (String)row.getField(0)) .window(TumblingEventTimeWindows.of(Time.seconds(5))) .fold(initRow(), new FoldFunction() { @Override public Row fold(Row ret, Ro

Re: flink eventTime, lateness, maxoutoforderness

2017-12-22 Thread chen
Thanks Gordon, Please see the rely. I use code, but the result it doesn`t like what the doc explain. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Deep Copy in FLINK, Kryo Copy is used in the different operator

2018-02-11 Thread chen
Actually our team have our own Stream Engine, we tested our engine and flink, find out when we aggregate the stream data, the throughput is decreasing very fast. So we catch the stack and find out a deep copy in flink. In different operator, there will be org.apache.flink.api.java.typeutils.runti

Re: Deep Copy in FLINK, Kryo Copy is used in the different operator

2018-02-21 Thread chen
@Aljoscha Krettek, Thanks Aljoscha, I will try this way to test the performance. Last 7 days is chinese spring fastival, sorry for response you so late. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Deep Copy in FLINK, Kryo Copy is used in the different operator

2018-02-21 Thread chen
@Gábor Gévay, Thanks Gábor I just use flink in produce environment, but the performance is not good, especially in aggregation. At the beginning I used Java serialization, but it does not work well. Maybe I do not understood flink very well then. I will try change the serialization meth

Specially introduced Flink to chinese users in CNCC(China National Computer Congress)

2015-10-24 Thread Liang Chen
Specially introduced Flink to chinese users in CNCC(China National Computer Congress), many people are interesting in Flink and discussed with me. In the future, there may have more and more users from china to participate in Flink project and apply Flink to their big data system :) -- View this

Re: Specially introduced Flink to chinese users in CNCC(China National Computer Congress)

2015-10-24 Thread Liang Chen
-- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Specially-introduced-Flink-to-chinese-users-in-CNCC-China-National-Computer-Congress-tp

Re: Specially introduced Flink to chinese users in CNCC(China National Computer Congress)

2015-11-18 Thread Liang Chen
Two aspects are attracting them: 1.Flink is using java, it is easy for most of them to start Flink, and be more easy to maintain in comparison to Storm(as Clojure is difficult to maintain, and less people know it.) 2.Users really want an unified system supporting streaming and batch processing.

Re: [ANNOUNCE] Introducing Apache Flink Taiwan User Group - Flink.tw

2016-01-04 Thread Liang Chen
Awesome! Whether also inclue simple chinese or only Traditional chinese? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ANNOUNCE-Introducing-Apache-Flink-Taiwan-User-Group-Flink-tw-tp4136p4147.html Sent from the Apache Flink User Mailing Li

XGBoost4J: Portable Distributed XGboost in Flink

2016-03-14 Thread Tianqi Chen
Hi Flink Community: I am sending this email to let you know we just release XGBoost4J which also runs on Flink. In short, XGBoost is a machine learning package that is used by more than half of the machine challenge winning solutions and is already widely used in industry. The distributed versi

flink streaming - window chaining example

2016-03-27 Thread Chen Bekor
hi all! I'm just starting my way with flink and I have a design question. I'm trying to aggregate incoming events (source: kafka topic) on a 10min tumbling window in order to calculate the incoming events rate (total per minute). I would like to take this window and perform an additional window

printing datastream to the log - instead of stdout

2016-04-14 Thread Chen Bekor
s like I'm reinventing the wheel for something that should be very elementary. thanks!! Chen.

throttled stream

2016-04-16 Thread Chen Bekor
is there a way to consume a kafka stream using flink with a predefined rate limit (eg 5 events per second) we need this because we need to control some 3rd party api rate limitations so, even if we have a much larger throughput potential, we must control the consumption rate in order not to over

fan out parallel-able operator sub-task beyond total slots number

2016-04-17 Thread Chen Qin
ows Exception { return true; } }).setParallelism(10).addSink(new SinkFunction() { @Override public void invoke(Long aLong) throws Exception { System.out.println(aLong); } }); env.execute("fan out 100 subtasks for 1s delay mapper"); } Thanks, Chen Qin

design question

2016-04-23 Thread Chen Bekor
hi all, I have a stream of incoming object versions (objects change over time) and a requirement to fetch from a datastore the last known object version in order to link it with the id of the new version, so that I end up with a linked list of object versions. all object versions contain the sam

Re: design question

2016-04-24 Thread Chen Bekor
of the number of GUIDs that are eventually tracked. Whether > flink/stream processing is the most effective way to achieve your goal, I > can't say, but I am fairly confident that this particular aspect is not a > problem. > > On Sat, Apr 23, 2016 at 1:13 AM, Chen Bekor wrote:

s3 checkpointing issue

2016-05-03 Thread Chen Qin
scala 2.10 hadoop-aws 2.7.2 aws-java-sdk 1.7.4 Thanks, Chen Attached full log that shows on web dashboard when job canceled. java.lang.RuntimeException: Error triggering a checkpoint as the result of receiving checkpoint barrier at org.apache.flink.streaming.runtime.tasks.StreamTask$2.onEvent

Re: s3 checkpointing issue

2016-05-04 Thread Chen Qin
Uruk & Igor, Thanks for helping out! Yup, it fixed my issue. Chen On Wed, May 4, 2016 at 12:57 PM, Igor Berman wrote: > I think I've had this issue too and fixed it as Ufuk suggested > in core-site.xml > > something like > > fs.s3a.buffer.dir > /tmp > &

s3 statebackend user state size

2016-05-10 Thread Chen Qin
state split yet? Or would implement kvstate interface makes flink take care of large state problem? Chen java.lang.Exception: The slot in which the task was executed has been released. Probably loss of TaskManager eddbcda03a61f61210063a7cd2148b36 @ 10.163.98.18 - 24 slots - URL: akka.tcp:/

Re: s3 statebackend user state size

2016-05-10 Thread Chen Qin
Hi Ufuk, Yes, it does help with Rocksdb backend! After tune checkpoint frequency align with network throughput, task manager released and job get cancelled are gone. Chen > On May 10, 2016, at 10:33 AM, Ufuk Celebi wrote: > >> On Tue, May 10, 2016 at 5:07 PM, Chen Qin wrote: &

rocksdb backend on s3 window operator checkpoint issue

2016-05-16 Thread Chen Qin
re? Or did I mess up with configuration? Thanks, Chen 2016-05-16 17:20:32,132 INFO org.apache.flink.runtime.state.filesystem.FsStateBackend - Initializing file state backend to URI s3://xxx/checkpoints/7e6abf126ce3d18f173733b34eda81a9 2016-05-

Failures due to inevitable high backpressure

2020-08-26 Thread Hubert Chen
Hello, My Flink application has entered into a bad state and I was wondering if I could get some advice on how to resolve the issue. The sequence of events that led to a bad state: 1. A failure occurs (e.g., TM timeout) within the cluster 2. The application successfully recovers from the last co

Native kubernetes setup failed to start job

2020-10-29 Thread Chen Liangde
I created a flink cluster in kubernetes following this guide: https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/native_kubernetes.html The job manager was running. When a job was submitted to the job manager, it spawned a task manager pod, but the task manager failed to c

Re: Native kubernetes setup failed to start job

2020-11-01 Thread Chen Liangde
%classpath% %jvmmem% %jvmopts% %logging% %class% %args%" \ -Dkubernetes.jobmanager.service-account=flink Thanks Liangde Chen On Mon, 2 Nov 2020 at 08:20, Yang Wang wrote: > Could you share the JobManager logs so that we could check whether it > received the > registration fr

Re: Flink[Python] questions

2021-01-14 Thread Shuiqiang Chen
Hi Dc, Thank you for your feedback. 1. Currently, only built-in types are supported in Python DataStream API, however, you can apply a Row type to represent a custom Python class as a workaround that field names stand for the name of member variables and field types stand for the type of member

Re: Declaring and using ROW data type in PyFlink DataStream

2021-01-14 Thread Shuiqiang Chen
Hi meneldor, The main cause of the error is that there is a bug in `ctx.timer_service().current_watermark()`. At the beginning the stream, when the first record come into the KeyedProcessFunction.process_element() , the current_watermark will be the Long.MIN_VALUE at Java side, while at the Python

Re: Declaring and using ROW data type in PyFlink DataStream

2021-01-17 Thread Shuiqiang Chen
>> [1] https://issues.apache.org/jira/browse/FLINK-20647 >> >> Best, >> Xingbo >> >> meneldor 于2021年1月15日周五 上午1:20写道: >> >>> Thank you for the answer Shuiqiang! >>> Im using the last apache-flink version: >>> >>>>

Re: Declaring and using ROW data type in PyFlink DataStream

2021-01-18 Thread Shuiqiang Chen
Hi meneldor, Actually, the return type of the on_timer() must be the same as process_element(). It seems that the yield value of process_element() is missing the `timestamp` field. And the `output_type_info` has four field names but with 5 field types. Could you align them? Best, Shuiqiang

Flink1.10 Cluster's Received is zero in the web when consume from Kafka0.11

2020-03-25 Thread Jim Chen
Hi, all When I use flink-connector-kafka-0.11 consume Kafka0.11, the Cluster web's Received Record is always 0. However, the log is not empty. Any one can help me? [image: image.png]

Re: Flink1.10 Cluster's Received is zero in the web when consume from Kafka0.11

2020-03-25 Thread Jim Chen
chain after the source / > before the sink, or query the numRecordsOut metric for the source / > numRecordsIn metric for the sink via the WebUI metrics tab or REST API. > > On 25/03/2020 10:49, Jim Chen wrote: > > Hi, all > When I use flink-connector-kafka-0.11 consume Kafka0

How to consume kafka from the last offset?

2020-03-25 Thread Jim Chen
Hi, All I use flink-connector-kafka-0.11 consume the Kafka0.11. In KafkaConsumer params, i set the group.id and auto.offset.reset. In the Flink1.10, set the kafkaConsumer.setStartFromGroupOffsets(); Then, i restart the application, found the offset is not from the last position. Any one know wh

Re: How to consume kafka from the last offset?

2020-03-26 Thread Jim Chen
> Best Regards, > Dom. > > śr., 25 mar 2020 o 11:19 Jim Chen > napisał(a): > >> Hi, All >> I use flink-connector-kafka-0.11 consume the Kafka0.11. In >> KafkaConsumer params, i set the group.id and auto.offset.reset. In the >> Flink1.10, set the kaf

When i use the Tumbling Windows, find lost some record

2020-03-26 Thread Jim Chen
Hi, All When i use the Tumbling Windows, find lost some record. My code as follow *env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);* *env.addSource(FlinkKafkaConsumer011..)* *.assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor(Time.minutes(3)) {

Re: How to consume kafka from the last offset?

2020-03-26 Thread Jim Chen
. > > czw., 26 mar 2020 o 08:38 Jim Chen > napisał(a): > >> Thanks! >> >> I made a mistake. I forget to set the auto.offset.reset=false. It's my >> fault. >> >> Dominik Wosiński 于2020年3月25日周三 下午6:49写道: >> >>> Hi Jim, >>&g

Re: Debug Slowness in Async Checkpointing

2020-04-25 Thread Chen Q
* when upload to remote file system start/complete * when ack send to checkpoint coordinator For now, we only see checkpoint timeout due to a task can't finish in time in flink UI, seems limited to debug further. Chen On 4/24/20 10:52 PM, Congxian Qiu wrote: Hi If the bottleneck i

Re: Flink consuming rate increases slowly

2020-05-10 Thread Chen Qin
Hi Eyal, It’s unclear what warmup phase does in your use cases. Usually we see Flink start consume at high rate and drop to a point downstream can handle. Thanks Chen > On May 10, 2020, at 12:25 AM, Eyal Pe'er wrote: > > Hi all, > Lately I've added more resources to

Re: Flink operator throttle

2020-05-17 Thread Chen Qin
/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/internals/AbstractFetcher.java#L347 Best, Chen On Sat, May 16, 2020 at 11:48 PM Benchao Li wrote: > Hi, > > > If I want to use the rate limiter in other connectors, such as Kafka > sink, ES sin

Flink Kafka Connector Source Parallelism

2020-05-27 Thread Chen, Mason
Hi all, I’m currently trying to understand Flink’s Kafka Connector and how parallelism affects it. So, I am running the flink playground click count job and the parallelism is set to 2 by default. However, I don’t see the 2nd subtask of the Kafka Connector sending any records: https://imgur.c

Re: Flink Kafka Connector Source Parallelism

2020-05-27 Thread Chen, Mason
it’s not actually doing anything. So, it seems a general statement can be made that (# kafka partitions) >= (# parallelism of flink kafka source)…well I guess you could have more parallelism than kafka partitions, but the extra subtasks will not doing anything. From: "Chen, Maso

关于flink sql 滚动窗口无法输出结果集合

2020-05-28 Thread steven chen
数据没次都能进来,并且统计,但是为什么结果insert 不会保存到mysql 中?是sql的问题?还是?求大神解答 CREATE TABLE user_behavior ( itemCode VARCHAR, ts BIGINT COMMENT '时间戳', t as TO_TIMESTAMP(FROM_UNIXTIME(ts /1000,'-MM-dd HH:mm:ss')), proctime as PROCTIME(), WATERMARK FOR t as t - INTERVAL '5' SECOND ) WITH ( 'connector.type' = '

HELP ! ! ! When using Flink1.10 to define table with the string type, the query result is null

2020-07-05 Thread Jim Chen
Hi, everyone! When i use flink1.10 to define table, and i want to define the json array as the string type. But the query resutl is null when i execute the program. The detail code as follow: package com.flink; import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; import

Re: Kafka Rate Limit in FlinkConsumer ?

2020-07-06 Thread Chen Qin
timestamp) throws Exception { subtaskRateLimiter.acquire(); if (record == null) { consumerMetricGroup.counter("invalidRecord").inc(); } super.emitRecordWithTimestamp(record, partitionState, offset, timestamp); } }; } Thanks, Chen Pinterest Data > On Jul 6

HELP!!! Flink1.10 sql insert into HBase, error like validateSchemaAndApplyImplicitCast

2020-07-15 Thread Jim Chen
Hi, I use flink1.10.1 sql to connect hbase1.4.3. When inset into hbase, report an error like validateSchemaAndApplyImplicitCast. Means that the Query Schema and Sink Schema are inconsistent. Mainly Row (EXPR$0) in Query Schema, which are all expressions. Sink Schema is Row(device_id). I don't k

records-lag-max

2020-07-20 Thread Chen, Mason
Hi all, I am having some trouble with the lag metric from the kafka connector. The gauge value is always reported as NaN although I’m producing events first and then starting the flink job. Anyone know how to fix this? ``` # TYPE flink_taskmanager_job_task_operator_records_lag_max gauge flink_t

Re: records-lag-max

2020-07-20 Thread Chen, Mason
I removed an unnecessary `.keyBy()` and I’m getting the metrics again. Is this a potential bug? From: "Chen, Mason" Date: Monday, July 20, 2020 at 12:41 PM To: "user@flink.apache.org" Subject: records-lag-max Hi all, I am having some trouble with the lag metric from the

RE: Recommended pattern for implementing a DLQ with Flink+Kafka

2020-07-22 Thread Chen Qin
Could you more specific on what “failed message” means here?In general side output can do something like were  def process(ele) {   try{    biz} catch {   Sideout( ele + exception context)}}  process(func).sideoutput(tag).addSink(kafkasink) Thanks,Chen   From: Eleanore JinSent: Wednesday, July

Re: Job Cluster on Kubernetes with PyFlink

2020-07-28 Thread Shuiqiang Chen
Hi Wojciech, Currently, we are not able to deploy a job cluster for PyFlink jobs on kubernetes, but it will be supported in release-1.12. Best, Shuiqiang

Re: Job Cluster on Kubernetes with PyFlink

2020-07-28 Thread Shuiqiang Chen
uot; to be "org.apache.flink.client.python.PythonDriver". 5. Adding '-pym {the_entry_module_of_your_pyflink_job}' to [job arguments]. Best, Shuiqiang Shuiqiang Chen 于2020年7月28日周二 下午5:55写道: > Hi Wojciech, > > Currently, we are not able to deploy a job cluster for PyFlink jobs on &g

Re: [DISCUSS] FLIP-133: Rework PyFlink Documentation

2020-08-02 Thread Shuiqiang Chen
Hi jincheng, Thanks for the discussion. +1 for the FLIP. A well-organized documentation will greatly improve the efficiency and experience for developers. Best, Shuiqiang Hequn Cheng 于2020年8月1日周六 上午8:42写道: > Hi Jincheng, > > Thanks a lot for raising the discussion. +1 for the FLIP. > > I thin

Re: Python DataStream API Questions -- Java/Scala Interoperability?

2021-03-03 Thread Shuiqiang Chen
Hi Kevin, Thank you for your questions. Currently, users are not able to defined custom source/sinks in Python. This is a greate feature that can unify the end to end PyFlink application development in Python and is a large topic that we have no plan to support at present. As you have noticed tha

Re: PyFlink Connection Refused to Kubernetes Session Cluster

2021-03-04 Thread Shuiqiang Chen
Hi Robert, It seems the retrieved address of JobManager is a cluster-internal Ip that can noly be accessed inside the cluster. As you said, you might need to create an ingress to expose the JobManager service so that the client can access to it outside of the k8s cluster. Best, Shuiqiang Robert

Re: Convert BIGINT to TIMESTAMP in pyflink when using datastream api

2021-03-04 Thread Shuiqiang Chen
Hi Shilpa, There might be something wrong when defining the rowtime field with the Connector descriptor, it’s recommended to use SQL DDL to create tables, and do queries with table API. Best, Shuiqiang Shilpa Shankar 于2021年3月4日周四 下午9:29写道: > Hello, > > We are using pyflink's datastream api v1.

Re: Running Pyflink job on K8s Flink Cluster Deployment?

2021-03-06 Thread Shuiqiang Chen
Hi Kevin, You are able to run PyFlink applications on kuberetes cluster, both native k8s mode and resource definition mode are supported since release-1.12.0. Currently, Python and PyFlink are not enabled in official flink docker image, that you might need to build a custom image with Python and P

Re: Running Pyflink job on K8s Flink Cluster Deployment?

2021-03-06 Thread Shuiqiang Chen
kubectl create -f flink-configuration-configmap.yaml$ kubectl create -f jobmanager-service.yaml# Create the deployments for the cluster$ kubectl create -f job-manager.yaml$ kubectl create -f task-manager.yaml Best, Shuiqiang Shuiqiang Chen 于2021年3月6日周六 下午5:10写道: > Hi Kevin, > > You

Re: Python StreamExecutionEnvironment from_collection Kafka example

2021-03-12 Thread Shuiqiang Chen
Hi Robert, Kafka Connector is provided in Python DataStream API since release-1.12.0. And the documentation for it is lacking, we will make it up soon. The following code shows how to apply KafkaConsumers and KafkaProducer: ``` env = StreamExecutionEnvironment.get_execution_environment() env.set_

Re: Python StreamExecutionEnvironment from_collection Kafka example

2021-03-12 Thread Shuiqiang Chen
nks. > > On Fri, Mar 12, 2021 at 1:48 PM Shuiqiang Chen > wrote: > >> Hi Robert, >> >> Kafka Connector is provided in Python DataStream API since >> release-1.12.0. And the documentation for it is lacking, we will make it up >> soon. >> >>

Re: Python DataStream API Questions -- Java/Scala Interoperability?

2021-03-14 Thread Shuiqiang Chen
of the custom sink/source. > > What's the best way to pass arguments to the constructor? > > On Fri, Mar 5, 2021 at 4:29 PM Kevin Lam wrote: > >> Thanks Shuiqiang! That's really helpful, we'll give the connectors a try. >> >> On Wed, Mar 3, 202

Re: Working with DataStreams of Java objects in Pyflink?

2021-03-15 Thread Shuiqiang Chen
Hi Kevin, Currently, POJO type is not supported in Python DataStream API because it is hard to deal with the conversion between Python Objects and Java Objects. Maybe you can use a RowType to represent the POJO class such as Types.ROW_NAME([id, created_at, updated_at], [Types.LONG(), Types.LONG(),

Re: Pyflink tutorial output

2021-03-23 Thread Shuiqiang Chen
Hi Robert, Have you tried exploring the /tmp/output directory in the task manager pods on you kubernetes cluster? The StreamingFileSink will create the output directory on the host of task manager in which the sink tasks are executed. Best, Shuiqiang Robert Cullen 于2021年3月24日周三 上午2:48写道: > I’m

Re: PyFlink DataStream Example Kafka/Kinesis?

2021-03-24 Thread Shuiqiang Chen
Hi Kevin, Kinesis connector is not supported yet in Python DataStream API. We will add it in the future. Best, Shuiqiang Bohinski, Kevin 于2021年3月25日周四 上午5:03写道: > Is there a kinesis example? > > > > *From: *"Bohinski, Kevin" > *Date: *Wednesday, March 24, 2021 at 4:40 PM > *To: *"Bohinski, Ke

Re: PyFlink DataStream Example Kafka/Kinesis?

2021-03-24 Thread Shuiqiang Chen
; Bin > > On Wed, Mar 24, 2021 at 7:30 PM Shuiqiang Chen > wrote: > >> Hi Kevin, >> >> Kinesis connector is not supported yet in Python DataStream API. We will >> add it in the future. >> >> Best, >> Shuiqiang >> >> Bohinski,

Re: PyFlink DataStream Example Kafka/Kinesis?

2021-03-24 Thread Shuiqiang Chen
mplement the > connector over the weekend? > > I am interested in contributing to Flink, and I think this can be a good > starting point to me > > Best > Bin > > On Wed, Mar 24, 2021 at 7:49 PM Shuiqiang Chen > wrote: > >> I have just created the jira >> h

Re: [EXTERNAL] Re: PyFlink DataStream Example Kafka/Kinesis?

2021-03-25 Thread Shuiqiang Chen
s contribution as you like. Best, Shuiqiang Bohinski, Kevin 于2021年3月25日周四 下午12:00写道: > Hi Shuiqiang, > > > > Thanks for letting me know. Feel free to send any beginner level > contributions for this effort my way 😊 . > > > > Best, > > kevin > > > > *From: *

Long checkpoint duration for Kafka source operators

2021-05-13 Thread Hubert Chen
Hello, I have an application that reads from two Kafka sources, joins them, and produces to a Kafka sink. The application is experiencing long end to end checkpoint durations for the Kafka source operators. I'm hoping I could get some direction in how to debug this further. Here is a UI screensho

Prometheus Reporter Enhancement

2021-05-18 Thread Mason Chen
Hi all, Would people appreciate enhancements to the prometheus reporter to include extra labels via a configuration, as a contribution to Flink? I can see it being useful for adding labels that are not job specific, but infra specific. The change would be nicely integrated with the Flink’s Conf

Re: Prometheus Reporter Enhancement

2021-05-19 Thread Mason Chen
/jira/browse/FLINK-17495 > <https://issues.apache.org/jira/browse/FLINK-17495> > > On 5/18/2021 8:16 PM, Andrew Otto wrote: >> Sounds useful! >> >> On Tue, May 18, 2021 at 2:02 PM Mason Chen > <mailto:mason.c...@apple.com>> wrote: >> Hi all, >

Re: Long checkpoint duration for Kafka source operators

2021-05-20 Thread Hubert Chen
uests to inject the barrier. After I increased the JM resources, the long start delays disappeared. On Thu, May 13, 2021 at 1:56 PM Hubert Chen wrote: > Hello, > > I have an application that reads from two Kafka sources, joins them, and > produces to a Kafka sink. The application is

Flink Metrics Naming

2021-05-28 Thread Mason Chen
Can anyone give insight as to why Flink allows 2 metrics with the same “name”? For example, getRuntimeContext.addGroup(“group”, “group1”).counter(“myMetricName”); And getRuntimeContext.addGroup(“other_group”, “other_group1”).counter(“myMetricName”); Are totally valid. It seems that it has l

Re: Flink Metrics Naming

2021-06-01 Thread Mason Chen
structure and that's why the >> path is exposed via labels if I am not mistaken. So long story short, what >> you are seeing is a combination of how Flink organizes metrics and what can >> be reported to Prometheus. >> >> I am also pulling in Chesnay who is mor

Re: Flink Metrics Naming

2021-06-01 Thread Mason Chen
Upon further inspection, it seems like the user scope is not universal (i.e. comes through the connectors and not UDFs (like rich map function)), but the question still stands if the process makes sense. > On Jun 1, 2021, at 10:38 AM, Mason Chen wrote: > > Makes sense. We are

Re: Prometheus Reporter Enhancement

2021-06-02 Thread Mason Chen
ternal re-structuring that we'd > like to do before extending the metric system further, because we've been > tacking on more and more things since it was released in 1.3.0 (!!!) but > barely refactored things to properly fit together. > > On 5/20/2021 12:58 AM, Mason Chen wro

subscribe

2021-06-03 Thread Boyang Chen

Re: Flink exported metrics scope configuration

2021-06-03 Thread Mason Chen
Hi Kai, You can use the excluded variables config for the reporter. metrics.reporter..scope.variables.excludes: (optional) A semi-colon (;) separate list of variables that should be ignored by tag-based reporters (e.g., Prometheus, InfluxDB). https://ci.apache.org/projects/flink/flink-docs-rel

unsubscribe

2021-06-21 Thread steven chen
unsubscribe

Re: [ANNOUNCE] Hequn becomes a Flink committer

2019-08-07 Thread Zili Chen
Congrats Hequn! Best, tison. Jeff Zhang 于2019年8月7日周三 下午5:14写道: > Congrats Hequn! > > Paul Lam 于2019年8月7日周三 下午5:08写道: > >> Congrats Hequn! Well deserved! >> >> Best, >> Paul Lam >> >> 在 2019年8月7日,16:28,jincheng sun 写道: >> >> Hi everyone, >> >> I'm very happy to announce that Hequn accepted th

Re: some slots are not be available,when job is not running

2019-08-09 Thread Zili Chen
Hi, Could you attach the stack trace in exception or relevant logs? Best, tison. pengcheng...@bonc.com.cn 于2019年8月9日周五 下午4:55写道: > Hi, > > Why are some slots unavailable? > > My cluster model is standalone,and high-availability mode is zookeeper. > task.cancellation.timeout: 0 > some slots ar

Re: Why available task slots are not leveraged for pipeline?

2019-08-12 Thread Zili Chen
Hi Cam, If you set parallelism to 60, then you would make use of all 60 slots you have and for you case, each slot executes a chained operator contains 13 tasks. It is not the case one slot executes at least 60 sub-tasks. Best, tison. Cam Mach 于2019年8月12日周一 下午7:55写道: > Hi Zhu and Abhishek, >

Re: How can I pass multiple java options in standalone mode ?

2019-08-12 Thread Zili Chen
Hi Vishwas, Replace ',' with ' '(space) should work. Best, tison. Vishwas Siravara 于2019年8月13日周二 上午6:50写道: > Hi guys, > I have this entry in flink-conf.yaml file for jvm options. > env.java.opts: "-Djava.security.auth.login.config={{ flink_installed_dir > }}/kafka-jaas.conf,-Djava.security.kr

Re: JDBC sink for streams API

2019-08-12 Thread Zili Chen
Hi Eduardo, JDBCSinkFunction is a package-private class which you can make use of by JDBCAppendTableSink. A typical statement could be new JDBCAppendTableSink.builder() . ... .build() .consumeDataStream(upstream); Also JDBCUpse

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Zili Chen
Congratulations Andrey! Best, tison. Till Rohrmann 于2019年8月14日周三 下午9:26写道: > Hi everyone, > > I'm very happy to announce that Andrey Zagrebin accepted the offer of the > Flink PMC to become a committer of the Flink project. > > Andrey has been an active community member for more than 15 months

Weekly Community Update 2019/33, Personal Chinese Version

2019-08-18 Thread Zili Chen
Hi community, Inspired by weekly community updates thread, regards the growth of Chinese community and kind of different concerns for community members I'd like to start a personally maintained Chinese version of Weekly Community Update. Right now I posted these weeks' updates on this page[1] whe

Re: Recovery from job manager crash using check points

2019-08-19 Thread Zili Chen
Hi Min, I guess you use standalone high-availability and when TM fails, JM can recovered the job from an in-memory checkpoint store. However, when JM fails, since you don't persist state on ha backend such as ZooKeeper, even JM relaunched by YARN RM superseded by a stand by, the new one knows not

Re: Weekly Community Update 2019/33, Personal Chinese Version

2019-08-19 Thread Zili Chen
at 13:27, Paul Lam wrote: > >> Hi Tison, >> >> Big +1 for the Chinese Weekly Community Update. The content is >> well-organized, and I believe it would be very helpful for Chinese users to >> get an overview of what’s going on in the community. >

[SURVEY] How do you use high-availability services in Flink?

2019-08-21 Thread Zili Chen
Hi guys, We want to have an accurate idea of how users actually use high-availability services in Flink, especially how you customize high-availability services by HighAvailabilityServicesFactory. Basically there are standalone impl., zookeeper impl., embedded impl. used in MiniCluster, YARN impl

Re: [SURVEY] How do you use high-availability services in Flink?

2019-08-21 Thread Zili Chen
In addition, FLINK-13750[1] also likely introduce breaking change on high-availability services. So it is highly encouraged you who might be affected by the change share your cases :-) Best, tison. [1] https://issues.apache.org/jira/browse/FLINK-13750 Zili Chen 于2019年8月21日周三 下午3:32写道: >

Re: Recovery from job manager crash using check points

2019-08-21 Thread Zili Chen
tates with a save point directory? e.g. > > ./bin/flink run myJob.jar -s savepointDirectory > > > > Regards, > > > > Min > > > > *From:* Zili Chen [mailto:wander4...@gmail.com] > *Sent:* Dienstag, 20. August 2019 04:16 > *To:* Biao Liu > *Cc:* Ta

Re: TaskManager not connecting to ResourceManager in HA mode

2019-08-21 Thread Zili Chen
Hi Aleksandar, base on your log: taskmanager_1 | 2019-08-22 00:05:03,713 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor- Connecting to ResourceManager akka.tcp://flink@jobmanager:6123/user/jobmanager() . taskmanager_1 | 2019-08-22 00:05:04

Re: TaskManager not connecting to ResourceManager in HA mode

2019-08-21 Thread Zili Chen
hread.html/c0cc07197e6ba30b45d7709cc9e17d8497e5e3f33de504d58dfcafad@%3Cuser.flink.apache.org%3E Zili Chen 于2019年8月22日周四 上午10:16写道: > Hi Aleksandar, > > base on your log: > > taskmanager_1 | 2019-08-22 00:05:03,713 INFO > org.apache.flink.runtime.taskexecutor.TaskExecutor- Connecting > to ResourceManager >

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-22 Thread Zili Chen
Congratulations! Thanks Gordon and Kurt for being the release manager. Thanks all the contributors who have made this release possible. Best, tison. Jark Wu 于2019年8月22日周四 下午8:11写道: > Congratulations! > > Thanks Gordon and Kurt for being the release manager and thanks a lot to > all the contr

Re: TaskManager not connecting to ResourceManager in HA mode

2019-08-22 Thread Zili Chen
mble now starts fine, I’m working on ironing out the bugs > now. > > I’ll participate in the survey too! > > On Aug 21, 2019, at 7:18 PM, Zili Chen wrote: > > Besides, would you like to participant our survey thread[1] on > user list about "How do you use high-availa

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Zili Chen
Hi Gavin, I also find a problem in shell if the directory contain whitespace then the final command to run is incorrect. Could you check the final command to be executed? FYI, here is the ticket[1]. Best, tison. [1] https://issues.apache.org/jira/browse/FLINK-13827 Gavin Lee 于2019年8月23日周五 下午

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Zili Chen
ectory under root folder of flink distribution. > > > On Fri, Aug 23, 2019 at 4:10 PM Zili Chen wrote: > >> Hi Gavin, >> >> I also find a problem in shell if the directory contain whitespace >> then the final command to run is incorrect. Could you check the >>

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Zili Chen
ss was removed in flink-dist_2.12-1.9 >> jar >> file. >> Seems broken here for scala 2.12, right? >> >> [1] >> >> http://mirror.bit.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.12.tgz >> >> On Fri, Aug 23, 2019 at 4:37 PM Zili Chen

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-26 Thread Zili Chen
Hi Oytun, I think it intents to publish flink-queryable-state-client-java without scala suffix since it is scala-free. An artifact without scala suffix has been published [2]. See also [1]. Best, tison. [1] https://issues.apache.org/jira/browse/FLINK-12602 [2] https://mvnrepository.com/artifact

Re: [SURVEY] How do you use high-availability services in Flink?

2019-08-28 Thread Zili Chen
icesFactory > FileSystemSubmittedJobGraphStore implements SubmittedJobGraphStore > > Testing so far proved that bringing down a JobManager and bringing it back > up does indeed restore all the running jobs. Job creation/destruction also > works. > > Hope this helps! > > T

Re: [ANNOUNCE] Kostas Kloudas joins the Flink PMC

2019-09-06 Thread Zili Chen
Congrats Klou! Best, tison. Till Rohrmann 于2019年9月6日周五 下午9:23写道: > Congrats Klou! > > Cheers, > Till > > On Fri, Sep 6, 2019 at 3:00 PM Dian Fu wrote: > >> Congratulations Kostas! >> >> Regards, >> Dian >> >> > 在 2019年9月6日,下午8:58,Wesley Peng 写道: >> > >> > On 2019/9/6 8:55 下午, Fabian Hueske w

Re: Post-processing batch JobExecutionResult

2019-09-06 Thread Zili Chen
Hi spoganshev, If you deploy in per-job mode, OptimizerPlanEnvironment would be used, and thus as you pointed out, there is _no_ way to post processing JobExecutionResult. We the community regard this situation as a shortcoming and work on an enhancement progress to enable you get a JobClient as r

Re: Post-processing batch JobExecutionResult

2019-09-06 Thread Zili Chen
Besides, if you submit the job by Jar Run REST API, it is also OptimizerPlanEnvironment to be used. So again, _no_ post processing support at the moment. Zili Chen 于2019年9月7日周六 上午12:51写道: > Hi spoganshev, > > If you deploy in per-job mode, OptimizerPlanEnvironment would be used, a

  1   2   3   4   5   6   >