Re: [ANNOUNCE] Apache Flink CDC 3.1.1 released

2024-06-18 Thread Paul Lam
Well done! Thanks a lot for your hard work! Best, Paul Lam > 2024年6月19日 09:47,Leonard Xu 写道: > > Congratulations! Thanks Qingsheng for the release work and all contributors > involved. > > Best, > Leonard > >> 2024年6月18日 下午11:50,Qingsheng Ren 写道: >> &

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 Thread Paul Lam
Finally! Thanks to all! Best, Paul Lam > 2023年10月27日 03:58,Alexander Fedulov 写道: > > Great work, thanks everyone! > > Best, > Alexander > > On Thu, 26 Oct 2023 at 21:15, Martijn Visser > wrote: > >> Thank you all who have contributed! >> >

Re: Flink operator task opens threads internally

2023-08-02 Thread Paul Lam
Hi Kamal, It’s okay if you don’t mind the data order. But it’s not very commonly seen to accept client sockets from Flink jobs, as the socket server address is dynamic and requires service discovery. Would you like to share more about the background? Best, Paul Lam > 2023年8月3日 10:26,Ka

[Flink K8s Operator] Trigger nonce missing for manual savepoint info

2023-07-12 Thread Paul Lam
): ``` Last Savepoint: Location: hdfs://.../savepoints/61dfcb2fd7946a0001827c55/savepoint-31c74e-3347299b84ec Time Stamp:1689150184799 Trigger Type: MANUAL ``` Do you have any ideas about what’s causing this? Thanks! Best, Paul Lam

Re: [Flink K8s Operator] Automatic cleanup of terminated deployments

2023-05-21 Thread Paul Lam
Hi Gyula, Thank you and sorry for the late response. My use case is that users may run finite jobs (either batch jobs or finite stream jobs), leaving a lot of deprecated flink deployments around. I’ve filed a ticket[1]. [1] https://issues.apache.org/jira/browse/FLINK-32143 Best, Paul Lam

[Flink K8s Operator] Automatic cleanup of terminated deployments

2023-05-14 Thread Paul Lam
Hi all, Currently, if a job turns into terminated status (e.g. FINISHED or FAILED), the flinkdeployment remains until a manual cleanup is performed. I went through the docs but did not find any way to clean them up automatically. Am I missing something? Thanks! Best, Paul Lam

Re: [Flink K8s Operator] flinkdep stays in DEPLOYED and never turns STABLE

2022-12-06 Thread Paul Lam
Thanks a lot for your input, Gyula! Best, Paul Lam > 2022年12月6日 18:38,Gyula Fóra 写道: > > Hi! > > The stable state is not marked in the reconciliation state field but instead > using the last stable spec field. Deployed simply means that something is > running :) >

[Flink K8s Operator] flinkdep stays in DEPLOYED and never turns STABLE

2022-12-06 Thread Paul Lam
I missed something? Thanks a lot! Best, Paul Lam

Re: [ANNOUNCE] Apache Kyuubi (Incubating) released 1.5.1-incubating

2022-04-21 Thread Paul Lam
Hi Fu, I think the mail might have been sent to Flink user mail list by mistake? Best, Paul Lam > 2022年4月22日 11:00,Fu Chen 写道: > > Hi all, > > The Apache Kyuubi (Incubating) community is pleased to announce that > Apache Kyuubi (Incubating) 1.5.1-incubating has been rel

Re: Setting S3 as State Backend in SQL Client

2022-03-16 Thread Paul Lam
-docs-release-1.14/docs/dev/table/config/ <https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/config/> Best, Paul Lam > 2022年3月15日 19:21,dz902 写道: > > Just tried editing flink-conf.yaml and it seems SQL Client does not respect > that also. Is this an

Re: Inaccurate checkpoint trigger time

2022-02-08 Thread Paul Lam
Hi Robert & Yun, Thanks a lot for your helpful analysis! I’ve confirmed that it’s the checkpoint cleanup problem that caused the inaccurate checkpoint trigger time. Best, Paul Lam > 2022年1月30日 19:45,Yun Tang 写道: > > Hi Paul, > > I think Robert's idea might be righ

Re: Inaccurate checkpoint trigger time

2022-01-27 Thread Paul Lam
? Thanks a lot! [1] https://gist.github.com/link3280/5072a054a43b40ba28891837a8fdf995 [2] https://github.com/apache/flink/blob/90e850301e672fc0da293abc55eb446f7ec68ffa/flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java#L743 Best, Paul Lam > 2021年11月23日

Re: Tuning akka.ask.timeout

2022-01-24 Thread Paul Lam
about the cause by profiling. Best, Paul Lam > 2022年1月21日 09:56,Guowei Ma 写道: > > Hi, Paul > > Would you like to share some information such as the Flink version you used > and the memory of TM and JM. > And when does the timeout happen? Such as at begin of the job or dur

Re: Tuning akka.ask.timeout

2022-01-24 Thread Paul Lam
set the akka timeout to 60s as a workaround, and the job now runs well. I’m planning to dig deeper using profile and will get back to the community if I find something. Best, Paul Lam > 2022年1月20日 20:09,Zhilong Hong 写道: > > Hi, Paul: > > Increasing akka.ask.timeout only cove

Tuning akka.ask.timeout

2022-01-20 Thread Paul Lam
and as a response timeout. So I’m wondering what’s the reasonable range for this option? And why would the Actor fail to respond in time (the message was dropped due to pressure)? Any input would be appreciated! Thanks a lot. Best, Paul Lam

Re: Using venv in Pyflink

2021-12-23 Thread Paul Lam
wonder if the installation affects the heartbeats between taskmanager and jobmanager? Best, Paul Lam > 2021年12月23日 19:16,Dian Fu 写道: > > Hi Paul, > > Currently, you need to build venv in an environment where you want to execute > the PyFlink jobs. > > >> Also,

Using venv in Pyflink

2021-12-23 Thread Paul Lam
Debian, causing pyflink to fail. Is there some good practice to avoid this? Also, I wonder if it’s possible for pyflink to optionally provide an automatically created venv for each pyflink job? Thanks! Best, Paul Lam

Re: Java dependencies management in Pyflink

2021-12-14 Thread Paul Lam
Hi Dian, Thanks a lot for your input. That’s a valid solution. We avoid using fat jars in Java API, because it easily leads to class conflicts. But PyFlink is like SQL API, user-imported Java dependencies are comparatively rare, so fat jar is a proper choice. Best, Paul Lam > 2021年12月14日

Java dependencies management in Pyflink

2021-12-14 Thread Paul Lam
the corresponding Java artifact of the imported PyFlink modules via maven dependency plugin. But I wonder if there is some best practice to address the problem? Thanks a lot! Best, Paul Lam

Re: Inaccurate checkpoint trigger time

2021-11-23 Thread Paul Lam
Hi Yun, Thanks a lot for your pointers! I’ll try it out as you suggested and then get back to you. Best, Paul Lam > 2021年11月23日 16:32,Yun Tang 写道: > > Hi Paul, > > This is really weird, from what I know, flink-1.11.0 has a problem of > handling min-pause time [1]

Inaccurate checkpoint trigger time

2021-11-22 Thread Paul Lam
. Please help me narrow down the problem if you have any idea. Best, Paul Lam

Re: Relation between Flink Configuration and TableEnv

2021-09-24 Thread Paul Lam
Hi Caizhi, Thanks a lot for you clarification! Now I understand the design of TableConfig. Best, Paul Lam > 2021年9月24日 15:40,Caizhi Weng 写道: > > Hi! > > TableConfig is for configurations related to the Table and SQL API, > especially the configurations in Optimize

Re: Relation between Flink Configuration and TableEnv

2021-09-23 Thread Paul Lam
Sorry, I mean the relation between Flink Configuration and TableConfig, not TableEnv. Best, Paul Lam Paul Lam 于2021年9月24日周五 上午12:24写道: > Hi all, > > Currently, Flink creates a new Configuration in TableConfig of > StreamTableEnvironment, and synchronizes options in it

Relation between Flink Configuration and TableEnv

2021-09-23 Thread Paul Lam
om/apache/flink/blob/36ff71f5ff63a140acc634dd1d98b2bb47a76ba5/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/environment/StreamExecutionEnvironment.java#L904 Best, Paul Lam

Re: Questions on usage of SQL hints

2021-08-12 Thread Paul Lam
Hi JING, Thanks for your inputs! It helps a lot. Best, Paul Lam > 2021年8月12日 13:13,JING ZHANG 写道: > > Hi Paul, > There are Table hints and Query hints. > Query hints are on the way, there is a JIRA to track this issue [3]. AFAIK, > the issue is almost close to submit

Questions on usage of SQL hints

2021-08-11 Thread Paul Lam
Hi community, I’m trying out SQL hints on DML, but there’s not very much about the supported SQL hints on the docs. Are the SQL hints limited to source/sink tables only at the moment? And where can I find the full list of supported SQL hints? Thanks in advance! Best, Paul Lam

Re: Set job specific resources in one StreamTableEnvironment

2021-07-20 Thread Paul Lam
onfiguration or API can be used to adjust job resources in SQL/Table API? In my case, approaches with DataStream API is also viable, cause I’m submitting the jobs programatically. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-53%3A+Fine+Grained+Operator+Resource+Management Best, Pa

Set job specific resources in one StreamTableEnvironment

2021-07-15 Thread Paul Lam
resources specs of all nodes in the StreamGraph, but it’s problematic because some nodes have a parallelism limit (e.g. can’t be greater than 1). I think I might be missing something and there should be a better way to do this. Please give me some pointers. Thanks a lot! Best, Paul Lam

Re: How can I tell if a record in a bounded job is the last record?

2021-06-29 Thread Paul Lam
/StreamTask.java Best, Paul Lam > 2021年6月30日 10:38,Yik San Chan 写道: > > Hi community, > > I have a batch job that consumes records from a bounded source (e.g., Hive), > walk them through a BufferingSink as described in > [docs](https://ci.apache.org/projects/flink/flink-docs-

Re: Add control mode for flink

2021-06-08 Thread Paul Lam
through the whole DAG, but events needs to be acknowledged by downstream and can overtake records, while stream records are not). So I’m wondering if we plan to unify the two approaches in the new control flow (as Xintong mentioned both in the previous mails)? Best, Paul Lam > 2021年6月8日 14

Re: Avro schema

2021-04-02 Thread Paul Lam
. Best, Paul Lam > 2021年4月2日 11:34,Sumeet Malhotra 写道: > > Just realized, my question was probably not clear enough. :-) > > I understand that the Avro (or JSON for that matter) format can be ingested > as described here: > https://ci.apache.org/projects/flink/flink-

Re: Unexpected latency across operator instances

2021-01-06 Thread Paul Lam
Hi Antonis, Did you try to profile the “bad” taskmanager to see what the task thread was busy doing? And a possible culprit might be gc, if you haven't checked that. I’ve seen gc threads eating up 30% of cpu. Best, Paul Lam > 2020年12月14日 06:24,Antonis Papaioannou 写道: > &

Re: [ANNOUNCE] Apache Flink 1.11.3 released

2020-12-18 Thread Paul Lam
Well done! Thanks to Gordon and Xintong, and everyone that contributed to the release. Best, Paul Lam > 2020年12月18日 19:20,Xintong Song 写道: > > The Apache Flink community is very happy to announce the release of Apache > Flink 1.11.3, which is the third bugfix release for the

Failed job reinitiated with wrong checkpoint after a ZK reconnection

2020-10-22 Thread Paul Lam
restore the job, which rewound the job unexpectedly. I’ve filed an issue[1], and any comments are appreciated. 1. https://issues.apache.org/jira/browse/FLINK-19778 Best, Paul Lam

Re: TM heartbeat timeout due to ResourceManager being busy

2020-10-11 Thread Paul Lam
Sorry for the misspelled name, Xintong Best, Paul Lam > 2020年10月12日 14:46,Paul Lam 写道: > > Hi Xingtong, > > Thanks a lot for the pointer! > > It’s good to see there would be a new IO executor to take care of the TM > contexts. Looking forward to the 1.12 releas

Re: TM heartbeat timeout due to ResourceManager being busy

2020-10-11 Thread Paul Lam
Hi Xingtong, Thanks a lot for the pointer! It’s good to see there would be a new IO executor to take care of the TM contexts. Looking forward to the 1.12 release! Best, Paul Lam > 2020年10月12日 14:18,Xintong Song 写道: > > Hi Paul, > > Thanks for reporting this. > &g

TM heartbeat timeout due to ResourceManager being busy

2020-10-11 Thread Paul Lam
timeout, is there any recommended out of the box approach that can reduce the chance of getting the timeouts? In the long run, is it possible to limit the number of taskmanager contexts that RM creates at a time, so that the heartbeat triggers can chime in? Thanks! Best, Paul Lam

Re: Savepoint incomplete when job was killed after a cancel timeout

2020-09-29 Thread Paul Lam
Hi Till, Thanks a lot for the pointer! I tried to restore the job using the savepoint in a dry run, and it worked! Guess I've misunderstood the configuration option, and confused by the non-existent paths that the metadata contains. Best, Paul Lam Till Rohrmann 于2020年9月29日周二 下午10

Savepoint incomplete when job was killed after a cancel timeout

2020-09-29 Thread Paul Lam
nfig: - Flink 1.11.0 - YARN job cluster - HA via zookeeper - FsStateBackend - Aligned non-incremental checkpoint Any comments and suggestions are appreciated! Thanks! Best, Paul Lam

Re: [ANNOUNCE] New PMC member: Dian Fu

2020-08-27 Thread Paul Lam
Congrats, Dian! Best, Paul Lam > 2020年8月27日 17:42,Marta Paes Moreira 写道: > > Congrats, Dian! > > On Thu, Aug 27, 2020 at 11:39 AM Yuan Mei <mailto:yuanmei.w...@gmail.com>> wrote: > Congrats! > > On Thu, Aug 27, 2020 at 5:38 PM Xingbo Huang &

Re: [DISCUSS] Remove Kafka 0.10.x connector (and possibly 0.11.x)

2020-08-27 Thread Paul Lam
. Stop the job with a savepoint, committing the offset to Kafka brokers. 2. Modify user code, migrate to he universal connector, and change the source operator id to discard the old connector states. 3. Start the job with the savepoint, and read Kafka from group offsets. Best, Paul Lam > 2020年8月

Re: [DISCUSS] Remove Kafka 0.10.x connector (and possibly 0.11.x)

2020-08-24 Thread Paul Lam
, the universal connector is compatible with 0.10 brokers, but I want to double check that. Best, Paul Lam > 2020年8月24日 22:46,Aljoscha Krettek 写道: > > Hi all, > > this thought came up on FLINK-17260 [1] but I think it would be a good idea > in general. The issue reminded us th

Re: Unexpected unnamed sink in SQL job

2020-08-04 Thread Paul Lam
Hi Jingsong, Thanks for your input. Now I understand the design. I think in my case the StreamingFileCommitter is not chained because its upstream operator is not parallelism 1. BTW, it’d be better if it has a more meaningful operator name. Best, Paul Lam > 2020年8月4日 17:11,Jingsong Li

Unexpected unnamed sink in SQL job

2020-08-04 Thread Paul Lam
: - Flink 1.11.0 with Blink planner - Hive 1.1.0 Best, Paul Lam

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Paul Lam
dir was not created yet. Best, Paul Lam > 2020年7月21日 18:50,Jingsong Li 写道: > > Hi, > > Sorry for this. This work around only works in Hive 2+. > We can only wait for 1.11.2. > > Best, > Jingsong > > On Tue, Jul 21, 2020 at 6:15 PM Rui Li <mailto:lirui.fu

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Paul Lam
iationUtil.java:586) at org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:515) at org.apache.flink.streaming.api.graph.StreamConfig.setStreamOperatorFactory(StreamConfig.java:260) ... 36 more Best, Paul Lam > 2020年7月21日 16:59,Jingsong Li

Re: FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Paul Lam
, Paul Lam > 2020年7月21日 16:24,Rui Li 写道: > > Hey Paul, > > Could you please share more about your job, e.g. the schema of your Hive > table, whether it's partitioned, and the table properties you've set? > > On Tue, Jul 21, 2020 at 4:02 PM Paul Lam <mailt

FileNotFoundException when writting Hive orc tables

2020-07-21 Thread Paul Lam
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:201) Best, Paul Lam

Re: How to debug window states

2020-07-15 Thread Paul Lam
It turns out to be a bug of StateTTL [1]. But I’m still interested in debugging window states. Any suggestions are appreciated. Thanks! 1. https://issues.apache.org/jira/browse/FLINK-16581 <https://issues.apache.org/jira/browse/FLINK-16581> Best, Paul Lam > 2020年7月15日 13:13,Pau

How to debug window states

2020-07-14 Thread Paul Lam
. - Keys amount are capped. - Watermarks are good, no lags. - No back pressure. If I can directly read the states to find out what’s accountable for the state size growth, that would be very intuitional and time-saving. Best, Paul Lam

Re: IllegalAccessError when writing to hive orc table

2020-07-13 Thread Paul Lam
connectors seem to be broken. Should we use https://repo.maven.apache.org/maven2/ <https://repo.maven.apache.org/maven2/> instead? Best, Paul Lam > 2020年7月13日 18:02,Jingsong Li 写道: > > Hi, It looks really weird. > > Is there any possibility of class conflict? > How do you

IllegalAccessError when writing to hive orc table

2020-07-13 Thread Paul Lam
org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:992) Best, Paul Lam

Re: [ANNOUNCE] Apache Flink 1.11.0 released

2020-07-07 Thread Paul Lam
Finally! Thanks for Piotr and Zhijiang being the release managers, and everyone that contributed to the release! Best, Paul Lam > 2020年7月7日 22:06,Zhijiang 写道: > > The Apache Flink community is very happy to announce the release of Apache > Flink 1.11.0, which is the latest m

Re: Dynamic source and sink.

2020-07-02 Thread Paul Lam
tenant configuration. 2. Still restart the job, and optimize the downtime by using session mode. Best, Paul Lam > 2020年7月2日 11:23,C DINESH 写道: > > Hi Danny, > > Thanks for the response. > > In short without restarting we cannot add new sinks or sources. > > For

Re: SQL Timetamp types incompatible after migration to 1.10

2020-03-20 Thread Paul Lam
Filed an issue to track this problem. [1] [1] https://issues.apache.org/jira/browse/FLINK-16693 <https://issues.apache.org/jira/browse/FLINK-16693> Best, Paul Lam > 在 2020年3月20日,17:17,Paul Lam 写道: > > Hi Jark, > > Sorry for my late reply. > > Yes, I’m using the

Re: SQL Timetamp types incompatible after migration to 1.10

2020-03-20 Thread Paul Lam
fix the old planner if it’s not too involved. Best, Paul Lam > 在 2020年3月19日,17:13,Jark Wu 写道: > > Hi Paul, > > Are you using old planner? Did you try blink planner? I guess it maybe a bug > in old planner which doesn't work well on new types. > > Best, > Jar

SQL Timetamp types incompatible after migration to 1.10

2020-03-19 Thread Paul Lam
ql.Timestamp. Is my reasoning correct? And is there any workaround? Thanks a lot! Best, Paul Lam

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Paul Lam
Congrats, Dian! Best, Paul Lam > 在 2020年1月17日,10:49,tison 写道: > > Congratulations! Dian > > Best, > tison. > > > Zhu Zhu mailto:reed...@gmail.com>> 于2020年1月17日周五 > 上午10:47写道: > Congratulations Dian. > > Thanks, > Zhu Zhu > > hailon

Re: Jobmanager not properly fenced when killed by YARN RM

2019-12-16 Thread Paul Lam
for you help! [1] https://issues.apache.org/jira/browse/FLINK-14010 <https://issues.apache.org/jira/browse/FLINK-14010> Best, Paul Lam > 在 2019年12月16日,20:35,Yang Wang 写道: > > Hi Paul, > > I found lots of "Failed to stop Container " logs in the jobmanager.log. It

Jobmanager not properly fenced when killed by YARN RM

2019-12-13 Thread Paul Lam
adoop 2.6.5. As I can remember, I've seen a similar issue that relates to the fencing of JobManager, but I searched the JIRA and couldn't find it. It would be great if someone can point me to the right direction. And any comments are also welcome! Thanks! Best, Paul Lam

Re: StreamingFileSink duplicate data

2019-11-21 Thread Paul Lam
way, we avoid file name conflicts with the previous execution (see[1]). [1] https://github.com/apache/flink/blob/93dfdd05a84f933473c7b22437e12c03239f9462/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/filesystem/Buckets.java#L276 Best, Paul Lam > 在 2019年11月

Re: [DISCUSS] Semantic and implementation of per-job mode

2019-10-30 Thread Paul Lam
easily get access to the logs. Best, Paul Lam > 在 2019年10月30日,16:22,SHI Xiaogang 写道: > > Hi > > Thanks for bringing this. > > The design looks very nice to me in that > 1. In the new per-job mode, we don't need to compile user programs in the > client and can

Re: Flink StreamingFileSink part file behavior

2019-10-24 Thread Paul Lam
://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/functions/sink/filesystem/BucketAssigner.html <https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/functions/sink/filesystem/BucketAssigner.html> Best, Pa

Re: Flink State Migration Version 1.8.2

2019-10-17 Thread Paul Lam
Hi, Could you confirm that you’re using POJOSerializer before and after migration? Best, Paul Lam > 在 2019年10月17日,21:34,ApoorvK 写道: > > It is throwing below error , > the class I am adding variables have other variable as an object of class > which are also in state.

Re: [ANNOUNCE] Zili Chen becomes a Flink committer

2019-09-11 Thread Paul Lam
Congratulations Zili! Best, Paul Lam > 在 2019年9月12日,09:34,Rong Rong 写道: > > Congratulations Zili! > > -- > Rong > > On Wed, Sep 11, 2019 at 6:26 PM Hequn Cheng <mailto:chenghe...@gmail.com>> wrote: > Congratulations! > > Best, Hequn > > On T

Re: StateMigrationException thrown by state processor api

2019-08-27 Thread Paul Lam
bell, I might have read about the limitation that the state processor API is unable to read states in WindowOperator currently. All in all, thanks again. Best, Paul Lam > 在 2019年8月27日,17:32,Yun Tang 写道: > > Hi Paul > > Would you please share more information of the exception

StateMigrationException thrown by state processor api

2019-08-27 Thread Paul Lam
ease point me to the right direction. Thanks a lot! Best, Paul Lam

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-22 Thread Paul Lam
Well done! Thanks to everyone who contributed to the release! Best, Paul Lam Yu Li 于2019年8月22日周四 下午9:03写道: > Thanks for the update Gordon, and congratulations! > > Great thanks to all for making this release possible, especially to our > release managers! > > Best Regards,

Re: Weekly Community Update 2019/33, Personal Chinese Version

2019-08-18 Thread Paul Lam
Hi Tison, Big +1 for the Chinese Weekly Community Update. The content is well-organized, and I believe it would be very helpful for Chinese users to get an overview of what’s going on in the community. Best, Paul Lam > 在 2019年8月19日,12:27,Zili Chen 写道: > > Hi community, > &

Re: [ANNOUNCE] Hequn becomes a Flink committer

2019-08-07 Thread Paul Lam
Congrats Hequn! Well deserved! Best, Paul Lam > 在 2019年8月7日,16:28,jincheng sun 写道: > > Hi everyone, > > I'm very happy to announce that Hequn accepted the offer of the Flink PMC to > become a committer of the Flink project. > > Hequn has been contributing to

Re: Graceful Task Manager Termination and Replacement

2019-07-11 Thread Paul Lam
712> Best, Paul Lam > 在 2019年7月12日,03:38,Aaron Levin 写道: > > Hello, > > Is there a way to gracefully terminate a Task Manager beyond just killing it > (this seems to be what `./taskmanager.sh stop` does)? Specifically I'm > interested in a way to replace a Task Ma

Re: [ANNOUNCE] Rong Rong becomes a Flink committer

2019-07-11 Thread Paul Lam
Congrats to Rong! Rong has contributed a lot to the community and well deserves it. Best, Paul Lam > 在 2019年7月12日,09:40,JingsongLee 写道: > > Congratulations Rong. > Rong Rong has done a lot of nice work in the past time to the flink community. > >

Re: Setting consumer offset

2019-07-02 Thread Paul Lam
Hi Avi, Yes, it will. The restored state takes priority over the start position. Best, Paul Lam > 在 2019年7月2日,15:11,Avi Levi 写道: > > Hi, > If I set in code the consumer offset e.g consumer.setStartFromTimestamp and I > start the job from a curtain savepoint/checkpoint will th

Re: How to generate a sequential watermark which increases by one unit each time

2019-05-24 Thread Paul Lam
Hi Averell, IMHO, a simple approach would be adding a rich map that holds the sequence value (backed by states) and attach it to the records before the assigner operator. Best, Paul Lam > 在 2019年5月21日,20:37,Averell 写道: > > Hi everyone, > > I have a stream of files, each fi

Re: Upgrading from 1.4 to 1.8, losing Kafka consumer state

2019-05-24 Thread Paul Lam
restored state plus FlinkKafkaConsumer is set to `startFromGroupOffset`. Best, Paul Lam > 在 2019年5月24日,07:50,Nikolas Davis 写道: > > Howdy, > > We're in the process of upgrading to 1.8. When restoring state to the new > cluster (using a savepoint) we are seeing our Kafka consum

Re: Preserve accumulators after failure in DataStream API

2019-05-02 Thread Paul Lam
Hi Wouter, I've met the same issue and finally managed to use operator states to back the accumulators, so they can be restored after restarts. The downside is that we have to update the values in both accumulators and states to make them consistent. FYI. Best, Paul Lam Fabian Hueske 于201

Re: [DISCUSS] Temporarily remove support for job rescaling via CLI action "modify"

2019-04-24 Thread Paul Lam
Hi Gary, + 1 to remove it for now. Actually some users are not aware of that it’s still experimental, and ask quite a lot about the problem it causes. Best, Paul Lam > 在 2019年4月24日,14:49,Stephan Ewen 写道: > > Sounds reasonable to me. If it is a broken feature, then there is not muc

Re: Fast restart of a job with a large state

2019-04-18 Thread Paul Lam
Hi, Have you tried task local recovery [1]? [1] https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html#retained-checkpoints Best, Paul Lam > 在 2019年4月17日,17:46,Sergey Zhemzhitsky 写道: > > Hi Flinkers, > > Operating different flink jobs I've

Re: Fast restart of a job with a large state

2019-04-18 Thread Paul Lam
The URL in my previous mail is wrong, and it should be: https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#task-local-recovery <https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#task-local-recovery> Best, Pa

Re: assignTimestampsAndWatermarks not work after KeyedStream.process

2019-04-17 Thread Paul Lam
the data pipeline. Best, Paul Lam > 在 2019年4月17日,23:04,an0...@gmail.com 写道: > > `assignTimestampsAndWatermarks` before `keyBy` works: > ```java > DataStream trips = >env.addSource(consumer).assignTimestampsAndWatermarks(new > BoundedOutOfOrdernessTimestamp

Re: RocksDBStatebackend does not write checkpoints to backup path

2019-03-28 Thread Paul Lam
Hi Gordon, Thanks for your reply. I’ve found out that it should be a bug of RocksDBStateBackend [1]. [1] https://issues.apache.org/jira/browse/FLINK-12042 <https://issues.apache.org/jira/browse/FLINK-12042> Best, Paul Lam > 在 2019年3月28日,17:03,Tzu-Li (Gordon) Tai 写道: > >

Re: RocksDB local snapshot sliently disappears and cause checkpoint to fail

2019-03-28 Thread Paul Lam
Hi Yu, I’ve set `fs.default-scheme` to hdfs, and it's mainly used for simplifying checkpoint / savepoint / HA paths. And I leave the rocksdb local dir empty, so the local snapshot still goes to YARN local cache dirs. Hope that answers your question. Best, Paul Lam > 在 2019年3月28日,15:

Re: RocksDB local snapshot sliently disappears and cause checkpoint to fail

2019-03-27 Thread Paul Lam
tps://issues.apache.org/jira/browse/FLINK-12042> Best, Paul Lam > 在 2019年3月27日,19:06,Paul Lam 写道: > > Hi, > > I’m using Flink 1.6.4 and recently I ran into a weird issue of rocksdb > statebackend. A job that runs fine on a YARN cluster keeps failing on > checkpoint after migrated t

RocksDB local snapshot sliently disappears and cause checkpoint to fail

2019-03-27 Thread Paul Lam
RocksDBKeyedStateBackend (see attachment), and found that the local snapshot performed as expected and the .sst files were written, but when the async task accessed the directory, the whole snapshot directory was gone. What could possibly be the cause? Thanks a lot. Best,Paul Lam 2019-03-27 16:51

Re: Discrepancy between the part length file's length and the part file length during recover

2019-03-26 Thread Paul Lam
e would > need truncation as and when FileStatus on NN recovers ? Actually, most of the time the file needs truncation and I’ve set up a cronjob to do this. Best, Paul Lam > 在 2019年3月26日,21:26,Vishal Santoshi <mailto:vishal.santo...@gmail.com>> 写道: > > Thank you

Re: Discrepancy between the part length file's length and the part file length during recover

2019-03-26 Thread Paul Lam
consistency is required. I’ve just filed a ticket [1], please take a look. [1] https://issues.apache.org/jira/browse/FLINK-12022 <https://issues.apache.org/jira/browse/FLINK-12022> Best, Paul Lam > 在 2019年3月12日,09:24,Vishal Santoshi 写道: > > This seems strange. When I pull th

RocksDBStatebackend does not write checkpoints to backup path

2019-03-26 Thread Paul Lam
checkpoint, doesn’t exist. But there is no error message about failures of rocksdb checkpoint. What could possibly be the cause? Thanks a lot! Best, Paul Lam

Re: flink on yarn log rolling

2019-03-20 Thread Paul Lam
Hi Shengnan, If you have the ssh permission, you could take a look at the container working directories to check if the taskmanager local log4j.properties is as expected. Best, Paul Lam > 在 2019年3月20日,15:30,Shengnan YU 写道: > > Hi all: > I'd like to enable log rolling for

Re: Flink performance drops when async checkpoint is slow

2019-02-28 Thread Paul Lam
, so we didn’t think that way before. Best, Paul Lam > 在 2019年2月28日,16:12,zhijiang 写道: > > Hi Paul, > > I am not sure whether task thread is involverd in some works during > snapshoting states for FsStateBackend. But I have another experience which > might also cause

Flink performance drops when async checkpoint is slow

2019-02-27 Thread Paul Lam
dashboard shows that the buffer during alignment is less than 10 MB, even when back pressure is high. We’ve been struggling with this problem for weeks, and any help is appreciated. Thanks a lot! Best, Paul Lam

Re: Test FileUtilsTest.testDeleteDirectory failed when building Flink

2019-02-27 Thread Paul Lam
Hi Chesnay & Fabian, Thanks for your replies. I found it should be related to the CI runner. I moved to gitlab CI which runs the script as root user by default, so it is always able to remove a write protected file. Best, Paul Lam > 在 2019年2月20日,17:08,Chesnay Schepler 写道: > >

Test FileUtilsTest.testDeleteDirectory failed when building Flink

2019-02-15 Thread Paul Lam
with this? Thanks a lot! WRT the environment: - Flink version: 1.7.1 - JDK: open jdk 1.8.0_111 - OS version: debian 8 Best, Paul Lam

Re: Videos and slides on Flink Forward Beijing

2019-02-01 Thread Paul Lam
Hi Congxian & Yun, Thanks a lot for the pointers! Best, Paul Lam > 在 2019年2月1日,23:07,Yun Tang 写道: > > Hi Paul > > You could find slides here > https://github.com/flink-china/flink-forward-china-2018 > <https://github.com/flink-china/flink-forward-china-2018&

Videos and slides on Flink Forward Beijing

2019-01-31 Thread Paul Lam
Hi, It’s been a while since Flink Forward Beijing, would the videos and slides be available on the website? Thanks! Best, Paul Lam

Back pressure within a operator chain

2019-01-23 Thread Paul Lam
much greater that job parallelism (like 40:1), OOM happens. The the root cause should be Kafka consumer pulling too much data. So I’m wondering if I should separate the source and sink to make the back pressure mechanism working. Best, Paul Lam

Access Flink configuration in user functions

2018-12-27 Thread Paul Lam
) at the user main method? Or the configuration is only used internally? Thanks! Best, Paul Lam

Re: Issue with Flink not able to properly read the ResourceManager address for a HA setup

2018-12-19 Thread Paul Lam
Hi Sai, It looks like the Hadoop config path is not correctly set. You could set the logging level in log4j-cli.properties to debug to get more informations. Best, Paul Lam > 在 2018年12月20日,03:18,Sai Inampudi 写道: > > Hi, I am trying to create a flink cluster on yarn, by running the

Re: Custom S3 endpoint

2018-12-18 Thread Paul Lam
Hi Nick, What version of Hadoop are you using? AFAIK, you must use Hadoop 2.7+ to support custom s3 endpoint, or the `fs.s3a.endpoint` property in core-site.xml would be ignored. Best, Paul Lam > 在 2018年12月19日,06:40,Martin, Nick 写道: > > I’m working on Flink 1.7.0 and I’m trying t

Re: Logging Kafka during exceptions

2018-11-21 Thread Paul Lam
records is not always helpful. What do you think? Best, Paul Lam > 在 2018年11月22日,13:20,Scott Sue 写道: > > Hi Paul, > > Thanks for the quick reply. Ok does that mean that as general practice, I > should be catching all exceptions for the purpose of logging in any of my > O

Re: Logging Kafka during exceptions

2018-11-21 Thread Paul Lam
Hi Scott, I think you can do it by catching the exception in the user function and log the current message that the operator is processing before re-throwing (or not) the exception to Flink runtime. Best, Paul Lam > 在 2018年11月22日,12:59,Scott Sue 写道: > > Hi all, > > When

  1   2   >