Re: Understanding recovering from savepoint / checkpoint with additional operators when chaining

2021-07-07 Thread Jiahui Jiang
operators be able to recover from this checkpoint? Thanks! From: Arvid Heise Sent: Wednesday, July 7, 2021 5:20 AM To: Jiahui Jiang Cc: ro...@apache.org ; user@flink.apache.org Subject: Re: Understanding recovering from savepoint / checkpoint with additional

Re: Understanding recovering from savepoint / checkpoint with additional operators when chaining

2021-07-06 Thread Jiahui Jiang
ental checkpoint is disabled? Thank you! From: Roman Khachatryan Sent: Friday, July 2, 2021 4:59 PM To: Jiahui Jiang Cc: user@flink.apache.org Subject: Re: Understanding recovering from savepoint / checkpoint with additional operators when chaining Hi, Just to cl

Understanding recovering from savepoint / checkpoint with additional operators when chaining

2021-07-02 Thread Jiahui Jiang
Hello Flink, I'm trying to understand the state recovery mechanism when there are extra stateless operators. I'm using flink-sql, and I tested a 'select `number_col` from source' query, where the stream graph looks like: `source (stateful with fixed uid) -> [several stateless operators transla

Re: Discard checkpoint files through a single recursive call

2021-06-15 Thread Jiahui Jiang
From: Yun Tang Sent: Tuesday, June 15, 2021 10:27 PM To: Guowei Ma ; Jiahui Jiang Cc: user@flink.apache.org Subject: Re: Discard checkpoint files through a single recursive call Hi Jiang, Please take a look at FLINK-17860 and FLINK-13856 for previous discussion of this problem. [1] https

Discard checkpoint files through a single recursive call

2021-06-14 Thread Jiahui Jiang
Hello Flink! We are building an infrastructure where we implement our own CompletedCheckpointStore. The read and write to the external storage location of these checkpoints are through HTTP calls to an external service. Recently we noticed some checkpoint file cleanup performance issue when the

Re: SSL setup for YARN deployment when hostnames are unknown.

2020-11-19 Thread Jiahui Jiang
Sent: Thursday, November 12, 2020 2:08 AM To: Jiahui Jiang Cc: matth...@ververica.com ; user@flink.apache.org ; aljos...@apache.org Subject: Re: SSL setup for YARN deployment when hostnames are unknown. Hi Jiahui, using the yarn.container-start-command-template is indeed a good idea. I was

Re: SSL setup for YARN deployment when hostnames are unknown.

2020-11-11 Thread Jiahui Jiang
can generate the keystore, then start the JM process? Will that be allowed given the current Flink architecture? Thanks! From: Jiahui Jiang Sent: Wednesday, November 11, 2020 9:09 AM To: matth...@ververica.com Cc: user@flink.apache.org ; aljos...@apache.org Su

Re: SSL setup for YARN deployment when hostnames are unknown.

2020-11-11 Thread Jiahui Jiang
11, 2020 3:58 AM To: Jiahui Jiang Cc: user@flink.apache.org ; aljos...@apache.org Subject: Re: SSL setup for YARN deployment when hostnames are unknown. Hi Jiahui, thanks for reaching out to the mailing list. This is not something I have expertise in. But have you checked out the Flink SSL Set

Re: SSL setup for YARN deployment when hostnames are unknown.

2020-11-10 Thread Jiahui Jiang
Ping on this 🙂 It there anyway I can run a script or implement some interface to run before the Dispatcher service starts up to dynamically generate the keystore? Thank you! From: Jiahui Jiang Sent: Monday, November 9, 2020 3:19 PM To: user@flink.apache.org

SSL setup for YARN deployment when hostnames are unknown.

2020-11-09 Thread Jiahui Jiang
Hello Flink! We are working on turning on REST SSL for YARN deployments. We built a generic orchestration server that can submit Flink clusters to any YARN clusters given the relevant Hadoop configs. But this means we may not know the hostname the Job Managers can be deployed onto - not even th

RestartStrategy failure count when losing a Task Manager

2020-07-14 Thread Jiahui Jiang
Hello Flink, I have some questions regarding to the guideline on configuring restart strategy. I was testing a job with the following setup: 1. There are many tasks, but currently I'm running with only 2 parallelism, but plenty of task slots (4 TM and 4 task slot in each TM). 2. It's ran

Register time attribute while converting a DataStream to Table

2020-05-12 Thread Jiahui Jiang
Hello Flink friends, I have a retract stream in the format of 'DataStream' that I want to register into my table environment, and also expose processing time column in the table. For a regular datastream, I have being doing 'tableEnvironment.createTemporaryView(path, dataStream, 'field1,field2,

Re: Preserve record orders after WINDOW function

2020-05-11 Thread Jiahui Jiang
s all the elements that fall into this window in their order inside the stream. Is that correct? Thanks again! 😊 From: Jark Wu Sent: Monday, May 11, 2020 8:52 PM To: Jiahui Jiang Cc: user@flink.apache.org Subject: Re: Preserve record orders after WINDOW functio

Preserve record orders after WINDOW function

2020-05-11 Thread Jiahui Jiang
Hello! I'm writing a SQL query with a OVER window function ordered by processing time. I'm wondering since timestamp is only millisecond granularity. For a query using over window and sorted on processing time column, for example, ``` SELECT col1, max(col2) OVER (PARTITION BY col1, ORDER BY

Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.10

2020-04-30 Thread Jiahui Jiang
I see I see. Thank you so much! From: Xintong Song Sent: Wednesday, April 29, 2020 11:22 PM To: Jiahui Jiang Cc: user@flink.apache.org Subject: Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.10 That's pretty much it. I'm not very fam

Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.10

2020-04-29 Thread Jiahui Jiang
usage within them, we need to set up a non-zero task.off-heap? Thanks! From: Xintong Song Sent: Wednesday, April 29, 2020 10:53 PM To: Jiahui Jiang Cc: Steven Wu ; user@flink.apache.org Subject: Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.1

Re: Configuring taskmanager.memory.task.off-heap.size in Flink 1.10

2020-04-29 Thread Jiahui Jiang
configuration if UDFs or libraries of the job uses off-heap memory. Thank you~ Xintong Song On Wed, Apr 29, 2020 at 11:07 AM Jiahui Jiang mailto:qzhzm173...@hotmail.com>> wrote: Hello! We are migrating our pipeline from Flink 1.8 to 1.10 in Kubernetes. In the first try, we simply copie

Configuring taskmanager.memory.task.off-heap.size in Flink 1.10

2020-04-28 Thread Jiahui Jiang
Hello! We are migrating our pipeline from Flink 1.8 to 1.10 in Kubernetes. In the first try, we simply copied the old 'taskmanager.heap.size' over to 'taskmanager.memory.flink.size'. This caused the cluster to OOM. Eventually we had to allocate a small amount of memory to 'taskmanager.memory.tas

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-14 Thread Jiahui Jiang
Good to know! Thank you so much for all the responses again :) From: Jark Wu Sent: Tuesday, April 14, 2020 10:51 PM To: godfrey he Cc: Jiahui Jiang ; user Subject: Re: Setting different idleStateRetentionTime for different queries executed in the same

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-13 Thread Jiahui Jiang
rey he Sent: Monday, April 13, 2020 9:51 PM To: Jiahui Jiang Cc: Jark Wu ; user@flink.apache.org Subject: Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10 Hi Jiahui, Query hint is a way for fine-grained configuration.

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-13 Thread Jiahui Jiang
b ON a.column_name = b.column_name; Is this something Flink SQL may want to support out of the box? (Starting from Calcite 1.22.0<https://calcite.apache.org/news/2020/03/05/release-1.22.0/>, it started to provide first class hint parsing) ________ From: Jiahui Jiang Sent

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-12 Thread Jiahui Jiang
2020 8:45 AM To: Jiahui Jiang Cc: user@flink.apache.org Subject: Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10 Yes, that's right. Set idleStateRetentionTime on TableConfig before translation should work. On Sat, 11 Apr 2

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-10 Thread Jiahui Jiang
all the continuous queries executing concurrently. Thanks again! From: Jark Wu Sent: Saturday, April 11, 2020 1:24 AM To: Jiahui Jiang Cc: user@flink.apache.org Subject: Re: Setting different idleStateRetentionTime for different queries executed in the same

Re: Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-10 Thread Jiahui Jiang
Just looked into the source code a bit further and realized that for StreamTableEnvironmentImpl, even for sinks it's also doing translation lazily. Any way we can have different transformation to have different queryConfig? From: Jiahui Jiang Sent: Friday,

Setting different idleStateRetentionTime for different queries executed in the same TableEnvironment in Flink 1.10

2020-04-10 Thread Jiahui Jiang
Hello! I'm using Table API to write a pipeline with multiple queries. And I want to set up different idleStateRetentionTime for different queries. In Flink 1.8, it seems to be the case where I can pass in a streamQueryConfig when converting each output table into datastreams. And the translate w