date:20220308

[GitHub] [flink] flinkbot edited a comment on pull request #19004: [FLINK-26490][checkpoint] Adjust the MaxParallelism when operator state don't contain keyed state.

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19004:
URL: https://github.com/apache/flink/pull/19004#issuecomment-1061472524


   
   ## CI report:
   
   * b4fe404fd3eb4563d6e476c83c4d29c947752571 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32669)
 
   * b4d2eac3584665d4e667ed60b5231a3850b18430 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18995: [FLINK-26498][TABLE][WINDOW]emit result when cleanupTime timer call onEventTime if allowlatency e…

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18995:
URL: https://github.com/apache/flink/pull/18995#issuecomment-1060511804


   
   ## CI report:
   
   * 0b607728d4aa0cfbd216575dbc449565de505713 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32662)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-26535) Introduce StoreTableSource and StoreTableSink

2022-03-08 Thread Jane Chan (Jira)

Jane Chan created FLINK-26535:
-

 Summary: Introduce StoreTableSource and StoreTableSink
 Key: FLINK-26535
 URL: https://issues.apache.org/jira/browse/FLINK-26535
 Project: Flink
  Issue Type: Sub-task
  Components: Table Store
Affects Versions: 0.1.0
Reporter: Jane Chan






--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26527) ClassCastException in TemporaryClassLoaderContext

2022-03-08 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502784#comment-17502784
 ] 

Martijn Visser commented on FLINK-26527:


[~tinny] You're suffering from the problem which is explained on the Debugging 
Classloading documentation, see 
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/debugging/debugging_classloading/#x-cannot-be-cast-to-x-exceptions

Could you check if this is indeed your problem?

> ClassCastException in TemporaryClassLoaderContext
> -
>
> Key: FLINK-26527
> URL: https://issues.apache.org/jira/browse/FLINK-26527
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.13.5, 1.14.3
>Reporter: shizhengchao
>Priority: Major
>
> When I try to run sql using flink's classloader, I get the following 
> exception：
> {code:java}
> Exception in thread "main" java.lang.ClassCastException: 
> org.codehaus.janino.CompilerFactory cannot be cast to 
> org.codehaus.commons.compiler.ICompilerFactory
>     at 
> org.codehaus.commons.compiler.CompilerFactoryFactory.getCompilerFactory(CompilerFactoryFactory.java:129)
>  
> ……{code}
> my code is like this：
> {code:java}
> Configuration configuration = new Configuration();
> configuration.set(CoreOptions.CLASSLOADER_RESOLVE_ORDER, "child-first");
> List dependencies = 
> FlinkClassLoader.getFlinkDependencies(FLINK_HOME/lib);
> URLClassLoader classLoader = ClientUtils.buildUserCodeClassLoader(
> dependencies,
> Collections.emptyList(),
> SessionContext.class.getClassLoader(),
> configuration);
> try (TemporaryClassLoaderContext ignored = 
> TemporaryClassLoaderContext.of(classLoader)) {     
>tableEnv.explainSql(sql);
>  
> //CompilerFactoryFactory.getCompilerFactory("org.codehaus.janino.CompilerFactory");
> } {code}
> But, if you change `classloader.resolve-order` to `parent-first`, everything 
> works fine



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26535) Introduce StoreTableSource and StoreTableSink

2022-03-08 Thread Jane Chan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-26535:
--
Description: Introduce StoreTableSource and StoreTableSink, which creates 
StoreSource and StoreSink respectively, and interact with TableStoreFactory,  
in order to enable running stream/batch jobs via SQL.

> Introduce StoreTableSource and StoreTableSink
> -
>
> Key: FLINK-26535
> URL: https://issues.apache.org/jira/browse/FLINK-26535
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: 0.1.0
>Reporter: Jane Chan
>Priority: Major
>
> Introduce StoreTableSource and StoreTableSink, which creates StoreSource and 
> StoreSink respectively, and interact with TableStoreFactory,  in order to 
> enable running stream/batch jobs via SQL.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26524) Elasticsearch (v5.3.3) sink end-to-end test

2022-03-08 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser updated FLINK-26524:
---
Description: 
e2e test {{Elasticsearch (v5.3.3) sink end-to-end test}} failed in [this 
build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=32627&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=070ff179-953e-5bda-71fa-d6599415701c&l=16598]
 on {{release-1.14}} probably because of the following stacktrace showing up in 
the logs:
{code}
Mar 07 15:40:41 2022-03-07 15:40:40,336 WARN  
org.apache.flink.runtime.checkpoint.CheckpointFailureManager [] - Failed to 
trigger checkpoint 1 for job 3a2fd4c6fb03d5b20929a6f2b7131d82. (0 consecutive 
failed attempts so far)
Mar 07 15:40:41 org.apache.flink.runtime.checkpoint.CheckpointException: 
Checkpoint was declined (task is closing)
Mar 07 15:40:41 at 
org.apache.flink.runtime.messages.checkpoint.SerializedCheckpointException.unwrap(SerializedCheckpointException.java:51)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.receiveDeclineMessage(CheckpointCoordinator.java:988)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$declineCheckpoint$2(ExecutionGraphHandler.java:103)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$processCheckpointCoordinatorMessage$3(ExecutionGraphHandler.java:119)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_322]
Mar 07 15:40:41 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_322]
Mar 07 15:40:41 at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
Mar 07 15:40:41 Caused by: org.apache.flink.util.SerializedThrowable: Task name 
with subtask : Source: Sequence Source (Deprecated) -> Flat Map -> Sink: 
Unnamed (1/1)#0 Failure reason: Checkpoint was declined (task is closing)
Mar 07 15:40:41 at 
org.apache.flink.runtime.taskmanager.Task.declineCheckpoint(Task.java:1389) 
~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
org.apache.flink.runtime.taskmanager.Task.declineCheckpoint(Task.java:1382) 
~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
org.apache.flink.runtime.taskmanager.Task.triggerCheckpointBarrier(Task.java:1348)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
org.apache.flink.runtime.taskexecutor.TaskExecutor.triggerCheckpoint(TaskExecutor.java:956)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) ~[?:1.8.0_322]
{code}

  was:
e2e test {{Elasticsearch (v5.3.3) sink end-to-end test}} failed in [this 
build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=32627&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=070ff179-953e-5bda-71fa-d6599415701c&l=16598]
 on {{master}} probably because of the following stacktrace showing up in the 
logs:
{code}
Mar 07 15:40:41 2022-03-07 15:40:40,336 WARN  
org.apache.flink.runtime.checkpoint.CheckpointFailureManager [] - Failed to 
trigger checkpoint 1 for job 3a2fd4c6fb03d5b20929a6f2b7131d82. (0 consecutive 
failed attempts so far)
Mar 07 15:40:41 org.apache.flink.runtime.checkpoint.CheckpointException: 
Checkpoint was declined (task is closing)
Mar 07 15:40:41 at 
org.apache.flink.runtime.messages.checkpoint.SerializedCheckpointException.unwrap(SerializedCheckpointException.java:51)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
org.apache.flink.runtime.checkpoint.CheckpointCoordinator.receiveDeclineMessage(CheckpointCoordinator.java:988)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$declineCheckpoint$2(ExecutionGraphHandler.java:103)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$processCheckpointCoordinatorMessage$3(ExecutionGraphHandler.java:119)
 ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
Mar 07 15:40:41 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[?:1.8.0_322]
Mar 07 15:40:41 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[?:1.8.0_322]
Mar 07 15:40:41 at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
Mar 07 15:40:41 Caused by: org.apache.flink.util.SerializedThrowable: Task name 
with subtask : Source: Sequence Source (Deprecated) -> Flat Map

[jira] [Commented] (FLINK-26524) Elasticsearch (v5.3.3) sink end-to-end test

2022-03-08 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502785#comment-17502785
 ] 

Martijn Visser commented on FLINK-26524:


Thanks for clarifying [~guoyangze] - Should have noticed that myself :)

> Elasticsearch (v5.3.3) sink end-to-end test
> ---
>
> Key: FLINK-26524
> URL: https://issues.apache.org/jira/browse/FLINK-26524
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.14.3
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.14.4
>
>
> e2e test {{Elasticsearch (v5.3.3) sink end-to-end test}} failed in [this 
> build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=32627&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=070ff179-953e-5bda-71fa-d6599415701c&l=16598]
>  on {{master}} probably because of the following stacktrace showing up in the 
> logs:
> {code}
> Mar 07 15:40:41 2022-03-07 15:40:40,336 WARN  
> org.apache.flink.runtime.checkpoint.CheckpointFailureManager [] - Failed to 
> trigger checkpoint 1 for job 3a2fd4c6fb03d5b20929a6f2b7131d82. (0 consecutive 
> failed attempts so far)
> Mar 07 15:40:41 org.apache.flink.runtime.checkpoint.CheckpointException: 
> Checkpoint was declined (task is closing)
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.messages.checkpoint.SerializedCheckpointException.unwrap(SerializedCheckpointException.java:51)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.receiveDeclineMessage(CheckpointCoordinator.java:988)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$declineCheckpoint$2(ExecutionGraphHandler.java:103)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$processCheckpointCoordinatorMessage$3(ExecutionGraphHandler.java:119)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_322]
> Mar 07 15:40:41   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_322]
> Mar 07 15:40:41   at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
> Mar 07 15:40:41 Caused by: org.apache.flink.util.SerializedThrowable: Task 
> name with subtask : Source: Sequence Source (Deprecated) -> Flat Map -> Sink: 
> Unnamed (1/1)#0 Failure reason: Checkpoint was declined (task is closing)
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.taskmanager.Task.declineCheckpoint(Task.java:1389) 
> ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.taskmanager.Task.declineCheckpoint(Task.java:1382) 
> ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.taskmanager.Task.triggerCheckpointBarrier(Task.java:1348)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.taskexecutor.TaskExecutor.triggerCheckpoint(TaskExecutor.java:956)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) ~[?:1.8.0_322]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26535) Introduce StoreTableSource and StoreTableSink

2022-03-08 Thread Jane Chan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-26535:
--
Description: Introduce StoreTableSource and StoreTableSink, which creates 
StoreSource and StoreSink respectively, and interact with TableStoreFactory,  
in order to enable stream/batch jobs via SQL.  (was: Introduce StoreTableSource 
and StoreTableSink, which creates StoreSource and StoreSink respectively, and 
interact with TableStoreFactory,  in order to enable running stream/batch jobs 
via SQL.)

> Introduce StoreTableSource and StoreTableSink
> -
>
> Key: FLINK-26535
> URL: https://issues.apache.org/jira/browse/FLINK-26535
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: 0.1.0
>Reporter: Jane Chan
>Priority: Major
>
> Introduce StoreTableSource and StoreTableSink, which creates StoreSource and 
> StoreSink respectively, and interact with TableStoreFactory,  in order to 
> enable stream/batch jobs via SQL.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26535) Introduce StoreTableSource and StoreTableSink

2022-03-08 Thread Jane Chan (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jane Chan updated FLINK-26535:
--
Description: Introduce StoreTableSource and StoreTableSink, which creates 
StoreSource and StoreSink respectively, and interact with TableStoreFactory, to 
enable stream/batch jobs via SQL.  (was: Introduce StoreTableSource and 
StoreTableSink, which creates StoreSource and StoreSink respectively, and 
interact with TableStoreFactory,  in order to enable stream/batch jobs via SQL.)

> Introduce StoreTableSource and StoreTableSink
> -
>
> Key: FLINK-26535
> URL: https://issues.apache.org/jira/browse/FLINK-26535
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: 0.1.0
>Reporter: Jane Chan
>Priority: Major
>
> Introduce StoreTableSource and StoreTableSink, which creates StoreSource and 
> StoreSink respectively, and interact with TableStoreFactory, to enable 
> stream/batch jobs via SQL.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (FLINK-26534) shuffle by sink's primary key should cover the case that input changelog stream has a different parallelism

2022-03-08 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-26534:


Assignee: lincoln lee

> shuffle by sink's primary key should cover the case that input changelog 
> stream has a different parallelism
> ---
>
> Key: FLINK-26534
> URL: https://issues.apache.org/jira/browse/FLINK-26534
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.15.0
>Reporter: lincoln lee
>Assignee: lincoln lee
>Priority: Minor
>
> FLINK-20370 fix the wrong result when sink primary key is not the same with 
> query and introduced a new auto-keyby sink's primary key strategy for append 
> stream if the sink's parallelism differs from input stream's.
> But still exists one case to be solved:
> for a changelog stream, its changelog upsert key same as sink's primary key, 
> but sink's parallelism changed by user (via those sinks which implement the 
> `ParallelismProvider` interface, e.g., KafkaDynamicSink), we should fix it.
> And a minor change: keyby canbe omitted when sink has single parallism 
> (because none partitioner will cause worse disorder)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18994: [FLINK-26505][hive] Support non equality condition for left semi join…

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18994:
URL: https://github.com/apache/flink/pull/18994#issuecomment-1060435366


   
   ## CI report:
   
   * 83c89641a59ebb45c7260e04b096f3d79d4e32e0 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32661)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-26535) Introduce StoreTableSource and StoreTableSink

2022-03-08 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-26535:


Assignee: Jane Chan

> Introduce StoreTableSource and StoreTableSink
> -
>
> Key: FLINK-26535
> URL: https://issues.apache.org/jira/browse/FLINK-26535
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: 0.1.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
>
> Introduce StoreTableSource and StoreTableSink, which creates StoreSource and 
> StoreSink respectively, and interact with TableStoreFactory, to enable 
> stream/batch jobs via SQL.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26535) Introduce StoreTableSource and StoreTableSink

2022-03-08 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee updated FLINK-26535:
-
Fix Version/s: table-store-0.1.0

> Introduce StoreTableSource and StoreTableSink
> -
>
> Key: FLINK-26535
> URL: https://issues.apache.org/jira/browse/FLINK-26535
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: 0.1.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
> Fix For: table-store-0.1.0
>
>
> Introduce StoreTableSource and StoreTableSink, which creates StoreSource and 
> StoreSink respectively, and interact with TableStoreFactory, to enable 
> stream/batch jobs via SQL.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] smattheis commented on pull request #18984: [hotfix][docs] Fix HTML formatting in ops/metrics.md

2022-03-08 Thread GitBox



smattheis commented on pull request #18984:
URL: https://github.com/apache/flink/pull/18984#issuecomment-1061513936


   @pnowojski backport to any previous version not required as 
`watermarkAlignmentDrift` is new in 1.15


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-26527) ClassCastException in TemporaryClassLoaderContext

2022-03-08 Thread shizhengchao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502790#comment-17502790
 ] 

shizhengchao commented on FLINK-26527:
--

[~martijnvisser] Thanks, I've got it, it's not a problem.

The userclassloader should not include flink related dependencies. So I will 
separate userclassloader and application classloader(parent) to solve this 
problem

> ClassCastException in TemporaryClassLoaderContext
> -
>
> Key: FLINK-26527
> URL: https://issues.apache.org/jira/browse/FLINK-26527
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.13.5, 1.14.3
>Reporter: shizhengchao
>Priority: Major
>
> When I try to run sql using flink's classloader, I get the following 
> exception：
> {code:java}
> Exception in thread "main" java.lang.ClassCastException: 
> org.codehaus.janino.CompilerFactory cannot be cast to 
> org.codehaus.commons.compiler.ICompilerFactory
>     at 
> org.codehaus.commons.compiler.CompilerFactoryFactory.getCompilerFactory(CompilerFactoryFactory.java:129)
>  
> ……{code}
> my code is like this：
> {code:java}
> Configuration configuration = new Configuration();
> configuration.set(CoreOptions.CLASSLOADER_RESOLVE_ORDER, "child-first");
> List dependencies = 
> FlinkClassLoader.getFlinkDependencies(FLINK_HOME/lib);
> URLClassLoader classLoader = ClientUtils.buildUserCodeClassLoader(
> dependencies,
> Collections.emptyList(),
> SessionContext.class.getClassLoader(),
> configuration);
> try (TemporaryClassLoaderContext ignored = 
> TemporaryClassLoaderContext.of(classLoader)) {     
>tableEnv.explainSql(sql);
>  
> //CompilerFactoryFactory.getCompilerFactory("org.codehaus.janino.CompilerFactory");
> } {code}
> But, if you change `classloader.resolve-order` to `parent-first`, everything 
> works fine



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26524) Elasticsearch (v5.3.3) sink end-to-end test

2022-03-08 Thread Martijn Visser (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502792#comment-17502792
 ] 

Martijn Visser commented on FLINK-26524:


This probably won't get picked up soonish unless it's going to appear multiple 
times. This implementation has been dropped in newer versions (since it's not 
supported anymore) plus it has been proven stable in the past

> Elasticsearch (v5.3.3) sink end-to-end test
> ---
>
> Key: FLINK-26524
> URL: https://issues.apache.org/jira/browse/FLINK-26524
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.14.3
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.14.4
>
>
> e2e test {{Elasticsearch (v5.3.3) sink end-to-end test}} failed in [this 
> build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=32627&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=070ff179-953e-5bda-71fa-d6599415701c&l=16598]
>  on {{release-1.14}} probably because of the following stacktrace showing up 
> in the logs:
> {code}
> Mar 07 15:40:41 2022-03-07 15:40:40,336 WARN  
> org.apache.flink.runtime.checkpoint.CheckpointFailureManager [] - Failed to 
> trigger checkpoint 1 for job 3a2fd4c6fb03d5b20929a6f2b7131d82. (0 consecutive 
> failed attempts so far)
> Mar 07 15:40:41 org.apache.flink.runtime.checkpoint.CheckpointException: 
> Checkpoint was declined (task is closing)
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.messages.checkpoint.SerializedCheckpointException.unwrap(SerializedCheckpointException.java:51)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator.receiveDeclineMessage(CheckpointCoordinator.java:988)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$declineCheckpoint$2(ExecutionGraphHandler.java:103)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.scheduler.ExecutionGraphHandler.lambda$processCheckpointCoordinatorMessage$3(ExecutionGraphHandler.java:119)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [?:1.8.0_322]
> Mar 07 15:40:41   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [?:1.8.0_322]
> Mar 07 15:40:41   at java.lang.Thread.run(Thread.java:750) [?:1.8.0_322]
> Mar 07 15:40:41 Caused by: org.apache.flink.util.SerializedThrowable: Task 
> name with subtask : Source: Sequence Source (Deprecated) -> Flat Map -> Sink: 
> Unnamed (1/1)#0 Failure reason: Checkpoint was declined (task is closing)
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.taskmanager.Task.declineCheckpoint(Task.java:1389) 
> ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.taskmanager.Task.declineCheckpoint(Task.java:1382) 
> ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.taskmanager.Task.triggerCheckpointBarrier(Task.java:1348)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at 
> org.apache.flink.runtime.taskexecutor.TaskExecutor.triggerCheckpoint(TaskExecutor.java:956)
>  ~[flink-dist_2.11-1.14-SNAPSHOT.jar:1.14-SNAPSHOT]
> Mar 07 15:40:41   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) ~[?:1.8.0_322]
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #19002: [FLINK-26501] extra log

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19002:
URL: https://github.com/apache/flink/pull/19002#issuecomment-1060970183


   
   ## CI report:
   
   * a7efad2153427eeb07dcdede96c89f6db1ec5b7f Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32650)
 
   * cbdee82a50ffe1a7c3fdc3b708bafd6258229afc UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19004: [FLINK-26490][checkpoint] Adjust the MaxParallelism when operator state don't contain keyed state.

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19004:
URL: https://github.com/apache/flink/pull/19004#issuecomment-1061472524


   
   ## CI report:
   
   * b4fe404fd3eb4563d6e476c83c4d29c947752571 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32669)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-26137) Create webhook REST api test

2022-03-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26137:
---
Labels: pull-request-available  (was: )

> Create webhook REST api test
> 
>
> Key: FLINK-26137
> URL: https://issues.apache.org/jira/browse/FLINK-26137
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Major
>  Labels: pull-request-available
>
> Add test to validate the webhook rest endpoint and make sure it returns the 
> expected responses, status codes etc.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #19002: [FLINK-26501] extra log

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19002:
URL: https://github.com/apache/flink/pull/19002#issuecomment-1060970183


   
   ## CI report:
   
   * a7efad2153427eeb07dcdede96c89f6db1ec5b7f Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32650)
 
   * cbdee82a50ffe1a7c3fdc3b708bafd6258229afc Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32672)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19004: [FLINK-26490][checkpoint] Adjust the MaxParallelism when operator state don't contain keyed state.

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19004:
URL: https://github.com/apache/flink/pull/19004#issuecomment-1061472524


   
   ## CI report:
   
   * b4fe404fd3eb4563d6e476c83c4d29c947752571 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32669)
 
   * b4d2eac3584665d4e667ed60b5231a3850b18430 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-26527) ClassCastException in TemporaryClassLoaderContext

2022-03-08 Thread Martijn Visser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26527?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn Visser closed FLINK-26527.
--
Resolution: Not A Problem

Thanks for getting back on this [~tinny] ! Good luck on your next steps

> ClassCastException in TemporaryClassLoaderContext
> -
>
> Key: FLINK-26527
> URL: https://issues.apache.org/jira/browse/FLINK-26527
> Project: Flink
>  Issue Type: Bug
>  Components: Client / Job Submission
>Affects Versions: 1.13.5, 1.14.3
>Reporter: shizhengchao
>Priority: Major
>
> When I try to run sql using flink's classloader, I get the following 
> exception：
> {code:java}
> Exception in thread "main" java.lang.ClassCastException: 
> org.codehaus.janino.CompilerFactory cannot be cast to 
> org.codehaus.commons.compiler.ICompilerFactory
>     at 
> org.codehaus.commons.compiler.CompilerFactoryFactory.getCompilerFactory(CompilerFactoryFactory.java:129)
>  
> ……{code}
> my code is like this：
> {code:java}
> Configuration configuration = new Configuration();
> configuration.set(CoreOptions.CLASSLOADER_RESOLVE_ORDER, "child-first");
> List dependencies = 
> FlinkClassLoader.getFlinkDependencies(FLINK_HOME/lib);
> URLClassLoader classLoader = ClientUtils.buildUserCodeClassLoader(
> dependencies,
> Collections.emptyList(),
> SessionContext.class.getClassLoader(),
> configuration);
> try (TemporaryClassLoaderContext ignored = 
> TemporaryClassLoaderContext.of(classLoader)) {     
>tableEnv.explainSql(sql);
>  
> //CompilerFactoryFactory.getCompilerFactory("org.codehaus.janino.CompilerFactory");
> } {code}
> But, if you change `classloader.resolve-order` to `parent-first`, everything 
> works fine



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * 8e46affb05226eeed6b8eb20df971445139654a8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32535)
 
   * 610f23f625a4559bef6f000916a2de37a5cb3b38 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32666)
 
   * ca91f03a2674f7041d3a66c618f833809c6e98aa UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19004: [FLINK-26490][checkpoint] Adjust the MaxParallelism when operator state don't contain keyed state.

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19004:
URL: https://github.com/apache/flink/pull/19004#issuecomment-1061472524


   
   ## CI report:
   
   * b4fe404fd3eb4563d6e476c83c4d29c947752571 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32669)
 
   * b4d2eac3584665d4e667ed60b5231a3850b18430 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32673)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-26137) Create webhook REST api test

2022-03-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-26137:
--

Assignee: Nicholas Jiang

> Create webhook REST api test
> 
>
> Key: FLINK-26137
> URL: https://issues.apache.org/jira/browse/FLINK-26137
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Nicholas Jiang
>Priority: Major
>  Labels: pull-request-available
>
> Add test to validate the webhook rest endpoint and make sure it returns the 
> expected responses, status codes etc.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26137) Create webhook REST api test

2022-03-08 Thread Gyula Fora (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502798#comment-17502798
 ] 

Gyula Fora commented on FLINK-26137:


done :) 

> Create webhook REST api test
> 
>
> Key: FLINK-26137
> URL: https://issues.apache.org/jira/browse/FLINK-26137
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Nicholas Jiang
>Priority: Major
>  Labels: pull-request-available
>
> Add test to validate the webhook rest endpoint and make sure it returns the 
> expected responses, status codes etc.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26521) Reconsider setting generationAwareEventProcessing = true

2022-03-08 Thread Matyas Orhidi (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502799#comment-17502799
 ] 

Matyas Orhidi commented on FLINK-26521:
---

+1 Since we don't need to trigger savepoints using annotation anymore. I'm in 
favor of this change.

> Reconsider setting generationAwareEventProcessing = true
> 
>
> Key: FLINK-26521
> URL: https://issues.apache.org/jira/browse/FLINK-26521
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
>
> At the moment (if I understand correctly) FlinkDeployment status changes 
> automatically trigger an immediate reconcile step due to having 
> generationAwareEventProcessing = false
> on the FlinkDeploymentController .
> This causes a weird behaviour where reschedule delays and logic around these 
> are not respected properly and many cases.
> This might cause issues as timeout exceptions etc. We should consider whether 
> we need this flag on false.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * 610f23f625a4559bef6f000916a2de37a5cb3b38 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32666)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-table-store] JingsongLi commented on a change in pull request #33: [FLINK-26346] Add statistics collecting to sst files

2022-03-08 Thread GitBox



JingsongLi commented on a change in pull request #33:
URL: https://github.com/apache/flink-table-store/pull/33#discussion_r821432900



##
File path: 
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/stats/OrcFileStatsExtractor.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.stats;
+
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.table.data.DecimalData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.table.types.logical.DecimalType;
+import org.apache.flink.table.types.logical.RowType;
+import org.apache.flink.table.utils.DateTimeUtils;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.orc.BooleanColumnStatistics;
+import org.apache.orc.ColumnStatistics;
+import org.apache.orc.DateColumnStatistics;
+import org.apache.orc.DecimalColumnStatistics;
+import org.apache.orc.DoubleColumnStatistics;
+import org.apache.orc.IntegerColumnStatistics;
+import org.apache.orc.OrcFile;
+import org.apache.orc.Reader;
+import org.apache.orc.StringColumnStatistics;
+import org.apache.orc.TimestampColumnStatistics;
+import org.apache.orc.TypeDescription;
+
+import java.io.IOException;
+import java.sql.Date;
+import java.util.List;
+import java.util.stream.IntStream;
+
+/** {@link FileStatsExtractor} for orc files. */
+public class OrcFileStatsExtractor implements FileStatsExtractor {

Review comment:
   Hadoop-related dependencies are a headache




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-26521) Reconsider setting generationAwareEventProcessing = true

2022-03-08 Thread Matyas Orhidi (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502800#comment-17502800
 ] 

Matyas Orhidi commented on FLINK-26521:
---

We need to think of a clever way to restart the reconcile loop in case of an 
error, e.g. validation. Annotations provided a way to to this without changing 
the spec.

> Reconsider setting generationAwareEventProcessing = true
> 
>
> Key: FLINK-26521
> URL: https://issues.apache.org/jira/browse/FLINK-26521
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
>
> At the moment (if I understand correctly) FlinkDeployment status changes 
> automatically trigger an immediate reconcile step due to having 
> generationAwareEventProcessing = false
> on the FlinkDeploymentController .
> This causes a weird behaviour where reschedule delays and logic around these 
> are not respected properly and many cases.
> This might cause issues as timeout exceptions etc. We should consider whether 
> we need this flag on false.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Closed] (FLINK-26526) Record hasNull and allNull instead of nullCount in FieldStats

2022-03-08 Thread Caizhi Weng (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Caizhi Weng closed FLINK-26526.
---
Resolution: Not A Problem

> Record hasNull and allNull instead of nullCount in FieldStats
> -
>
> Key: FLINK-26526
> URL: https://issues.apache.org/jira/browse/FLINK-26526
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Affects Versions: 0.1.0
>Reporter: Caizhi Weng
>Priority: Major
>
> Currently we aren't strongly relying on {{nullCount}}. Also, some formats 
> (for example orc) does not support {{nullCount}} statistics. So we can record 
> {{hasNull}} and {{allNull}} instead of {{nullCount}}.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Comment Edited] (FLINK-26521) Reconsider setting generationAwareEventProcessing = true

2022-03-08 Thread Matyas Orhidi (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502800#comment-17502800
 ] 

Matyas Orhidi edited comment on FLINK-26521 at 3/8/22, 8:34 AM:


We need to think of a clever way to restart the reconcile loop in case of an 
error, e.g. validation. Annotations provided a way to do this without changing 
the spec.


was (Author: JIRAUSER282074):
We need to think of a clever way to restart the reconcile loop in case of an 
error, e.g. validation. Annotations provided a way to to this without changing 
the spec.

> Reconsider setting generationAwareEventProcessing = true
> 
>
> Key: FLINK-26521
> URL: https://issues.apache.org/jira/browse/FLINK-26521
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
>
> At the moment (if I understand correctly) FlinkDeployment status changes 
> automatically trigger an immediate reconcile step due to having 
> generationAwareEventProcessing = false
> on the FlinkDeploymentController .
> This causes a weird behaviour where reschedule delays and logic around these 
> are not respected properly and many cases.
> This might cause issues as timeout exceptions etc. We should consider whether 
> we need this flag on false.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink-table-store] JingsongLi commented on a change in pull request #33: [FLINK-26346] Add statistics collecting to sst files

2022-03-08 Thread GitBox



JingsongLi commented on a change in pull request #33:
URL: https://github.com/apache/flink-table-store/pull/33#discussion_r821433871



##
File path: 
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/stats/OrcFileStatsExtractor.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.stats;
+
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.table.data.DecimalData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.table.types.logical.DecimalType;
+import org.apache.flink.table.types.logical.RowType;
+import org.apache.flink.table.utils.DateTimeUtils;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.orc.BooleanColumnStatistics;
+import org.apache.orc.ColumnStatistics;
+import org.apache.orc.DateColumnStatistics;
+import org.apache.orc.DecimalColumnStatistics;
+import org.apache.orc.DoubleColumnStatistics;
+import org.apache.orc.IntegerColumnStatistics;
+import org.apache.orc.OrcFile;
+import org.apache.orc.Reader;
+import org.apache.orc.StringColumnStatistics;
+import org.apache.orc.TimestampColumnStatistics;
+import org.apache.orc.TypeDescription;
+
+import java.io.IOException;
+import java.sql.Date;
+import java.util.List;
+import java.util.stream.IntStream;
+
+/** {@link FileStatsExtractor} for orc files. */
+public class OrcFileStatsExtractor implements FileStatsExtractor {
+
+private final RowType rowType;
+
+public OrcFileStatsExtractor(RowType rowType) {
+this.rowType = rowType;
+}
+
+@Override
+public FieldStats[] extract(Path path) throws IOException {
+org.apache.hadoop.fs.Path hadoopPath = new 
org.apache.hadoop.fs.Path(path.toUri());
+Reader reader =
+OrcFile.createReader(hadoopPath, OrcFile.readerOptions(new 
Configuration()));
+
+long rowCount = reader.getNumberOfRows();
+ColumnStatistics[] columnStatistics = reader.getStatistics();
+TypeDescription schema = reader.getSchema();
+List columnNames = schema.getFieldNames();
+List columnTypes = schema.getChildren();
+
+return IntStream.range(0, rowType.getFieldCount())
+.mapToObj(
+i -> {
+RowType.RowField field = 
rowType.getFields().get(i);
+int fieldIdx = 
columnNames.indexOf(field.getName());
+int colId = columnTypes.get(fieldIdx).getId();
+return toFieldStats(field, 
columnStatistics[colId], rowCount);
+})
+.toArray(FieldStats[]::new);
+}
+
+private FieldStats toFieldStats(RowType.RowField field, ColumnStatistics 
stats, long rowCount) {
+long nullCount = rowCount - stats.getNumberOfValues();

Review comment:
   We can check nullCount with `hasNull`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18303: [FLINK-25085][runtime] Add a scheduled thread pool for periodic tasks in RpcEndpoint

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18303:
URL: https://github.com/apache/flink/pull/18303#issuecomment-1008005239


   
   ## CI report:
   
   * ecd80ec7a11dae690b258377aab2428872a0abdb Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32664)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * 610f23f625a4559bef6f000916a2de37a5cb3b38 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32666)
 
   * ca91f03a2674f7041d3a66c618f833809c6e98aa UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] lincoln-lil opened a new pull request #19006: [FLINK-26534][table-planner] shuffle by sink's primary key should cover the case that input changelog stream has a different parallelism

2022-03-08 Thread GitBox



lincoln-lil opened a new pull request #19006:
URL: https://github.com/apache/flink/pull/19006


   ## What is the purpose of the change
   FLINK-20370 fix the wrong result when sink primary key is not the same with 
query and introduced a new auto-keyby sink's primary key strategy for append 
stream if the sink's parallelism differs from input stream's.
   
   But still exists one case to be solved:
   for a changelog stream, its changelog upsert key same as sink's primary key, 
but sink's parallelism changed by user (via those sinks which implement the 
`ParallelismProvider` interface, e.g., KafkaDynamicSink), we should fix it.
   
   And a minor change: keyby canbe omitted when sink has single parallism 
(because none partitioner will cause worse disorder)
   
   ## Brief change log
   update the logic of applyKeyBy in `CommonExecSink`
   
   ## Verifying this change
   Streaming sql's `TableSinkTest` newly added test cases
   
   ## Does this pull request potentially affect one of the following parts:
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
@Public(Evolving): (no)
 - The serializers: (no )
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   
   ## Documentation
 - Does this pull request introduce a new feature? (no)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-26534) shuffle by sink's primary key should cover the case that input changelog stream has a different parallelism

2022-03-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26534:
---
Labels: pull-request-available  (was: )

> shuffle by sink's primary key should cover the case that input changelog 
> stream has a different parallelism
> ---
>
> Key: FLINK-26534
> URL: https://issues.apache.org/jira/browse/FLINK-26534
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Planner
>Affects Versions: 1.15.0
>Reporter: lincoln lee
>Assignee: lincoln lee
>Priority: Minor
>  Labels: pull-request-available
>
> FLINK-20370 fix the wrong result when sink primary key is not the same with 
> query and introduced a new auto-keyby sink's primary key strategy for append 
> stream if the sink's parallelism differs from input stream's.
> But still exists one case to be solved:
> for a changelog stream, its changelog upsert key same as sink's primary key, 
> but sink's parallelism changed by user (via those sinks which implement the 
> `ParallelismProvider` interface, e.g., KafkaDynamicSink), we should fix it.
> And a minor change: keyby canbe omitted when sink has single parallism 
> (because none partitioner will cause worse disorder)



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26496) [JUnit5 Migration] Module: flink-yarn-test

2022-03-08 Thread RocMarshal (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

RocMarshal updated FLINK-26496:
---
Parent: FLINK-25325
Issue Type: Sub-task  (was: Improvement)

> [JUnit5 Migration] Module: flink-yarn-test
> --
>
> Key: FLINK-26496
> URL: https://issues.apache.org/jira/browse/FLINK-26496
> Project: Flink
>  Issue Type: Sub-task
>  Components: Deployment / YARN, Tests
>Reporter: RocMarshal
>Priority: Minor
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (FLINK-26513) Document operator helm chart

2022-03-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-26513:
--

Assignee: Márton Balassi

> Document operator helm chart
> 
>
> Key: FLINK-26513
> URL: https://issues.apache.org/jira/browse/FLINK-26513
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Márton Balassi
>Priority: Major
>
> The helm chart is growing with many different options to cover a wide set of 
> environments.
> We should try to document these features.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Assigned] (FLINK-26468) Test default binding to localhost

2022-03-08 Thread Joe Moser (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Moser reassigned FLINK-26468:
-

Assignee: Gabor Somogyi

> Test default binding to localhost
> -
>
> Key: FLINK-26468
> URL: https://issues.apache.org/jira/browse/FLINK-26468
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Configuration
>Affects Versions: 1.15.0
>Reporter: Mika Naylor
>Assignee: Gabor Somogyi
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.15.0
>
>
> Change introduced in: https://issues.apache.org/jira/browse/FLINK-24474
> For security reasons, we have bound the REST and RPC endpoints (for the 
> JobManagers and TaskManagers) to the loopback address (localhost/127.0.0.1) 
> to prevent clusters from being accidentally exposed to the outside world.
> These were:
> * jobmanager.bind-host
> * taskmanager.bind-host
> * rest.bind-address
> Some suggestions to test:
> * Test that spinning up a Flink cluster with the default flink-conf.yaml 
> works correctly locally with different set ups (1 TaskManager, several task 
> managers, default parallelism, > 1 parallelism). Test that the JobManagers 
> and TaskManagers can communicate, and that the REST endpoint is accessable 
> locally. Test that the REST/RPC endpoints are not accessable outside of the 
> local machine.
> * Test that removing the binding configuration for the above mentioned 
> settings means that the cluster binds to 0.0.0.0 and is accessable to the 
> outside world (this may involve also changing rest.address, 
> jobmanager.rpc.address and taskmanager.rpc.address)
> * Test that default Flink setups with docker behave correctly.
> * Test that default Flink setups behave correctly with other resource 
> providers (kubernetes native, etc).



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26424) RequestTimeoutException

2022-03-08 Thread Till Rohrmann (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502804#comment-17502804
 ] 

Till Rohrmann commented on FLINK-26424:
---

I think we will soon solve this problem. There is no PR for it yet, though.

> RequestTimeoutException 
> 
>
> Key: FLINK-26424
> URL: https://issues.apache.org/jira/browse/FLINK-26424
> Project: Flink
>  Issue Type: Bug
>  Components: Stateful Functions
>Affects Versions: statefun-3.2.0
>Reporter: xinchenyuan
>Priority: Major
>
> there is no max retries, all I got is the call timeout
> as doc said, [Transport Spec  
> |https://nightlies.apache.org/flink/flink-statefun-docs-release-3.2/docs/modules/http-endpoint/#transport-1]call
>  will be failed after timeout.
> but when expcetion raised, runtime restart,  I'm confused why a function 
> internal error will cause such a big problem, will MAX RETRIES be a 
> configurable param?
>  
> 2022-02-28 17:58:32
> org.apache.flink.statefun.flink.core.functions.StatefulFunctionInvocationException:
>  An error occurred when attempting to invoke function FunctionType(tendoc, 
> AlertNotificationIngressCkafka).
> at 
> org.apache.flink.statefun.flink.core.functions.StatefulFunction.receive(StatefulFunction.java:50)
> at 
> org.apache.flink.statefun.flink.core.functions.ReusableContext.apply(ReusableContext.java:73)
> at 
> org.apache.flink.statefun.flink.core.functions.FunctionActivation.applyNextPendingEnvelope(FunctionActivation.java:50)
> at 
> org.apache.flink.statefun.flink.core.functions.LocalFunctionGroup.processNextEnvelope(LocalFunctionGroup.java:61)
> at 
> org.apache.flink.statefun.flink.core.functions.Reductions.processEnvelopes(Reductions.java:164)
> at 
> org.apache.flink.statefun.flink.core.functions.AsyncSink.drainOnOperatorThread(AsyncSink.java:119)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:50)
> at org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:90)
> at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxExecutorImpl.yield(MailboxExecutorImpl.java:86)
> at 
> org.apache.flink.statefun.flink.core.functions.FunctionGroupOperator.processElement(FunctionGroupOperator.java:88)
> at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.pushToOperator(ChainingOutput.java:108)
> at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:89)
> at 
> org.apache.flink.streaming.runtime.tasks.ChainingOutput.collect(ChainingOutput.java:39)
> at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:50)
> at 
> org.apache.flink.streaming.api.operators.CountingOutput.collect(CountingOutput.java:28)
> at 
> org.apache.flink.statefun.flink.core.feedback.FeedbackUnionOperator.sendDownstream(FeedbackUnionOperator.java:186)
> at 
> org.apache.flink.statefun.flink.core.feedback.FeedbackUnionOperator.processElement(FeedbackUnionOperator.java:86)
> at 
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:205)
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
> at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:66)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:423)
> at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:204)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:623)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
> at java.base/java.lang.Thread.run(Unknown Source)
> Caused by: java.lang.IllegalStateException: Failure forwarding a message to a 
> remote function Address(tendoc, AlertNotificationIngressCkafka, cls-message)
> at 
> org.apache.flink.statefun.flink.core.reqreply.RequestReplyFunction.onAsyncResult(RequestReplyFunction.java:170)
> at 
> org.apache.flink.statefun.flink.core.reqreply.RequestReplyFunction.invoke(RequestReplyFunction.java:124)
> at 
> org.apach

[jira] [Commented] (FLINK-23190) Make task-slot allocation much more evenly

2022-03-08 Thread Till Rohrmann (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502807#comment-17502807
 ] 

Till Rohrmann commented on FLINK-23190:
---

Sorry for not getting back to your PR [~loyi]. Unfortunately, I have left my 
previous company and am no longer working very actively on the Flink project. 
The best thing would be to reach out to the Flink community and look for a new 
shepherd. Sorry for the inconveniences I've caused here.

> Make task-slot allocation much more evenly
> --
>
> Key: FLINK-23190
> URL: https://issues.apache.org/jira/browse/FLINK-23190
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.3, 1.13.1
>Reporter: loyi
>Assignee: loyi
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: image-2021-07-16-10-34-30-700.png
>
>
> FLINK-12122 only guarantees spreading out tasks across the set of TMs which 
> are registered at the time of scheduling, but our jobs are all runing on 
> active yarn mode， the job with smaller source parallelism offen cause 
> load-balance issues. 
>  
> For this job:
> {code:java}
> //  -ys 4 means 10 taskmanagers
> env.addSource(...).name("A").setParallelism(10).
>  map(...).name("B").setParallelism(30)
>  .map(...).name("C").setParallelism(40)
>  .addSink(...).name("D").setParallelism(20);
> {code}
>  
>  Flink-1.12.3 task allocation: 
> ||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10||
> |A| 
> 1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|
> |B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}|
> |C|4|4|4|4|4|4|4|4|4|4|
> |D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}|
>  
> Suggestions:
> When TaskManger start register slots to slotManager , current processing 
> logic will choose  the first pendingSlot which meet its resource 
> requirements.  The "random" strategy usually causes uneven task allocation 
> when source-operator's parallelism is significantly below process-operator's. 
>   A simple feasible idea  is  {color:#de350b}partition{color} the current  
> "{color:#de350b}pendingSlots{color}" by their "JobVertexIds" (such as  let 
> AllocationID bring the detail)  , then allocate the slots proportionally to 
> each JobVertexGroup.
>  
> For above case, the 40 pendingSlots could be divided into 4 groups：
> [ABCD]: 10        // A、B、C、D reprents  {color:#de350b}jobVertexId{color}
> [BCD]: 10
> [CD]: 10
> [D]: 10
>  
> Every taskmanager will provide 4 slots one time, and each group will get 1 
> slot according their proportion (1/4), the final allocation result is below:
> [ABCD] : deploye on 10 different taskmangers
> [BCD]: deploye on 10 different taskmangers
> [CD]: deploye on 10  different taskmangers
> [D]: deploye on 10 different taskmangers
>  
> I have implement a [concept 
> code|https://github.com/saddays/flink/commit/dc82e60a7c7599fbcb58c14f8e3445bc8d07ace1]
>   based on Flink-1.12.3 ,  the patch version has {color:#de350b}fully 
> evenly{color} task allocation , and works well on my workload .  Are there 
> other point that have not been considered or  does it conflict with future 
> plans?      Sorry for my poor english.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Closed] (FLINK-26377) Extract Reconciler interface

2022-03-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-26377.
--
Resolution: Fixed

> Extract Reconciler interface
> 
>
> Key: FLINK-26377
> URL: https://issues.apache.org/jira/browse/FLINK-26377
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Aitozi
>Priority: Major
>  Labels: pull-request-available
>
> We should extract a common interface for the different reconciler classes 
> (Job and Session for now) and create the reconciler instance on the fly based 
> on the FlinkDeployment.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Closed] (FLINK-26471) Deleting the operator while jobs are running causes the jobs to fail

2022-03-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-26471.
--
Resolution: Fixed

merged: 8abb9f7af486e6e2cc729b24645d0ed36e8482fc

> Deleting the operator while jobs are running causes the jobs to fail
> 
>
> Key: FLINK-26471
> URL: https://issues.apache.org/jira/browse/FLINK-26471
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Márton Balassi
>Priority: Critical
>
> Deleting the operator helm chart triggers the following error in the 
> jobmanager:
> Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation: 
> [list]  for kind: [ConfigMap]  with name: [null]  in namespace: [default]  
> failed.
> ...
> Caused by: java.io.FileNotFoundException: /opt/flink/.kube/config (No such 
> file or directory)
> Need to look into why this is happening and should try to avoid it. 
> cc [~wangyang0918] 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #17959: [hotfix][doc] Remove redundant document description

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #17959:
URL: https://github.com/apache/flink/pull/17959#issuecomment-982471429


   
   ## CI report:
   
   * e7dfb46f6688fdb610a54f1058b0d6069d0c93d3 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32663)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * 610f23f625a4559bef6f000916a2de37a5cb3b38 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32666)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-26536) PyFlink RemoteKeyedStateBackend#merge_namespaces bug

2022-03-08 Thread Juntao Hu (Jira)

Juntao Hu created FLINK-26536:
-

 Summary: PyFlink RemoteKeyedStateBackend#merge_namespaces bug
 Key: FLINK-26536
 URL: https://issues.apache.org/jira/browse/FLINK-26536
 Project: Flink
  Issue Type: Bug
  Components: API / Python
Affects Versions: 1.14.3
Reporter: Juntao Hu
 Fix For: 1.15.0


There's two bugs in RemoteKeyedStateBackend#merge_namespaces:
 * data in source namespaces are not commited before merging
 * target namespace is added at the head of sources_bytes, which causes java 
side SimpleStateRequestHandler to leave one source namespace unmerged



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot commented on pull request #19006: [FLINK-26534][table-planner] shuffle by sink's primary key should cover the case that input changelog stream has a different parallelism

2022-03-08 Thread GitBox



flinkbot commented on pull request #19006:
URL: https://github.com/apache/flink/pull/19006#issuecomment-1061561518


   
   ## CI report:
   
   * a110bde8a2512a88f6f77b0d4b1d6952f599ac43 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-25684) Support enhanced show databases syntax

2022-03-08 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-25684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502814#comment-17502814
 ] 

Jingsong Lee commented on FLINK-25684:
--

[~jark] 

> Support enhanced show databases syntax
> --
>
> Key: FLINK-25684
> URL: https://issues.apache.org/jira/browse/FLINK-25684
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / API
>Reporter: Moses
>Priority: Major
>  Labels: pull-request-available
>
> Enhanced `show databases` statement like ` show databasesfrom like 'db%' ` 
> has been supported broadly in many popular SQL engine like Spark SQL/MySQL.
> We could use such statement to easily show the databases that we wannted.
> h3. SHOW DATABSES [ LIKE regex_pattern ]
> Examples:
> {code:java}
> Flink SQL> create database db1;
> [INFO] Execute statement succeed.
> Flink SQL> create database db1_1;
> [INFO] Execute statement succeed.
> Flink SQL> create database pre_db;
> [INFO] Execute statement succeed.
> Flink SQL> show databases;
> +--+
> |database name |
> +--+
> | default_database |
> |  db1 |
> |db1_1 |
> |   pre_db |
> +--+
> 4 rows in set
> Flink SQL> show databases like 'db1';
> +---+
> | database name |
> +---+
> |   db1 |
> +---+
> 1 row in set
> Flink SQL> show databases like 'db%';
> +---+
> | database name |
> +---+
> |   db1 |
> | db1_1 |
> +---+
> 2 rows in set
> Flink SQL> show databases like '%db%';
> +---+
> | database name |
> +---+
> |   db1 |
> | db1_1 |
> |pre_db |
> +---+
> 3 rows in set
> Flink SQL> show databases like '%db';
> +---+
> | database name |
> +---+
> |pre_db |
> +---+
> 1 row in set
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * 610f23f625a4559bef6f000916a2de37a5cb3b38 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32666)
 
   * ca91f03a2674f7041d3a66c618f833809c6e98aa UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19006: [FLINK-26534][table-planner] shuffle by sink's primary key should cover the case that input changelog stream has a different parallel

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19006:
URL: https://github.com/apache/flink/pull/19006#issuecomment-1061561518


   
   ## CI report:
   
   * a110bde8a2512a88f6f77b0d4b1d6952f599ac43 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32674)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] LadyForest opened a new pull request #19007: [FLINK-26495][table-planner] Prohibit hints(dynamic table options) on view

2022-03-08 Thread GitBox



LadyForest opened a new pull request #19007:
URL: https://github.com/apache/flink/pull/19007


   ## What is the purpose of the change
   
   This PR aims to throw an explicit exception when querying a view with hints 
to improve user experience. Currently, the hints on view are swallowed quietly 
during the planning phase without any trace, and users complain that the hints 
don't work on view. 
   
   ## Brief changelog
   
 - Add check on `ExpandingPreparingTable` to throw a `ValidationException`
 - A monitor fix for `CliClientITCase` to correct the assertion order for 
expected and actual
   
   ## Verifying this change
   
   This change added tests and can be verified as follows:
   
 - `TableEnvironmentTest#testQueryViewWithHints`
 - `OptionHintsTest#testOptionsHintOnTableApiView` and 
`#testOptionsHintOnSQLView`
 -  Add test case in `view.q`, which can be verified by running 
`CliClientITCase`
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / **no**)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / **no**)
 - The serializers: (yes / **no** / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / **no** 
/ don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / **no** / don't 
know)
 - The S3 file system connector: (yes / **no** / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / **no**)
 - If yes, how is the feature documented? (**not applicable** / docs / 
JavaDocs / not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-26495) Dynamic table options does not work for view

2022-03-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26495:
---
Labels: pull-request-available  (was: )

> Dynamic table options does not work for view
> 
>
> Key: FLINK-26495
> URL: https://issues.apache.org/jira/browse/FLINK-26495
> Project: Flink
>  Issue Type: Improvement
>  Components: Table SQL / Planner
>Affects Versions: 1.15.0
>Reporter: Jane Chan
>Assignee: Jane Chan
>Priority: Major
>  Labels: pull-request-available
>
> The dynamic table options (aka. table hints) syntax
> {code:java}
> table_identifier /*+ OPTIONS(key=val [, key=val]*) */ {code}
> does not work for the view without any exception thrown or suggestions to 
> users. It is not user-friendly and misleading. We should either throw a 
> meaningful exception or support this feature for view.
>  
> h4. How to reproduce
> Run the following statements in SQL CLI
> {code:java}
> Flink SQL> create table datagen (f0 int, f1 double) with ('connector' = 
> 'datagen', 'number-of-rows' = '5');
> [INFO] Execute statement succeed.
> Flink SQL> create view my_view as select * from datagen;
> [INFO] Execute statement succeed.
> Flink SQL> explain plan for select * from my_view /*+ 
> OPTIONS('number-of-rows' = '1') */;
> == Abstract Syntax Tree ==
> LogicalProject(f0=[$0], f1=[$1])
> +- LogicalTableScan(table=[[default_catalog, default_database, datagen]])
> == Optimized Physical Plan ==
> TableSourceScan(table=[[default_catalog, default_database, datagen]], 
> fields=[f0, f1])
> == Optimized Execution Plan ==
> TableSourceScan(table=[[default_catalog, default_database, datagen]], 
> fields=[f0, f1]) {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26521) Reconsider setting generationAwareEventProcessing = true

2022-03-08 Thread Yang Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502816#comment-17502816
 ] 

Yang Wang commented on FLINK-26521:
---

Maybe we could introduce a new reconciler in the future, which only responses 
to the annotations. It just works like we send a "control command" to the flink 
k8s operator and then it finishes its work.

> Reconsider setting generationAwareEventProcessing = true
> 
>
> Key: FLINK-26521
> URL: https://issues.apache.org/jira/browse/FLINK-26521
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
>
> At the moment (if I understand correctly) FlinkDeployment status changes 
> automatically trigger an immediate reconcile step due to having 
> generationAwareEventProcessing = false
> on the FlinkDeploymentController .
> This causes a weird behaviour where reschedule delays and logic around these 
> are not respected properly and many cases.
> This might cause issues as timeout exceptions etc. We should consider whether 
> we need this flag on false.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * 610f23f625a4559bef6f000916a2de37a5cb3b38 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32666)
 
   * ca91f03a2674f7041d3a66c618f833809c6e98aa Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32675)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #19007: [FLINK-26495][table-planner] Prohibit hints(dynamic table options) on view

2022-03-08 Thread GitBox



flinkbot commented on pull request #19007:
URL: https://github.com/apache/flink/pull/19007#issuecomment-1061572668


   
   ## CI report:
   
   * 7e6f04294c00b3ec399aef1b3d8f71dfe150c579 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-26521) Reconsider setting generationAwareEventProcessing = true

2022-03-08 Thread Gyula Fora (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-26521.
--
Resolution: Fixed

closed: a2412c0f2434d8005be12318ec347cbe60502753

> Reconsider setting generationAwareEventProcessing = true
> 
>
> Key: FLINK-26521
> URL: https://issues.apache.org/jira/browse/FLINK-26521
> Project: Flink
>  Issue Type: Sub-task
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
>
> At the moment (if I understand correctly) FlinkDeployment status changes 
> automatically trigger an immediate reconcile step due to having 
> generationAwareEventProcessing = false
> on the FlinkDeploymentController .
> This causes a weird behaviour where reschedule delays and logic around these 
> are not respected properly and many cases.
> This might cause issues as timeout exceptions etc. We should consider whether 
> we need this flag on false.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] Vancior opened a new pull request #19008: [FLINK-26536] [python] Fix PyFlink RemoteKeyedStateBackend#merge_namespaces bugs

2022-03-08 Thread GitBox



Vancior opened a new pull request #19008:
URL: https://github.com/apache/flink/pull/19008


   
   
   ## What is the purpose of the change
   
   This pull request fix two bugs in PyFlink 
RemoteKeyedStateBackend#merge_namespaces, which lead to incorrect result of 
session window merging.
   
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (no)
 - The serializers: (no)
 - The runtime per-record code paths (performance sensitive): (no)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
 - The S3 file system connector: (no)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (no)
 - If yes, how is the feature documented? (not applicable)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-26536) PyFlink RemoteKeyedStateBackend#merge_namespaces bug

2022-03-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26536:
---
Labels: pull-request-available  (was: )

> PyFlink RemoteKeyedStateBackend#merge_namespaces bug
> 
>
> Key: FLINK-26536
> URL: https://issues.apache.org/jira/browse/FLINK-26536
> Project: Flink
>  Issue Type: Bug
>  Components: API / Python
>Affects Versions: 1.14.3
>Reporter: Juntao Hu
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.15.0
>
>
> There's two bugs in RemoteKeyedStateBackend#merge_namespaces:
>  * data in source namespaces are not commited before merging
>  * target namespace is added at the head of sources_bytes, which causes java 
> side SimpleStateRequestHandler to leave one source namespace unmerged



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] SteNicholas commented on pull request #18892: [FLINK-25177][table] Support 'DESCRIBE TABLE EXTENDED' with managed table

2022-03-08 Thread GitBox



SteNicholas commented on pull request #18892:
URL: https://github.com/apache/flink/pull/18892#issuecomment-1061576277


   @JingsongLi, could you please help to review this pull request?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18892: [FLINK-25177][table] Support 'DESCRIBE TABLE EXTENDED' with managed table

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18892:
URL: https://github.com/apache/flink/pull/18892#issuecomment-1048407140


   
   ## CI report:
   
   * 82dfead6209ea99d6642abac1534bb2a89dc9591 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32112)
 
   * 9e092ffef298becebae353b0690d9bc5bdb3e516 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19003: [FLINK-26517][runtime] Normalize the decided parallelism to power of 2 when using adaptive batch scheduler

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19003:
URL: https://github.com/apache/flink/pull/19003#issuecomment-1061377385


   
   ## CI report:
   
   * d412278f22b0f79c0af6481398cb697a82da0ad5 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32668)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19007: [FLINK-26495][table-planner] Prohibit hints(dynamic table options) on view

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19007:
URL: https://github.com/apache/flink/pull/19007#issuecomment-1061572668


   
   ## CI report:
   
   * 7e6f04294c00b3ec399aef1b3d8f71dfe150c579 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32676)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-table-store] tsreaper commented on a change in pull request #33: [FLINK-26346] Add statistics collecting to sst files

2022-03-08 Thread GitBox



tsreaper commented on a change in pull request #33:
URL: https://github.com/apache/flink-table-store/pull/33#discussion_r821481242



##
File path: 
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/stats/OrcFileStatsExtractor.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.file.stats;
+
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.table.data.DecimalData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.table.types.logical.DecimalType;
+import org.apache.flink.table.types.logical.RowType;
+import org.apache.flink.table.utils.DateTimeUtils;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.orc.BooleanColumnStatistics;
+import org.apache.orc.ColumnStatistics;
+import org.apache.orc.DateColumnStatistics;
+import org.apache.orc.DecimalColumnStatistics;
+import org.apache.orc.DoubleColumnStatistics;
+import org.apache.orc.IntegerColumnStatistics;
+import org.apache.orc.OrcFile;
+import org.apache.orc.Reader;
+import org.apache.orc.StringColumnStatistics;
+import org.apache.orc.TimestampColumnStatistics;
+import org.apache.orc.TypeDescription;
+
+import java.io.IOException;
+import java.sql.Date;
+import java.util.List;
+import java.util.stream.IntStream;
+
+/** {@link FileStatsExtractor} for orc files. */
+public class OrcFileStatsExtractor implements FileStatsExtractor {
+
+private final RowType rowType;
+
+public OrcFileStatsExtractor(RowType rowType) {
+this.rowType = rowType;
+}
+
+@Override
+public FieldStats[] extract(Path path) throws IOException {
+org.apache.hadoop.fs.Path hadoopPath = new 
org.apache.hadoop.fs.Path(path.toUri());
+Reader reader =
+OrcFile.createReader(hadoopPath, OrcFile.readerOptions(new 
Configuration()));
+
+long rowCount = reader.getNumberOfRows();
+ColumnStatistics[] columnStatistics = reader.getStatistics();
+TypeDescription schema = reader.getSchema();
+List columnNames = schema.getFieldNames();
+List columnTypes = schema.getChildren();
+
+return IntStream.range(0, rowType.getFieldCount())
+.mapToObj(
+i -> {
+RowType.RowField field = 
rowType.getFields().get(i);
+int fieldIdx = 
columnNames.indexOf(field.getName());
+int colId = columnTypes.get(fieldIdx).getId();
+return toFieldStats(field, 
columnStatistics[colId], rowCount);
+})
+.toArray(FieldStats[]::new);
+}
+
+private FieldStats toFieldStats(RowType.RowField field, ColumnStatistics 
stats, long rowCount) {
+long nullCount = rowCount - stats.getNumberOfValues();

Review comment:
   I don't know why java docs of `ColumnStatistics#getNumberOfValues` state 
like this. But according to the implementation in `TreeWriterBase#writeBatch` 
and according to the unit tests current implementation is alright.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #19008: [FLINK-26536] [python] Fix PyFlink RemoteKeyedStateBackend#merge_namespaces bugs

2022-03-08 Thread GitBox



flinkbot commented on pull request #19008:
URL: https://github.com/apache/flink/pull/19008#issuecomment-1061583264


   
   ## CI report:
   
   * 290c71daa6e7502d0715349a784d43ee27c1c2dd UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] gaoyunhaii commented on pull request #18805: [FLINK-26516][streaming] Recover GlobalCommittables with Sink V1 GlobalCommittable serializer

2022-03-08 Thread GitBox



gaoyunhaii commented on pull request #18805:
URL: https://github.com/apache/flink/pull/18805#issuecomment-1061584175


   I added a fix for the failed azure test: previously if there are no 
committables the combine and commit would be skipped, otherwise there might 
create an empty global committable. The test failed due to this reason. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-26458) Rename Accumulator to MergeFunction

2022-03-08 Thread Jane Chan (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502822#comment-17502822
 ] 

Jane Chan commented on FLINK-26458:
---

I can take this ticket.

> Rename Accumulator to MergeFunction
> ---
>
> Key: FLINK-26458
> URL: https://issues.apache.org/jira/browse/FLINK-26458
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Priority: Major
> Fix For: table-store-0.1.0
>
>
> See org.apache.flink.table.store.file.mergetree.compact.Accumulator.
> Actually, it is not an accumulator, but a merger. The naming of the 
> accumulator is misleading.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] honnix opened a new pull request #19009: [hotfix] [docs] Refer to Running in an IDE section early on

2022-03-08 Thread GitBox



honnix opened a new pull request #19009:
URL: https://github.com/apache/flink/pull/19009


   ## What is the purpose of the change
   
   When following this doc, I found myself searching on Google instead of 
reading the final section that is pretty further away. It might be useful to 
have an early reference.
   
   ## Brief change log
   
   Refer to "Running in an IDE" section when we first mentioning running 
directly inside IDE.
   
   ## Verifying this change
   
   This change is a trivial doc change without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
   NA.
   
   ## Documentation
   
   NA.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on a change in pull request #19003: [FLINK-26517][runtime] Normalize the decided parallelism to power of 2 when using adaptive batch scheduler

2022-03-08 Thread GitBox



zhuzhurk commented on a change in pull request #19003:
URL: https://github.com/apache/flink/pull/19003#discussion_r821468782



##
File path: 
flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
##
@@ -531,7 +531,9 @@
 .withDescription(
 Description.builder()
 .text(
-"The lower bound of allowed 
parallelism to set adaptively if %s has been set to %s",
+"The lower bound of allowed 
parallelism to set adaptively if %s has been set to %s. "
++ "Currently, this option 
should be configured as a power of 2, "
++ "otherwise it will also 
be rounded up to a power of 2 by framework.",

Review comment:
   maybe: by framework -> automatically

##
File path: 
flink-core/src/main/java/org/apache/flink/configuration/JobManagerOptions.java
##
@@ -556,14 +560,16 @@
 Documentation.Sections.EXPERT_SCHEDULING,
 Documentation.Sections.ALL_JOB_MANAGER
 })
-public static final ConfigOption 
ADAPTIVE_BATCH_SCHEDULER_DATA_VOLUME_PER_TASK =
-key("jobmanager.adaptive-batch-scheduler.data-volume-per-task")
+public static final ConfigOption 
ADAPTIVE_BATCH_SCHEDULER_AVG_DATA_VOLUME_PER_TASK =
+key("jobmanager.adaptive-batch-scheduler.avg-data-volume-per-task")
 .memoryType()
 .defaultValue(MemorySize.ofMebiBytes(1024))
 .withDescription(
 Description.builder()
 .text(
-"The size of data volume to expect 
each task instance to process if %s has been set to %s",
+"The average size of data volume 
to expect each task instance to process if %s has been set to %s. "

Review comment:
   maybe also note that the actually processed data of some tasks may be 
much larger if there is data skew, or if the data is too large while the 
parallelism has reached the upper bound?

##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/adaptivebatch/DefaultVertexParallelismDecider.java
##
@@ -109,54 +112,85 @@ private int calculateParallelism(List 
consumedResults) {
 + " Use {} as the size of broadcast data to decide 
the parallelism.",
 new MemorySize(broadcastBytes),
 new MemorySize(expectedMaxBroadcastBytes),
-
JobManagerOptions.ADAPTIVE_BATCH_SCHEDULER_DATA_VOLUME_PER_TASK.key(),
+
JobManagerOptions.ADAPTIVE_BATCH_SCHEDULER_AVG_DATA_VOLUME_PER_TASK.key(),
 CAP_RATIO_OF_BROADCAST,
 new MemorySize(expectedMaxBroadcastBytes));
 
 broadcastBytes = expectedMaxBroadcastBytes;
 }
 
-int parallelism =
+int initialParallelism =
 (int) Math.ceil((double) nonBroadcastBytes / 
(dataVolumePerTask - broadcastBytes));
+int parallelism = normalizeParallelism(initialParallelism);
 
 LOG.debug(
 "The size of broadcast data is {}, the size of non-broadcast 
data is {}, "
-+ "the initially decided parallelism is {}.",
++ "the initially decided parallelism is {}, after 
normalize is {}",
 new MemorySize(broadcastBytes),
 new MemorySize(nonBroadcastBytes),
+initialParallelism,
 parallelism);
 
 if (parallelism < minParallelism) {
 LOG.info(
-"The initially decided parallelism {} is smaller than the 
minimum parallelism {} "
-+ "(which is configured by '{}'). Use {} as the 
finally decided parallelism.",
+"The initially normalized parallelism {} is smaller than 
the normalized minimum parallelism {}. "
++ "Use {} as the finally decided parallelism.",
 parallelism,
 minParallelism,
-
JobManagerOptions.ADAPTIVE_BATCH_SCHEDULER_MIN_PARALLELISM.key(),
 minParallelism);
 parallelism = minParallelism;
 } else if (parallelism > maxParallelism) {
 LOG.info(
-"The initially decided parallelism {} is larger than the 
maximum parallelism {} "
-+ "(which is configured by '{}'). Use {} as the 
finally decided parallelism.",
+"The initially normalized parallelism {} is larger than 
the normalized maximum parallelism {}. "
++ "Use {} as the finally decided parallelism.",

[GitHub] [flink] honnix commented on a change in pull request #19009: [hotfix] [docs] Refer to Running in an IDE section early on

2022-03-08 Thread GitBox



honnix commented on a change in pull request #19009:
URL: https://github.com/apache/flink/pull/19009#discussion_r821491306



##
File path: docs/content/docs/try-flink/datastream.md
##
@@ -902,4 +903,4 @@ You should see the following output in your task manager 
logs:
 
 Running the project in an IDE may result in a `java.lang.NoClassDefFoundError` 
exception. This is probably because you do not have all required Flink 
dependencies implicitly loaded into the classpath.
 
-* IntelliJ IDEA: Go to Run > Edit Configurations > Modify options > Select 
`include dependencies with "Provided" scope`. This run configuration will now 
include all required classes to run the application from within the IDE.
\ No newline at end of file
+* IntelliJ IDEA: Go to Run > Edit Configurations > Modify options > Select 
`include dependencies with "Provided" scope`. This run configuration will now 
include all required classes to run the application from within the IDE.

Review comment:
   I guess github gets confused. There is no change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18863: [FLINK-26033][flink-connector-kafka]Fix the problem that robin does not take effect due to upgrading kafka client to 2.4.1 since Flin

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18863:
URL: https://github.com/apache/flink/pull/18863#issuecomment-1046819200


   
   ## CI report:
   
   * 8cda572f57f1ee4a969e84d51b05d1e1ba74887f Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32670)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18805: [FLINK-26516][streaming] Recover GlobalCommittables with Sink V1 GlobalCommittable serializer

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18805:
URL: https://github.com/apache/flink/pull/18805#issuecomment-1041554745


   
   ## CI report:
   
   * e0cc9d1d2036b964077a274f7425cab180404583 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32636)
 
   * 617d6deb36c374af35b6db6ef36b5d4f024a0ca0 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18892: [FLINK-25177][table] Support 'DESCRIBE TABLE EXTENDED' with managed table

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18892:
URL: https://github.com/apache/flink/pull/18892#issuecomment-1048407140


   
   ## CI report:
   
   * 82dfead6209ea99d6642abac1534bb2a89dc9591 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32112)
 
   * 9e092ffef298becebae353b0690d9bc5bdb3e516 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32677)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * 610f23f625a4559bef6f000916a2de37a5cb3b38 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32666)
 
   * ca91f03a2674f7041d3a66c618f833809c6e98aa Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32675)
 
   * b6fd8d95b532e070c17d4cdbfc7ad85125d9df38 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19008: [FLINK-26536] [python] Fix PyFlink RemoteKeyedStateBackend#merge_namespaces bugs

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19008:
URL: https://github.com/apache/flink/pull/19008#issuecomment-1061583264


   
   ## CI report:
   
   * 290c71daa6e7502d0715349a784d43ee27c1c2dd Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32678)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] paul8263 opened a new pull request #19010: [FLINK-26381][docs] Wrong document order of Chinese version

2022-03-08 Thread GitBox



paul8263 opened a new pull request #19010:
URL: https://github.com/apache/flink/pull/19010


   ## What is the purpose of the change
   
   Compared English with Chinese version, the orders of articles locates in TOC 
for docs learn Flink are different.
   
   ## Brief change log
   
   - docs/content.zh/docs/learn-flink/datastream_api.md
   - docs/content.zh/docs/learn-flink/event_driven.md
   - docs/content.zh/docs/learn-flink/fault_tolerance.md
   - docs/content.zh/docs/learn-flink/streaming_analytics.md
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): no
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: no
 - The serializers: no
 - The runtime per-record code paths (performance sensitive): no
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
 - The S3 file system connector: no
   
   ## Documentation
   
 - Does this pull request introduce a new feature? no
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-26458) Rename Accumulator to MergeFunction

2022-03-08 Thread Jingsong Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jingsong Lee reassigned FLINK-26458:


Assignee: Jane Chan

> Rename Accumulator to MergeFunction
> ---
>
> Key: FLINK-26458
> URL: https://issues.apache.org/jira/browse/FLINK-26458
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jane Chan
>Priority: Major
> Fix For: table-store-0.1.0
>
>
> See org.apache.flink.table.store.file.mergetree.compact.Accumulator.
> Actually, it is not an accumulator, but a merger. The naming of the 
> accumulator is misleading.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (FLINK-26458) Rename Accumulator to MergeFunction

2022-03-08 Thread Jingsong Lee (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-26458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17502830#comment-17502830
 ] 

Jingsong Lee commented on FLINK-26458:
--

[~qingyue] Thanks

> Rename Accumulator to MergeFunction
> ---
>
> Key: FLINK-26458
> URL: https://issues.apache.org/jira/browse/FLINK-26458
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table Store
>Reporter: Jingsong Lee
>Assignee: Jane Chan
>Priority: Major
> Fix For: table-store-0.1.0
>
>
> See org.apache.flink.table.store.file.mergetree.compact.Accumulator.
> Actually, it is not an accumulator, but a merger. The naming of the 
> accumulator is misleading.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26381) Wrong document order of Chinese version

2022-03-08 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-26381:
---
Labels: pull-request-available  (was: )

> Wrong document order of Chinese version
> ---
>
> Key: FLINK-26381
> URL: https://issues.apache.org/jira/browse/FLINK-26381
> Project: Flink
>  Issue Type: Bug
>  Components: chinese-translation
>Affects Versions: 1.14.3
>Reporter: tonny
>Priority: Major
>  Labels: pull-request-available
>
> The chapter named "流式分析"(streaming analytics) and "数据管道 & ETL"(Data Pipelines 
> & ETL) under the "实践练习"(Learn Flink) are transposed compared to the English 
> version. 
> It causes important concepts such as "keyed state", "map" missed to solve the 
> exercise in the chapter of streaming analytics.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] echauchot commented on pull request #18973: [FLINK-25771][connectors][Cassandra][test] Raise the Cassandra driver timeouts at the Cluster level

2022-03-08 Thread GitBox



echauchot commented on pull request #18973:
URL: https://github.com/apache/flink/pull/18973#issuecomment-1061605587


   Hi @zentol, WDYT ?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18805: [FLINK-26516][streaming] Recover GlobalCommittables with Sink V1 GlobalCommittable serializer

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18805:
URL: https://github.com/apache/flink/pull/18805#issuecomment-1041554745


   
   ## CI report:
   
   * e0cc9d1d2036b964077a274f7425cab180404583 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32636)
 
   * 617d6deb36c374af35b6db6ef36b5d4f024a0ca0 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32679)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-table-store] JingsongLi commented on a change in pull request #33: [FLINK-26346] Add statistics collecting to sst files

2022-03-08 Thread GitBox



JingsongLi commented on a change in pull request #33:
URL: https://github.com/apache/flink-table-store/pull/33#discussion_r821524301



##
File path: 
flink-table-store-core/src/main/java/org/apache/flink/table/store/file/stats/FieldStatsCollector.java
##
@@ -35,7 +35,7 @@ public FieldStatsCollector(RowType rowType) {
 this.minValues = new Object[numFields];
 this.maxValues = new Object[numFields];
 this.nullCounts = new long[numFields];
-this.converter = new RowDataToObjectArrayConverter(rowType);
+this.converter = new RowDataToObjectArrayConverter(rowType, true);

Review comment:
   I think we can just copy the field in `minValues[i] = c;` and 
`maxValues[i] = c;`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * ca91f03a2674f7041d3a66c618f833809c6e98aa Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32675)
 
   * 47cdf91d5086b292f20d40700c7bd9052e32a6e0 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19005: [FLINK-26531][kafka] KafkaWriterITCase.testMetadataPublisher failed on azure

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19005:
URL: https://github.com/apache/flink/pull/19005#issuecomment-1061480820


   
   ## CI report:
   
   * a259c21a8693b3bc8b6732f3f5a9372dc846a658 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32671)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink-table-store] JingsongLi commented on a change in pull request #33: [FLINK-26346] Add statistics collecting to sst files

2022-03-08 Thread GitBox



JingsongLi commented on a change in pull request #33:
URL: https://github.com/apache/flink-table-store/pull/33#discussion_r821528073



##
File path: 
flink-table-store-orc/src/main/java/org/apache/flink/table/store/orc/OrcFileStatsExtractor.java
##
@@ -0,0 +1,183 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.table.store.orc;
+
+import org.apache.flink.core.fs.Path;
+import org.apache.flink.table.data.DecimalData;
+import org.apache.flink.table.data.StringData;
+import org.apache.flink.table.data.TimestampData;
+import org.apache.flink.table.store.file.stats.FieldStats;
+import org.apache.flink.table.store.file.stats.FileStatsExtractor;
+import org.apache.flink.table.types.logical.DecimalType;
+import org.apache.flink.table.types.logical.RowType;
+import org.apache.flink.table.utils.DateTimeUtils;
+
+import org.apache.hadoop.conf.Configuration;
+import org.apache.orc.BooleanColumnStatistics;
+import org.apache.orc.ColumnStatistics;
+import org.apache.orc.DateColumnStatistics;
+import org.apache.orc.DecimalColumnStatistics;
+import org.apache.orc.DoubleColumnStatistics;
+import org.apache.orc.IntegerColumnStatistics;
+import org.apache.orc.OrcFile;
+import org.apache.orc.Reader;
+import org.apache.orc.StringColumnStatistics;
+import org.apache.orc.TimestampColumnStatistics;
+import org.apache.orc.TypeDescription;
+
+import java.io.IOException;
+import java.sql.Date;
+import java.util.List;
+import java.util.stream.IntStream;
+
+/** {@link FileStatsExtractor} for orc files. */
+public class OrcFileStatsExtractor implements FileStatsExtractor {
+
+private final RowType rowType;
+
+public OrcFileStatsExtractor(RowType rowType) {
+this.rowType = rowType;
+}
+
+@Override
+public FieldStats[] extract(Path path) throws IOException {
+org.apache.hadoop.fs.Path hadoopPath = new 
org.apache.hadoop.fs.Path(path.toUri());
+Reader reader =
+OrcFile.createReader(hadoopPath, OrcFile.readerOptions(new 
Configuration()));
+
+long rowCount = reader.getNumberOfRows();
+ColumnStatistics[] columnStatistics = reader.getStatistics();
+TypeDescription schema = reader.getSchema();
+List columnNames = schema.getFieldNames();
+List columnTypes = schema.getChildren();
+
+return IntStream.range(0, rowType.getFieldCount())
+.mapToObj(
+i -> {
+RowType.RowField field = 
rowType.getFields().get(i);
+int fieldIdx = 
columnNames.indexOf(field.getName());
+int colId = columnTypes.get(fieldIdx).getId();
+return toFieldStats(field, 
columnStatistics[colId], rowCount);
+})
+.toArray(FieldStats[]::new);
+}
+
+private FieldStats toFieldStats(RowType.RowField field, ColumnStatistics 
stats, long rowCount) {
+long nullCount = rowCount - stats.getNumberOfValues();

Review comment:
   Add a check here, if !stats.hasNull() nullCount should be zero.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #19009: [hotfix] [docs] Refer to Running in an IDE section early on

2022-03-08 Thread GitBox



flinkbot commented on pull request #19009:
URL: https://github.com/apache/flink/pull/19009#issuecomment-1061628385


   
   ## CI report:
   
   * 282572e53fc0db37c831a9486742e1b5a51663e0 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #19010: [FLINK-26381][docs] Wrong document order of Chinese version

2022-03-08 Thread GitBox



flinkbot commented on pull request #19010:
URL: https://github.com/apache/flink/pull/19010#issuecomment-1061632219


   
   ## CI report:
   
   * 68d688153e4d21853ce41a4894fd6d66e1347fab UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * ca91f03a2674f7041d3a66c618f833809c6e98aa Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32675)
 
   * 47cdf91d5086b292f20d40700c7bd9052e32a6e0 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32680)
 
   * 1240d4313817b4ae0bfca21090eb6f1cbd40f78e UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-18779) Support the SupportsFilterPushDown interface for LookupTableSource

2022-03-08 Thread Flink Jira Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-18779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-18779:
---
Labels: auto-unassigned pull-request-available stale-assigned  (was: 
auto-unassigned pull-request-available)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issue is assigned but has not 
received an update in 30 days, so it has been labeled "stale-assigned".
If you are still working on the issue, please remove the label and add a 
comment updating the community on your progress.  If this issue is waiting on 
feedback, please consider this a reminder to the committer/reviewer. Flink is a 
very active project, and so we appreciate your patience.
If you are no longer working on the issue, please unassign yourself so someone 
else may work on it.


> Support the SupportsFilterPushDown interface for LookupTableSource
> --
>
> Key: FLINK-18779
> URL: https://issues.apache.org/jira/browse/FLINK-18779
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Reporter: Jark Wu
>Assignee: Yuan Zhu
>Priority: Major
>  Labels: auto-unassigned, pull-request-available, stale-assigned
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-12639) FLIP-42: Rework Documentation

2022-03-08 Thread Flink Jira Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-12639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-12639:
---
  Labels: auto-deprioritized-major auto-deprioritized-minor 
pull-request-available usability  (was: auto-deprioritized-major 
pull-request-available stale-minor usability)
Priority: Not a Priority  (was: Minor)

This issue was labeled "stale-minor" 7 days ago and has not received any 
updates so it is being deprioritized. If this ticket is actually Minor, please 
raise the priority and ask a committer to assign you the issue or revive the 
public discussion.


> FLIP-42: Rework Documentation
> -
>
> Key: FLINK-12639
> URL: https://issues.apache.org/jira/browse/FLINK-12639
> Project: Flink
>  Issue Type: Improvement
>  Components: Documentation
>Reporter: Konstantin Knauf
>Priority: Not a Priority
>  Labels: auto-deprioritized-major, auto-deprioritized-minor, 
> pull-request-available, usability
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-26272) Elasticsearch7SinkITCase.testWriteJsonToElasticsearch fails with socket timeout

2022-03-08 Thread Flink Jira Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-26272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-26272:
---
Labels: stale-critical test-stability  (was: test-stability)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issues has been marked as 
Critical but is unassigned and neither itself nor its Sub-Tasks have been 
updated for 14 days. I have gone ahead and marked it "stale-critical". If this 
ticket is critical, please either assign yourself or give an update. 
Afterwards, please remove the label or in 7 days the issue will be 
deprioritized.


> Elasticsearch7SinkITCase.testWriteJsonToElasticsearch fails with socket 
> timeout
> ---
>
> Key: FLINK-26272
> URL: https://issues.apache.org/jira/browse/FLINK-26272
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / ElasticSearch
>Affects Versions: 1.15.0
>Reporter: Matthias Pohl
>Priority: Critical
>  Labels: stale-critical, test-stability
>
> We observed a test failure in [this 
> build|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=31883&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=ed165f3f-d0f6-524b-5279-86f8ee7d0e2d&l=12917]
>  with {{Elasticsearch7SinkITCase.testWriteJsonToElasticsearch}} failing due 
> to a {{SocketTimeoutException}}:
> {code}
> Feb 18 18:04:20 [ERROR] Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 80.248 s <<< FAILURE! - in 
> org.apache.flink.connector.elasticsearch.sink.Elasticsearch7SinkITCase
> Feb 18 18:04:20 [ERROR] 
> org.apache.flink.connector.elasticsearch.sink.Elasticsearch7SinkITCase.testWriteJsonToElasticsearch(BiFunction)[1]
>   Time elapsed: 31.525 s  <<< ERROR!
> Feb 18 18:04:20 org.apache.flink.runtime.client.JobExecutionException: Job 
> execution failed.
> Feb 18 18:04:20   at 
> org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144)
> Feb 18 18:04:20   at 
> org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:141)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> Feb 18 18:04:20   at 
> org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$1(AkkaInvocationHandler.java:259)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> Feb 18 18:04:20   at 
> org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1389)
> Feb 18 18:04:20   at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93)
> Feb 18 18:04:20   at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68)
> Feb 18 18:04:20   at 
> org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
> Feb 18 18:04:20   at 
> java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975)
> Feb 18 18:04:20   at 
> org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$1.onComplete(AkkaFutureUtils.java:47)
> Feb 18 18:04:20   at akka.dispatch.OnComplete.internal(Future.scala:300)
> Feb 18 18:04:20   at akka.dispatch.OnComplete.internal(Future.scala:297)
> Feb 18 18:04:20   at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:224)
> Feb 18 18:04:20   at 
> akka.dispatch.japi$CallbackBridge.apply(Future.scala:221)
> Feb 18 18:04:20   at 
> scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60)
> Feb 18 18:04:20   at 
> org.apache.flink.runtime.concurrent.akka.Ak

[jira] [Updated] (FLINK-25456) FlinkKafkaProducerITCase.testScaleDownBeforeFirstCheckpoint

2022-03-08 Thread Flink Jira Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-25456:
---
Labels: stale-critical test-stability  (was: test-stability)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issues has been marked as 
Critical but is unassigned and neither itself nor its Sub-Tasks have been 
updated for 14 days. I have gone ahead and marked it "stale-critical". If this 
ticket is critical, please either assign yourself or give an update. 
Afterwards, please remove the label or in 7 days the issue will be 
deprioritized.


> FlinkKafkaProducerITCase.testScaleDownBeforeFirstCheckpoint
> ---
>
> Key: FLINK-25456
> URL: https://issues.apache.org/jira/browse/FLINK-25456
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / Kafka
>Affects Versions: 1.15.0, 1.14.2
>Reporter: Till Rohrmann
>Priority: Critical
>  Labels: stale-critical, test-stability
>
> The test {{FlinkKafkaProducerITCase.testScaleDownBeforeFirstCheckpoint}} 
> fails with
> {code}
> 2021-12-27T02:54:54.8464375Z Dec 27 02:54:54 [ERROR] Tests run: 15, Failures: 
> 1, Errors: 0, Skipped: 0, Time elapsed: 285.279 s <<< FAILURE! - in 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase
> 2021-12-27T02:54:54.8465354Z Dec 27 02:54:54 [ERROR] 
> testScaleDownBeforeFirstCheckpoint  Time elapsed: 85.514 s  <<< FAILURE!
> 2021-12-27T02:54:54.8468827Z Dec 27 02:54:54 java.lang.AssertionError: 
> Detected producer leak. Thread name: kafka-producer-network-thread | 
> producer-MockTask-002a002c-18
> 2021-12-27T02:54:54.8469779Z Dec 27 02:54:54  at 
> org.junit.Assert.fail(Assert.java:89)
> 2021-12-27T02:54:54.8470485Z Dec 27 02:54:54  at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.checkProducerLeak(FlinkKafkaProducerITCase.java:847)
> 2021-12-27T02:54:54.8471842Z Dec 27 02:54:54  at 
> org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducerITCase.testScaleDownBeforeFirstCheckpoint(FlinkKafkaProducerITCase.java:381)
> 2021-12-27T02:54:54.8472724Z Dec 27 02:54:54  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-12-27T02:54:54.8473509Z Dec 27 02:54:54  at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-12-27T02:54:54.8474704Z Dec 27 02:54:54  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-12-27T02:54:54.8475523Z Dec 27 02:54:54  at 
> java.base/java.lang.reflect.Method.invoke(Method.java:566)
> 2021-12-27T02:54:54.8476258Z Dec 27 02:54:54  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
> 2021-12-27T02:54:54.8476949Z Dec 27 02:54:54  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-12-27T02:54:54.8477632Z Dec 27 02:54:54  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
> 2021-12-27T02:54:54.8478451Z Dec 27 02:54:54  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-12-27T02:54:54.8479282Z Dec 27 02:54:54  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-12-27T02:54:54.8479976Z Dec 27 02:54:54  at 
> org.apache.flink.testutils.junit.RetryRule$RetryOnFailureStatement.evaluate(RetryRule.java:135)
> 2021-12-27T02:54:54.8480696Z Dec 27 02:54:54  at 
> org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
> 2021-12-27T02:54:54.8481410Z Dec 27 02:54:54  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
> 2021-12-27T02:54:54.8482009Z Dec 27 02:54:54  at 
> org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
> 2021-12-27T02:54:54.8482636Z Dec 27 02:54:54  at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
> 2021-12-27T02:54:54.8483267Z Dec 27 02:54:54  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
> 2021-12-27T02:54:54.8483900Z Dec 27 02:54:54  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
> 2021-12-27T02:54:54.8484574Z Dec 27 02:54:54  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
> 2021-12-27T02:54:54.8485214Z Dec 27 02:54:54  at 
> org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
> 2021-12-27T02:54:54.8485838Z Dec 27 02:54:54  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
> 2021-12-27T02:54:54.8486441Z Dec 27 02:54:54  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
> 2021-12-27T02:54:54.84

[jira] [Updated] (FLINK-25306) Flink CLI end-to-end test timeout on azure

2022-03-08 Thread Flink Jira Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-25306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flink Jira Bot updated FLINK-25306:
---
Labels: auto-deprioritized-critical stale-major test-stability  (was: 
auto-deprioritized-critical test-stability)

I am the [Flink Jira Bot|https://github.com/apache/flink-jira-bot/] and I help 
the community manage its development. I see this issues has been marked as 
Major but is unassigned and neither itself nor its Sub-Tasks have been updated 
for 60 days. I have gone ahead and added a "stale-major" to the issue". If this 
ticket is a Major, please either assign yourself or give an update. Afterwards, 
please remove the label or in 7 days the issue will be deprioritized.


> Flink CLI end-to-end test timeout on azure
> --
>
> Key: FLINK-25306
> URL: https://issues.apache.org/jira/browse/FLINK-25306
> Project: Flink
>  Issue Type: Bug
>  Components: Build System / Azure Pipelines, Client / Job Submission
>Affects Versions: 1.15.0
>Reporter: Yun Gao
>Priority: Major
>  Labels: auto-deprioritized-critical, stale-major, test-stability
>
> {code:java}
> Dec 14 02:14:48 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:16:59 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:19:10 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:21:21 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:23:32 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:25:43 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:27:54 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:30:05 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:32:16 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:34:27 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:36:38 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:38:49 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:41:01 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:43:12 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:45:23 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:47:34 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:49:45 Waiting for Dispatcher REST endpoint to come up...
> Dec 14 02:49:46 Dispatcher REST endpoint has not started within a timeout of 
> 30 sec
> Dec 14 02:49:46 [FAIL] Test script contains errors.
> Dec 14 02:49:46 Checking for errors...
> Dec 14 02:49:46 No errors in log files.
> Dec 14 02:49:46 Checking for exceptions...
> Dec 14 02:49:46 No exceptions in log files.
> Dec 14 02:49:46 Checking for non-empty .out files...
> Dec 14 02:49:46 No non-empty .out files.
> Dec 14 02:49:46 
> Dec 14 02:49:46 [FAIL] 'Flink CLI end-to-end test' failed after 65 minutes 
> and 35 seconds! Test exited with exit code 1
> Dec 14 02:49:46 
> 02:49:46 ##[group]Environment Information
> Dec 14 02:49:46 Searching for .dump, .dumpstream and related files in 
> '/home/vsts/work/1/s'
> dmesg: read kernel buffer failed: Operation not permitted
> Dec 14 02:49:48 Stopping taskexecutor daemon (pid: 93858) on host 
> fv-az231-497.
> Dec 14 02:49:49 Stopping standalonesession daemon (pid: 93605) on host 
> fv-az231-497.
> The STDIO streams did not close within 10 seconds of the exit event from 
> process '/usr/bin/bash'. This may indicate a child process inherited the 
> STDIO streams and has not yet exited.
> ##[error]Bash exited with code '1'.
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=28066&view=logs&j=2ffee335-fb12-54a6-1ba9-9610c8a56b81&t=ad628523-4b0b-5f7d-41f5-e8e2e6921343&l=108



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Updated] (FLINK-23190) Make task-slot allocation much more evenly

2022-03-08 Thread loyi (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-23190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

loyi updated FLINK-23190:
-
Flags:   (was: Patch)

> Make task-slot allocation much more evenly
> --
>
> Key: FLINK-23190
> URL: https://issues.apache.org/jira/browse/FLINK-23190
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.12.3, 1.13.1
>Reporter: loyi
>Assignee: loyi
>Priority: Major
>  Labels: pull-request-available, stale-assigned
> Attachments: image-2021-07-16-10-34-30-700.png
>
>
> FLINK-12122 only guarantees spreading out tasks across the set of TMs which 
> are registered at the time of scheduling, but our jobs are all runing on 
> active yarn mode， the job with smaller source parallelism offen cause 
> load-balance issues. 
>  
> For this job:
> {code:java}
> //  -ys 4 means 10 taskmanagers
> env.addSource(...).name("A").setParallelism(10).
>  map(...).name("B").setParallelism(30)
>  .map(...).name("C").setParallelism(40)
>  .addSink(...).name("D").setParallelism(20);
> {code}
>  
>  Flink-1.12.3 task allocation: 
> ||operator||tm1 ||tm2||tm3||tm4||tm5||5m6||tm7||tm8||tm9||tm10||
> |A| 
> 1|{color:#de350b}2{color}|{color:#de350b}2{color}|1|1|{color:#de350b}3{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|{color:#de350b}0{color}|
> |B|3|3|3|3|3|3|3|3|{color:#de350b}2{color}|{color:#de350b}4{color}|
> |C|4|4|4|4|4|4|4|4|4|4|
> |D|2|2|2|2|2|{color:#de350b}1{color}|{color:#de350b}1{color}|2|2|{color:#de350b}4{color}|
>  
> Suggestions:
> When TaskManger start register slots to slotManager , current processing 
> logic will choose  the first pendingSlot which meet its resource 
> requirements.  The "random" strategy usually causes uneven task allocation 
> when source-operator's parallelism is significantly below process-operator's. 
>   A simple feasible idea  is  {color:#de350b}partition{color} the current  
> "{color:#de350b}pendingSlots{color}" by their "JobVertexIds" (such as  let 
> AllocationID bring the detail)  , then allocate the slots proportionally to 
> each JobVertexGroup.
>  
> For above case, the 40 pendingSlots could be divided into 4 groups：
> [ABCD]: 10        // A、B、C、D reprents  {color:#de350b}jobVertexId{color}
> [BCD]: 10
> [CD]: 10
> [D]: 10
>  
> Every taskmanager will provide 4 slots one time, and each group will get 1 
> slot according their proportion (1/4), the final allocation result is below:
> [ABCD] : deploye on 10 different taskmangers
> [BCD]: deploye on 10 different taskmangers
> [CD]: deploye on 10  different taskmangers
> [D]: deploye on 10 different taskmangers
>  
> I have implement a [concept 
> code|https://github.com/saddays/flink/commit/dc82e60a7c7599fbcb58c14f8e3445bc8d07ace1]
>   based on Flink-1.12.3 ,  the patch version has {color:#de350b}fully 
> evenly{color} task allocation , and works well on my workload .  Are there 
> other point that have not been considered or  does it conflict with future 
> plans?      Sorry for my poor english.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[GitHub] [flink] flinkbot edited a comment on pull request #18978: [FLINK-25549][flink-dstl] [JUnit5 Migration] Module: flink-dstl

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18978:
URL: https://github.com/apache/flink/pull/18978#issuecomment-1059111709


   
   ## CI report:
   
   * d55c277b2822bf631c2cb22483c646ff2104d322 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32562)
 
   * 8659b3083d14aab62c04ac96611897fe596a5026 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19009: [hotfix] [docs] Refer to Running in an IDE section early on

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19009:
URL: https://github.com/apache/flink/pull/19009#issuecomment-1061628385


   
   ## CI report:
   
   * 282572e53fc0db37c831a9486742e1b5a51663e0 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32681)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #19010: [FLINK-26381][docs] Wrong document order of Chinese version

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #19010:
URL: https://github.com/apache/flink/pull/19010#issuecomment-1061632219


   
   ## CI report:
   
   * 68d688153e4d21853ce41a4894fd6d66e1347fab Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32682)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zoucao commented on a change in pull request #18966: [FLINK-26422][docs-zh][table]update chinese doc with the new TablePipeline docs

2022-03-08 Thread GitBox



zoucao commented on a change in pull request #18966:
URL: https://github.com/apache/flink/pull/18966#discussion_r821541527



##
File path: docs/content.zh/docs/dev/table/data_stream_api.md
##
@@ -596,13 +596,13 @@ pipeline or a statement set:
 
 ```java
 // execute with explicit sink
-tableEnv.from("InputTable").executeInsert("OutputTable")
+tableEnv.from("InputTable").insertInto("OutputTable").execute()

Review comment:
   Hi, @RocMarshal , could you help me confirm whether the ';' should be 
added for one line java code,
   for example :
   not added originally：
   
   > - Start from the earliest available message in the topic.
   >   ```java
   >   StartCursor.earliest()
   >   ```
   > - Start from the latest available message in the topic.
   >   ```java
   >   StartCursor.latest()
   >   ```
   > - Start from a specified message between the earliest and the latest.
   >   Pulsar connector would consume from the latest available message if the 
message id doesn't exist.
   > 
   >   The start message is included in consuming result.
   >   ```java
   >   StartCursor.fromMessageId(MessageId)
   >   ```
   added originally：
   
   > 
   > ```java
   > // discover new partitions per 10 seconds
   > PulsarSource.builder()
   > 
.setConfig(PulsarSourceOptions.PULSAR_PARTITION_DISCOVERY_INTERVAL_MS, 1);
   > ```
   All were chosen from pulsar.md.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18978: [FLINK-25549][flink-dstl] [JUnit5 Migration] Module: flink-dstl

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18978:
URL: https://github.com/apache/flink/pull/18978#issuecomment-1059111709


   
   ## CI report:
   
   * d55c277b2822bf631c2cb22483c646ff2104d322 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32562)
 
   * 8659b3083d14aab62c04ac96611897fe596a5026 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32683)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #18957: [FLINK-26444][python]Window allocator supporting pyflink datastream API

2022-03-08 Thread GitBox



flinkbot edited a comment on pull request #18957:
URL: https://github.com/apache/flink/pull/18957#issuecomment-1056202149


   
   ## CI report:
   
   * ec1e0a435186082e5ac1481bc093f9bdd9d94d70 UNKNOWN
   * 47cdf91d5086b292f20d40700c7bd9052e32a6e0 Azure: 
[CANCELED](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=32680)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 3 4 >

1 - 100 of 370 matches

Mail list logo