date:20210826



 [ 
https://issues.apache.org/jira/browse/FLINK-23978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-23978:
-
Parent: FLINK-23986
Issue Type: Sub-task  (was: Technical Debt)

> FieldAccessor has direct scala dependency
> -
>
> Key: FLINK-23978
> URL: https://issues.apache.org/jira/browse/FLINK-23978
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
> Fix For: 1.15.0
>
>
> The FieldAccessor class in flink-streaming-java has a hard dependency on 
> scala. It would be ideal if we could restrict this dependencies to 
> flink-streaming-scala.
> We could move the SimpleProductFieldAccessor & RecursiveProductFieldAccessor 
> to flink-streaming-scala, and load them in the FieldAccessorFactory via 
> reflection.
> This is one of a few steps that would allow the Java Datastream API to be 
> used without scala being on the classpath.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23968) Remove unused code in TestBaseUtils



 [ 
https://issues.apache.org/jira/browse/FLINK-23968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-23968:
-
Parent: FLINK-23986
Issue Type: Sub-task  (was: Technical Debt)

> Remove unused code in TestBaseUtils
> ---
>
> Key: FLINK-23968
> URL: https://issues.apache.org/jira/browse/FLINK-23968
> Project: Flink
>  Issue Type: Sub-task
>  Components: Tests
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> The TestBaseUtils in flink-test-utils contains unused code, which also cause 
> some scala references.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23979) Exceptions with Kotlin 1.5.0 and higher



[ 
https://issues.apache.org/jira/browse/FLINK-23979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405007#comment-17405007
 ] 

Hank commented on FLINK-23979:
--

The Serialization issue was fixed in 1.5.10, I will update this ticket.

> Exceptions with Kotlin 1.5.0 and higher
> ---
>
> Key: FLINK-23979
> URL: https://issues.apache.org/jira/browse/FLINK-23979
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.13.2
>Reporter: Hank
>Priority: Major
>
> *Summary*
> keyBy(..) function triggers exceptions when using Kotlin. Different Kotlin 
> compiler versions give different exceptions.
>  
> *Reproduce*
> See below.
>  
> *Using Kotlin 1.5.20*
>  
> When using
> {code:java}
> .keyBy(...){code}
> the following runtime exception occurs:
>  
>  
> {code:java}
> Exception in thread "main" 
> org.apache.flink.api.common.functions.InvalidTypesException: The types of the 
> interface org.apache.flink.api.java.functions.KeySelector could not be 
> inferred. Support for synthetic interfaces, lambdas, and generic or raw types 
> is limited at this point
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1244)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:1268)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1231)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:789)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:587)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:436)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:429)
> at 
> org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:118)
> at 
> org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
> at FraudDetectionKt.main(FraudDetection.kt:23){code}
>  
>  
> *Using Kotlin 1.5.0*
> When using
> {code:java}
> .keyBy(...)
> {code}
> gives the following runtime exception:
> {code:java}
> Exception in thread "main" 
> org.apache.flink.api.common.InvalidProgramException: Object 
> FraudDetectionKt$$Lambda$138/0x0008001e6440@7d446ed1 is not serializable
>   at 
> org.apache.flink.api.java.ClosureCleaner.ensureSerializable(ClosureCleaner.java:180)
>   at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1901)
>   at 
> org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:189)
>   at 
> org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
>   at FraudDetectionKt.main(FraudDetection.kt:23)
> {code}
> Using and older version of Kotlin, e.g 1.4.32, this exception does not occur 
> and the program runs fine.
>  
> Some research points this change log that might have something to do with 
> these exceptions?
> [https://kotlinlang.org/docs/whatsnew15.html#lambdas-via-invokedynamic]
>  
> *Reproduce*
> Use the code from the tutorial:
> [https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/try-flink/datastream/]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23979) Exceptions with Kotlin 1.5.0 and higher



 [ 
https://issues.apache.org/jira/browse/FLINK-23979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hank updated FLINK-23979:
-
Description: 
*Summary*

keyBy(..) function triggers exceptions when using Kotlin. Different Kotlin 
compiler versions give different exceptions.

 

*Reproduce*

See below.

 

*Using Kotlin 1.5.20 and 1.5.30*

 

When using
{code:java}
.keyBy(...){code}
the following runtime exception occurs:

 

 
{code:java}
Exception in thread "main" 
org.apache.flink.api.common.functions.InvalidTypesException: The types of the 
interface org.apache.flink.api.java.functions.KeySelector could not be 
inferred. Support for synthetic interfaces, lambdas, and generic or raw types 
is limited at this point
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1244)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:1268)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1231)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:789)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:587)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:436)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:429)
at 
org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:118)
at 
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
at FraudDetectionKt.main(FraudDetection.kt:23){code}
 

 

*Using Kotlin 1.5.0 – +FIXED in 1.5.10+*

When using
{code:java}
.keyBy(...)
{code}
gives the following runtime exception:
{code:java}
Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: 
Object FraudDetectionKt$$Lambda$138/0x0008001e6440@7d446ed1 is not 
serializable
at 
org.apache.flink.api.java.ClosureCleaner.ensureSerializable(ClosureCleaner.java:180)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1901)
at 
org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:189)
at 
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
at FraudDetectionKt.main(FraudDetection.kt:23)
{code}
Using and older version of Kotlin, e.g 1.4.32, this exception does not occur 
and the program runs fine.

 

Some research points this change log that might have something to do with these 
exceptions?

[https://kotlinlang.org/docs/whatsnew15.html#lambdas-via-invokedynamic]

 

*Reproduce*

Use the code from the tutorial:

[https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/try-flink/datastream/]

 

  was:
*Summary*

keyBy(..) function triggers exceptions when using Kotlin. Different Kotlin 
compiler versions give different exceptions.

 

*Reproduce*

See below.

 

*Using Kotlin 1.5.20*

 

When using
{code:java}
.keyBy(...){code}
the following runtime exception occurs:

 

 
{code:java}
Exception in thread "main" 
org.apache.flink.api.common.functions.InvalidTypesException: The types of the 
interface org.apache.flink.api.java.functions.KeySelector could not be 
inferred. Support for synthetic interfaces, lambdas, and generic or raw types 
is limited at this point
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1244)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:1268)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1231)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:789)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:587)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:436)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:429)
at 
org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:118)
at 
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
at FraudDetectionKt.main(FraudDetection.kt:23){code}
 

 

*Using Kotlin 1.5.0*

When using
{code:java}
.keyBy(...)
{code}
gives the following runtime exception:
{code:java}
Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: 
Object FraudDetectionKt$$Lambda$138/0x0008001e6440@7d446ed1 is not 
serializable
at 
org.apache.flink.api.java.ClosureCleaner.ensureSerializable(ClosureCleaner.java:180)
at 
org.apache.flink.streaming.api.environment

[GitHub] [flink] pnowojski commented on a change in pull request #16990: [FLINK-23776][datastream] Optimize source metric calculation



pnowojski commented on a change in pull request #16990:
URL: https://github.com/apache/flink/pull/16990#discussion_r696349341



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/metrics/groups/SourceListener.java
##
@@ -0,0 +1,48 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.runtime.metrics.groups;
+
+import org.apache.flink.annotation.Internal;
+
+/**
+ * Listens to changes inside the source. This is currently only used as an 
abstraction for {@link
+ * InternalSourceReaderMetricGroup}.
+ *
+ * Please be cautious to add more implementations of this interface to 
production code as it's
+ * performance-sensitive. Ideally, we only have one implementation that can be 
inlined.
+ */
+@Internal
+public interface SourceListener {

Review comment:
   You are introducing this interface only for the sake of creating test 
implementation, and then commenting that it shouldn't be extended. Do we really 
need to do that? Can not we have just a final class tracking last emitted 
timestamp and watermark?
   
   I can see a second reason, a bit cleaner code. With this interface you don't 
need to pass whole `InternalSourceReaderMetricGroup` to the outputs. 
   
   However... both of those issues seems to me a problem of using inheritance 
instead of composition in the `InternalSourceReaderMetricGroup implements 
SourceListener`.
   
   Wouldn't it solve both of those problems if you introduced `public final 
class SourceListener` and inject it as a parameter to 
`InternalSourceReaderMetricGroup` and both `SourceOutputWithWatermarks` and 
`TimestampsOnlyOutput`? This would also render `NoopSourceListener` 
unnecessary, as instantiating production class would be just as easy. `public 
final class` wouldn't optimise anything BUT it would provide a compile error 
and force someone to look at the code and maybe notice this javadoc if someone 
would violate this . And we would be sure that there is no problem with 
`NoopSourceListener` leaking to to class loader (for example in benchmarks 
which are including test dependencies).




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-23979) Exceptions with Kotlin 1.5.0 and higher



 [ 
https://issues.apache.org/jira/browse/FLINK-23979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hank updated FLINK-23979:
-
Description: 
*Summary*

keyBy(..) function triggers exceptions when using Kotlin. Different Kotlin 
compiler versions give different exceptions.

 

*Reproduce*

See below.

 

*Using Kotlin 1.5.20 and 1.5.30*

 

When using
{code:java}
.keyBy(...){code}
the following runtime exception occurs: 
{code:java}
Exception in thread "main" 
org.apache.flink.api.common.functions.InvalidTypesException: The types of the 
interface org.apache.flink.api.java.functions.KeySelector could not be 
inferred. Support for synthetic interfaces, lambdas, and generic or raw types 
is limited at this point
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1244)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:1268)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1231)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:789)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:587)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:436)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:429)
at 
org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:118)
at 
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
at FraudDetectionKt.main(FraudDetection.kt:23){code}
 

Update: this seemed to be an issue in Kotlin < 1.4.0 as well: 
https://github.com/classpass/flink-kotlin#lambdas-kotlin-pre-14-and-invalidtypesexception

 

*Using Kotlin 1.5.0 – +FIXED in 1.5.10+*

When using
{code:java}
.keyBy(...)
{code}
gives the following runtime exception:
{code:java}
Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: 
Object FraudDetectionKt$$Lambda$138/0x0008001e6440@7d446ed1 is not 
serializable
at 
org.apache.flink.api.java.ClosureCleaner.ensureSerializable(ClosureCleaner.java:180)
at 
org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1901)
at 
org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:189)
at 
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
at FraudDetectionKt.main(FraudDetection.kt:23)
{code}
Using and older version of Kotlin, e.g 1.4.32, this exception does not occur 
and the program runs fine.

 

Some research points this change log that might have something to do with these 
exceptions?

[https://kotlinlang.org/docs/whatsnew15.html#lambdas-via-invokedynamic]

 

*Reproduce*

Use the code from the tutorial:

[https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/try-flink/datastream/]

 

  was:
*Summary*

keyBy(..) function triggers exceptions when using Kotlin. Different Kotlin 
compiler versions give different exceptions.

 

*Reproduce*

See below.

 

*Using Kotlin 1.5.20 and 1.5.30*

 

When using
{code:java}
.keyBy(...){code}
the following runtime exception occurs:

 

 
{code:java}
Exception in thread "main" 
org.apache.flink.api.common.functions.InvalidTypesException: The types of the 
interface org.apache.flink.api.java.functions.KeySelector could not be 
inferred. Support for synthetic interfaces, lambdas, and generic or raw types 
is limited at this point
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1244)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:1268)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1231)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:789)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:587)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:436)
at 
org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:429)
at 
org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:118)
at 
org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
at FraudDetectionKt.main(FraudDetection.kt:23){code}
 

 

*Using Kotlin 1.5.0 – +FIXED in 1.5.10+*

When using
{code:java}
.keyBy(...)
{code}
gives the following runtime exception:
{code:java}
Exception in thread "main" org.apache.flink.api.common.InvalidProgramException: 
Object FraudDetectionKt$$Lambda$138/0x0008001e6440@7

[jira] [Commented] (FLINK-23979) Exceptions with Kotlin 1.5.0 and higher



[ 
https://issues.apache.org/jira/browse/FLINK-23979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405009#comment-17405009
 ] 

Hank commented on FLINK-23979:
--

InvalidTypesException seemed to be an issue with Kotlin < 1.4.0 as well:

https://github.com/classpass/flink-kotlin#lambdas-kotlin-pre-14-and-invalidtypesexception

> Exceptions with Kotlin 1.5.0 and higher
> ---
>
> Key: FLINK-23979
> URL: https://issues.apache.org/jira/browse/FLINK-23979
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.13.2
>Reporter: Hank
>Priority: Major
>
> *Summary*
> keyBy(..) function triggers exceptions when using Kotlin. Different Kotlin 
> compiler versions give different exceptions.
>  
> *Reproduce*
> See below.
>  
> *Using Kotlin 1.5.20 and 1.5.30*
>  
> When using
> {code:java}
> .keyBy(...){code}
> the following runtime exception occurs: 
> {code:java}
> Exception in thread "main" 
> org.apache.flink.api.common.functions.InvalidTypesException: The types of the 
> interface org.apache.flink.api.java.functions.KeySelector could not be 
> inferred. Support for synthetic interfaces, lambdas, and generic or raw types 
> is limited at this point
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1244)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:1268)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1231)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:789)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:587)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:436)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:429)
> at 
> org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:118)
> at 
> org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
> at FraudDetectionKt.main(FraudDetection.kt:23){code}
>  
> Update: this seemed to be an issue in Kotlin < 1.4.0 as well: 
> https://github.com/classpass/flink-kotlin#lambdas-kotlin-pre-14-and-invalidtypesexception
>  
> *Using Kotlin 1.5.0 – +FIXED in 1.5.10+*
> When using
> {code:java}
> .keyBy(...)
> {code}
> gives the following runtime exception:
> {code:java}
> Exception in thread "main" 
> org.apache.flink.api.common.InvalidProgramException: Object 
> FraudDetectionKt$$Lambda$138/0x0008001e6440@7d446ed1 is not serializable
>   at 
> org.apache.flink.api.java.ClosureCleaner.ensureSerializable(ClosureCleaner.java:180)
>   at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1901)
>   at 
> org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:189)
>   at 
> org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
>   at FraudDetectionKt.main(FraudDetection.kt:23)
> {code}
> Using and older version of Kotlin, e.g 1.4.32, this exception does not occur 
> and the program runs fine.
>  
> Some research points this change log that might have something to do with 
> these exceptions?
> [https://kotlinlang.org/docs/whatsnew15.html#lambdas-via-invokedynamic]
>  
> *Reproduce*
> Use the code from the tutorial:
> [https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/try-flink/datastream/]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-13973) Checkpoint recovery failed after user set uidHash

2021-08-26 Thread Yun Gao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405010#comment-17405010
 ] 

Yun Gao commented on FLINK-13973:
-

Fix on master via 8bf1fbd4d4bb0c0ce664aeb36395238612341a44

> Checkpoint recovery failed after user set uidHash
> -
>
> Key: FLINK-13973
> URL: https://issues.apache.org/jira/browse/FLINK-13973
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.8.0, 1.8.1, 1.9.0, 1.13.0
>Reporter: Zhouyu Pei
>Assignee: Yun Gao
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>
> Checkpoint recovery failed after user set uidHash, the possible reasons are 
> as follows:
> If altOperatorID is not null, operatorState will be obtained by altOperatorID 
> and will not be given



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23979) Exceptions with Kotlin 1.5.0 and higher



[ 
https://issues.apache.org/jira/browse/FLINK-23979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405012#comment-17405012
 ] 

Hank commented on FLINK-23979:
--

Kotlin bug tracker: https://youtrack.jetbrains.com/issue/KT-48422

> Exceptions with Kotlin 1.5.0 and higher
> ---
>
> Key: FLINK-23979
> URL: https://issues.apache.org/jira/browse/FLINK-23979
> Project: Flink
>  Issue Type: Bug
>  Components: API / DataStream
>Affects Versions: 1.13.2
>Reporter: Hank
>Priority: Major
>
> *Summary*
> keyBy(..) function triggers exceptions when using Kotlin. Different Kotlin 
> compiler versions give different exceptions.
>  
> *Reproduce*
> See below.
>  
> *Using Kotlin 1.5.20 and 1.5.30*
>  
> When using
> {code:java}
> .keyBy(...){code}
> the following runtime exception occurs: 
> {code:java}
> Exception in thread "main" 
> org.apache.flink.api.common.functions.InvalidTypesException: The types of the 
> interface org.apache.flink.api.java.functions.KeySelector could not be 
> inferred. Support for synthetic interfaces, lambdas, and generic or raw types 
> is limited at this point
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1244)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterTypeFromGenericType(TypeExtractor.java:1268)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getParameterType(TypeExtractor.java:1231)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:789)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getUnaryOperatorReturnType(TypeExtractor.java:587)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:436)
> at 
> org.apache.flink.api.java.typeutils.TypeExtractor.getKeySelectorTypes(TypeExtractor.java:429)
> at 
> org.apache.flink.streaming.api.datastream.KeyedStream.(KeyedStream.java:118)
> at 
> org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
> at FraudDetectionKt.main(FraudDetection.kt:23){code}
>  
> Update: this seemed to be an issue in Kotlin < 1.4.0 as well: 
> https://github.com/classpass/flink-kotlin#lambdas-kotlin-pre-14-and-invalidtypesexception
>  
> *Using Kotlin 1.5.0 – +FIXED in 1.5.10+*
> When using
> {code:java}
> .keyBy(...)
> {code}
> gives the following runtime exception:
> {code:java}
> Exception in thread "main" 
> org.apache.flink.api.common.InvalidProgramException: Object 
> FraudDetectionKt$$Lambda$138/0x0008001e6440@7d446ed1 is not serializable
>   at 
> org.apache.flink.api.java.ClosureCleaner.ensureSerializable(ClosureCleaner.java:180)
>   at 
> org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.clean(StreamExecutionEnvironment.java:1901)
>   at 
> org.apache.flink.streaming.api.datastream.DataStream.clean(DataStream.java:189)
>   at 
> org.apache.flink.streaming.api.datastream.DataStream.keyBy(DataStream.java:296)
>   at FraudDetectionKt.main(FraudDetection.kt:23)
> {code}
> Using and older version of Kotlin, e.g 1.4.32, this exception does not occur 
> and the program runs fine.
>  
> Some research points this change log that might have something to do with 
> these exceptions?
> [https://kotlinlang.org/docs/whatsnew15.html#lambdas-via-invokedynamic]
>  
> *Reproduce*
> Use the code from the tutorial:
> [https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/try-flink/datastream/]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23987) Move Powered By section into separate page

Chesnay Schepler created FLINK-23987:


 Summary: Move Powered By section into separate page
 Key: FLINK-23987
 URL: https://issues.apache.org/jira/browse/FLINK-23987
 Project: Flink
  Issue Type: Technical Debt
  Components: Project Website
Reporter: Chesnay Schepler


The PMC was just informed that it is not allowed to have a Powered By section 
on the main homepage. We need to move it into a dedicated page.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] luryson commented on pull request #4827: [FLINK-7840] [build] Shade netty in akka



luryson commented on pull request #4827:
URL: https://github.com/apache/flink/pull/4827#issuecomment-906156030


   Is there any way to shade all akka dependencies ? In my project, scala has 
been shaded in another jar, which cause the failure of flink's initilization. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-7840) Shade Akka's Netty Dependency

2021-08-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-7840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-7840:
--
Labels: pull-request-available  (was: )

> Shade Akka's Netty Dependency
> -
>
> Key: FLINK-7840
> URL: https://issues.apache.org/jira/browse/FLINK-7840
> Project: Flink
>  Issue Type: Improvement
>  Components: Build System
>Reporter: Stephan Ewen
>Assignee: Stephan Ewen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.4.0
>
>
> In order to avoid clashes between different Netty versions we should shade 
> Akka's Netty away.
> These dependency version clashed manifest themselves in very subtle ways, 
> like occasional deadlocks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] tillrohrmann commented on a change in pull request #16971: [FLINK-23794][tests] Fix InMemoryReporter instantiation and memory consumption.



tillrohrmann commented on a change in pull request #16971:
URL: https://github.com/apache/flink/pull/16971#discussion_r696353483



##
File path: 
flink-connectors/flink-connector-base/src/test/java/org/apache/flink/connector/base/source/reader/SourceMetricsITCase.java
##
@@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.flink.connector.base.source.reader;
+
+import org.apache.flink.api.common.eventtime.BoundedOutOfOrdernessWatermarks;
+import org.apache.flink.api.common.eventtime.SerializableTimestampAssigner;
+import org.apache.flink.api.common.eventtime.WatermarkOutput;
+import org.apache.flink.api.common.eventtime.WatermarkStrategy;
+import org.apache.flink.api.connector.source.Boundedness;
+import org.apache.flink.api.connector.source.Source;
+import org.apache.flink.configuration.Configuration;
+import org.apache.flink.connector.base.source.reader.mocks.MockBaseSource;
+import org.apache.flink.connector.base.source.reader.mocks.MockRecordEmitter;
+import org.apache.flink.core.execution.JobClient;
+import org.apache.flink.metrics.Gauge;
+import org.apache.flink.metrics.Metric;
+import org.apache.flink.metrics.groups.OperatorMetricGroup;
+import org.apache.flink.runtime.metrics.MetricNames;
+import org.apache.flink.runtime.testutils.InMemoryReporter;
+import org.apache.flink.runtime.testutils.MiniClusterResourceConfiguration;
+import org.apache.flink.streaming.api.datastream.DataStream;
+import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
+import org.apache.flink.streaming.api.functions.sink.DiscardingSink;
+import org.apache.flink.test.util.MiniClusterWithClientResource;
+import org.apache.flink.testutils.junit.SharedObjects;
+import org.apache.flink.testutils.junit.SharedReference;
+
+import org.hamcrest.Matcher;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Rule;
+import org.junit.Test;
+
+import java.time.Duration;
+import java.util.List;
+import java.util.Map;
+import java.util.concurrent.CyclicBarrier;
+
+import static org.apache.flink.metrics.testutils.MetricMatchers.isCounter;
+import static org.apache.flink.metrics.testutils.MetricMatchers.isGauge;
+import static org.hamcrest.CoreMatchers.equalTo;
+import static org.hamcrest.CoreMatchers.nullValue;
+import static org.hamcrest.Matchers.both;
+import static org.hamcrest.Matchers.greaterThan;
+import static org.hamcrest.Matchers.hasSize;
+import static org.hamcrest.Matchers.lessThan;
+import static org.junit.Assert.assertThat;
+
+/** Tests whether all provided metrics of a {@link Source} are of the expected 
values (FLIP-33). */
+public class SourceMetricsITCase {
+private static final int DEFAULT_PARALLELISM = 4;
+// since integration tests depend on wall clock time, use huge lags
+private static final long EVENTTIME_LAG = Duration.ofDays(100).toMillis();
+private static final long WATERMARK_LAG = Duration.ofDays(1).toMillis();
+private static final long EVENTTIME_EPSILON = 
Duration.ofDays(20).toMillis();
+// this basically is the time a build is allowed to be frozen before the 
test fails
+private static final long WATERMARK_EPSILON = 
Duration.ofHours(6).toMillis();
+@Rule
+public final SharedObjects sharedObjects = SharedObjects.create();
+private InMemoryReporter reporter;
+
+private MiniClusterWithClientResource miniClusterResource;
+
+@Before
+public void setup() throws Exception {
+reporter = InMemoryReporter.createWithRetainedMetrics();
+Configuration configuration = new Configuration();
+reporter.addToConfiguration(configuration);

Review comment:
   Nice :-)

##
File path: 
flink-connectors/flink-connector-base/src/test/java/org/apache/flink/connector/base/source/reader/SourceMetricsITCase.java
##
@@ -0,0 +1,273 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance

[GitHub] [flink] flinkbot edited a comment on pull request #16861: [FLINK-23808][checkpoint] Bypass operators when advanceToEndOfEventTime for both legacy and new source tasks



flinkbot edited a comment on pull request #16861:
URL: https://github.com/apache/flink/pull/16861#issuecomment-900179085


   
   ## CI report:
   
   * a10c86eef00f4ac5e3d3594b9e939d55ab7f16fc Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22870)
 
   * 7165c3df04d5b5520cced9f44b34d7bd0ad22c02 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16971: [FLINK-23794][tests] Fix InMemoryReporter instantiation and memory consumption.



flinkbot edited a comment on pull request #16971:
URL: https://github.com/apache/flink/pull/16971#issuecomment-904969319


   
   ## CI report:
   
   * 7960af7d4e53785f300d77acdbf372f9c91d3204 UNKNOWN
   * d9da2add065aea3ac802bb72872aa8fdeb7e1275 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22878)
 
   * ca12cee214cb00e89f8175cc0c3d8645ed350d51 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-23466) UnalignedCheckpointITCase hangs on Azure

2021-08-26 Thread Yingjie Cao (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-23466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405019#comment-17405019
 ] 

Yingjie Cao commented on FLINK-23466:
-

>>> Thanks for the investigation and writing down your suspicions. If the 
>>>problem is with `BufferManager`, the bug was previously unnoticed, because 
>>>exclusive buffers were never returned to the buffer pool? That's why only 
>>>`buffersPerChannel=0` is surfacing the problem?

I think you are right, if there are exclusive buffers, the floating buffers are 
optional.

 

>>> In a retrospect, maybe we should have randomised the number of exclusive 
>>>buffers in {{org.apache.flink.streaming.util.TestStreamEnvironment}} to 
>>>provide test coverage for 0 exclusive buffers mode in the ITCases?

I think it is a good idea.

> UnalignedCheckpointITCase hangs on Azure
> 
>
> Key: FLINK-23466
> URL: https://issues.apache.org/jira/browse/FLINK-23466
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Yingjie Cao
>Priority: Critical
>  Labels: test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] tillrohrmann commented on pull request #16981: [FLINK-23965][e2e] Store logs under FLINK_DIR/log



tillrohrmann commented on pull request #16981:
URL: https://github.com/apache/flink/pull/16981#issuecomment-906164798


   Thanks for the review @AHeise. Merging this PR now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] tillrohrmann closed pull request #16981: [FLINK-23965][e2e] Store logs under FLINK_DIR/log



tillrohrmann closed pull request #16981:
URL: https://github.com/apache/flink/pull/16981


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] tillrohrmann commented on pull request #16982: [BP-1.13][FLINK-23965][e2e] Store logs under FLINK_DIR/log



tillrohrmann commented on pull request #16982:
URL: https://github.com/apache/flink/pull/16982#issuecomment-906167576


   Thanks for the review @AHeise. Merging this PR now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] tillrohrmann merged pull request #16982: [BP-1.13][FLINK-23965][e2e] Store logs under FLINK_DIR/log



tillrohrmann merged pull request #16982:
URL: https://github.com/apache/flink/pull/16982


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-23965) E2E do not execute locally on MacOS



 [ 
https://issues.apache.org/jira/browse/FLINK-23965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann closed FLINK-23965.
-
Resolution: Fixed

Fixed via

1.14.0: e5de4a2b003f2cacb89d04b6f1a379cd555d6270
1.13.3: e67acb5f26894a77d5f4faac6c858de5c4b6b502

> E2E do not execute locally on MacOS
> ---
>
> Key: FLINK-23965
> URL: https://issues.apache.org/jira/browse/FLINK-23965
> Project: Flink
>  Issue Type: Bug
>  Components: Tests
>Affects Versions: 1.14.0, 1.13.2
>Reporter: Till Rohrmann
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 1.14.0, 1.13.3
>
>
> After FLINK-21346, the e2e tests are no longer executing locally on MacOS. 
> The problem seems to be that the e2e configure a log directory that does not 
> exist and this fails starting a Flink cluster.
> I suggest to change the directory to the old directory {{FLINK_DIR/log}} 
> instead of {{FLINK_DIR/logs}}.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] tillrohrmann commented on pull request #16986: [FLINK-23954][tests] Add more logging to test_queryable_state_restart_tm.sh for debugging purposes



tillrohrmann commented on pull request #16986:
URL: https://github.com/apache/flink/pull/16986#issuecomment-906172028


   The test failure is unrelated. Merging this PR in order to improve the debug 
log output of `test_queryable_state_restart_tm.sh`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zentol commented on a change in pull request #16986: [FLINK-23954][tests] Add more logging to test_queryable_state_restart_tm.sh for debugging purposes



zentol commented on a change in pull request #16986:
URL: https://github.com/apache/flink/pull/16986#discussion_r696372014



##
File path: 
flink-end-to-end-tests/flink-queryable-state-test/src/main/java/org/apache/flink/streaming/tests/queryablestate/QsStateClient.java
##
@@ -61,6 +61,8 @@ public static void main(final String[] args) throws Exception 
{
 TypeInformation.of(new TypeHint() {}),
 TypeInformation.of(new TypeHint() 
{}));
 
+System.out.println("Wait until the state can be queried.");

Review comment:
   is this intentionally not using logging?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] tillrohrmann commented on a change in pull request #16986: [FLINK-23954][tests] Add more logging to test_queryable_state_restart_tm.sh for debugging purposes



tillrohrmann commented on a change in pull request #16986:
URL: https://github.com/apache/flink/pull/16986#discussion_r696376120



##
File path: 
flink-end-to-end-tests/flink-queryable-state-test/src/main/java/org/apache/flink/streaming/tests/queryablestate/QsStateClient.java
##
@@ -61,6 +61,8 @@ public static void main(final String[] args) throws Exception 
{
 TypeInformation.of(new TypeHint() {}),
 TypeInformation.of(new TypeHint() 
{}));
 
+System.out.println("Wait until the state can be queried.");

Review comment:
   Yes, because we call the `QsStateClient` from the 
`test_queryable_state_restart_tm.sh` w/o proper logging set up.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] tillrohrmann commented on a change in pull request #16986: [FLINK-23954][tests] Add more logging to test_queryable_state_restart_tm.sh for debugging purposes



tillrohrmann commented on a change in pull request #16986:
URL: https://github.com/apache/flink/pull/16986#discussion_r696376827



##
File path: 
flink-end-to-end-tests/flink-queryable-state-test/src/main/java/org/apache/flink/streaming/tests/queryablestate/QsStateClient.java
##
@@ -61,6 +61,8 @@ public static void main(final String[] args) throws Exception 
{
 TypeInformation.of(new TypeHint() {}),
 TypeInformation.of(new TypeHint() 
{}));
 
+System.out.println("Wait until the state can be queried.");

Review comment:
   I know it is not beautiful but I didn't want to spend more time on 
overriding shade exclusions and adding log4j configuration files to the 
`QsStateClient.jar`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-19303) Disable WAL in RocksDB recovery

2021-08-26 Thread Yu Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li reassigned FLINK-19303:
-

Assignee: Juha Mynttinen

Thanks [~juha.mynttinen], I've assigned the JIRA back to you.

> Disable WAL in RocksDB recovery
> ---
>
> Key: FLINK-19303
> URL: https://issues.apache.org/jira/browse/FLINK-19303
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Juha Mynttinen
>Assignee: Juha Mynttinen
>Priority: Minor
>  Labels: auto-deprioritized-major, auto-unassigned
>
> During recovery of {{RocksDBStateBackend}} the recovery mechanism puts the 
> key value pairs to local RocksDB instance(s). To speed up the process, the 
> recovery process uses RocskDB write batch mechanism. [RocksDB 
> WAL|https://github.com/facebook/rocksdb/wiki/Write-Ahead-Log]  is enabled 
> during this process.
> During normal operations, i.e. when the state backend has been recovered and 
> the Flink application is running (on RocksDB state backend) WAL is disabled.
> The recovery process doesn't need WAL. In fact the recovery should be much 
> faster without WAL. Thus, WAL should be disabled in the recovery process.
> AFAIK the last thing that was done with WAL during recovery was an attempt to 
> remove it. Later that removal was removed because it causes stability issues 
> (https://issues.apache.org/jira/browse/FLINK-8922).
> Unfortunately the root cause why disabling WAL causes segfault during 
> recovery is unknown. After all, WAL is not used during normal operations.
> Potential explanation is some kind of bug in RocksDB write batch when using 
> WAL. It is possible later RocksDB versions have fixes / workarounds for the 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] tillrohrmann edited a comment on pull request #16986: [FLINK-23954][tests] Add more logging to test_queryable_state_restart_tm.sh for debugging purposes



tillrohrmann edited a comment on pull request #16986:
URL: https://github.com/apache/flink/pull/16986#issuecomment-906172028


   Thanks for the review @zentol. The test failure is unrelated. Merging this 
PR in order to improve the debug log output of 
`test_queryable_state_restart_tm.sh`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-19303) Disable WAL in RocksDB recovery

2021-08-26 Thread Yu Li (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-19303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yu Li updated FLINK-19303:
--
Labels:   (was: auto-deprioritized-major auto-unassigned)

> Disable WAL in RocksDB recovery
> ---
>
> Key: FLINK-19303
> URL: https://issues.apache.org/jira/browse/FLINK-19303
> Project: Flink
>  Issue Type: Improvement
>  Components: Runtime / State Backends
>Reporter: Juha Mynttinen
>Assignee: Juha Mynttinen
>Priority: Minor
>
> During recovery of {{RocksDBStateBackend}} the recovery mechanism puts the 
> key value pairs to local RocksDB instance(s). To speed up the process, the 
> recovery process uses RocskDB write batch mechanism. [RocksDB 
> WAL|https://github.com/facebook/rocksdb/wiki/Write-Ahead-Log]  is enabled 
> during this process.
> During normal operations, i.e. when the state backend has been recovered and 
> the Flink application is running (on RocksDB state backend) WAL is disabled.
> The recovery process doesn't need WAL. In fact the recovery should be much 
> faster without WAL. Thus, WAL should be disabled in the recovery process.
> AFAIK the last thing that was done with WAL during recovery was an attempt to 
> remove it. Later that removal was removed because it causes stability issues 
> (https://issues.apache.org/jira/browse/FLINK-8922).
> Unfortunately the root cause why disabling WAL causes segfault during 
> recovery is unknown. After all, WAL is not used during normal operations.
> Potential explanation is some kind of bug in RocksDB write batch when using 
> WAL. It is possible later RocksDB versions have fixes / workarounds for the 
> issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] tillrohrmann closed pull request #16986: [FLINK-23954][tests] Add more logging to test_queryable_state_restart_tm.sh for debugging purposes



tillrohrmann closed pull request #16986:
URL: https://github.com/apache/flink/pull/16986


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-23954) Queryable state (rocksdb) with TM restart end-to-end test fails on azure



[ 
https://issues.apache.org/jira/browse/FLINK-23954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405032#comment-17405032
 ] 

Till Rohrmann commented on FLINK-23954:
---

Added more debug logs via

2fef9eda1e76cf08f4c16ea104e7ce0f0b79eae9
6a7b5e8079cfced519ac5e106f0fb7277131f8bd

Once the test case is fixed/hardened, 6a7b5e8079cfced519ac5e106f0fb7277131f8bd 
can be reverted.

> Queryable state (rocksdb) with TM restart end-to-end test fails on azure
> 
>
> Key: FLINK-23954
> URL: https://issues.apache.org/jira/browse/FLINK-23954
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Queryable State
>Affects Versions: 1.14.0
>Reporter: Xintong Song
>Priority: Critical
>  Labels: pull-request-available, test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22714&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=070ff179-953e-5bda-71fa-d6599415701c&l=11090
> {code}
> Aug 24 09:54:17 
> ==
> Aug 24 09:54:17 Running 'Queryable state (rocksdb) with TM restart end-to-end 
> test'
> Aug 24 09:54:17 
> ==
> Aug 24 09:54:17 TEST_DATA_DIR: 
> /home/vsts/work/1/s/flink-end-to-end-tests/test-scripts/temp-test-directory-17362481511
> Aug 24 09:54:17 Flink dist directory: 
> /home/vsts/work/1/s/flink-dist/target/flink-1.14-SNAPSHOT-bin/flink-1.14-SNAPSHOT
> Aug 24 09:54:17 Adding flink-queryable-state-runtime to lib/
> Aug 24 09:54:17 Starting cluster.
> Aug 24 09:54:18 Starting standalonesession daemon on host fv-az123-794.
> Aug 24 09:54:19 Starting taskexecutor daemon on host fv-az123-794.
> Aug 24 09:54:19 Waiting for Dispatcher REST endpoint to come up...
> Aug 24 09:54:20 Waiting for Dispatcher REST endpoint to come up...
> Aug 24 09:54:21 Waiting for Dispatcher REST endpoint to come up...
> Aug 24 09:54:22 Waiting for Dispatcher REST endpoint to come up...
> Aug 24 09:54:23 Waiting for Dispatcher REST endpoint to come up...
> Aug 24 09:54:24 Dispatcher REST endpoint is up.
> Aug 24 09:54:31 Job (f230364470a082b78dc34147f8da38f1) is running.
> Aug 24 09:54:31 Starting to wait for completion of 10 checkpoints
> Aug 24 09:54:31 2/10 completed checkpoints
> Aug 24 09:54:33 6/10 completed checkpoints
> Aug 24 09:54:35 6/10 completed checkpoints
> Aug 24 09:54:37 SERVER: 127.0.0.1
> Aug 24 09:54:37 PORT: 9069
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> Aug 24 09:54:38 MapState has 17 entries
> Aug 24 09:54:38 TaskManager 412715 killed.
> Aug 24 09:54:38 Number of running task managers 1 is not yet 0.
> Aug 24 09:54:42 Number of running task managers has reached 0.
> Aug 24 09:54:42 Latest snapshot count was 22
> Aug 24 09:54:43 Starting taskexecutor daemon on host fv-az123-794.
> Aug 24 09:54:43 Number of running task managers 0 is not yet 1.
> Aug 24 09:54:47 Number of running task managers has reached 1.
> Aug 24 09:54:49 Job (f230364470a082b78dc34147f8da38f1) is running.
> Aug 24 09:54:49 Starting to wait for completion of 16 checkpoints
> Aug 24 09:54:49 11/16 completed checkpoints
> Aug 24 09:54:51 11/16 completed checkpoints
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further 
> details.
> Aug 24 09:54:54 after: 21
> Aug 24 09:54:54 An error occurred
> Aug 24 09:54:54 [FAIL] Test script contains errors.
> Aug 24 09:54:54 Checking of logs skipped.
> Aug 24 09:54:54 
> Aug 24 09:54:54 [FAIL] 'Queryable state (rocksdb) with TM restart end-to-end 
> test' failed after 0 minutes and 37 seconds! Test exited with exit code 1
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] zhuzhurk commented on a change in pull request #16959: [FLINK-23912] Remove unnecessary "Clearing resource requirements of job" in SlotManager



zhuzhurk commented on a change in pull request #16959:
URL: https://github.com/apache/flink/pull/16959#discussion_r696381040



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/slotpool/DefaultDeclarativeSlotPool.java
##
@@ -117,7 +117,7 @@ public void increaseResourceRequirementsBy(ResourceCounter 
increment) {
 
 @Override
 public void decreaseResourceRequirementsBy(ResourceCounter decrement) {
-if (decrement.isEmpty()) {
+if (decrement.isEmpty() || totalResourceRequirements.isEmpty()) {

Review comment:
   Looks to me this is a critical bug. If TM is disconnected with available 
slots during job execution, the job may never be able to declare right resource 
requirements again.
   I wrote a rough test to show this case. You can check the log to see the 
resource declaration that after a failover, the job with parallelism=3 will 
only declare resources for 2 slots.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] zhuzhurk commented on a change in pull request #16959: [FLINK-23912] Remove unnecessary "Clearing resource requirements of job" in SlotManager



zhuzhurk commented on a change in pull request #16959:
URL: https://github.com/apache/flink/pull/16959#discussion_r696381188



##
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/slotpool/DefaultDeclarativeSlotPool.java
##
@@ -117,7 +117,7 @@ public void increaseResourceRequirementsBy(ResourceCounter 
increment) {
 
 @Override
 public void decreaseResourceRequirementsBy(ResourceCounter decrement) {
-if (decrement.isEmpty()) {
+if (decrement.isEmpty() || totalResourceRequirements.isEmpty()) {

Review comment:
   
https://github.com/zhuzhurk/flink/commit/a61434092b2980116398067428da7e4124856952




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] twalthr closed pull request #16960: [FLINK-23915][table] Fix catalog and primary key resolution of temporary tables



twalthr closed pull request #16960:
URL: https://github.com/apache/flink/pull/16960


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-23221) Migrate Docker images to Debian Bullseye



 [ 
https://issues.apache.org/jira/browse/FLINK-23221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler closed FLINK-23221.

Fix Version/s: (was: 1.13.3)
   1.11.4
   1.12.5
   1.13.2
   Resolution: Fixed

All Flink images for 1.11.4/1.12.5/1.13.2 were automatically updated to 
Bullseye.

> Migrate Docker images to Debian Bullseye
> 
>
> Key: FLINK-23221
> URL: https://issues.apache.org/jira/browse/FLINK-23221
> Project: Flink
>  Issue Type: Improvement
>  Components: flink-docker
>Affects Versions: 1.13.1
> Environment: Issue was discovered by AWS ECR image scanning on 
> apache/flink:1.13.1-scala_2.12
>Reporter: Razvan AGAPE
>Assignee: Chesnay Schepler
>Priority: Critical
>  Labels: docker, flink, glibc
> Fix For: 1.14.0, 1.13.2, 1.12.5, 1.11.4
>
>
> The AWS ECR image scanning reports some HIGH vulnerabilities on 
> apache/flink:1.13.1-scala_2.12 docker image. In addition, all versions prior 
> to this one have these issues.
> The vulnerabilities are the following:
>  # [CVE-2021-33574|https://security-tracker.debian.org/tracker/CVE-2021-33574]
>  # [CVE-2019-25013 - for this one a patch was been released in glibc version 
> 2.31-9|https://security-tracker.debian.org/tracker/CVE-2019-25013]
> Our security policy do not allow us to deploy images having security 
> vulnerabilities. Searching through the Internet I found that for the first 
> problem, a patch containing the solution will be release this year.
> Do you plan to release a new image containing the newer glibc version in 
> order to solve those issues?
> Also, I checked and the alpine based flink images do not have these 
> vulnerabilities. Do you plan to release newer versions of flink based on 
> alpine (latest one is flink:1.8.x)?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #14395: [FLINK-16491][formats] Add compression support for ParquetAvroWriters



flinkbot edited a comment on pull request #14395:
URL: https://github.com/apache/flink/pull/14395#issuecomment-745744218


   
   ## CI report:
   
   * cf4d3c1ec2008e17ad6b8b6a86d0c399d1e9c48d Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22528)
 
   * 966c45118322ce5b9610f5e898843d3997e41357 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-23967) Remove scala references in OneInputStreamTaskTest



 [ 
https://issues.apache.org/jira/browse/FLINK-23967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chesnay Schepler updated FLINK-23967:
-
Parent: FLINK-23986
Issue Type: Sub-task  (was: Technical Debt)

> Remove scala references in OneInputStreamTaskTest
> -
>
> Key: FLINK-23967
> URL: https://issues.apache.org/jira/browse/FLINK-23967
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream, Tests
>Reporter: Chesnay Schepler
>Assignee: Chesnay Schepler
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] tillrohrmann commented on a change in pull request #8527: [FLINK-12610] [table-planner-blink] Introduce planner rules about aggregate



tillrohrmann commented on a change in pull request #8527:
URL: https://github.com/apache/flink/pull/8527#discussion_r696390658



##
File path: 
flink-table/flink-table-planner-blink/src/test/scala/org/apache/flink/table/runtime/batch/sql/agg/AggregateReduceGroupingITCase.scala
##
@@ -0,0 +1,405 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.flink.table.runtime.batch.sql.agg
+
+import org.apache.flink.api.java.typeutils.RowTypeInfo
+import org.apache.flink.table.api.{PlannerConfigOptions, TableConfigOptions, 
Types}
+import org.apache.flink.table.plan.stats.FlinkStatistic
+import org.apache.flink.table.runtime.utils.BatchTestBase
+import org.apache.flink.table.runtime.utils.BatchTestBase.row
+import org.apache.flink.table.util.DateTimeTestUtil.UTCTimestamp
+
+import org.junit.{Before, Test}
+
+import java.sql.Date
+
+import scala.collection.JavaConverters._
+import scala.collection.Seq
+
+class AggregateReduceGroupingITCase extends BatchTestBase {
+
+  @Before
+  def before(): Unit = {
+registerCollection("T1",
+  Seq(row(2, 1, "A", null),
+row(3, 2, "A", "Hi"),
+row(5, 2, "B", "Hello"),
+row(6, 3, "C", "Hello world")),
+  new RowTypeInfo(Types.INT, Types.INT, Types.STRING, Types.STRING),
+  "a1, b1, c1, d1",
+  Array(true, true, true, true),
+  FlinkStatistic.builder().uniqueKeys(Set(Set("a1").asJava).asJava).build()
+)
+
+registerCollection("T2",
+  Seq(row(1, 1, "X"),
+row(1, 2, "Y"),
+row(2, 3, null),
+row(2, 4, "Z")),
+  new RowTypeInfo(Types.INT, Types.INT, Types.STRING),
+  "a2, b2, c2",
+  Array(true, true, true),
+  FlinkStatistic.builder()
+.uniqueKeys(Set(Set("b2").asJava, Set("a2", 
"b2").asJava).asJava).build()
+)
+
+registerCollection("T3",
+  Seq(row(1, 10, "Hi", 1L),
+row(2, 20, "Hello", 1L),
+row(2, 20, "Hello world", 2L),
+row(3, 10, "Hello world, how are you?", 1L),
+row(4, 20, "I am fine.", 2L),
+row(4, null, "Luke Skywalker", 2L)),
+  new RowTypeInfo(Types.INT, Types.INT, Types.STRING, Types.LONG),
+  "a3, b3, c3, d3",
+  Array(true, true, true, true),
+  FlinkStatistic.builder().uniqueKeys(Set(Set("a1").asJava).asJava).build()
+)
+
+registerCollection("T4",
+  Seq(row(1, 1, "A", UTCTimestamp("2018-06-01 10:05:30"), "Hi"),
+row(2, 1, "B", UTCTimestamp("2018-06-01 10:10:10"), "Hello"),
+row(3, 2, "B", UTCTimestamp("2018-06-01 10:15:25"), "Hello world"),
+row(4, 3, "C", UTCTimestamp("2018-06-01 10:36:49"), "I am fine.")),
+  new RowTypeInfo(Types.INT, Types.INT, Types.STRING, Types.SQL_TIMESTAMP, 
Types.STRING),
+  "a4, b4, c4, d4, e4",
+  Array(true, true, true, true, true),
+  FlinkStatistic.builder().uniqueKeys(Set(Set("a4").asJava).asJava).build()
+)
+
+registerCollection("T5",
+  Seq(row(2, 1, "A", null),
+row(3, 2, "B", "Hi"),
+row(1, null, "C", "Hello"),
+row(4, 3, "D", "Hello world"),
+row(3, 1, "E", "Hello world, how are you?"),
+row(5, null, "F", null),
+row(7, 2, "I", "hahaha"),
+row(6, 1, "J", "I am fine.")),
+  new RowTypeInfo(Types.INT, Types.INT, Types.STRING, Types.STRING),
+  "a5, b5, c5, d5",
+  Array(true, true, true, true),
+  FlinkStatistic.builder().uniqueKeys(Set(Set("c5").asJava).asJava).build()
+)
+
+registerCollection("T6",
+  (0 until 5).map(
+i => row(i, 1L, if (i % 500 == 0) null else s"Hello$i", "Hello world", 
10,
+  new Date(i + 153182000L))),
+  new RowTypeInfo(Types.INT, Types.LONG, Types.STRING, Types.STRING, 
Types.INT, Types.SQL_DATE),
+  "a6, b6, c6, d6, e6, f6",
+  Array(true, true, true, true, true, true),
+  FlinkStatistic.builder().uniqueKeys(Set(Set("a6").asJava).asJava).build()
+)
+// HashJoin is disabled due to translateToPlanInternal method is not 
implemented yet
+
tEnv.getConfig.getConf.setString(TableConfigOptions.SQL_EXEC_DISABLED_OPERATORS,
 "HashJoin")
+  }
+
+  @Test
+  def testSingleAggOnTable_SortAgg(): Unit = {
+
tEnv.getConfig.getConf.setString(TableConfigOp

[GitHub] [flink] flinkbot edited a comment on pull request #16787: [FLINK-18080][docs-zh] Translate "security-kerberos" page into Chinese



flinkbot edited a comment on pull request #16787:
URL: https://github.com/apache/flink/pull/16787#issuecomment-897334392


   
   ## CI report:
   
   * 19eb37d2ae8019e2075cec18a36f36a2518fa891 Azure: 
[SUCCESS](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22875)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16861: [FLINK-23808][checkpoint] Bypass operators when advanceToEndOfEventTime for both legacy and new source tasks



flinkbot edited a comment on pull request #16861:
URL: https://github.com/apache/flink/pull/16861#issuecomment-900179085


   
   ## CI report:
   
   * a10c86eef00f4ac5e3d3594b9e939d55ab7f16fc Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22870)
 
   * 7165c3df04d5b5520cced9f44b34d7bd0ad22c02 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22879)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16900: [FLINK-23877][connector/pulsar] Provide an embedded pulsar broker for testing.



flinkbot edited a comment on pull request #16900:
URL: https://github.com/apache/flink/pull/16900#issuecomment-902199681


   
   ## CI report:
   
   * f3180e650ed121eb4706dffbd4584d2a08189e24 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22876)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16971: [FLINK-23794][tests] Fix InMemoryReporter instantiation and memory consumption.



flinkbot edited a comment on pull request #16971:
URL: https://github.com/apache/flink/pull/16971#issuecomment-904969319


   
   ## CI report:
   
   * 7960af7d4e53785f300d77acdbf372f9c91d3204 UNKNOWN
   * d9da2add065aea3ac802bb72872aa8fdeb7e1275 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22878)
 
   * ca12cee214cb00e89f8175cc0c3d8645ed350d51 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22880)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16993: [FLINK-23818][python][docs] Add documentation about tgz files for python archives



flinkbot edited a comment on pull request #16993:
URL: https://github.com/apache/flink/pull/16993#issuecomment-906106746


   
   ## CI report:
   
   * 8c5f1ede48695bbb00c785cca78aac13d095922a Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22877)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] syhily commented on pull request #16900: [FLINK-23877][connector/pulsar] Provide an embedded pulsar broker for testing.



syhily commented on pull request #16900:
URL: https://github.com/apache/flink/pull/16900#issuecomment-906194276


   These typo have been fixed. I'm writing document on top of this PR. I hope 
it can be merged soon or later.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-23457) Sending the buffer of the right size for broadcast



 [ 
https://issues.apache.org/jira/browse/FLINK-23457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-23457:
---
Fix Version/s: 1.15.0

> Sending the buffer of the right size for broadcast
> --
>
> Key: FLINK-23457
> URL: https://issues.apache.org/jira/browse/FLINK-23457
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.14.0
>Reporter: Anton Kalashnikov
>Assignee: zhengyu.lou
>Priority: Major
> Fix For: 1.15.0
>
>
> It is not enough to know just the number of available buffers (credits) for 
> the downstream because the size of these buffers can be different. So we are 
> proposing to resolve this problem in the following way: If the downstream 
> buffer size is changed then the upstream should send the buffer of the size 
> not greater than the new one regardless of how big the current buffer on the 
> upstream. (pollBuffer should receive parameters like bufferSize and return 
> buffer not greater than it)
> Downstream will be able to support any buffer size < max buffer size, so it 
> should be just good enough to request BufferBuilder with new size after 
> getting announcement, and leaving existing BufferBuilder/BufferConsumers 
> unchanged. In other words code in {{PipelinedSubpartition(View)}} doesn’t 
> need to be changed (apart of forwarding new buffer size to the 
> {{BufferWritingResultPartition}}). All buffer size adjustments can be 
> implemented exclusively in {{BufferWritingResultPartition}}.
> If different downstream subtasks have different throughput and hence 
> different desired buffer sizes, then a single upstream subtask has to support 
> having two different subpartitions with different buffer sizes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23457) Sending the buffer of the right size for broadcast



 [ 
https://issues.apache.org/jira/browse/FLINK-23457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-23457:
---
Parent Issue: FLINK-23973  (was: FLINK-23451)

> Sending the buffer of the right size for broadcast
> --
>
> Key: FLINK-23457
> URL: https://issues.apache.org/jira/browse/FLINK-23457
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Anton Kalashnikov
>Assignee: zhengyu.lou
>Priority: Major
>
> It is not enough to know just the number of available buffers (credits) for 
> the downstream because the size of these buffers can be different. So we are 
> proposing to resolve this problem in the following way: If the downstream 
> buffer size is changed then the upstream should send the buffer of the size 
> not greater than the new one regardless of how big the current buffer on the 
> upstream. (pollBuffer should receive parameters like bufferSize and return 
> buffer not greater than it)
> Downstream will be able to support any buffer size < max buffer size, so it 
> should be just good enough to request BufferBuilder with new size after 
> getting announcement, and leaving existing BufferBuilder/BufferConsumers 
> unchanged. In other words code in {{PipelinedSubpartition(View)}} doesn’t 
> need to be changed (apart of forwarding new buffer size to the 
> {{BufferWritingResultPartition}}). All buffer size adjustments can be 
> implemented exclusively in {{BufferWritingResultPartition}}.
> If different downstream subtasks have different throughput and hence 
> different desired buffer sizes, then a single upstream subtask has to support 
> having two different subpartitions with different buffer sizes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23457) Sending the buffer of the right size for broadcast



 [ 
https://issues.apache.org/jira/browse/FLINK-23457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-23457:
---
Affects Version/s: 1.14.0

> Sending the buffer of the right size for broadcast
> --
>
> Key: FLINK-23457
> URL: https://issues.apache.org/jira/browse/FLINK-23457
> Project: Flink
>  Issue Type: Sub-task
>Affects Versions: 1.14.0
>Reporter: Anton Kalashnikov
>Assignee: zhengyu.lou
>Priority: Major
>
> It is not enough to know just the number of available buffers (credits) for 
> the downstream because the size of these buffers can be different. So we are 
> proposing to resolve this problem in the following way: If the downstream 
> buffer size is changed then the upstream should send the buffer of the size 
> not greater than the new one regardless of how big the current buffer on the 
> upstream. (pollBuffer should receive parameters like bufferSize and return 
> buffer not greater than it)
> Downstream will be able to support any buffer size < max buffer size, so it 
> should be just good enough to request BufferBuilder with new size after 
> getting announcement, and leaving existing BufferBuilder/BufferConsumers 
> unchanged. In other words code in {{PipelinedSubpartition(View)}} doesn’t 
> need to be changed (apart of forwarding new buffer size to the 
> {{BufferWritingResultPartition}}). All buffer size adjustments can be 
> implemented exclusively in {{BufferWritingResultPartition}}.
> If different downstream subtasks have different throughput and hence 
> different desired buffer sizes, then a single upstream subtask has to support 
> having two different subpartitions with different buffer sizes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23988) Test corner cases

Piotr Nowojski created FLINK-23988:
--

 Summary: Test corner cases
 Key: FLINK-23988
 URL: https://issues.apache.org/jira/browse/FLINK-23988
 Project: Flink
  Issue Type: Sub-task
  Components: Runtime / Network
Reporter: Piotr Nowojski


Check how debloating behaves in case of:
* data skew
* multiple/two/union inputs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23973) FLIP-183: Buffer debloat 1.1



 [ 
https://issues.apache.org/jira/browse/FLINK-23973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-23973:
---
Affects Version/s: 1.14.0

> FLIP-183: Buffer debloat 1.1
> 
>
> Key: FLINK-23973
> URL: https://issues.apache.org/jira/browse/FLINK-23973
> Project: Flink
>  Issue Type: New Feature
>Affects Versions: 1.14.0
>Reporter: Anton Kalashnikov
>Priority: Major
>
> Second umbrella ticket for 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment.|https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment]
> This ticket should collect task for improvement the first version of the 
> buffer debloat feature



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23973) FLIP-183: Buffer debloat 1.1



 [ 
https://issues.apache.org/jira/browse/FLINK-23973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-23973:
---
Fix Version/s: 1.15.0

> FLIP-183: Buffer debloat 1.1
> 
>
> Key: FLINK-23973
> URL: https://issues.apache.org/jira/browse/FLINK-23973
> Project: Flink
>  Issue Type: New Feature
>Affects Versions: 1.14.0
>Reporter: Anton Kalashnikov
>Priority: Major
> Fix For: 1.15.0
>
>
> Second umbrella ticket for 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment.|https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment]
> This ticket should collect task for improvement the first version of the 
> buffer debloat feature



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23973) FLIP-183: Buffer debloating 1.1



 [ 
https://issues.apache.org/jira/browse/FLINK-23973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-23973:
---
Summary: FLIP-183: Buffer debloating 1.1  (was: FLIP-183: Buffer debloat 
1.1)

> FLIP-183: Buffer debloating 1.1
> ---
>
> Key: FLINK-23973
> URL: https://issues.apache.org/jira/browse/FLINK-23973
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Network
>Affects Versions: 1.14.0
>Reporter: Anton Kalashnikov
>Priority: Major
> Fix For: 1.15.0
>
>
> Second umbrella ticket for 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment.|https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment]
> This ticket should collect task for improvement the first version of the 
> buffer debloat feature



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23973) FLIP-183: Buffer debloat 1.1



 [ 
https://issues.apache.org/jira/browse/FLINK-23973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-23973:
---
Component/s: Runtime / Network

> FLIP-183: Buffer debloat 1.1
> 
>
> Key: FLINK-23973
> URL: https://issues.apache.org/jira/browse/FLINK-23973
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Network
>Affects Versions: 1.14.0
>Reporter: Anton Kalashnikov
>Priority: Major
> Fix For: 1.15.0
>
>
> Second umbrella ticket for 
> [https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment.|https://cwiki.apache.org/confluence/display/FLINK/FLIP-183%3A+Dynamic+buffer+size+adjustment]
> This ticket should collect task for improvement the first version of the 
> buffer debloat feature



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23821) Test loopback mode to allow Python UDF worker and client reuse the same Python VM

2021-08-26 Thread Liu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-23821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405054#comment-17405054
 ] 

Liu commented on FLINK-23821:
-

Hello, [~hxbks2ks], I am willing to do this ticket. Can you assign it to me? 
Thanks.

> Test loopback mode to allow Python UDF worker and client reuse the same 
> Python VM
> -
>
> Key: FLINK-23821
> URL: https://issues.apache.org/jira/browse/FLINK-23821
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python
>Reporter: Huang Xingbo
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.14.0
>
>
> The newly introduced feature allows users to debug their python functions 
> directly in IDEs such as PyCharm.
> For the details of debugging, you can refer to 
> [doc|https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/python/debugging/#local-debug]
>  and for the details of how to debug in PyCharm, you can refer to the 
> [doc|https://www.jetbrains.com/help/pycharm/debugging-your-first-python-application.html]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot edited a comment on pull request #14395: [FLINK-16491][formats] Add compression support for ParquetAvroWriters



flinkbot edited a comment on pull request #14395:
URL: https://github.com/apache/flink/pull/14395#issuecomment-745744218


   
   ## CI report:
   
   * cf4d3c1ec2008e17ad6b8b6a86d0c399d1e9c48d Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22528)
 
   * 966c45118322ce5b9610f5e898843d3997e41357 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22881)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16900: [FLINK-23877][connector/pulsar] Provide an embedded pulsar broker for testing.



flinkbot edited a comment on pull request #16900:
URL: https://github.com/apache/flink/pull/16900#issuecomment-902199681


   
   ## CI report:
   
   * f3180e650ed121eb4706dffbd4584d2a08189e24 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22876)
 
   * 958dbafc8a14a0f46e54db04fd9cbd0f21f5d57f UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16971: [FLINK-23794][tests] Fix InMemoryReporter instantiation and memory consumption.



flinkbot edited a comment on pull request #16971:
URL: https://github.com/apache/flink/pull/16971#issuecomment-904969319


   
   ## CI report:
   
   * 7960af7d4e53785f300d77acdbf372f9c91d3204 UNKNOWN
   * ca12cee214cb00e89f8175cc0c3d8645ed350d51 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22880)
 
   * cc30231665f03047e4098e2b8760dccce6cf9e43 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (FLINK-23989) Flink SQL visibility

wangzhihao created FLINK-23989:
--

Summary: Flink SQL visibility
Key: FLINK-23989
URL: https://issues.apache.org/jira/browse/FLINK-23989
Project: Flink
Issue Type: New Feature
Components: Runtime / Queryable State
Reporter: wangzhihao

It’s desired to inspect into the internal states generated by SQL especially
for debug purpose. We propose to add a new feature in Flink UI console, when
the users click on a Vertex of the Job DAG. We shall list all states the vertex
has in a separate panel (let’s call *states panel*). On this panel users can
query the states with some keys. The returned value share be a human readable
string instead of opaque binary.

Particularly, we need expose the states as queryable. But currently the user
experience of queryable states is cumbersome. Only the serialized value is
returned to client and users need to handle deserialization by themselves.
What’s worse, the client need to construct the serializer and type manually. To
improve this situation. We propose:
# Have a new API to find all queryable states associated to a job vertex. This
can be done to check against the KvStateLocationRegistry, which store the
mapping between JobVertexId and states.
# Have a new API to allow users get the types of queryable states: For a
register name (String), Queryable Server will return the type of key and value
([LogicalType|https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/LogicalType.java]).
# To generate human readable string with API in step 2, we can 1) generate
[TypeSerializer|https://github.com/apache/flink/blob/6c9818323b41a84137c52822d2993df788dbc9bb/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java]
from the LogicalType, so as to handle Serde automatically. 2) to convert
internal data structures to external data structures to generate printable
string. (with converters
[DataStructureConverter|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/table/data/conversion/DataStructureConverter.html]
)

With all these steps and some modifications to Web UI/ Rest, we can enable
users to query SQL internal states.

--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-21003) Flink add Sink to AliyunOSS doesn't work

2021-08-26 Thread Alex Z (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-21003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405061#comment-17405061
 ] 

Alex Z commented on FLINK-21003:


I think the pr seems to solve this problem[ FLINK-11388 ].But i don't know why 
the pr was closed and not merged.

> Flink add Sink to AliyunOSS doesn't work
> 
>
> Key: FLINK-21003
> URL: https://issues.apache.org/jira/browse/FLINK-21003
> Project: Flink
>  Issue Type: Bug
>  Components: Connectors / FileSystem
>Affects Versions: 1.11.0
>Reporter: zhangyunyun
>Priority: Minor
>  Labels: auto-deprioritized-major
>
> When I add a sink to OSS, use the code below:
> {code:java}
> String path = "oss:///";
> StreamingFileSink streamingFileSink = StreamingFileSink
> .forRowFormat(new Path(path), new SimpleStringEncoder("UTF-8"))
> .withRollingPolicy(
> DefaultRollingPolicy.builder()
> .withRolloverInterval(TimeUnit.MINUTES.toMillis(5))
> .withInactivityInterval(TimeUnit.MINUTES.toMillis(1))
> .withMaxPartSize(1024 * 1024 * 10)
> .build()
> ).build();
> strStream.addSink(streamingFileSink);{code}
>  It occus an error:
> {code:java}
> Recoverable writers on Hadoop are only supported for HDF
> {code}
> Is there any mistakes I made?
> OR
> I want to use Aliyun OSS to store the stream data split to different files. 
> The Flink official document's example is use below:
> {code:java}
> // Write to OSS bucket
> stream.writeAsText("oss:///")
> {code}
> How to use this to split to different files by the data's attributes?
>  
> Thanks!
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23740) SQL Full Outer Join bug



[ 
https://issues.apache.org/jira/browse/FLINK-23740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405065#comment-17405065
 ] 

wangzhihao commented on FLINK-23740:


This SIM is currently depending on 
https://issues.apache.org/jira/browse/FLINK-23989

> SQL Full Outer Join bug
> ---
>
> Key: FLINK-23740
> URL: https://issues.apache.org/jira/browse/FLINK-23740
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.13.1, 1.13.2
>Reporter: Fu Kai
>Priority: Critical
>
> Hi team,
> We encountered an issue about FULL OUTER JOIN of Flink SQL, which happens 
> occasionally at very low probability that join output records cannot be 
> correctly updated. We cannot locate the root cause for now by glancing at the 
> SQL join logic in 
> [StreamingJoinOperator.|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/join/stream/StreamingJoinOperator.java#L198]
>  It cannot be stably reproduced and it does happen with massive data volume.
> The reason we suspect it's the FULL OUER join problem instead of others like 
> LEFT OUTER join is because the issue only arises after we introduced FULL 
> OUTER into the join flow. The query we are using is like the following. The 
> are two join code pieces below, the fist one contains solely left join(though 
> with nested) and there is no issue detected; the second one contains both 
> left and full outer join(nested as well), and the problem is that sometimes 
> update from the left table A(and other tables before the full outer join 
> operator) cannot be reflected in the final output. We suspect it could be the 
> introduce of full outer join that caused the problem, although at a very low 
> probability(~10 out of ~30million). 
> The root cause of the bug could be something else, the suspecting of FULL OUT 
> join is based on the result of our current experiment and observation.
> {code:java}
> create table A(
> k1 int,
> k2 int,
> k3 int,
> k4 int,
> k5 int,
> PRIMARY KEY (k1, k2, k3, k4, k5) NOT ENFORCED
> ) WITH ();
> create table B(
> k1 int,
> k2 int,
> k3 int,
> PRIMARY KEY (k1, k2, k3) NOT ENFORCED
> ) WITH ();
> create table C(
> k1 int,
> k2 int,
> k3 int,
> PRIMARY KEY (k1, k2, k3) NOT ENFORCED
> ) WITH ();
> create table D(
> k1 int,
> k2 int,
> PRIMARY KEY (k1, k2) NOT ENFORCED
> ) WITH ();
> // query with left join, no issue detected
> select * from A 
> left outer join 
> (select * from B
> left outer join C
> on 
> B.k1 = C.k1
> B.k2 = C.k2
> B.k3 = C.k3
> ) as BC
> on
> A.k1 = BC.k1
> A.k2 = BC.k2
> A.k3 = BC.k3
> left outer join D
> on 
> A.k1 = D.k1
> A.k2 = D.k2
> ;
> // query with full outer join combined with left outer join, record updates 
> from left table A cannot be updated in the final output record some times
> select * from A 
> left outer join 
> (select * from B
> full outer join C
> on 
> B.k1 = C.k1
> B.k2 = C.k2
> B.k3 = C.k3
> ) as BC
> on
> A.k1 = BC.k1
> A.k2 = BC.k2
> A.k3 = BC.k3
> left outer join D
> on 
> A.k1 = D.k1
> A.k2 = D.k2
> ;
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23969) Test Pulsar source end 2 end

2021-08-26 Thread Liu (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-23969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405067#comment-17405067
 ] 

Liu commented on FLINK-23969:
-

Hello, [~arvid]. I am willing to write the tests. Can you assign this ticket to 
me? Thanks.

> Test Pulsar source end 2 end
> 
>
> Key: FLINK-23969
> URL: https://issues.apache.org/jira/browse/FLINK-23969
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Pulsar
>Reporter: Arvid Heise
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.14.0
>
>
> Write a test application using Pulsar Source and execute it in distributed 
> fashion. Check fault-tolerance by crashing and restarting a TM.
> Ideally, we test different subscription modes and sticky keys in particular.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-23740) SQL Full Outer Join bug



[ 
https://issues.apache.org/jira/browse/FLINK-23740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405065#comment-17405065
 ] 

wangzhihao edited comment on FLINK-23740 at 8/26/21, 8:38 AM:
--

This SIM is currently depending on FLINK-23989


was (Author: zhihao):
This SIM is currently depending on 
https://issues.apache.org/jira/browse/FLINK-23989

> SQL Full Outer Join bug
> ---
>
> Key: FLINK-23740
> URL: https://issues.apache.org/jira/browse/FLINK-23740
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.13.1, 1.13.2
>Reporter: Fu Kai
>Priority: Critical
>
> Hi team,
> We encountered an issue about FULL OUTER JOIN of Flink SQL, which happens 
> occasionally at very low probability that join output records cannot be 
> correctly updated. We cannot locate the root cause for now by glancing at the 
> SQL join logic in 
> [StreamingJoinOperator.|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/join/stream/StreamingJoinOperator.java#L198]
>  It cannot be stably reproduced and it does happen with massive data volume.
> The reason we suspect it's the FULL OUER join problem instead of others like 
> LEFT OUTER join is because the issue only arises after we introduced FULL 
> OUTER into the join flow. The query we are using is like the following. The 
> are two join code pieces below, the fist one contains solely left join(though 
> with nested) and there is no issue detected; the second one contains both 
> left and full outer join(nested as well), and the problem is that sometimes 
> update from the left table A(and other tables before the full outer join 
> operator) cannot be reflected in the final output. We suspect it could be the 
> introduce of full outer join that caused the problem, although at a very low 
> probability(~10 out of ~30million). 
> The root cause of the bug could be something else, the suspecting of FULL OUT 
> join is based on the result of our current experiment and observation.
> {code:java}
> create table A(
> k1 int,
> k2 int,
> k3 int,
> k4 int,
> k5 int,
> PRIMARY KEY (k1, k2, k3, k4, k5) NOT ENFORCED
> ) WITH ();
> create table B(
> k1 int,
> k2 int,
> k3 int,
> PRIMARY KEY (k1, k2, k3) NOT ENFORCED
> ) WITH ();
> create table C(
> k1 int,
> k2 int,
> k3 int,
> PRIMARY KEY (k1, k2, k3) NOT ENFORCED
> ) WITH ();
> create table D(
> k1 int,
> k2 int,
> PRIMARY KEY (k1, k2) NOT ENFORCED
> ) WITH ();
> // query with left join, no issue detected
> select * from A 
> left outer join 
> (select * from B
> left outer join C
> on 
> B.k1 = C.k1
> B.k2 = C.k2
> B.k3 = C.k3
> ) as BC
> on
> A.k1 = BC.k1
> A.k2 = BC.k2
> A.k3 = BC.k3
> left outer join D
> on 
> A.k1 = D.k1
> A.k2 = D.k2
> ;
> // query with full outer join combined with left outer join, record updates 
> from left table A cannot be updated in the final output record some times
> select * from A 
> left outer join 
> (select * from B
> full outer join C
> on 
> B.k1 = C.k1
> B.k2 = C.k2
> B.k3 = C.k3
> ) as BC
> on
> A.k1 = BC.k1
> A.k2 = BC.k2
> A.k3 = BC.k3
> left outer join D
> on 
> A.k1 = D.k1
> A.k2 = D.k2
> ;
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23989) Flink SQL visibility



 [ 
https://issues.apache.org/jira/browse/FLINK-23989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

wangzhihao updated FLINK-23989:
---
Description: 
It’s desired to inspect into the internal states generated by SQL especially 
for debug purpose. We propose to add a new feature in Flink UI console, when 
the users click on a Vertex of the Job DAG. We shall list all states the vertex 
has in a separate panel (let’s call *states panel*). On this panel users can 
query the states with some keys. The returned value share be a human readable 
string instead of opaque binary.

 

Particularly, we need to expose the states as queryable. But currently the user 
experience of [queryable 
states|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/fault-tolerance/queryable_state/]
 is cumbersome. Only the serialized value is returned to client and users need 
to handle deserialization by themselves. What’s worse, the client need to 
construct the serializer and type manually. To improve this situation. We 
propose:
 # Have a new API to find all queryable states associated to a job vertex. This 
can be done to check against the KvStateLocationRegistry, which store the 
mapping between JobVertexId and states.
 # Have a new API to allow users get the types of queryable states: For a 
register name (String), Queryable Server will return the type of key and value 
([LogicalType|https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/LogicalType.java]).
 # To generate human readable string automatically with API in step 2, we can 
1) generate 
[TypeSerializer|https://github.com/apache/flink/blob/6c9818323b41a84137c52822d2993df788dbc9bb/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java]
 from the LogicalType, so as to handle Serde automatically. 2) to convert 
internal data structures to external data structures to generate printable 
string. (with converters 
[DataStructureConverter|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/table/data/conversion/DataStructureConverter.html]
 )

With all these steps and some modifications to Web UI/ Rest, we can enable 
users to query SQL internal states.

 

  was:
It’s desired to inspect into the internal states generated by SQL especially 
for debug purpose. We propose to add a new feature in Flink UI console, when 
the users click on a Vertex of the Job DAG. We shall list all states the vertex 
has in a separate panel (let’s call *states panel*). On this panel users can 
query the states with some keys. The returned value share be a human readable 
string instead of opaque binary.

 

Particularly, we need expose the states as queryable. But currently the user 
experience of queryable states is cumbersome. Only the serialized value is 
returned to client and users need to handle deserialization by themselves. 
What’s worse, the client need to construct the serializer and type manually. To 
improve this situation. We propose:
 # Have a new API to find all queryable states associated to a job vertex. This 
can be done to check against the KvStateLocationRegistry, which store the 
mapping between JobVertexId and states.
 # Have a new API to allow users get the types of queryable states: For a 
register name (String), Queryable Server will return the type of key and value 
([LogicalType|https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/LogicalType.java]).
 # To generate human readable string with API in step 2, we can 1) generate 
[TypeSerializer|https://github.com/apache/flink/blob/6c9818323b41a84137c52822d2993df788dbc9bb/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java]
 from the LogicalType, so as to handle Serde automatically. 2) to convert 
internal data structures to external data structures to generate printable 
string. (with converters 
[DataStructureConverter|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/table/data/conversion/DataStructureConverter.html]
 )

With all these steps and some modifications to Web UI/ Rest, we can enable 
users to query SQL internal states.

 


> Flink SQL visibility 
> -
>
> Key: FLINK-23989
> URL: https://issues.apache.org/jira/browse/FLINK-23989
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Queryable State
>Reporter: wangzhihao
>Priority: Major
>
> It’s desired to inspect into the internal states generated by SQL especially 
> for debug purpose. We propose to add a new feature in Flink UI console, when 
> the users click on a Vertex of the Job DAG. We shall list all states the 
> vertex has in a separate panel (let’s call *states panel*). On this panel

[jira] [Commented] (FLINK-11388) Add an OSS RecoverableWriter

2021-08-26 Thread Alex Z (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-11388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405071#comment-17405071
 ] 

Alex Z commented on FLINK-11388:


I want to know why the pr was closed and not merged.I think the pr is necessary 
for China's cloud users.

> Add an OSS RecoverableWriter
> 
>
> Key: FLINK-11388
> URL: https://issues.apache.org/jira/browse/FLINK-11388
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / FileSystem
>Affects Versions: 1.7.1
>Reporter: wujinhu
>Assignee: wujinhu
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> OSS offers persistence only after uploads or multi-part uploads complete.  In 
> order to make streaming uses OSS as sink, we should implement a Recoverable 
> writer. This writer will snapshot and store multi-part upload information and 
> recover from those information when failure occurs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23990) Replace custom monaco editor component

2021-08-26 Thread Jira

Ingo Bürk created FLINK-23990:
-

 Summary: Replace custom monaco editor component
 Key: FLINK-23990
 URL: https://issues.apache.org/jira/browse/FLINK-23990
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Web Frontend
Affects Versions: 1.14.0
Reporter: Ingo Bürk


After the upgrade to Angular 12 we should investigate if we can't replace the 
custom flink-monaco-editor component by the one shipped with ng-zorro. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] Airblader commented on a change in pull request #16980: [FLINK-23111][runtime-web] Fix monaco editor async setup



Airblader commented on a change in pull request #16980:
URL: https://github.com/apache/flink/pull/16980#discussion_r696419462



##
File path: 
flink-runtime-web/web-dashboard/src/app/pages/job/exceptions/job-exceptions.component.less
##
@@ -32,12 +32,15 @@ nz-table {
 
 flink-monaco-editor {
   height: calc(~"100vh - 346px");
+
   &.subtask {
 height: 300px;
   }
 }
 
 .expand-td {
+  display: block;
+  width: 100%;

Review comment:
   width 100% is the default for block elements. Why is this needed?

##
File path: 
flink-runtime-web/web-dashboard/src/app/share/common/monaco-editor/monaco-editor.component.ts
##
@@ -74,7 +74,7 @@ export class MonacoEditorComponent implements AfterViewInit, 
OnDestroy {
 
   ngAfterViewInit() {
 if ((window as any).monaco) {
-  this.setupMonaco();
+  setTimeout(() => this.setupMonaco());

Review comment:
   Could you quickly you explain why we need to delay this by a tick now?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Comment Edited] (FLINK-23740) SQL Full Outer Join bug

2021-08-26 Thread Zhihao Wang (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-23740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405065#comment-17405065
 ] 

Zhihao Wang edited comment on FLINK-23740 at 8/26/21, 8:44 AM:
---

This JIRA is now depending on FLINK-23989


was (Author: zhihao):
This SIM is currently depending on FLINK-23989

> SQL Full Outer Join bug
> ---
>
> Key: FLINK-23740
> URL: https://issues.apache.org/jira/browse/FLINK-23740
> Project: Flink
>  Issue Type: Bug
>  Components: Table SQL / Runtime
>Affects Versions: 1.13.1, 1.13.2
>Reporter: Fu Kai
>Priority: Critical
>
> Hi team,
> We encountered an issue about FULL OUTER JOIN of Flink SQL, which happens 
> occasionally at very low probability that join output records cannot be 
> correctly updated. We cannot locate the root cause for now by glancing at the 
> SQL join logic in 
> [StreamingJoinOperator.|https://github.com/apache/flink/blob/master/flink-table/flink-table-runtime/src/main/java/org/apache/flink/table/runtime/operators/join/stream/StreamingJoinOperator.java#L198]
>  It cannot be stably reproduced and it does happen with massive data volume.
> The reason we suspect it's the FULL OUER join problem instead of others like 
> LEFT OUTER join is because the issue only arises after we introduced FULL 
> OUTER into the join flow. The query we are using is like the following. The 
> are two join code pieces below, the fist one contains solely left join(though 
> with nested) and there is no issue detected; the second one contains both 
> left and full outer join(nested as well), and the problem is that sometimes 
> update from the left table A(and other tables before the full outer join 
> operator) cannot be reflected in the final output. We suspect it could be the 
> introduce of full outer join that caused the problem, although at a very low 
> probability(~10 out of ~30million). 
> The root cause of the bug could be something else, the suspecting of FULL OUT 
> join is based on the result of our current experiment and observation.
> {code:java}
> create table A(
> k1 int,
> k2 int,
> k3 int,
> k4 int,
> k5 int,
> PRIMARY KEY (k1, k2, k3, k4, k5) NOT ENFORCED
> ) WITH ();
> create table B(
> k1 int,
> k2 int,
> k3 int,
> PRIMARY KEY (k1, k2, k3) NOT ENFORCED
> ) WITH ();
> create table C(
> k1 int,
> k2 int,
> k3 int,
> PRIMARY KEY (k1, k2, k3) NOT ENFORCED
> ) WITH ();
> create table D(
> k1 int,
> k2 int,
> PRIMARY KEY (k1, k2) NOT ENFORCED
> ) WITH ();
> // query with left join, no issue detected
> select * from A 
> left outer join 
> (select * from B
> left outer join C
> on 
> B.k1 = C.k1
> B.k2 = C.k2
> B.k3 = C.k3
> ) as BC
> on
> A.k1 = BC.k1
> A.k2 = BC.k2
> A.k3 = BC.k3
> left outer join D
> on 
> A.k1 = D.k1
> A.k2 = D.k2
> ;
> // query with full outer join combined with left outer join, record updates 
> from left table A cannot be updated in the final output record some times
> select * from A 
> left outer join 
> (select * from B
> full outer join C
> on 
> B.k1 = C.k1
> B.k2 = C.k2
> B.k3 = C.k3
> ) as BC
> on
> A.k1 = BC.k1
> A.k2 = BC.k2
> A.k3 = BC.k3
> left outer join D
> on 
> A.k1 = D.k1
> A.k2 = D.k2
> ;
> {code}
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Created] (FLINK-23991) Specifying yarn.staging-dir fail when staging scheme is different from default fs scheme

2021-08-26 Thread Junfan Zhang (Jira)

Junfan Zhang created FLINK-23991:


 Summary: Specifying yarn.staging-dir fail when staging scheme is 
different from default fs scheme
 Key: FLINK-23991
 URL: https://issues.apache.org/jira/browse/FLINK-23991
 Project: Flink
  Issue Type: Bug
  Components: Deployment / YARN
Affects Versions: 1.13.2
Reporter: Junfan Zhang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23989) Flink SQL visibility

2021-08-26 Thread Zhihao Wang (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-23989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihao Wang updated FLINK-23989:

Description: 
It’s desired to inspect into the internal states generated by SQL especially 
for debug purpose. We propose to add a new feature in Flink UI console, when 
the users click on a Vertex of the Job DAG. We shall list all states the vertex 
has in a separate panel (let’s call *states panel*). On this panel users can 
query the states with some keys. The returned value share be a human readable 
string instead of opaque binary.

 

Particularly, we need to expose the states as queryable. But currently the user 
experience of [queryable 
states|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/fault-tolerance/queryable_state/]
 is cumbersome. Only the serialized value is returned to client and users need 
to handle deserialization by themselves. What’s worse, the client need to 
construct the serializer and type manually. To improve this situation. We 
propose:
 # Have a new API to find all queryable states associated to a job vertex. This 
can be done to check against the 
[KvStateLocationRegistry|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/query/KvStateLocationRegistry.java],
 which store the mapping between JobVertexId and states.
 # Have a new API to allow users get the types of queryable states: For a 
register name (String), Queryable Server will return the type of key and value 
([LogicalType|https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/LogicalType.java]).
 # To generate human readable string automatically with API in step 2, we can 
1) generate 
[TypeSerializer|https://github.com/apache/flink/blob/6c9818323b41a84137c52822d2993df788dbc9bb/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java]
 from the LogicalType, so as to handle Serde automatically. 2) to convert 
internal data structures to external data structures to generate printable 
string. (with converters 
[DataStructureConverter|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/table/data/conversion/DataStructureConverter.html]
 )

With all these steps and some modifications to Web UI/ Rest, we can enable 
users to query SQL internal states.

 

  was:
It’s desired to inspect into the internal states generated by SQL especially 
for debug purpose. We propose to add a new feature in Flink UI console, when 
the users click on a Vertex of the Job DAG. We shall list all states the vertex 
has in a separate panel (let’s call *states panel*). On this panel users can 
query the states with some keys. The returned value share be a human readable 
string instead of opaque binary.

 

Particularly, we need to expose the states as queryable. But currently the user 
experience of [queryable 
states|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/fault-tolerance/queryable_state/]
 is cumbersome. Only the serialized value is returned to client and users need 
to handle deserialization by themselves. What’s worse, the client need to 
construct the serializer and type manually. To improve this situation. We 
propose:
 # Have a new API to find all queryable states associated to a job vertex. This 
can be done to check against the KvStateLocationRegistry, which store the 
mapping between JobVertexId and states.
 # Have a new API to allow users get the types of queryable states: For a 
register name (String), Queryable Server will return the type of key and value 
([LogicalType|https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/logical/LogicalType.java]).
 # To generate human readable string automatically with API in step 2, we can 
1) generate 
[TypeSerializer|https://github.com/apache/flink/blob/6c9818323b41a84137c52822d2993df788dbc9bb/flink-core/src/main/java/org/apache/flink/api/common/typeutils/TypeSerializer.java]
 from the LogicalType, so as to handle Serde automatically. 2) to convert 
internal data structures to external data structures to generate printable 
string. (with converters 
[DataStructureConverter|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/table/data/conversion/DataStructureConverter.html]
 )

With all these steps and some modifications to Web UI/ Rest, we can enable 
users to query SQL internal states.

 


> Flink SQL visibility 
> -
>
> Key: FLINK-23989
> URL: https://issues.apache.org/jira/browse/FLINK-23989
> Project: Flink
>  Issue Type: New Feature
>  Components: Runtime / Queryable State
>Reporter: Zhihao Wang
>Priority: Major
>
> It’s desired to inspect into the inte

[GitHub] [flink] zuston opened a new pull request #16994: [FLINK-23991] Specifying yarn.staging-dir fail when staging scheme is…



zuston opened a new pull request #16994:
URL: https://github.com/apache/flink/pull/16994


   … different from default fs scheme
   
   
   
   ## What is the purpose of the change
   
   *(For example: This pull request makes task deployment go through the blob 
server, rather than through RPC. That way we avoid re-transferring them on each 
deployment (during recovery).)*
   
   
   ## Brief change log
   
   *(for example:)*
 - *The TaskInfo is stored in the blob store on job creation time as a 
persistent artifact*
 - *Deployments RPC transmits only the blob storage reference*
 - *TaskManagers retrieve the TaskInfo from the blob cache*
   
   
   ## Verifying this change
   
   *(Please pick either of the following options)*
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   *(or)*
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   
   *(or)*
   
   This change added tests and can be verified as follows:
   
   *(example:)*
 - *Added integration tests for end-to-end deployment with large payloads 
(100MB)*
 - *Extended integration test for recovery after master (JobManager) 
failure*
 - *Added test that validates that TaskInfo is transferred only once across 
recoveries*
 - *Manually verified the change by running a 4 node cluser with 2 
JobManagers and 4 TaskManagers, a stateful streaming program, and killing one 
JobManager and two TaskManagers during the execution, verifying that recovery 
happens correctly.*
   
   ## Does this pull request potentially affect one of the following parts:
   
 - Dependencies (does it add or upgrade a dependency): (yes / no)
 - The public API, i.e., is any changed class annotated with 
`@Public(Evolving)`: (yes / no)
 - The serializers: (yes / no / don't know)
 - The runtime per-record code paths (performance sensitive): (yes / no / 
don't know)
 - Anything that affects deployment or recovery: JobManager (and its 
components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (yes / no / don't know)
 - The S3 file system connector: (yes / no / don't know)
   
   ## Documentation
   
 - Does this pull request introduce a new feature? (yes / no)
 - If yes, how is the feature documented? (not applicable / docs / JavaDocs 
/ not documented)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-23991) Specifying yarn.staging-dir fail when staging scheme is different from default fs scheme

2021-08-26 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-23991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-23991:
---
Labels: pull-request-available  (was: )

> Specifying yarn.staging-dir fail when staging scheme is different from 
> default fs scheme
> 
>
> Key: FLINK-23991
> URL: https://issues.apache.org/jira/browse/FLINK-23991
> Project: Flink
>  Issue Type: Bug
>  Components: Deployment / YARN
>Affects Versions: 1.13.2
>Reporter: Junfan Zhang
>Priority: Major
>  Labels: pull-request-available
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] Myasuka commented on pull request #16846: [FLINK-23800][rocksdb] Expose live-sst-files-size to RocksDB native metrics



Myasuka commented on pull request #16846:
URL: https://github.com/apache/flink/pull/16846#issuecomment-906219947


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-23943) "/jobs/:jobid/stop" in REST API can't stop the target job.

2021-08-26 Thread Roc Marshal (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-23943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roc Marshal updated FLINK-23943:

Labels: rest  (was: )

> "/jobs/:jobid/stop" in REST API can't stop the target job.
> --
>
> Key: FLINK-23943
> URL: https://issues.apache.org/jira/browse/FLINK-23943
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / REST
>Affects Versions: 1.13.2
>Reporter: Roc Marshal
>Priority: Major
>  Labels: rest
> Attachments: flink-roc-standalonesession-0-MacBook-Pro.local.log, 
> flink-roc-taskexecutor-0-MacBook-Pro.local.log
>
>
> * "/jobs/:jobid/stop" in REST API can't stop the target job.
>  * It can trigger a savepoint with the parameters.
>  * The interface document link 
> https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-stop



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23943) "/jobs/:jobid/stop" in REST API can't stop the target job.

2021-08-26 Thread Roc Marshal (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-23943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roc Marshal updated FLINK-23943:

Labels: REST_API  (was: rest)

> "/jobs/:jobid/stop" in REST API can't stop the target job.
> --
>
> Key: FLINK-23943
> URL: https://issues.apache.org/jira/browse/FLINK-23943
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / REST
>Affects Versions: 1.13.2
>Reporter: Roc Marshal
>Priority: Major
>  Labels: REST_API
> Attachments: flink-roc-standalonesession-0-MacBook-Pro.local.log, 
> flink-roc-taskexecutor-0-MacBook-Pro.local.log
>
>
> * "/jobs/:jobid/stop" in REST API can't stop the target job.
>  * It can trigger a savepoint with the parameters.
>  * The interface document link 
> https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-stop



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] flinkbot commented on pull request #16994: [FLINK-23991] Specifying yarn.staging-dir fail when staging scheme is…



flinkbot commented on pull request #16994:
URL: https://github.com/apache/flink/pull/16994#issuecomment-906225993


   Thanks a lot for your contribution to the Apache Flink project. I'm the 
@flinkbot. I help the community
   to review your pull request. We will use this comment to track the progress 
of the review.
   
   
   ## Automated Checks
   Last check on commit 3c616589d6e315d0f29513fb0c7132f67b11d3bb (Thu Aug 26 
09:01:10 UTC 2021)
   
   **Warnings:**
* No documentation files were touched! Remember to keep the Flink docs up 
to date!
* **This pull request references an unassigned [Jira 
ticket](https://issues.apache.org/jira/browse/FLINK-23991).** According to the 
[code contribution 
guide](https://flink.apache.org/contributing/contribute-code.html), tickets 
need to be assigned before starting with the implementation work.
   
   
   Mention the bot in a comment to re-run the automated checks.
   ## Review Progress
   
   * ❓ 1. The [description] looks good.
   * ❓ 2. There is [consensus] that the contribution should go into to Flink.
   * ❓ 3. Needs [attention] from.
   * ❓ 4. The change fits into the overall [architecture].
   * ❓ 5. Overall code [quality] is good.
   
   Please see the [Pull Request Review 
Guide](https://flink.apache.org/contributing/reviewing-prs.html) for a full 
explanation of the review process.
The Bot is tracking the review progress through labels. Labels are applied 
according to the order of the review items. For consensus, approval by a Flink 
committer of PMC member is required Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot approve description` to approve one or more aspects (aspects: 
`description`, `consensus`, `architecture` and `quality`)
- `@flinkbot approve all` to approve all aspects
- `@flinkbot approve-until architecture` to approve everything until 
`architecture`
- `@flinkbot attention @username1 [@username2 ..]` to require somebody's 
attention
- `@flinkbot disapprove architecture` to remove an approval you gave earlier
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] dianfu closed pull request #16993: [FLINK-23818][python][docs] Add documentation about tgz files for python archives



dianfu closed pull request #16993:
URL: https://github.com/apache/flink/pull/16993


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Closed] (FLINK-23818) Add documentation about tgz files support for python archives

2021-08-26 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-23818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu closed FLINK-23818.
---
Resolution: Fixed

Merged to master via 5289b0e3ce7fad03225d064afceda35462454fe2

> Add documentation about tgz files support for python archives
> -
>
> Key: FLINK-23818
> URL: https://issues.apache.org/jira/browse/FLINK-23818
> Project: Flink
>  Issue Type: Improvement
>  Components: API / Python, Documentation
>Reporter: Dian Fu
>Assignee: Dian Fu
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (FLINK-23819) Testing tgz file for python archives

2021-08-26 Thread Dian Fu (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-23819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dian Fu updated FLINK-23819:

Description: 
This feature is to support tar.gz files as python archives. In the past, it 
only support zip files as python archives.

This feature could be tested as following:
 1) Build PyFlink packages from source according to documentation: 
[https://ci.apache.org/projects/flink/flink-docs-master/docs/flinkdev/building/#build-pyflink]
 2) Preparing tar.gz file which contains the conda Python virtual environment
 - Install MiniConda in your environment: 
[https://conda.io/projects/conda/en/latest/user-guide/install/macos.html]
 - Install conda pack: [https://conda.github.io/conda-pack/]
 - Prepare the conda environment and install the built PyFlink in the above 
step into the conda virtual environment:
{code}
 conda create --name myenv
 conda activate myenv
 conda install python=3.8
 python -m pip install 
~/code/src/apache/flink/flink-python/apache-flink-libraries/dist/apache-flink-libraries-1.14.dev0.tar.gz
 python -m pip install 
~/code/src/apache/flink/flink-python/dist/apache_flink-1.14.dev0-cp38-cp38-macosx_10_9_x86_64.whl
{code}

  - You could verify the packages installed in the conda env **myenv** as 
following:
{code}
conda list -n myenv
{code}

 - Package the conda virtual environment into a tgz file: (it will generate a 
file named myenv.tar.gz)
{code} 
conda pack -n myenv
{code}

3) Prepare a PyFlink job, here is an example:
{code:java}
import time
from pyflink.common.typeinfo import Types
from pyflink.datastream import StreamExecutionEnvironment, CoMapFunction
from pyflink.table import StreamTableEnvironment, DataTypes, Schema


def test_chaining():
env = StreamExecutionEnvironment.get_execution_environment()
t_env = StreamTableEnvironment.create(stream_execution_environment=env)

# 1. create source Table
t_env.execute_sql("""
CREATE TABLE datagen (
id INT,
data STRING
) WITH (
'connector' = 'datagen',
'rows-per-second' = '100',
'fields.id.kind' = 'sequence',
'fields.id.start' = '1',
'fields.id.end' = '1000'
)
""")

# 2. create sink Table
t_env.execute_sql("""
CREATE TABLE print (
id BIGINT,
data STRING,
flag STRING
) WITH (
'connector' = 'blackhole'
)
""")

t_env.execute_sql("""
CREATE TABLE print_2 (
id BIGINT,
data STRING,
flag STRING
) WITH (
'connector' = 'blackhole'
)
""")

# 3. query from source table and perform calculations
# create a Table from a Table API query:
source_table = t_env.from_path("datagen")

ds = t_env.to_append_stream(
source_table,
Types.ROW([Types.INT(), Types.STRING()]))

ds1 = ds.map(lambda i: (i[0] * i[0], i[1]))
ds2 = ds.map(lambda i: (i[0], i[1][2:]))

class MyCoMapFunction(CoMapFunction):

def map1(self, value):
return value

def map2(self, value):
return value

ds3 = ds1.connect(ds2).map(MyCoMapFunction(), 
output_type=Types.TUPLE([Types.LONG(), Types.STRING()]))

ds4 = ds3.map(lambda i: (i[0], i[1], "left"),
  output_type=Types.TUPLE([Types.LONG(), Types.STRING(), 
Types.STRING()]))

ds5 = ds3.map(lambda i: (i[0], i[1], "right"))\
 .map(lambda i: i,
  output_type=Types.TUPLE([Types.LONG(), Types.STRING(), 
Types.STRING()]))

schema = Schema.new_builder() \
.column("f0", DataTypes.BIGINT()) \
.column("f1", DataTypes.STRING()) \
.column("f2", DataTypes.STRING()) \
.build()

result_table_3 = t_env.from_data_stream(ds4, schema)
statement_set = t_env.create_statement_set()
statement_set.add_insert("print", result_table_3)

result_table_4 = t_env.from_data_stream(ds5, schema)
statement_set.add_insert("print_2", result_table_4)

statement_set.execute()


if __name__ == "__main__":

start_ts = time.time()
test_chaining()
end_ts = time.time()
print("--- %s seconds ---" % (end_ts - start_ts))
   {code}
4) Submit the PyFlink job using the generated myenv.tar.gz
 ./bin/flink run -d -m localhost:8081 -py test_pyflink.py -pyarch 
myenv.tar.gz#myenv -pyexec myenv/bin/python -pyclientexec myenv/bin/python

5) The job should runs normally and you should see logs as following in the log 
file of TaskManager:
{code}
2021-08-26 11:14:19,295 INFO 
org.apache.beam.runners.fnexecution.environment.ProcessEnvironmentFactory [] - 
Still waiting for startup of environment 
'/private/var/folders/jq/brl84gld47ngmcfyvwh2gtj4gp/T/python-dist-a61682a6-79b0-443c-b3c8-f9dade55e5d6/python-archives/myenv/lib/python3.8/site-packages/pyflink/bin/pyflink-udf-runner.sh'
 for worker id 1-1
{code}
It d

[GitHub] [flink] flinkbot edited a comment on pull request #16723: [FLINK-22885][table] Supports 'SHOW COLUMNS' syntax.



flinkbot edited a comment on pull request #16723:
URL: https://github.com/apache/flink/pull/16723#issuecomment-893364145


   
   ## CI report:
   
   * feca59d37ac86df05f11b8ecec57b93e4356f2e8 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22871)
 
   * e547c306fd49c553ac3dc0c58528eaff47510d26 UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16846: [FLINK-23800][rocksdb] Expose live-sst-files-size to RocksDB native metrics



flinkbot edited a comment on pull request #16846:
URL: https://github.com/apache/flink/pull/16846#issuecomment-899526733


   
   ## CI report:
   
   * 57e020c5f8de174d9e2f55983fb51936e1657f3e Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22872)
 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22885)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16861: [FLINK-23808][checkpoint] Bypass operators when advanceToEndOfEventTime for both legacy and new source tasks



flinkbot edited a comment on pull request #16861:
URL: https://github.com/apache/flink/pull/16861#issuecomment-900179085


   
   ## CI report:
   
   * 7165c3df04d5b5520cced9f44b34d7bd0ad22c02 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22879)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16900: [FLINK-23877][connector/pulsar] Provide an embedded pulsar broker for testing.



flinkbot edited a comment on pull request #16900:
URL: https://github.com/apache/flink/pull/16900#issuecomment-902199681


   
   ## CI report:
   
   * f3180e650ed121eb4706dffbd4584d2a08189e24 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22876)
 
   * 958dbafc8a14a0f46e54db04fd9cbd0f21f5d57f Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22882)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot edited a comment on pull request #16971: [FLINK-23794][tests] Fix InMemoryReporter instantiation and memory consumption.



flinkbot edited a comment on pull request #16971:
URL: https://github.com/apache/flink/pull/16971#issuecomment-904969319


   
   ## CI report:
   
   * 7960af7d4e53785f300d77acdbf372f9c91d3204 UNKNOWN
   * ca12cee214cb00e89f8175cc0c3d8645ed350d51 Azure: 
[FAILURE](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22880)
 
   * cc30231665f03047e4098e2b8760dccce6cf9e43 Azure: 
[PENDING](https://dev.azure.com/apache-flink/98463496-1af2-4620-8eab-a2ecc1a2e6fe/_build/results?buildId=22883)
 
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] flinkbot commented on pull request #16994: [FLINK-23991] Specifying yarn.staging-dir fail when staging scheme is…



flinkbot commented on pull request #16994:
URL: https://github.com/apache/flink/pull/16994#issuecomment-906233672


   
   ## CI report:
   
   * 3c616589d6e315d0f29513fb0c7132f67b11d3bb UNKNOWN
   
   
   Bot commands
 The @flinkbot bot supports the following commands:
   
- `@flinkbot run travis` re-run the last Travis build
- `@flinkbot run azure` re-run the last Azure build
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] yangjunhan commented on a change in pull request #16980: [FLINK-23111][runtime-web] Fix monaco editor async setup



yangjunhan commented on a change in pull request #16980:
URL: https://github.com/apache/flink/pull/16980#discussion_r696449877



##
File path: 
flink-runtime-web/web-dashboard/src/app/share/common/monaco-editor/monaco-editor.component.ts
##
@@ -74,7 +74,7 @@ export class MonacoEditorComponent implements AfterViewInit, 
OnDestroy {
 
   ngAfterViewInit() {
 if ((window as any).monaco) {
-  this.setupMonaco();
+  setTimeout(() => this.setupMonaco());

Review comment:
   Inside the method `setupMonaco()`, it tries to get the 
`elementRef.nativeElement` of monaco-editor component and create an editor 
instance based on it . However, the editor component is not rendered yet while 
expanding the table row on the page **job -> exceptions -> exception history**, 
so the creation of editor instance must be delayed to the next tick after its 
template container is properly rendered.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] yangjunhan commented on a change in pull request #16980: [FLINK-23111][runtime-web] Fix monaco editor async setup



yangjunhan commented on a change in pull request #16980:
URL: https://github.com/apache/flink/pull/16980#discussion_r696449877



##
File path: 
flink-runtime-web/web-dashboard/src/app/share/common/monaco-editor/monaco-editor.component.ts
##
@@ -74,7 +74,7 @@ export class MonacoEditorComponent implements AfterViewInit, 
OnDestroy {
 
   ngAfterViewInit() {
 if ((window as any).monaco) {
-  this.setupMonaco();
+  setTimeout(() => this.setupMonaco());

Review comment:
   Inside the method `setupMonaco()`, it tries to get the 
`elementRef.nativeElement` of monaco-editor component and create an editor 
instance based on it . However, the editor component is not rendered yet while 
expanding the table row on the page **job -> exceptions -> exception history**, 
so the creation of editor instance must be delayed to the next tick after it 
can properly get `elementRef`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] gaoyunhaii commented on pull request #16861: [FLINK-23808][checkpoint] Bypass operators when advanceToEndOfEventTime for both legacy and new source tasks



gaoyunhaii commented on pull request #16861:
URL: https://github.com/apache/flink/pull/16861#issuecomment-906240714


   @flinkbot run azure


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] pnowojski commented on a change in pull request #16988: [FLINK-23458][docs] Added the network buffer documentation along wit…



pnowojski commented on a change in pull request #16988:
URL: https://github.com/apache/flink/pull/16988#discussion_r696380229



##
File path: docs/content/docs/ops/state/network_buffer.md
##
@@ -0,0 +1,108 @@
+---
+title: "Network Buffers"
+weight: 100
+type: docs
+aliases:
+  - /ops/state/network_buffer.html
+---
+
+
+# Network buffer
+
+## Overview
+
+Each record in flink is sent to the next subtask not individually but 
compounded in Network buffer,
+the smallest unit inter subtask communication. Also, in order to keep high 
throughput, 

Review comment:
   ```suggestion
   the smallest unit for communication between subtasks. Also, in order to keep 
consistent high throughput, 
   ```

##
File path: docs/content/docs/ops/state/network_buffer.md
##
@@ -0,0 +1,108 @@
+---
+title: "Network Buffers"
+weight: 100
+type: docs
+aliases:
+  - /ops/state/network_buffer.html
+---
+
+
+# Network buffer
+
+## Overview
+
+Each record in flink is sent to the next subtask not individually but 
compounded in Network buffer,
+the smallest unit inter subtask communication. Also, in order to keep high 
throughput, 
+flink supports the network buffers queue(so called in-flight data) as for 
output stream as well as for the input stream. 

Review comment:
   ```suggestion
   Flink uses the network buffer queues (so called in-flight data) both on the 
output as well as on the input side. 
   ```

##
File path: docs/content/docs/ops/state/network_buffer.md
##
@@ -0,0 +1,108 @@
+---
+title: "Network Buffers"
+weight: 100
+type: docs
+aliases:
+  - /ops/state/network_buffer.html
+---
+
+
+# Network buffer
+
+## Overview
+
+Each record in flink is sent to the next subtask not individually but 
compounded in Network buffer,
+the smallest unit inter subtask communication. Also, in order to keep high 
throughput, 
+flink supports the network buffers queue(so called in-flight data) as for 
output stream as well as for the input stream. 
+In the result each subtask have an input queue waiting for the consumption and 
an output queue 
+waiting for sending to the next subtask. Having a full input queue can 
guarantee busyness of 
+the subtask and high level of the throughput but it has negative effect for 
the latency and 
+what more important for the checkpoint time. The long checkpoint time issue 
can be explained as 
+long waiting in network buffers queue(the subtask requires some time to 
consume all data from 
+the queue before it can handle the checkpoint barrier), as well as the data 
skew - 
+when different input queue for one subtask contain different number of the 
data which leads to 
+increasing the checkpoint alignment time and as result to increasing the whole 
checkpoint time.
+
+## Buffer debloat
+
+The buffer debloat is mechanism which automatically adjust the buffer size in 
order to keep configured 
+checkpoint time along with high throughput. More precisely, the buffer debloat 
calculate the maximum possible throughput
+(the maximum throughput which would be if the subtask was always busy) 
+for the subtask and adapt the buffer size in such a way that the time for 
consumption of in-flight data would be equal to configured one.
+
+The most useful settings:
+* The buffer debloat can be enabled by setting the property 
`taskmanager.network.memory.buffer-debloat.enabled` to `true`. 
+* Desirable time of the consumption in-flight data can be configured by 
setting `taskmanager.network.memory.buffer-debloat.target` to `duration`.
+
+The settings for the optimization: 
+
+* `taskmanager.network.memory.buffer-debloat.period` - the minimum time 
between buffer size recalculation. 
+It can be decreased if the throughput is pretty volatile by some reason and 
the high speed on the changes required.
+* `taskmanager.network.memory.buffer-debloat.samples` - The measure of 
calculation smoothing. 
+It can be decreased if the buffer size is changed undesirable slower compare 
to the instant throughput or it can be increased otherwise.

Review comment:
   ```suggestion
   * `taskmanager.network.memory.buffer-debloat.samples` - Adjust the number of 
samples over which throughput measurements are averaged out.
   The frequency of the collected samples can be adjusted via 
`taskmanager.network.memory.buffer-debloat.period`.
   The fewer samples, the faster reaction time of the debloating mechanism, but 
a higher chance of a sudden spike or drop of the throughput to cause the buffer 
debloating to miscalculate the best amount of the in-flight data.
   ```

##
File path: docs/content/docs/ops/state/network_buffer.md
##
@@ -0,0 +1,108 @@
+---
+title: "Network Buffers"
+weight: 100
+type: docs
+aliases:
+  - /ops/state/network_buffer.html
+---
+
+
+# Network buffer
+
+## Overview
+
+Each record in flink is sent to the next subtask not individually but 
compounded in Network buffer,
+the smallest unit inter subtask communication. Also, in order to keep high 
th

[GitHub] [flink] pnowojski commented on pull request #16988: [FLINK-23458][docs] Added the network buffer documentation along wit…



pnowojski commented on pull request #16988:
URL: https://github.com/apache/flink/pull/16988#issuecomment-906244628


   Also as we discussed off line, I think the best place for this docs is a 
sibling of this page: 
https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/memory/mem_tuning/
   
   And can you add cross references from the configuration and memory 
configuration docs to this networ memory tuning page?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] yangjunhan commented on a change in pull request #16980: [FLINK-23111][runtime-web] Fix monaco editor async setup



yangjunhan commented on a change in pull request #16980:
URL: https://github.com/apache/flink/pull/16980#discussion_r696457528



##
File path: 
flink-runtime-web/web-dashboard/src/app/share/common/monaco-editor/monaco-editor.component.ts
##
@@ -74,7 +74,7 @@ export class MonacoEditorComponent implements AfterViewInit, 
OnDestroy {
 
   ngAfterViewInit() {
 if ((window as any).monaco) {
-  this.setupMonaco();
+  setTimeout(() => this.setupMonaco());

Review comment:
   Previously it worked because the rendering and `setupMonaco()` 
invocation could be done in the same tick under the older versions of Angular 
and NG-Zorro-Antd.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (FLINK-23466) UnalignedCheckpointITCase hangs on Azure



 [ 
https://issues.apache.org/jira/browse/FLINK-23466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Piotr Nowojski updated FLINK-23466:
---
Priority: Blocker  (was: Critical)

> UnalignedCheckpointITCase hangs on Azure
> 
>
> Key: FLINK-23466
> URL: https://issues.apache.org/jira/browse/FLINK-23466
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Yingjie Cao
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] yangjunhan commented on a change in pull request #16980: [FLINK-23111][runtime-web] Fix monaco editor async setup



yangjunhan commented on a change in pull request #16980:
URL: https://github.com/apache/flink/pull/16980#discussion_r696458715



##
File path: 
flink-runtime-web/web-dashboard/src/app/pages/job/exceptions/job-exceptions.component.less
##
@@ -32,12 +32,15 @@ nz-table {
 
 flink-monaco-editor {
   height: calc(~"100vh - 346px");
+
   &.subtask {
 height: 300px;
   }
 }
 
 .expand-td {
+  display: block;
+  width: 100%;

Review comment:
   My mistake. I was testing out on the stylesheet and forgot to remove the 
redundant style.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Commented] (FLINK-23466) UnalignedCheckpointITCase hangs on Azure



[ 
https://issues.apache.org/jira/browse/FLINK-23466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405097#comment-17405097
 ] 

Piotr Nowojski commented on FLINK-23466:


I've bumped the priority to the release blocker, as I think we can hit this 
problem even with the default configuration options if there are just simply 
not enough buffers in the buffer pool. Before FLINK-16641 Flink would fail if 
there are no exclusive buffers for the input channels. No that's allowed, but 
can lead to this deadlock.

> UnalignedCheckpointITCase hangs on Azure
> 
>
> Key: FLINK-23466
> URL: https://issues.apache.org/jira/browse/FLINK-23466
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Checkpointing
>Affects Versions: 1.14.0
>Reporter: Dawid Wysakowicz
>Assignee: Yingjie Cao
>Priority: Blocker
>  Labels: test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=20813&view=logs&j=a57e0635-3fad-5b08-57c7-a4142d7d6fa9&t=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7&l=16016



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] Airblader commented on a change in pull request #16980: [FLINK-23111][runtime-web] Fix monaco editor async setup



Airblader commented on a change in pull request #16980:
URL: https://github.com/apache/flink/pull/16980#discussion_r696462684



##
File path: 
flink-runtime-web/web-dashboard/src/app/share/common/monaco-editor/monaco-editor.component.ts
##
@@ -74,7 +74,7 @@ export class MonacoEditorComponent implements AfterViewInit, 
OnDestroy {
 
   ngAfterViewInit() {
 if ((window as any).monaco) {
-  this.setupMonaco();
+  setTimeout(() => this.setupMonaco());

Review comment:
   Thanks. I'm OK with this as a hotfix, however it's more of a workaround 
and the component should probably be smart enough to only rely on its 
dimensions when it's actually been rendered / relayout itself when that 
changes. I've also raised a FLINK issue to investigate removing this custom 
component altogether for the one now included in ng-zorro.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Assigned] (FLINK-22002) AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg fail because of submitting task time-out.



 [ 
https://issues.apache.org/jira/browse/FLINK-22002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann reassigned FLINK-22002:
-

Assignee: Till Rohrmann

> AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg fail 
> because of submitting task time-out.
> -
>
> Key: FLINK-22002
> URL: https://issues.apache.org/jira/browse/FLINK-22002
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.12.2, 1.14.0
>Reporter: Guowei Ma
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: test-stability
> Fix For: 1.14.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15634&view=logs&j=955770d3-1fed-5a0a-3db6-0c7554c910cb&t=14447d61-56b4-5000-80c1-daa459247f6a&l=6424
> {code:java}
> org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase
> 2021-03-29T00:27:25.3406344Z [ERROR] 
> testSingleAggOnTable_HashAgg_WithLocalAgg(org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase)
>   Time elapsed: 21.908 s  <<< ERROR!
> 2021-03-29T00:27:25.3407190Z java.lang.RuntimeException: Failed to fetch next 
> result
> 2021-03-29T00:27:25.3407792Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
> 2021-03-29T00:27:25.3408502Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
> 2021-03-29T00:27:25.3409188Z  at 
> org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:117)
> 2021-03-29T00:27:25.3416724Z  at 
> org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:350)
> 2021-03-29T00:27:25.3417510Z  at 
> java.util.Iterator.forEachRemaining(Iterator.java:115)
> 2021-03-29T00:27:25.3418416Z  at 
> org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:108)
> 2021-03-29T00:27:25.3419031Z  at 
> org.apache.flink.table.planner.runtime.utils.BatchTestBase.executeQuery(BatchTestBase.scala:298)
> 2021-03-29T00:27:25.3419657Z  at 
> org.apache.flink.table.planner.runtime.utils.BatchTestBase.check(BatchTestBase.scala:138)
> 2021-03-29T00:27:25.3420638Z  at 
> org.apache.flink.table.planner.runtime.utils.BatchTestBase.checkResult(BatchTestBase.scala:104)
> 2021-03-29T00:27:25.3421384Z  at 
> org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable(AggregateReduceGroupingITCase.scala:182)
> 2021-03-29T00:27:25.3422284Z  at 
> org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg(AggregateReduceGroupingITCase.scala:135)
> 2021-03-29T00:27:25.3422975Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-03-29T00:27:25.3423504Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-03-29T00:27:25.3424298Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-03-29T00:27:25.3425229Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-03-29T00:27:25.3426107Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-03-29T00:27:25.3426756Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-03-29T00:27:25.3427743Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-03-29T00:27:25.3428520Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-03-29T00:27:25.3429128Z  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-03-29T00:27:25.3429715Z  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2021-03-29T00:27:25.3433435Z  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2021-03-29T00:27:25.3433977Z  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-03-29T00:27:25.3434476Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-03-29T00:27:25.3435607Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-03-29T00:27:25.3436460Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-03-29T00:27:25.3437054Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-03-29T00:27:25.3437673Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-03-29T00:27:25.3438765Z  at 
> org.junit.runners.Paren

[jira] [Commented] (FLINK-22002) AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg fail because of submitting task time-out.



[ 
https://issues.apache.org/jira/browse/FLINK-22002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405103#comment-17405103
 ] 

Till Rohrmann commented on FLINK-22002:
---

The test failure has a different cause

{code}
Aug 26 00:51:29 [ERROR] testSingleAggOnTable_HashAgg_WithoutLocalAgg  Time 
elapsed: 29.979 s  <<< ERROR!
Aug 26 00:51:29 java.lang.RuntimeException: Failed to fetch next result
Aug 26 00:51:29 at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
Aug 26 00:51:29 at 
org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
Aug 26 00:51:29 at 
org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:370)
Aug 26 00:51:29 at 
java.util.Iterator.forEachRemaining(Iterator.java:115)
Aug 26 00:51:29 at 
org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:109)
Aug 26 00:51:29 at 
org.apache.flink.table.planner.runtime.utils.BatchTestBase.executeQuery(BatchTestBase.scala:300)
Aug 26 00:51:29 at 
org.apache.flink.table.planner.runtime.utils.BatchTestBase.check(BatchTestBase.scala:140)
Aug 26 00:51:29 at 
org.apache.flink.table.planner.runtime.utils.BatchTestBase.checkResult(BatchTestBase.scala:106)
Aug 26 00:51:29 at 
org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable(AggregateReduceGroupingITCase.scala:179)
Aug 26 00:51:29 at 
org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithoutLocalAgg(AggregateReduceGroupingITCase.scala:143)
Aug 26 00:51:29 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method)
Aug 26 00:51:29 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
Aug 26 00:51:29 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
Aug 26 00:51:29 at java.lang.reflect.Method.invoke(Method.java:498)
Aug 26 00:51:29 at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
Aug 26 00:51:29 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
Aug 26 00:51:29 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
Aug 26 00:51:29 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
Aug 26 00:51:29 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
Aug 26 00:51:29 at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
Aug 26 00:51:29 at 
org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45)
Aug 26 00:51:29 at 
org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
Aug 26 00:51:29 at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
Aug 26 00:51:29 at 
org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
Aug 26 00:51:29 at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
Aug 26 00:51:29 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
Aug 26 00:51:29 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
Aug 26 00:51:29 at 
org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
Aug 26 00:51:29 at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
Aug 26 00:51:29 at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
Aug 26 00:51:29 at 
org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
Aug 26 00:51:29 at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
Aug 26 00:51:29 at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
Aug 26 00:51:29 at 
org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
Aug 26 00:51:29 at org.junit.rules.RunRules.evaluate(RunRules.java:20)
Aug 26 00:51:29 at 
org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
{code}

Looking at the logs we do see the following:

{code}
00:50:40,740 [flink-akka.actor.default-dispatcher-10] INFO  
org.apache.flink.runtime.taskexecutor.TaskExecutor   [] - Offer 
reserved slots to the leader of job e3bd09da12185c0e070507670d2939ee.
00:50:40,740 [flink-akka.actor.default-dispatcher-10] INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Source: 
Custom Source -> SourceConversion(table=[default_catalog.default_database.T6], 
fields=[a6, b6, c6, d6, e6, f6]) -> Calc(select=[a6, d6, b6, c6, e6]) (1/1) 
(a4a17596e5831da44e9ba35

[jira] [Updated] (FLINK-22002) AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg fail because of submitting task time-out.



 [ 
https://issues.apache.org/jira/browse/FLINK-22002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-22002:
--
Fix Version/s: 1.13.3

> AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg fail 
> because of submitting task time-out.
> -
>
> Key: FLINK-22002
> URL: https://issues.apache.org/jira/browse/FLINK-22002
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.12.2, 1.14.0
>Reporter: Guowei Ma
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: test-stability
> Fix For: 1.14.0, 1.13.3
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15634&view=logs&j=955770d3-1fed-5a0a-3db6-0c7554c910cb&t=14447d61-56b4-5000-80c1-daa459247f6a&l=6424
> {code:java}
> org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase
> 2021-03-29T00:27:25.3406344Z [ERROR] 
> testSingleAggOnTable_HashAgg_WithLocalAgg(org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase)
>   Time elapsed: 21.908 s  <<< ERROR!
> 2021-03-29T00:27:25.3407190Z java.lang.RuntimeException: Failed to fetch next 
> result
> 2021-03-29T00:27:25.3407792Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
> 2021-03-29T00:27:25.3408502Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
> 2021-03-29T00:27:25.3409188Z  at 
> org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:117)
> 2021-03-29T00:27:25.3416724Z  at 
> org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:350)
> 2021-03-29T00:27:25.3417510Z  at 
> java.util.Iterator.forEachRemaining(Iterator.java:115)
> 2021-03-29T00:27:25.3418416Z  at 
> org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:108)
> 2021-03-29T00:27:25.3419031Z  at 
> org.apache.flink.table.planner.runtime.utils.BatchTestBase.executeQuery(BatchTestBase.scala:298)
> 2021-03-29T00:27:25.3419657Z  at 
> org.apache.flink.table.planner.runtime.utils.BatchTestBase.check(BatchTestBase.scala:138)
> 2021-03-29T00:27:25.3420638Z  at 
> org.apache.flink.table.planner.runtime.utils.BatchTestBase.checkResult(BatchTestBase.scala:104)
> 2021-03-29T00:27:25.3421384Z  at 
> org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable(AggregateReduceGroupingITCase.scala:182)
> 2021-03-29T00:27:25.3422284Z  at 
> org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg(AggregateReduceGroupingITCase.scala:135)
> 2021-03-29T00:27:25.3422975Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-03-29T00:27:25.3423504Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-03-29T00:27:25.3424298Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-03-29T00:27:25.3425229Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-03-29T00:27:25.3426107Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-03-29T00:27:25.3426756Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-03-29T00:27:25.3427743Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-03-29T00:27:25.3428520Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-03-29T00:27:25.3429128Z  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-03-29T00:27:25.3429715Z  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2021-03-29T00:27:25.3433435Z  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2021-03-29T00:27:25.3433977Z  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-03-29T00:27:25.3434476Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-03-29T00:27:25.3435607Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-03-29T00:27:25.3436460Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-03-29T00:27:25.3437054Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-03-29T00:27:25.3437673Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-03-29T00:27:25.3438765Z  at 
> org.junit.runners.Parent

[jira] [Updated] (FLINK-22002) AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg fail because of submitting task time-out.



 [ 
https://issues.apache.org/jira/browse/FLINK-22002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Rohrmann updated FLINK-22002:
--
Affects Version/s: (was: 1.12.2)
   1.13.2

> AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg fail 
> because of submitting task time-out.
> -
>
> Key: FLINK-22002
> URL: https://issues.apache.org/jira/browse/FLINK-22002
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Coordination
>Affects Versions: 1.14.0, 1.13.2
>Reporter: Guowei Ma
>Assignee: Till Rohrmann
>Priority: Major
>  Labels: test-stability
> Fix For: 1.14.0, 1.13.3
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=15634&view=logs&j=955770d3-1fed-5a0a-3db6-0c7554c910cb&t=14447d61-56b4-5000-80c1-daa459247f6a&l=6424
> {code:java}
> org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase
> 2021-03-29T00:27:25.3406344Z [ERROR] 
> testSingleAggOnTable_HashAgg_WithLocalAgg(org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase)
>   Time elapsed: 21.908 s  <<< ERROR!
> 2021-03-29T00:27:25.3407190Z java.lang.RuntimeException: Failed to fetch next 
> result
> 2021-03-29T00:27:25.3407792Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.nextResultFromFetcher(CollectResultIterator.java:109)
> 2021-03-29T00:27:25.3408502Z  at 
> org.apache.flink.streaming.api.operators.collect.CollectResultIterator.hasNext(CollectResultIterator.java:80)
> 2021-03-29T00:27:25.3409188Z  at 
> org.apache.flink.table.planner.sinks.SelectTableSinkBase$RowIteratorWrapper.hasNext(SelectTableSinkBase.java:117)
> 2021-03-29T00:27:25.3416724Z  at 
> org.apache.flink.table.api.internal.TableResultImpl$CloseableRowIteratorWrapper.hasNext(TableResultImpl.java:350)
> 2021-03-29T00:27:25.3417510Z  at 
> java.util.Iterator.forEachRemaining(Iterator.java:115)
> 2021-03-29T00:27:25.3418416Z  at 
> org.apache.flink.util.CollectionUtil.iteratorToList(CollectionUtil.java:108)
> 2021-03-29T00:27:25.3419031Z  at 
> org.apache.flink.table.planner.runtime.utils.BatchTestBase.executeQuery(BatchTestBase.scala:298)
> 2021-03-29T00:27:25.3419657Z  at 
> org.apache.flink.table.planner.runtime.utils.BatchTestBase.check(BatchTestBase.scala:138)
> 2021-03-29T00:27:25.3420638Z  at 
> org.apache.flink.table.planner.runtime.utils.BatchTestBase.checkResult(BatchTestBase.scala:104)
> 2021-03-29T00:27:25.3421384Z  at 
> org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable(AggregateReduceGroupingITCase.scala:182)
> 2021-03-29T00:27:25.3422284Z  at 
> org.apache.flink.table.planner.runtime.batch.sql.agg.AggregateReduceGroupingITCase.testSingleAggOnTable_HashAgg_WithLocalAgg(AggregateReduceGroupingITCase.scala:135)
> 2021-03-29T00:27:25.3422975Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2021-03-29T00:27:25.3423504Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2021-03-29T00:27:25.3424298Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2021-03-29T00:27:25.3425229Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2021-03-29T00:27:25.3426107Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2021-03-29T00:27:25.3426756Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2021-03-29T00:27:25.3427743Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2021-03-29T00:27:25.3428520Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2021-03-29T00:27:25.3429128Z  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> 2021-03-29T00:27:25.3429715Z  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> 2021-03-29T00:27:25.3433435Z  at 
> org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> 2021-03-29T00:27:25.3433977Z  at 
> org.junit.rules.RunRules.evaluate(RunRules.java:20)
> 2021-03-29T00:27:25.3434476Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2021-03-29T00:27:25.3435607Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2021-03-29T00:27:25.3436460Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2021-03-29T00:27:25.3437054Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2021-03-29T00:27:25.3437673Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2021-03-29T00:2

[jira] [Assigned] (FLINK-23969) Test Pulsar source end 2 end

2021-08-26 Thread Arvid Heise (Jira)



 [ 
https://issues.apache.org/jira/browse/FLINK-23969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arvid Heise reassigned FLINK-23969:
---

Assignee: Liu

> Test Pulsar source end 2 end
> 
>
> Key: FLINK-23969
> URL: https://issues.apache.org/jira/browse/FLINK-23969
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Pulsar
>Reporter: Arvid Heise
>Assignee: Liu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.14.0
>
>
> Write a test application using Pulsar Source and execute it in distributed 
> fashion. Check fault-tolerance by crashing and restarting a TM.
> Ideally, we test different subscription modes and sticky keys in particular.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (FLINK-23969) Test Pulsar source end 2 end

2021-08-26 Thread Arvid Heise (Jira)



[ 
https://issues.apache.org/jira/browse/FLINK-23969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17405107#comment-17405107
 ] 

Arvid Heise commented on FLINK-23969:
-

I assigned you, please wait for the documentation (just be done this week) and 
check if it can be improved.

> Test Pulsar source end 2 end
> 
>
> Key: FLINK-23969
> URL: https://issues.apache.org/jira/browse/FLINK-23969
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Pulsar
>Reporter: Arvid Heise
>Assignee: Liu
>Priority: Blocker
>  Labels: release-testing
> Fix For: 1.14.0
>
>
> Write a test application using Pulsar Source and execute it in distributed 
> fashion. Check fault-tolerance by crashing and restarting a TM.
> Ideally, we test different subscription modes and sticky keys in particular.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [flink] pnowojski commented on a change in pull request #16905: [FLINK-23884][checkpoint] Fix the concurrent problem between triggering savepoint with drain and task finishing



pnowojski commented on a change in pull request #16905:
URL: https://github.com/apache/flink/pull/16905#discussion_r696469744



##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/MultipleInputStreamTask.java
##
@@ -244,6 +234,28 @@ protected void createInputProcessor(
 return resultFuture;
 }
 
+private CompletableFuture triggerStopWithSavepointWithDrainAsync(
+CheckpointMetaData checkpointMetaData, CheckpointOptions 
checkpointOptions) {
+
+List> sourceFinishedFutures = new 
ArrayList<>();
+mainMailboxExecutor.execute(
+() -> {
+
setSynchronousSavepoint(checkpointMetaData.getCheckpointId(), true);
+operatorChain
+.getSourceTaskInputs()
+.forEach(s -> 
sourceFinishedFutures.add(s.getOperator().stop()));
+},
+"stop chained Flip-27 source for stop-with-savepoint --drain");
+
+CompletableFuture sourcesStopped = 
FutureUtils.waitForAll(sourceFinishedFutures);

Review comment:
   Alternatively you could have implemented a similar pattern as you used 
in the `SourceOperatorStreamTask`. You could have 
`FutureUtils.waitForAll(sourceFinishedFutures)` inside the mailbox action, and 
inside the mailbox action use 
`FutureUtils.forward(FutureUtils.waitForAll(sourceFinishedFutures), 
sourcesStopped);`.

##
File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/MultipleInputStreamTask.java
##
@@ -244,6 +234,28 @@ protected void createInputProcessor(
 return resultFuture;
 }
 
+private CompletableFuture triggerStopWithSavepointWithDrainAsync(
+CheckpointMetaData checkpointMetaData, CheckpointOptions 
checkpointOptions) {
+
+List> sourceFinishedFutures = new 
ArrayList<>();
+mainMailboxExecutor.execute(
+() -> {
+
setSynchronousSavepoint(checkpointMetaData.getCheckpointId(), true);
+operatorChain
+.getSourceTaskInputs()
+.forEach(s -> 
sourceFinishedFutures.add(s.getOperator().stop()));
+},
+"stop chained Flip-27 source for stop-with-savepoint --drain");
+
+CompletableFuture sourcesStopped = 
FutureUtils.waitForAll(sourceFinishedFutures);

Review comment:
   You still have a race condition. You are accessing not thread safe 
`sourceFinishedFutures` from two different threads.
   
   In my previous suggestion I meant to use something like:
   `Future MailboxExecutor#submit(java.util.concurrent.Callable, 
java.lang.String)`. You can submit a callable that will collect the sources 
stopping futures and return them as a `Future>`, that 
you could chain in the line below.
   
   One caveat might be that you might need to either modify `submit()` to 
return `CompletableFuture`, or to introduce another version that will do just 
that? `MailboxExecutor` is `@PublicEvolving`, and probably not used very 
widely, and only by the power users, so we should be ok with changing it's 
interface.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [flink] pnowojski commented on a change in pull request #16905: [FLINK-23884][checkpoint] Fix the concurrent problem between triggering savepoint with drain and task finishing