[jira] [Created] (FLINK-18292) Savepoint for job error

2020-06-15 Thread Kayo (Jira)
Kayo created FLINK-18292:


 Summary: Savepoint for job error
 Key: FLINK-18292
 URL: https://issues.apache.org/jira/browse/FLINK-18292
 Project: Flink
  Issue Type: Bug
  Components: Command Line Client
Affects Versions: 1.9.3
Reporter: Kayo


{quote}Triggering savepoint for job ea4ff34da1a64412e515518fdcc0bdc1.
Waiting for response...


 The program finished with the following exception:

org.apache.flink.util.FlinkException: Triggering a savepoint for the job 
ea4ff34da1a64412e515518fdcc0bdc1 failed.
 at 
org.apache.flink.client.cli.CliFrontend.triggerSavepoint(CliFrontend.java:708)
 at 
org.apache.flink.client.cli.CliFrontend.lambda$savepoint$9(CliFrontend.java:687)
 at 
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:943)
 at org.apache.flink.client.cli.CliFrontend.savepoint(CliFrontend.java:684)
 at 
org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1023)
 at 
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1081)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1911)
 at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
 at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1081)
Caused by: org.apache.flink.runtime.concurrent.FutureUtils$RetryException: 
Could not complete the operation. Number of retries has been exhausted.
 at 
org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$8(FutureUtils.java:273)
 at 
java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760)
 at 
java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736)
 at 
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
 at java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:561)
 at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:929)
 at 
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.CompletionException: 
org.apache.flink.runtime.rest.util.RestClientException: Response was neither of 
the expected type([simple type, class 
org.apache.flink.runtime.rest.handler.async.TriggerResponse]) nor an error.
 at 
java.util.concurrent.CompletableFuture.encodeRelay(CompletableFuture.java:326)
 at 
java.util.concurrent.CompletableFuture.completeRelay(CompletableFuture.java:338)
 at java.util.concurrent.CompletableFuture.uniRelay(CompletableFuture.java:911)
 at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:953)
 at 
java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:926)
 ... 4 more
Caused by: org.apache.flink.runtime.rest.util.RestClientException: Response was 
neither of the expected type([simple type, class 
org.apache.flink.runtime.rest.handler.async.TriggerResponse]) nor an error.
 at org.apache.flink.runtime.rest.RestClient.parseResponse(RestClient.java:398)
 at 
org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:373)
 at 
java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:952)
 ... 5 more
Caused by: 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.ValueInstantiationException:
 Cannot construct instance of 
`org.apache.flink.runtime.rest.handler.async.TriggerResponse`, problem: 
`java.lang.NullPointerException`
 at [Source: UNKNOWN; line: -1, column: -1]
 at 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.exc.ValueInstantiationException.from(ValueInstantiationException.java:47)
 at 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.DeserializationContext.instantiationException(DeserializationContext.java:1732)
 at 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapAsJsonMappingException(StdValueInstantiator.java:491)
 at 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.rewrapCtorProblem(StdValueInstantiator.java:514)
 at 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:285)
 at 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.ValueInstantiator.createFromObjectWith(ValueInstantiator.java:229)
 at 
org.apache.flink.shaded.jackson2.com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreat

Re: [DISCUSS] Semantics of our JIRA fields

2020-06-15 Thread Piotr Nowojski


> On 12 Jun 2020, at 15:44, Robert Metzger  wrote:
> 
> Piotrek, do you agree with my "affects version" explanation? I would like
> to bring this discussion to a conclusion.
> 

+0 for this semantic from my side.

> On Tue, May 26, 2020 at 4:51 PM Till Rohrmann  wrote:
> 
>> If we change the meaning of the priority levels, then I would suggest to
>> have a dedicated discussion for it. This would also be more visible than
>> compared to being hidden in some lengthy discussion thread. I think the
>> proposed definitions of priority levels differ slightly from how the
>> community worked in the past.
>> 
>> Cheers,
>> Till
>> 
>> On Tue, May 26, 2020 at 4:30 PM Robert Metzger 
>> wrote:
>> 
>>> Hi,
>>> 
>>> 1. I'm okay with updating the definition of the priorities for the reason
>>> you've mentioned.
>>> 
>>> 2. "Affects version"
>>> 
>>> The reason why like to mark affects version on unreleased versions is to
>>> clearly indicate which branch is affected by a bug. Given the current
>> Flink
>>> release status, if there's a bug only in "release-1.11", but not in
>>> "master", there is no way of figuring that out, if we only allow released
>>> versions for "affects version" (In my proposal, you would set "affects
>>> version" to '1.11.0', '1.12.0' to indicate that).
>>> 
>>> What we could do is introduce "1.12-SNAPSHOT" as version to mark issues
>> on
>>> unreleased versions. (But then people might accidentally set the "fix
>>> version" to a "-SNAPSHOT" version.)
>>> 
>>> I'm still in favor of my proposal. I have never heard a report from a
>>> confused user about our Jira fields (I guess they usually check bugs for
>>> released versions only)
>>> 
>>> 
>>> On Tue, May 26, 2020 at 12:39 PM Piotr Nowojski 
>>> wrote:
>>> 
 Hi,
 
 Sorry for a bit late response. I have two concerns:
 
 1. Priority
 
 I would propose to stretch priorities that we are using to
>> differentiate
 between things that must be fixed for given release:
 
 BLOCKER - drop anything you are doing, this issue must be fixed right
>> now
 CRITICAL - release can not happen without fixing it, but can be fixed a
 bit later (for example without context switching and dropping whatever
>>> I’m
 doing right now)
 MAJOR - default, nice to have
 Anything below - meh
 
 We were already using this semantic for tracking test instabilities
>>> during
 the 1.11 release cycle. Good examples:
 
 BLOCKER - master branch not compiling, very frequent test failures (for
 example almost every build affected), …
 CRITICAL - performance regression/bug that we introduced in some
>> feature,
 but which is not affecting other developers as much
 MAJOR - freshly discovered test instability with unknown
>> impact/frequency
 (could be happening once a year),
 
 2. Affects version
 
 If bug is only on the master branch, does it affect an unreleased
>>> version?
 
 So far I was assuming that it doesn’t - unreleased bugs would have
>> empty
 “affects version” field. My reasoning was that this field should be
>> used
 for Flink users, to check which RELEASED Flink versions are affected by
 some bug, that user is searching for. Otherwise it might be a bit
>>> confusing
 if there are lots of bugs with both affects version and fix version set
>>> to
 the same value.
 
 Piotrek
 
> On 25 May 2020, at 16:40, Robert Metzger 
>> wrote:
> 
> Hi all,
> thanks a lot for the feedback. The majority of responses are very
 positive
> to my proposal.
> 
> I have put my proposal into our wiki:
> 
 
>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=154995514
> 
> Regarding the comments so far:
> @Jark: I clarified this in the wiki.
> 
> @Israel: I have not considered build changing all 3000 resolved
>> tickets
 to
> closed yet, but after consideration I don't think it is necessary. If
> others in the community would like to change them, please speak up in
 this
> thread :)
> 
> @Flavio: I agree that we can not ask new or infrequent users to fully
> adhere to these definitions. I added a note in the Wiki.
> Using the resolved state for indicating "PR available" is problematic
> because there are plenty of cases where PRs are stale (and this
>> ticket
> would then appear as a "resolved"). The Apache tools are adding a
>> link
>>> to
> the PR, and some contributors are setting the ticket to "In
>> Progress".
>>> I
> don't see a problem that we need to solve here.
> 
> @Yun: Thank you for your comment. I added an example clarifying how I
 would
> handle such a case. It is slightly different from your proposal: You
> suggested using x.y.0 versions, I used "the next supported,
>> unreleased
> version", because that's how we've done it so far (and I don't want
>> to
> change things, I

Interested in contributing to Apache Flink

2020-06-15 Thread Vivin Peris
Respected Sir

I am currently a final year undergrad student at National Institute of
Technology Surathkal, India.

I have extensively worked on Apache Flink in my previous internship where I
had to create a database of billions of IPs in the world along with data
associated with it.
The power of Flink is just unbelievable and helped me do work which would
take months in just a matter of two days.

With this background, I am deeply interested in applying to your esteemed
organization under Google Season of Developers. Since I am new to technical
writing, I am willing to work hard.

I would like to know what I can work on so that I can equip myself for this
task.

Thank you for your time and consideration

Sincerely
Vivin


Re: request create flip permission for flink es bounded source/lookup source connector

2020-06-15 Thread Robert Metzger
Ah, sorry. The permission system of Confluence is a bit annoying.
I gave you "Add Attachment" permissions.

On Mon, Jun 15, 2020 at 8:57 AM Jacky Lau  wrote:

> Hi Robert:
>  When i edit the FLIP, and upload the images. It will show the
> prompting
> message like this "You'll need to ask permission to insert files here"
> Could you also help me give the permission for uploading images to FLIP
> iki?
>
> Robert Metzger wrote
> > Hi,
> > I gave you access to the Wiki!
> >
> > On Fri, Jun 12, 2020 at 11:50 AM Jacky Lau <
>
> > liuyongvs@
>
> > > wrote:
> >
> >> Hi Jack:
> >>Thank you so much. My wiki name is jackylau
> >>
> >>
> >> Jark Wu-2 wrote
> >> > Hi Jacky,
> >> >
> >> > What's your username in wiki? So that I can give the permission to
> you.
> >> >
> >> > Best,
> >> > Jark
> >> >
> >> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau <
> >>
> >> > liuyongvs@
> >>
> >> > > wrote:
> >> >
> >> >> hi all:
> >> >>After this simple discussion here
> >> >>
> >> >>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
> >> >> ,
> >> >>and i should create i flip127 to  track this. But i don't have
> >> create
> >> >> flip permision.
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Sent from:
> >> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >> >>
> >>
> >>
> >> Jark Wu-2 wrote
> >> > Hi Jacky,
> >> >
> >> > What's your username in wiki? So that I can give the permission to
> you.
> >> >
> >> > Best,
> >> > Jark
> >> >
> >> > On Fri, 12 Jun 2020 at 11:38, Jacky Lau <
> >>
> >> > liuyongvs@
> >>
> >> > > wrote:
> >> >
> >> >> hi all:
> >> >>After this simple discussion here
> >> >>
> >> >>
> >>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-td42082.html#a42106
> >> >> ,
> >> >>and i should create i flip127 to  track this. But i don't have
> >> create
> >> >> flip permision.
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Sent from:
> >> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >> >>
> >>
> >>
> >>
> >>
> >>
> >> --
> >> Sent from:
> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >>
>
>
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>


Re: Moving in and Modify Dependency Source

2020-06-15 Thread Robert Metzger
Hi Austin,
Thanks for working on the RMQ connector! There seem to be a few users
affected by that issue.

The GitHub page confirms that users can choose from the three licenses:
https://github.com/rabbitmq/rabbitmq-java-client#license:

> This means that the user can consider the library to be licensed under any
> of the licenses from the list above. For example, you may choose the
> Apache Public License 2.0 and include this client into a commercial
> product. Projects that are licensed under the GPLv2 may choose GPLv2, and
> so on.


Best,
Robert

On Mon, Jun 15, 2020 at 8:59 AM Till Rohrmann  wrote:

> Hi Austin,
>
> usually if source code is multi licensed then this means that the user can
> choose the license under which he wants it to use. In our case it would be
> the Apache License version 2. But you should check the license text to make
> sure that this has not been forbidden explicitly.
>
> When copying code from another project, the practice is to annotate it with
> a comment stating from where the code was obtained. So in your case you
> would give these files the ASL license header and add a comment to the
> source code from where it was copied.
>
> Cheers,
> Till
>
> On Sat, Jun 13, 2020 at 10:41 PM Austin Cawley-Edwards <
> austin.caw...@gmail.com> wrote:
>
> > Hi all,
> >
> > I'm working on [FLINK-10195] on the RabbitMQ connector which involves
> > modifying some of the RMQ client source code (that has been moved out of
> > that package) and bringing it into Flink. The RMQ client code is
> > triple-licensed under Mozilla Public License 1.1 ("MPL"), the GNU General
> > Public License version 2 ("GPL"), and the Apache License version 2
> ("ASL").
> >
> > Does anyone have experience doing something similar/ what I would need to
> > do in terms of the license headers in the Flink source files?
> >
> > Thank you,
> > Austin
> >
> > [FLINK-10195]: https://issues.apache.org/jira/browse/FLINK-10195
> >
>


Flink 1.10 with GCS for checkpoints

2020-06-15 Thread Ramya Ramamurthy
Hi,

We are trying to upgrade our Flink from 1.7 to 1.10. We have our
checkpoints on Google Cloud Storage today. But this is not working well
with 1.10.
And below is the error we get.
any help here would be appreciated.
We followed the below blog for GCS related configurations.
https://www.ververica.com/blog/getting-started-with-da-platform-on-google-kubernetes-engine


Excerpt from the error:





*org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:64)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
Could not find a file system implementation for scheme 'gs'. The scheme is
not directly supported by Flink and no Hadoop file system to support this
scheme could be loaded. at
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:450)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362) at
org.apache.flink.core.fs.Path.getFileSystem(Path.java:298) at
org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:100)
 *

*Complete ERROR Stack:*

2020-06-15 04:46:00,783 WARN  org.apache.flink.configuration.Configuration
 - Config uses deprecated configuration key
'high-availability.zookeeper.storageDir' instead of proper key
'high-availability.storageDir'
2020-06-15 04:46:00,804 INFO
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Shutting
StandaloneSessionClusterEntrypoint down with application status FAILED.
Diagnostics java.io.IOException: Could not create FileSystem for highly
available storage path
(gs://ss-enigma-bucket/flink/flink/checkpoints/fs.default_ns)
at
org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:103)
at
org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:89)
at
org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:125)
at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:305)
at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:263)
at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:207)
at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)
at
org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)
at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
at
org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:64)
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
Could not find a file system implementation for scheme 'gs'. The scheme is
not directly supported by Flink and no Hadoop file system to support this
scheme could be loaded.
at
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:450)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
at
org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:100)
... 10 more
Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
Hadoop is not in the classpath/dependencies.
at
org.apache.flink.core.fs.UnsupportedSchemeFactory.create(UnsupportedSchemeFactory.java:58)
at
org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:446)
... 13 more
.
2020-06-15 04:46:00,816 INFO
 org.apache.flink.runtime.rpc.akka.AkkaRpcService  - Stopping
Akka RPC service.
2020-06-15 04:46:00,901 INFO
 akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting
down remote daemon.
2020-06-15 04:46:00,903 INFO
 akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote
daemon shut down; proceeding with flushing remote transports.
2020-06-15 04:46:00,948 INFO
 akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting
shut down.
2020-06-15 04:46:01,006 ERROR
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Could not
start cluster entrypoint StandaloneSessionClusterEntrypoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to
initialize the cluster entrypoint StandaloneSessionClusterEntrypoint.
at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:187)
at
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
at
org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:64)
Caused by: java.io.IOException: Could not create FileSyst

[jira] [Created] (FLINK-18293) TaskExecutor offering non empty slots can lead to resource violation

2020-06-15 Thread Till Rohrmann (Jira)
Till Rohrmann created FLINK-18293:
-

 Summary: TaskExecutor offering non empty slots can lead to 
resource violation
 Key: FLINK-18293
 URL: https://issues.apache.org/jira/browse/FLINK-18293
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Coordination
Affects Versions: 1.10.1, 1.11.0
Reporter: Till Rohrmann
 Fix For: 1.12.0


When a {{JobMaster}} loses leadership, then the {{TaskExecutor}} will fail all 
running tasks belonging to this job and transition all slots belonging to this 
job from {{ACTIVE}} into {{ALLOCATED}}. The idea is that these slots can be 
re-offered to the new leader of the very same job.

A problem arises when the {{Task}} cancellation takes longer than the election 
of the new leader. In this case, the slot containing a {{CANCELLING}} task, 
will be offered to the new {{JobMaster}} as empty. The {{JobMaster}} not 
knowing that the slot still contains a resource consumer might deploy new tasks 
into it believing that these tasks can use all of the available resources. In 
the best case, the newly deployed {{Tasks}} will simply get fewer resources 
than thought. In the worst case this will lead to a resource violation.

W/o the {{JobMaster}} being able to reconcile the state of already deployed 
{{Tasks}} into {{Slots}}, I believe that we should only re-offer the slot when 
it is free. One might model this scenario with introducing a new 
{{TaskSlotState.CLEANING}}. {{CLEANING}} means that the slot is still allocated 
for a given job but that there are still some resources which need to be 
cleaned up before it can be re-offered (transition to state {{ALLOCATED}}).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18294) Log java processes and disk space after each e2e test

2020-06-15 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-18294:


 Summary: Log java processes and disk space after each e2e test
 Key: FLINK-18294
 URL: https://issues.apache.org/jira/browse/FLINK-18294
 Project: Flink
  Issue Type: Improvement
  Components: Test Infrastructure
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


To debug interferences between e2e test it would be helpful to log disk usages 
and leftover java processes.
I've seen instances where, right before the java e2e tests are run, there is 
still a kafka process running, and in one abnormal case we use 13gb more disk 
space.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Flink 1.10 with GCS for checkpoints

2020-06-15 Thread Till Rohrmann
Hi Ramya,

it looks as if Flink cannot find the Hadoop dependencies. Could you make
sure that you start Flink with HADOOP_CLASSPATH defined or pointing it to
the Hadoop conf directory via HADOOP_CONF_DIR. See this link [1] for more
information on how to add Hadoop support.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/deployment/hadoop.html

Cheers,
Till

On Mon, Jun 15, 2020 at 10:44 AM Ramya Ramamurthy  wrote:

> Hi,
>
> We are trying to upgrade our Flink from 1.7 to 1.10. We have our
> checkpoints on Google Cloud Storage today. But this is not working well
> with 1.10.
> And below is the error we get.
> any help here would be appreciated.
> We followed the below blog for GCS related configurations.
>
> https://www.ververica.com/blog/getting-started-with-da-platform-on-google-kubernetes-engine
>
>
> Excerpt from the error:
>
>
>
>
>
>
> *org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:64)
> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
> Could not find a file system implementation for scheme 'gs'. The scheme is
> not directly supported by Flink and no Hadoop file system to support this
> scheme could be loaded. at
>
> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:450)
> at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362) at
> org.apache.flink.core.fs.Path.getFileSystem(Path.java:298) at
>
> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:100)
>  *
>
> *Complete ERROR Stack:*
>
> 2020-06-15 04:46:00,783 WARN  org.apache.flink.configuration.Configuration
>  - Config uses deprecated configuration key
> 'high-availability.zookeeper.storageDir' instead of proper key
> 'high-availability.storageDir'
> 2020-06-15 04:46:00,804 INFO
>  org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Shutting
> StandaloneSessionClusterEntrypoint down with application status FAILED.
> Diagnostics java.io.IOException: Could not create FileSystem for highly
> available storage path
> (gs://ss-enigma-bucket/flink/flink/checkpoints/fs.default_ns)
> at
>
> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:103)
> at
>
> org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:89)
> at
>
> org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:125)
> at
>
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:305)
> at
>
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:263)
> at
>
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:207)
> at
>
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:169)
> at
>
> org.apache.flink.runtime.security.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:30)
> at
>
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:168)
> at
>
> org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:518)
> at
>
> org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint.main(StandaloneSessionClusterEntrypoint.java:64)
> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
> Could not find a file system implementation for scheme 'gs'. The scheme is
> not directly supported by Flink and no Hadoop file system to support this
> scheme could be loaded.
> at
>
> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:450)
> at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:362)
> at org.apache.flink.core.fs.Path.getFileSystem(Path.java:298)
> at
>
> org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:100)
> ... 10 more
> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException:
> Hadoop is not in the classpath/dependencies.
> at
>
> org.apache.flink.core.fs.UnsupportedSchemeFactory.create(UnsupportedSchemeFactory.java:58)
> at
>
> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:446)
> ... 13 more
> .
> 2020-06-15 04:46:00,816 INFO
>  org.apache.flink.runtime.rpc.akka.AkkaRpcService  - Stopping
> Akka RPC service.
> 2020-06-15 04:46:00,901 INFO
>  akka.remote.RemoteActorRefProvider$RemotingTerminator - Shutting
> down remote daemon.
> 2020-06-15 04:46:00,903 INFO
>  akka.remote.RemoteActorRefProvider$RemotingTerminator - Remote
> daemon shut down; proceeding with flushing remote transports.
> 2020-06-15 04:46:00,948 INFO
>  akka.remote.RemoteActorRefProvider$RemotingTerminator - Remoting
> shut down.
> 2020-06-15 04:46:01,006 ERROR
> org.apache.flink

[jira] [Created] (FLINK-18295) Remove the hack logics of result consumers

2020-06-15 Thread Zhu Zhu (Jira)
Zhu Zhu created FLINK-18295:
---

 Summary: Remove the hack logics of result consumers
 Key: FLINK-18295
 URL: https://issues.apache.org/jira/browse/FLINK-18295
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Coordination
Reporter: Zhu Zhu
Assignee: Zhu Zhu
 Fix For: 1.12.0


Currently an {{IntermediateDataSet}} can have multiple {{JobVertex}} as its 
consumers. That's why the consumers of a `IntermediateResultPartition` is in 
the form of {{List>}}.

However, in scheduler/{{ExecutionGraph}} there is assumption that one 
`IntermediateResultPartition` can be consumed by one only `ExecutionJobVertex`. 
This results in a lot of hack logics which assumes partition consumers to 
contain a single list.

We should remove these hack logics. The idea is to change 
`IntermediateResultPartition#consumers` to be `List`. 
`ExecutionGraph` building logics should be adjusted accordingly with the 
assumption that an `IntermediateResult` can have one only consumer vertex. In 
`JobGraph`, there should also be check logics for this assumption.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18296) Add support for datetype TIMESTAMP_WITH_LOCAL_ZONE for Json format

2020-06-15 Thread Shengkai Fang (Jira)
Shengkai Fang created FLINK-18296:
-

 Summary: Add support for datetype TIMESTAMP_WITH_LOCAL_ZONE for 
Json format
 Key: FLINK-18296
 URL: https://issues.apache.org/jira/browse/FLINK-18296
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.11.0
Reporter: Shengkai Fang


Currently we still can't use datatype TIMESTAMP_WITH_LOCAL_ZONE in json format. 
According to the document descripition of TIMESTAMP_WITH_LOCAL_ZONE, I think 
the behaviour of this type should work like this when we processing in json : 
 # it can read timestamp with timezone in ISO-8601 format or RFC3339 format and 
transfer timestamp into UTC time or time with user-defined zone for processing;
 # it can output timestamp in format like "-MM-ddTHH-mm-ss.s\{precision}" 
or "-MM-dd HH-mm-ss.s\{precision}"



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18297) SQL client: setting execution.type to invalid value shuts down the session

2020-06-15 Thread David Anderson (Jira)
David Anderson created FLINK-18297:
--

 Summary: SQL client: setting execution.type to invalid value shuts 
down the session
 Key: FLINK-18297
 URL: https://issues.apache.org/jira/browse/FLINK-18297
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Affects Versions: 1.10.0
Reporter: David Anderson


Flink SQL> SET execution.type=foo;

Exception in thread "main" org.apache.flink.table.client.SqlClientException: 
Invalid configuration entry.

 at 
org.apache.flink.table.client.config.entries.ConfigEntry.(ConfigEntry.java:41)

 at 
org.apache.flink.table.client.config.entries.ExecutionEntry.(ExecutionEntry.java:112)

 at 
org.apache.flink.table.client.config.entries.ExecutionEntry.enrich(ExecutionEntry.java:375)

 at 
org.apache.flink.table.client.config.Environment.enrich(Environment.java:295)

 at 
org.apache.flink.table.client.gateway.local.LocalExecutor.setSessionProperty(LocalExecutor.java:284)

 at org.apache.flink.table.client.cli.CliClient.callSet(CliClient.java:370)

 at org.apache.flink.table.client.cli.CliClient.callCommand(CliClient.java:262)

 at java.util.Optional.ifPresent(Optional.java:159)

 at org.apache.flink.table.client.cli.CliClient.open(CliClient.java:200)

 at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:125)

 at org.apache.flink.table.client.SqlClient.start(SqlClient.java:104)

 at org.apache.flink.table.client.SqlClient.main(SqlClient.java:178)

Caused by: org.apache.flink.table.api.ValidationException: Unknown value for 
property 'type'.

Supported values are [streaming, batch] but was: foo

 at 
org.apache.flink.table.descriptors.DescriptorProperties.lambda$validateEnum$34(DescriptorProperties.java:1254)

 at 
org.apache.flink.table.descriptors.DescriptorProperties.validateOptional(DescriptorProperties.java:1520)

 at 
org.apache.flink.table.descriptors.DescriptorProperties.validateEnum(DescriptorProperties.java:1247)

 at 
org.apache.flink.table.descriptors.DescriptorProperties.validateEnumValues(DescriptorProperties.java:1266)

 at 
org.apache.flink.table.client.config.entries.ExecutionEntry.validate(ExecutionEntry.java:123)

 at 
org.apache.flink.table.client.config.entries.ConfigEntry.(ConfigEntry.java:39)

 ... 11 more

 

Shutting down the session...

done.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18298) Rename TableResult headers of SHOW statements

2020-06-15 Thread Fabian Hueske (Jira)
Fabian Hueske created FLINK-18298:
-

 Summary: Rename TableResult headers of SHOW statements
 Key: FLINK-18298
 URL: https://issues.apache.org/jira/browse/FLINK-18298
 Project: Flink
  Issue Type: Improvement
  Components: Table SQL / API
Affects Versions: 1.11.0
Reporter: Fabian Hueske


The SHOW TABLES and SHOW FUNCTIONS commands are listing all tables and 
functions of the currently selected database.
 With FLIP-84, the result is passed back as a TableResult object, that includes 
the schema of the result as a TableSchema.

The column name for the result of SHOW TABLES and SHOW FUNCTION is "result":
  
{code:java}
SHOW TABLES;
result
myTable1
myTable2
{code}
 
 I think this name is not very descriptive and too generic. 
 IMO it would be nice to change it to "table name" and "function name", 
respectively.
{code:java}
SHOW TABLES; 
table names
myTable1 
myTable2{code}
Would be nice to get this little improvement in before the 1.11 release.

cc [~godfreyhe], [~twalthr]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18299) Add keyword in json format to parse timestamp in different standard

2020-06-15 Thread Shengkai Fang (Jira)
Shengkai Fang created FLINK-18299:
-

 Summary: Add keyword in json format to parse timestamp in 
different standard
 Key: FLINK-18299
 URL: https://issues.apache.org/jira/browse/FLINK-18299
 Project: Flink
  Issue Type: Bug
  Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
Affects Versions: 1.11.0
Reporter: Shengkai Fang


Add keyword such as 'timestamp-format.standard' in json format to parse 
timestamp in different format. In this issue, we will support value 'SQL' and 
'ISO-8601': 
 * Option 'SQL' will parse input timestamp in '-MM-dd 
HH:mm:ss.s\{precision}' format, e.g. '2020-12-30 12:13:14.123' and output 
timestamp in the same way.
 * Option 'ISO-8601' will parse input timestamp in 
'-MM-ddTHH:mm:ss.s\{precision}', e.g. '2020-12-30T12:13:14.123' format and 
output timestamp in the same way.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18300) SQL Client doesn't support ALTER VIEW

2020-06-15 Thread Rui Li (Jira)
Rui Li created FLINK-18300:
--

 Summary: SQL Client doesn't support ALTER VIEW
 Key: FLINK-18300
 URL: https://issues.apache.org/jira/browse/FLINK-18300
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Reporter: Rui Li
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Re-renaming "Flink Master" back to JobManager

2020-06-15 Thread Aljoscha Krettek

Hi All,

This came to my mind because of the master/slave discussion in [1] and 
the larger discussions about inequality/civil rights happening right now 
in the world. I think for this reason alone we should use a name that 
does not include "master".


We could rename it back to JobManager, which was the name mostly used 
before 2019. Since the beginning of Flink, TaskManager was the term used 
for the worker component/node and JobManager was the term used for the 
orchestrating component/node.


Currently our glossary [2] defines these terms (paraphrased by me):

 - "Flink Master": it's the orchestrating component that consists of 
resource manager, dispatcher, and JobManager


 - JobManager: it's the thing that manages a single job and runs as 
part of a "Flink Master"


 - TaskManager: it's the worker process

Prior to the introduction of the glossary the definition of JobManager 
would have been:


 - It's the orchestrating component that manages execution of jobs and 
schedules work on TaskManagers.


Quite some parts in the code and documentation/configuration options 
still use that older meaning of JobManager. Newer parts of the 
documentation use "Flink Master" instead.


I'm proposing to go back to calling the orchestrating component 
JobManager, which would mean that we have to touch up the documentation 
to remove mentions of "Flink Master". I'm also proposing not to mention 
the internal components such as resource manager and dispatcher in the 
glossary because there are transparent to users.


I'm proposing to go back to JobManager instead of an alternative name 
also because switching to yet another name would mean many more changes 
to code/documentation/peoples minds.


What do you all think?

Best,
Aljoscha


[1] https://issues.apache.org/jira/browse/FLINK-18209
[2] 
https://ci.apache.org/projects/flink/flink-docs-master/concepts/glossary.html


[jira] [Created] (FLINK-18301) Backup Kafka logs on e2e failure

2020-06-15 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-18301:


 Summary: Backup Kafka logs on e2e failure
 Key: FLINK-18301
 URL: https://issues.apache.org/jira/browse/FLINK-18301
 Project: Flink
  Issue Type: Improvement
  Components: Test Infrastructure
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.12.0


Similarly to Flink logs, backup the kafka logs if the test fails.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] Re-renaming "Flink Master" back to JobManager

2020-06-15 Thread Konstantin Knauf
Hi Aljoscha,

sounds good to me. Let’s also make sure we don’t refer to the JobMaster as
Jobmanager anywhere then (code, config).

I am not sure we can avoid mentioning the Flink ResourceManagers in user
facing docs completely. For JobMaster and Dispatcher this seems doable.

Best,

Konstantin

On Mon 15. Jun 2020 at 12:56, Aljoscha Krettek  wrote:

> Hi All,
>
> This came to my mind because of the master/slave discussion in [1] and
> the larger discussions about inequality/civil rights happening right now
> in the world. I think for this reason alone we should use a name that
> does not include "master".
>
> We could rename it back to JobManager, which was the name mostly used
> before 2019. Since the beginning of Flink, TaskManager was the term used
> for the worker component/node and JobManager was the term used for the
> orchestrating component/node.
>
> Currently our glossary [2] defines these terms (paraphrased by me):
>
>   - "Flink Master": it's the orchestrating component that consists of
> resource manager, dispatcher, and JobManager
>
>   - JobManager: it's the thing that manages a single job and runs as
> part of a "Flink Master"
>
>   - TaskManager: it's the worker process
>
> Prior to the introduction of the glossary the definition of JobManager
> would have been:
>
>   - It's the orchestrating component that manages execution of jobs and
> schedules work on TaskManagers.
>
> Quite some parts in the code and documentation/configuration options
> still use that older meaning of JobManager. Newer parts of the
> documentation use "Flink Master" instead.
>
> I'm proposing to go back to calling the orchestrating component
> JobManager, which would mean that we have to touch up the documentation
> to remove mentions of "Flink Master". I'm also proposing not to mention
> the internal components such as resource manager and dispatcher in the
> glossary because there are transparent to users.
>
> I'm proposing to go back to JobManager instead of an alternative name
> also because switching to yet another name would mean many more changes
> to code/documentation/peoples minds.
>
> What do you all think?
>
> Best,
> Aljoscha
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-18209
> [2]
>
> https://ci.apache.org/projects/flink/flink-docs-master/concepts/glossary.html
>
-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


[jira] [Created] (FLINK-18302) Sql client uses wrong class loader when execute INSERT statements

2020-06-15 Thread Jark Wu (Jira)
Jark Wu created FLINK-18302:
---

 Summary: Sql client uses wrong class loader when execute INSERT 
statements
 Key: FLINK-18302
 URL: https://issues.apache.org/jira/browse/FLINK-18302
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Client
Reporter: Jark Wu
Assignee: Jark Wu
 Fix For: 1.11.0


Sql-client when execute INSERT statements does not use the user class loader 
from ExecutionContext. This makes it impossible to run queries with UDF in it 
if the dependencies are added with {{--jar}} flag.

This can be reproduced when I migrate {{SQLClientKafkaITCase}} to use DDL 
(FLINK-18086). 

It give exception:

{code}
Exception in thread "main" org.apache.flink.table.client.SqlClientException: 
Unexpected exception. This is a bug. Please consider filing an issue.
at org.apache.flink.table.client.SqlClient.main(SqlClient.java:213)
Caused by: java.lang.RuntimeException: Error running SQL job.
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdateInternal(LocalExecutor.java:595)
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdate(LocalExecutor.java:515)
at 
org.apache.flink.table.client.cli.CliClient.callInsert(CliClient.java:596)
at 
org.apache.flink.table.client.cli.CliClient.callCommand(CliClient.java:315)
at java.util.Optional.ifPresent(Optional.java:159)
at org.apache.flink.table.client.cli.CliClient.open(CliClient.java:212)
at org.apache.flink.table.client.SqlClient.openCli(SqlClient.java:142)
at org.apache.flink.table.client.SqlClient.start(SqlClient.java:114)
at org.apache.flink.table.client.SqlClient.main(SqlClient.java:201)
Caused by: java.lang.RuntimeException: Could not execute program.
at 
org.apache.flink.table.client.gateway.local.ProgramDeployer.deploy(ProgramDeployer.java:84)
at 
org.apache.flink.table.client.gateway.local.LocalExecutor.executeUpdateInternal(LocalExecutor.java:592)
... 8 more
Caused by: org.apache.flink.util.FlinkRuntimeException: 
org.apache.flink.api.common.InvalidProgramException: Table program cannot be 
compiled. This is a bug. Please file an issue.
at 
org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:68)
at 
org.apache.flink.table.runtime.generated.GeneratedClass.compile(GeneratedClass.java:78)
at 
org.apache.flink.table.runtime.generated.GeneratedClass.getClass(GeneratedClass.java:96)
at 
org.apache.flink.table.runtime.operators.CodeGenOperatorFactory.getStreamOperatorClass(CodeGenOperatorFactory.java:51)
at 
org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.preValidate(StreamingJobGraphGenerator.java:217)
at 
org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:154)
at 
org.apache.flink.streaming.api.graph.StreamingJobGraphGenerator.createJobGraph(StreamingJobGraphGenerator.java:109)
at 
org.apache.flink.streaming.api.graph.StreamGraph.getJobGraph(StreamGraph.java:850)
at 
org.apache.flink.client.StreamGraphTranslator.translateToJobGraph(StreamGraphTranslator.java:52)
at 
org.apache.flink.client.FlinkPipelineTranslationUtil.getJobGraph(FlinkPipelineTranslationUtil.java:43)
at 
org.apache.flink.client.deployment.executors.PipelineExecutorUtils.getJobGraph(PipelineExecutorUtils.java:55)
at 
org.apache.flink.client.deployment.executors.AbstractSessionClusterExecutor.execute(AbstractSessionClusterExecutor.java:57)
at 
org.apache.flink.table.client.gateway.local.ProgramDeployer.deploy(ProgramDeployer.java:82)
... 9 more
Caused by: 
org.apache.flink.shaded.guava18.com.google.common.util.concurrent.UncheckedExecutionException:
 org.apache.flink.api.common.InvalidProgramException: Table program cannot be 
compiled. This is a bug. Please file an issue.
at 
org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2203)
at 
org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache.get(LocalCache.java:3937)
at 
org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4739)
at 
org.apache.flink.table.runtime.generated.CompileUtils.compile(CompileUtils.java:66)
... 21 more
Caused by: org.apache.flink.api.common.InvalidProgramException: Table program 
cannot be compiled. This is a bug. Please file an issue.
at 
org.apache.flink.table.runtime.generated.CompileUtils.doCompile(CompileUtils.java:81)
at 
org.apache.flink.table.runtime.generated.CompileUtils.lambda$compile$1(CompileUtils.java:66)
at 
org.apache.flink.shaded.guava18.com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4742)
at 
org.apache.

[jira] [Created] (FLINK-18303) Filesystem connector doesn't flush part files after rolling interval

2020-06-15 Thread Jark Wu (Jira)
Jark Wu created FLINK-18303:
---

 Summary: Filesystem connector doesn't flush part files after 
rolling interval
 Key: FLINK-18303
 URL: https://issues.apache.org/jira/browse/FLINK-18303
 Project: Flink
  Issue Type: Bug
  Components: Connectors / FileSystem, Table SQL / Ecosystem
Reporter: Jark Wu
Assignee: Jark Wu
 Fix For: 1.11.0


I have set "execution.checkpointing.interval" to "5s" and 
"sink.rolling-policy.time-interval" to "2s". However, it still take 60 seconds 
to see the first part file. 

This can be reproduced by the following code in SQL CLI:

{code:sql}
CREATE TABLE CsvTable (
  event_timestamp STRING,
  `user` STRING,
  message STRING,
  duplicate_count BIGINT,
  constant STRING
) WITH (
  'connector' = 'filesystem',
  'path' = '$RESULT',
  'format' = 'csv',
  'sink.rolling-policy.time-interval' = '2s'
);

INSERT INTO CsvTable -- read from Kafka Avro, and write into Filesystem Csv
SELECT AvroTable.*, RegReplace('Test constant folding.', 'Test', 'Success') AS 
constant
FROM AvroTable;
{code}

This is found when I migrate SQLClientKafkaITCase to use DDL (FLINK-18086).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] SQL Syntax for Table API StatementSet

2020-06-15 Thread Fabian Hueske
Hi everyone,

FLIP-84 [1] added the concept of a "statement set" to group multiple INSERT
INTO statements (SQL or Table API) together. The statements in a statement
set are jointly optimized and executed as a single Flink job.

I would like to start a discussion about a SQL syntax to group multiple
INSERT INTO statements in a statement set. The use case would be to expose
the statement set feature to a solely text based client for Flink SQL such
as Flink's SQL CLI [1].

During the discussion of FLIP-84, we had briefly talked about such a syntax
[3].

START STATEMENT SET;
INSERT INTO ... SELECT ...;
INSERT INTO ... SELECT ...;
...
END STATEMENT SET;

We didn't follow up on this proposal, to keep the focus on the FLIP-84
Table API changes and to not dive into a discussion about multiline SQL
query support [4].

While this feature is clearly based on multiple SQL queries, I think it is
a bit different from what we usually understand as multiline SQL support.
That's because a statement set ends up to be a single Flink job. Hence,
there is no need on the Flink side to coordinate the execution of multiple
jobs (incl. the discussion about blocking or async execution of queries).
Flink would treat the queries in a STATEMENT SET as a single query.

I would like to start a discussion about supporting the [START|END]
STATEMENT SET syntax (or a different syntax with equivalent semantics) in
Flink.
I don't have a strong preference whether this should be implemented in
Flink's SQL core or be a purely client side implementation in the CLI
client. It would be good though to have parser support in Flink for this.

What do others think?

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
[2]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sqlClient.html
[3]
https://docs.google.com/document/d/1ueLjQWRPdLTFB_TReAyhseAX-1N3j4WYWD0F02Uau0E/edit#heading=h.al86t1h4ecuv
[4]
https://lists.apache.org/thread.html/rf494e227c47010c91583f90eeaf807d3a4c3eb59d105349afd5fdc31%40%3Cdev.flink.apache.org%3E


[jira] [Created] (FLINK-18304) Document default reporter interval

2020-06-15 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-18304:


 Summary: Document default reporter interval
 Key: FLINK-18304
 URL: https://issues.apache.org/jira/browse/FLINK-18304
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18305) Add interval configuration too all reporter examples that support it

2020-06-15 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-18305:


 Summary: Add interval configuration too all reporter examples that 
support it
 Key: FLINK-18305
 URL: https://issues.apache.org/jira/browse/FLINK-18305
 Project: Flink
  Issue Type: Improvement
  Components: Documentation, Runtime / Metrics
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0


The documentation is not clear on what reporters support the interval 
configuration. We can extend the configuration examples to make this clearer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18306) how to satisfy the node-sass dependency when compiling runtime-web?

2020-06-15 Thread appleyuchi (Jira)
appleyuchi created FLINK-18306:
--

 Summary: how to satisfy the node-sass dependency when compiling 
runtime-web?
 Key: FLINK-18306
 URL: https://issues.apache.org/jira/browse/FLINK-18306
 Project: Flink
  Issue Type: Bug
  Components: Runtime / Web Frontend
Affects Versions: 1.12.0
Reporter: appleyuchi


2 commands in flink-runtime-web 's pom.xml,they are

npm ci

npm run build



npm ci need the v4.11.0 in package-lock.json

when compiling,

*it tell me that it need 
{color:#172b4d}node-sass/v4.11.0/linux-x64-72_binding.node.{color}*

 

{color:#172b4d}*The author of node-sass has already deleted 
linux-x64-72_binding.node*{color}

{color:#172b4d}[https://github.com/sass/node-sass/issues/2653]{color}

 

{color:#172b4d}list of node-sass/v4.11.0{color}

{color:#172b4d}https://github.com/sass/node-sass/releases/tag/v4.11.0{color}

 

*{color:#172b4d}Question:{color}*

*{color:#172b4d}how to satisfy the requirement above?{color}*

 

but



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[ANNOUNCE] Weekly Community Update 2020/23-24

2020-06-15 Thread Konstantin Knauf
Dear community,

happy to share this community update on the last two weeks including the
release of Stateful Functions 2.1, a table source for ElasticSearch, and a
bit more. The community is still working on release testing for Apache
Flink 1.11, so still comparably quite. Expecting the first feature
discussions for Flink 1.12 to start soon, though.

Flink Development
==

* [releases] Apache Flink Stateful Functions 2.1.0 was released last
Tuesday. [1] For details checkout the release blog post [2].

* [connectors] Jacky Lau started a discussion on extending the
ElasticSearch connector by adding a TableSource. The source would be
bounded (similar to JDBC and Hbase) and scannable as well as lookupable. [3]

* [documentation] Robert has started a conversation on adding a Japanese
translation of the documentation. The discussion showed that a) there does
not seem to be anyone who could review these contributions and b) there are
a couple of issues in keeping the English and Chinese translation in sync
that would need to be addressed prior to adding another language. [4]

* [documentation] Seth proposed to move the code backing the "walkthroughs"
from the main *flink* repository to the *flink-playgrounds* repository. [5]

* [development process] Aljoscha proposes to update the checked–in
EditorConfig to the current code and checkstyle configuration of Apache
Flink [6]

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-Apache-Flink-Stateful-Functions-2-1-0-released-tp42345.html
[2] https://flink.apache.org/news/2020/06/09/release-statefun-2.1.0.html
[3]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discussion-flink-elasticsearch-connector-supports-tp42082p42238.html
[4]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Add-Japanese-translation-of-the-flink-apache-org-website-tp42279.html
[5]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Discuss-Migrate-walkthroughs-to-flink-playgrounds-for-1-11-tp42360.html
[6]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Update-our-EditorConfig-file-tp42409.html

Notable Bugs
==

A lot of activity around release testing for Flink 1.11.0, but nothing
notable for any of our release versions.

Events, Blog Posts, Misc
===

* Benchalo Li and Xintong Song are Apache Flink Committers now.
Congratulations! [7,8]

* Marta has published a new quarterly community update blog post on the
Flink blog. [9]

[7]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-New-Flink-Committer-Benchao-Li-tp42312p42353.html
[8]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ANNOUNCE-New-Apache-Flink-Committer-Xintong-Song-tp42194p42207.html
[9] https://flink.apache.org/news/2020/06/11/community-update.html

Cheers,

Konstantin

-- 

Konstantin Knauf

https://twitter.com/snntrable

https://github.com/knaufk


NullPointer Exception while trying to access or read ReadOnly ctx in processElement method in KeyedBroadCastProcessFunction in Apache Flink

2020-06-15 Thread bujjirahul45 .
I have some interesting scenario i am working on pattern matching in flink
evaluating the incoming data against a set of patterns using
keyedbroadcastprocessfunction, when i am running the program in IDE i am
getting null pointer exception in processElements method when trying to
access ReadOnlyContext but the same program is running fine in flink
terminal, below is my keyedbroadcastprocessfunction

public class TestProcess extends KeyedBroadcastProcessFunction,
Tuple2>, Tuple2> {

public static final MapStateDescriptor >
ruleDescriptor =
new MapStateDescriptor <>("RuleDiscriptor",
,BasicTypeInfo.STRING_TYPE_INFO
,new MapTypeInfo<>(String.class,String.class));

@Override
public void processElement(Tuple2 value,
ReadOnlyContext ctx, Collector> out) throws Exception {

System.out.println("sampleSignal: " +value.f1.toString());

String Context = ctx.getBroadcastState(ruleDescriptor).toString();

Map incomingRule = new Hashmap<>();

incomingPattern = ctx.getBroadcastState(ruleDescriptor).get(Key);

/*It's hitting nullpointer exception when printing the size of
hashmpa*/
System.out.println("Map Size: " +incomingRule.size());

System.out.println("Context: " +Context);

System.out.println("Before Rule Iterator");

/*I tried below way to print the values in broadcaststream just to
print the values
  in broadcast state it don't print anything*/
for(Map.Entry> rules:
ctx.getBroadcastState(ruleDescriptor).immutableEntries()){
System.out.println("Key: " +rules.getKey());
System.out.println("Value: "+rules.getValue());
}


for(Map.Entry rules: incomingRule.entrySet()){

System.out.println("Key: " +rules.getKey());
System.out.println("Value: "+rules.getValue());
}

out.collect(new Tuple2<>(value.f0,value.f1));

}

@Override
public void processBroadcastElement(Tuple2>
value, Context ctx,
Collector> out) throws Exception {

System.out.println("BroadCastState Key: " +value.f0);
System.out.println("BroadCastState Value: " +value.f1);
ctx.getBroadcastState(ruleDescriptor).put(value.f0,value.f1);

}
}
Below is the IDE Terminal output with error exception

/*Its prints below data in BroadCastState in processBroadcastElement*/
BroadCastState Key: Key
BroadCastState Value: {"RuleKey":"RuleValue"}


/*Its printing below data in processElement*/

sampleSignal: {SignalData}

When it hits the Map in which i am storing the Rule Name and Rule Condition
its throwing nullpointer exception and below is the stack trace of error

Exception in thread "main"
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at
org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:146)
at
org.apache.flink.runtime.minicluster.MiniCluster.executeJobBlocking(MiniCluster.java:638)
at
org.apache.flink.streaming.api.environment.LocalStreamEnvironment.execute(LocalStreamEnvironment.java:123)
at com.westpac.itm.eq.pattern.App.main(App.java:34)
Caused by: java.lang.NullPointerException
at
com.westpac.itm.eq.pattern.TestProcess.processElement(TestProcess.java:35)
at
com.westpac.itm.eq.pattern.TestProcess.processElement(TestProcess.java:15)
at
org.apache.flink.streaming.api.operators.co.CoBroadcastWithKeyedOperator.processElement1(CoBroadcastWithKeyedOperator.java:113)
at
org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:238)
Caused by: java.lang.NullPointerException

at
org.apache.flink.streaming.runtime.tasks.TwoInputStreamTask.run(TwoInputStreamTask.java:117)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
at java.lang.Thread.run(Thread.java:748)


Please help me in solving the issue

Thanks,
Rahul.


Any python example with json data from Kafka using flink-statefun

2020-06-15 Thread Sunil Sattiraju
Hi, Based on the example from 
https://github.com/apache/flink-statefun/tree/master/statefun-examples/statefun-python-greeter-example
 I am trying to ingest json data in kafka, but unable to achieve based on the 
examples.

event-generator.py

def produce():
request = {}
request['id'] = "abc-123"
request['field1'] = "field1-1"
request['field2'] = "field2-2"
request['field3'] = "field3-3"
if len(sys.argv) == 2:
delay_seconds = int(sys.argv[1])
else:
delay_seconds = 1
producer = KafkaProducer(bootstrap_servers=[KAFKA_BROKER])
for request in random_requests_dict():
producer.send(topic='test-topic',
  value=json.dumps(request).encode('utf-8'))
producer.flush()
time.sleep(delay_seconds)

Below is the proto definition of the json data ( i dont always know all the 
fields, but i know id fields definitely exists)
message.proto

message MyRow {
string id = 1;
google.protobuf.Struct message = 2;
}

Below is greeter that received the data
tokenizer.py ( same like greeter.py saving state of id instead of counting )


@app.route('/statefun', methods=['POST'])
def handle():
my_row = MyRow()
data = my_row.ParseFromString(request.data) // Is this the right way to do 
it?
response_data = handler(request.data)
response = make_response(response_data)
response.headers.set('Content-Type', 'application/octet-stream')
return response


but, below is the error message. I am a newbie with proto and appreciate any 
help

11:55:17,996 tokenizer ERROR Exception on /statefun [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2447, in 
wsgi_app
response = self.full_dispatch_request()
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1952, in 
full_dispatch_request
rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1821, in 
handle_user_exception
reraise(exc_type, exc_value, tb)
  File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39, in 
reraise
raise value
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1950, in 
full_dispatch_request
rv = self.dispatch_request()
  File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1936, in 
dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
  File "/app/tokenizer.py", line 101, in handle
response_data = handler(data)
  File "/usr/local/lib/python3.8/site-packages/statefun/request_reply.py", line 
38, in __call__
request.ParseFromString(request_bytes)
  File "/usr/local/lib/python3.8/site-packages/google/protobuf/message.py", 
line 199, in ParseFromString
return self.MergeFromString(serialized)
  File 
"/usr/local/lib/python3.8/site-packages/google/protobuf/internal/python_message.py",
 line 1131, in MergeFromString
serialized = memoryview(serialized)
TypeError: memoryview: a bytes-like object is required, not 'int'



Re: Any python example with json data from Kafka using flink-statefun

2020-06-15 Thread Igal Shilman
Hi,

The values must be valid encoded Protobuf messages [1], while in your
attached code snippet you are sending utf-8 encoded JSON strings.
You can take a look at this example with a generator that produces Protobuf
messages [2][3]

[1] https://developers.google.com/protocol-buffers/docs/pythontutorial
[2]
https://github.com/apache/flink-statefun/blob/8376afa6b064bfa2374eefbda5e149cd490985c0/statefun-examples/statefun-python-greeter-example/generator/event-generator.py#L37
[3]
https://github.com/apache/flink-statefun/blob/8376afa6b064bfa2374eefbda5e149cd490985c0/statefun-examples/statefun-python-greeter-example/greeter/messages.proto#L25

On Mon, Jun 15, 2020 at 4:25 PM Sunil Sattiraju 
wrote:

> Hi, Based on the example from
> https://github.com/apache/flink-statefun/tree/master/statefun-examples/statefun-python-greeter-example
> I am trying to ingest json data in kafka, but unable to achieve based on
> the examples.
>
> event-generator.py
>
> def produce():
> request = {}
> request['id'] = "abc-123"
> request['field1'] = "field1-1"
> request['field2'] = "field2-2"
> request['field3'] = "field3-3"
> if len(sys.argv) == 2:
> delay_seconds = int(sys.argv[1])
> else:
> delay_seconds = 1
> producer = KafkaProducer(bootstrap_servers=[KAFKA_BROKER])
> for request in random_requests_dict():
> producer.send(topic='test-topic',
>   value=json.dumps(request).encode('utf-8'))
> producer.flush()
> time.sleep(delay_seconds)
>
> Below is the proto definition of the json data ( i dont always know all
> the fields, but i know id fields definitely exists)
> message.proto
>
> message MyRow {
> string id = 1;
> google.protobuf.Struct message = 2;
> }
>
> Below is greeter that received the data
> tokenizer.py ( same like greeter.py saving state of id instead of counting
> )
>
>
> @app.route('/statefun', methods=['POST'])
> def handle():
> my_row = MyRow()
> data = my_row.ParseFromString(request.data) // Is this the right way
> to do it?
> response_data = handler(request.data)
> response = make_response(response_data)
> response.headers.set('Content-Type', 'application/octet-stream')
> return response
>
>
> but, below is the error message. I am a newbie with proto and appreciate
> any help
>
> 11:55:17,996 tokenizer ERROR Exception on /statefun [POST]
> Traceback (most recent call last):
>   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2447,
> in wsgi_app
> response = self.full_dispatch_request()
>   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1952,
> in full_dispatch_request
> rv = self.handle_user_exception(e)
>   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1821,
> in handle_user_exception
> reraise(exc_type, exc_value, tb)
>   File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39,
> in reraise
> raise value
>   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1950,
> in full_dispatch_request
> rv = self.dispatch_request()
>   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1936,
> in dispatch_request
> return self.view_functions[rule.endpoint](**req.view_args)
>   File "/app/tokenizer.py", line 101, in handle
> response_data = handler(data)
>   File "/usr/local/lib/python3.8/site-packages/statefun/request_reply.py",
> line 38, in __call__
> request.ParseFromString(request_bytes)
>   File
> "/usr/local/lib/python3.8/site-packages/google/protobuf/message.py", line
> 199, in ParseFromString
> return self.MergeFromString(serialized)
>   File
> "/usr/local/lib/python3.8/site-packages/google/protobuf/internal/python_message.py",
> line 1131, in MergeFromString
> serialized = memoryview(serialized)
> TypeError: memoryview: a bytes-like object is required, not 'int'
>
>


[jira] [Created] (FLINK-18307) Replace "slave" file name with "workers"

2020-06-15 Thread Stephan Ewen (Jira)
Stephan Ewen created FLINK-18307:


 Summary: Replace "slave" file name with "workers"
 Key: FLINK-18307
 URL: https://issues.apache.org/jira/browse/FLINK-18307
 Project: Flink
  Issue Type: Sub-task
  Components: Deployment / Scripts
Reporter: Stephan Ewen
Assignee: Stephan Ewen
 Fix For: 1.11.0


See parent issue for a discussion of the rationale.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18308) KafkaProducerTestBase->Kafka011ProducerExactlyOnceITCase. testExactlyOnceCustomOperator hangs in Azure

2020-06-15 Thread Andrey Zagrebin (Jira)
Andrey Zagrebin created FLINK-18308:
---

 Summary: KafkaProducerTestBase->Kafka011ProducerExactlyOnceITCase. 
testExactlyOnceCustomOperator hangs in Azure
 Key: FLINK-18308
 URL: https://issues.apache.org/jira/browse/FLINK-18308
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: 1.11.0
Reporter: Andrey Zagrebin


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3267&view=logs&j=d44f43ce-542c-597d-bf94-b0718c71e5e8&t=03dca39c-73e8-5aaf-601d-328ae5c35f20]

For last 3.5 hours, the test log ends with about 5000 entries like this:
{code:java}
2020-06-11T11:04:54.3299945Z 11:04:54,328 [FailingIdentityMapper Status 
Printer] INFO  
org.apache.flink.streaming.connectors.kafka.testutils.FailingIdentityMapper [] 
- > Failing mapper  0: count=690, 
totalCount=1000{code}
The problem was observed not on master but in [this 
PR|https://github.com/apache/flink/pull/12596]. The PR is simple fatal error 
handling refactoring in TM. Therefore, the PR looks unrelated. Another run of 
this PR in My Azure CI 
[passes|https://dev.azure.com/azagrebin/azagrebin/_build/results?buildId=214&view=results].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18309) Recommend avoiding uppercase to emphasise statements in doc style

2020-06-15 Thread Andrey Zagrebin (Jira)
Andrey Zagrebin created FLINK-18309:
---

 Summary: Recommend avoiding uppercase to emphasise statements in 
doc style
 Key: FLINK-18309
 URL: https://issues.apache.org/jira/browse/FLINK-18309
 Project: Flink
  Issue Type: Improvement
  Components: Project Website
Reporter: Andrey Zagrebin
Assignee: Andrey Zagrebin


Some contributions tend to use uppercase in user docs to highlight and/or 
emphasise statements. For example: "you MUST use the latest version". This 
style may appear somewhat aggressive to users.

Therefore, I suggest to add a recommendation to not use uppercase in user docs. 
We could highlight this statements as note paragraphs or with less 'shooting' 
style, e.g. italics to draw user attention.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Any python example with json data from Kafka using flink-statefun

2020-06-15 Thread Sunil Sattiraju
Thanks Igal,
I dont have control over the data source inside kafka ( current kafka topic 
contains either json or avro formats only, i am trying to reproduce this 
scenario using my test data generator ). 

is it possible to convert the json to proto at the receiving end of statefun 
applicaiton?

On 2020/06/15 14:51:01, Igal Shilman  wrote: 
> Hi,
> 
> The values must be valid encoded Protobuf messages [1], while in your
> attached code snippet you are sending utf-8 encoded JSON strings.
> You can take a look at this example with a generator that produces Protobuf
> messages [2][3]
> 
> [1] https://developers.google.com/protocol-buffers/docs/pythontutorial
> [2]
> https://github.com/apache/flink-statefun/blob/8376afa6b064bfa2374eefbda5e149cd490985c0/statefun-examples/statefun-python-greeter-example/generator/event-generator.py#L37
> [3]
> https://github.com/apache/flink-statefun/blob/8376afa6b064bfa2374eefbda5e149cd490985c0/statefun-examples/statefun-python-greeter-example/greeter/messages.proto#L25
> 
> On Mon, Jun 15, 2020 at 4:25 PM Sunil Sattiraju 
> wrote:
> 
> > Hi, Based on the example from
> > https://github.com/apache/flink-statefun/tree/master/statefun-examples/statefun-python-greeter-example
> > I am trying to ingest json data in kafka, but unable to achieve based on
> > the examples.
> >
> > event-generator.py
> >
> > def produce():
> > request = {}
> > request['id'] = "abc-123"
> > request['field1'] = "field1-1"
> > request['field2'] = "field2-2"
> > request['field3'] = "field3-3"
> > if len(sys.argv) == 2:
> > delay_seconds = int(sys.argv[1])
> > else:
> > delay_seconds = 1
> > producer = KafkaProducer(bootstrap_servers=[KAFKA_BROKER])
> > for request in random_requests_dict():
> > producer.send(topic='test-topic',
> >   value=json.dumps(request).encode('utf-8'))
> > producer.flush()
> > time.sleep(delay_seconds)
> >
> > Below is the proto definition of the json data ( i dont always know all
> > the fields, but i know id fields definitely exists)
> > message.proto
> >
> > message MyRow {
> > string id = 1;
> > google.protobuf.Struct message = 2;
> > }
> >
> > Below is greeter that received the data
> > tokenizer.py ( same like greeter.py saving state of id instead of counting
> > )
> >
> >
> > @app.route('/statefun', methods=['POST'])
> > def handle():
> > my_row = MyRow()
> > data = my_row.ParseFromString(request.data) // Is this the right way
> > to do it?
> > response_data = handler(request.data)
> > response = make_response(response_data)
> > response.headers.set('Content-Type', 'application/octet-stream')
> > return response
> >
> >
> > but, below is the error message. I am a newbie with proto and appreciate
> > any help
> >
> > 11:55:17,996 tokenizer ERROR Exception on /statefun [POST]
> > Traceback (most recent call last):
> >   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 2447,
> > in wsgi_app
> > response = self.full_dispatch_request()
> >   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1952,
> > in full_dispatch_request
> > rv = self.handle_user_exception(e)
> >   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1821,
> > in handle_user_exception
> > reraise(exc_type, exc_value, tb)
> >   File "/usr/local/lib/python3.8/site-packages/flask/_compat.py", line 39,
> > in reraise
> > raise value
> >   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1950,
> > in full_dispatch_request
> > rv = self.dispatch_request()
> >   File "/usr/local/lib/python3.8/site-packages/flask/app.py", line 1936,
> > in dispatch_request
> > return self.view_functions[rule.endpoint](**req.view_args)
> >   File "/app/tokenizer.py", line 101, in handle
> > response_data = handler(data)
> >   File "/usr/local/lib/python3.8/site-packages/statefun/request_reply.py",
> > line 38, in __call__
> > request.ParseFromString(request_bytes)
> >   File
> > "/usr/local/lib/python3.8/site-packages/google/protobuf/message.py", line
> > 199, in ParseFromString
> > return self.MergeFromString(serialized)
> >   File
> > "/usr/local/lib/python3.8/site-packages/google/protobuf/internal/python_message.py",
> > line 1131, in MergeFromString
> > serialized = memoryview(serialized)
> > TypeError: memoryview: a bytes-like object is required, not 'int'
> >
> >
> 


Rest handler redirect problem

2020-06-15 Thread Wong Lucent
Hi,


Recently, our team upgraged our filnk cluster from version 1.7 to 1.10. And we 
met some problem when calling the flink rest api.

1) We deploy our flink cluster in standlone mode on kubernetes and use two 
Jobmanagers for HA.

2) We deployed a kubernetes service for the two jobmanagers to provide a 
unified url.

3) We use restful api to operate the flink cluster.

Afther upgraded to 1.10,  we found there is some difference between 1.7 when 
processing the savepoint query request. For example, if we send a savepoint 
trigger request to the leader jobmanager, in 1.7 we can query the standby 
jobmanager to get the status of the checkpoint, while in 1.10 it will return a 
404 response.

In 1.7 all the requests to standby Jobmanager will be forward to the leader in 
"RedirectHandler", while in 1.10 the requesets will be forward with RPC in 
"LeaderRetrievalHandler". But there seems a issue in 
"AbstractAsynchronousOperationHandlers", in this handler, there is a local 
memory cache "completedOperationCache" to store the pending savpoint opeartion 
before redirect the request to the leader jobmanager, which seems not synced 
between all the jobmanagers. This makes only the jobmanager which receive the 
savepoint trigger requset can lookup the status of the savpoint, while the 
others can only return 404.

As this breaks our design in operating the flink cluster with restful API, we 
cannot use kubernetes service to hide the standby jobmanager any more. We hope 
to know is this behavior by design or it's really a bug?


Thanks and Best Regards
Lucent Wong


[jira] [Created] (FLINK-18310) Failure while parsing reporter interval does not reliably revert back to default

2020-06-15 Thread Chesnay Schepler (Jira)
Chesnay Schepler created FLINK-18310:


 Summary: Failure while parsing reporter interval does not reliably 
revert back to default
 Key: FLINK-18310
 URL: https://issues.apache.org/jira/browse/FLINK-18310
 Project: Flink
  Issue Type: Improvement
  Components: Runtime / Metrics
Affects Versions: 1.10.0
Reporter: Chesnay Schepler
Assignee: Chesnay Schepler
 Fix For: 1.11.0, 1.10.2


The reporter interval is parsed in 2 steps, one for the time amount, one for 
the time unit.
The result of the first step is used regardless of whether the second step 
succeeds.

This means that if you configure "3000 ms", we fall back to the default time 
unit (seconds), but stick to the configured amount (3000).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: Rest handler redirect problem

2020-06-15 Thread Chesnay Schepler

I think this is not unintentional and simply a case we did not consider.

Please file a JIRA.

On 15/06/2020 19:01, Wong Lucent wrote:

Hi,


Recently, our team upgraged our filnk cluster from version 1.7 to 1.10. And we 
met some problem when calling the flink rest api.

1) We deploy our flink cluster in standlone mode on kubernetes and use two 
Jobmanagers for HA.

2) We deployed a kubernetes service for the two jobmanagers to provide a 
unified url.

3) We use restful api to operate the flink cluster.

Afther upgraded to 1.10,  we found there is some difference between 1.7 when 
processing the savepoint query request. For example, if we send a savepoint 
trigger request to the leader jobmanager, in 1.7 we can query the standby 
jobmanager to get the status of the checkpoint, while in 1.10 it will return a 
404 response.

In 1.7 all the requests to standby Jobmanager will be forward to the leader in "RedirectHandler", while in 
1.10 the requesets will be forward with RPC in "LeaderRetrievalHandler". But there seems a issue in 
"AbstractAsynchronousOperationHandlers", in this handler, there is a local memory cache 
"completedOperationCache" to store the pending savpoint opeartion before redirect the request to the leader 
jobmanager, which seems not synced between all the jobmanagers. This makes only the jobmanager which receive the 
savepoint trigger requset can lookup the status of the savpoint, while the others can only return 404.

As this breaks our design in operating the flink cluster with restful API, we 
cannot use kubernetes service to hide the standby jobmanager any more. We hope 
to know is this behavior by design or it's really a bug?


Thanks and Best Regards
Lucent Wong





Re: Moving in and Modify Dependency Source

2020-06-15 Thread Austin Cawley-Edwards
Hey Robert,

Thanks for getting back to me! Just wasn't sure on the license header
requirements for the CI checks in Flink. Not too experienced with working
with licenses, especially in large open-source projects. Since we would be
using APL 2 (and from this link[1] we need to state changes, include the
copyright, add to a notice file, add to licenses), would I just include
their copyright at the top of the file and then state the changes I've made
there, or somewhere else? Do I need to create a new NOTICE file/ licenses
in the RMQ connector resources or add it to another file?

Sorry for all the questions! ... is there anywhere in the docs that
addresses this?

Best + thanks again,
Austin

[1]: https://tldrlegal.com/license/apache-license-2.0-%28apache-2.0%29

On Mon, Jun 15, 2020 at 4:20 AM Robert Metzger  wrote:

> Hi Austin,
> Thanks for working on the RMQ connector! There seem to be a few users
> affected by that issue.
>
> The GitHub page confirms that users can choose from the three licenses:
> https://github.com/rabbitmq/rabbitmq-java-client#license:
>
> > This means that the user can consider the library to be licensed under
> any
> > of the licenses from the list above. For example, you may choose the
> > Apache Public License 2.0 and include this client into a commercial
> > product. Projects that are licensed under the GPLv2 may choose GPLv2, and
> > so on.
>
>
> Best,
> Robert
>
> On Mon, Jun 15, 2020 at 8:59 AM Till Rohrmann 
> wrote:
>
> > Hi Austin,
> >
> > usually if source code is multi licensed then this means that the user
> can
> > choose the license under which he wants it to use. In our case it would
> be
> > the Apache License version 2. But you should check the license text to
> make
> > sure that this has not been forbidden explicitly.
> >
> > When copying code from another project, the practice is to annotate it
> with
> > a comment stating from where the code was obtained. So in your case you
> > would give these files the ASL license header and add a comment to the
> > source code from where it was copied.
> >
> > Cheers,
> > Till
> >
> > On Sat, Jun 13, 2020 at 10:41 PM Austin Cawley-Edwards <
> > austin.caw...@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > I'm working on [FLINK-10195] on the RabbitMQ connector which involves
> > > modifying some of the RMQ client source code (that has been moved out
> of
> > > that package) and bringing it into Flink. The RMQ client code is
> > > triple-licensed under Mozilla Public License 1.1 ("MPL"), the GNU
> General
> > > Public License version 2 ("GPL"), and the Apache License version 2
> > ("ASL").
> > >
> > > Does anyone have experience doing something similar/ what I would need
> to
> > > do in terms of the license headers in the Flink source files?
> > >
> > > Thank you,
> > > Austin
> > >
> > > [FLINK-10195]: https://issues.apache.org/jira/browse/FLINK-10195
> > >
> >
>


Re: Moving in and Modify Dependency Source

2020-06-15 Thread Austin Cawley-Edwards
Ah, missed Till's response -- thanks as well!

I'll add those headers to the files, so just now wondering about including
the licenses/ notice in the RMQ connector resources.

On Mon, Jun 15, 2020 at 7:40 PM Austin Cawley-Edwards <
austin.caw...@gmail.com> wrote:

> Hey Robert,
>
> Thanks for getting back to me! Just wasn't sure on the license header
> requirements for the CI checks in Flink. Not too experienced with working
> with licenses, especially in large open-source projects. Since we would be
> using APL 2 (and from this link[1] we need to state changes, include the
> copyright, add to a notice file, add to licenses), would I just include
> their copyright at the top of the file and then state the changes I've made
> there, or somewhere else? Do I need to create a new NOTICE file/ licenses
> in the RMQ connector resources or add it to another file?
>
> Sorry for all the questions! ... is there anywhere in the docs that
> addresses this?
>
> Best + thanks again,
> Austin
>
> [1]: https://tldrlegal.com/license/apache-license-2.0-%28apache-2.0%29
>
> On Mon, Jun 15, 2020 at 4:20 AM Robert Metzger 
> wrote:
>
>> Hi Austin,
>> Thanks for working on the RMQ connector! There seem to be a few users
>> affected by that issue.
>>
>> The GitHub page confirms that users can choose from the three licenses:
>> https://github.com/rabbitmq/rabbitmq-java-client#license:
>>
>> > This means that the user can consider the library to be licensed under
>> any
>> > of the licenses from the list above. For example, you may choose the
>> > Apache Public License 2.0 and include this client into a commercial
>> > product. Projects that are licensed under the GPLv2 may choose GPLv2,
>> and
>> > so on.
>>
>>
>> Best,
>> Robert
>>
>> On Mon, Jun 15, 2020 at 8:59 AM Till Rohrmann 
>> wrote:
>>
>> > Hi Austin,
>> >
>> > usually if source code is multi licensed then this means that the user
>> can
>> > choose the license under which he wants it to use. In our case it would
>> be
>> > the Apache License version 2. But you should check the license text to
>> make
>> > sure that this has not been forbidden explicitly.
>> >
>> > When copying code from another project, the practice is to annotate it
>> with
>> > a comment stating from where the code was obtained. So in your case you
>> > would give these files the ASL license header and add a comment to the
>> > source code from where it was copied.
>> >
>> > Cheers,
>> > Till
>> >
>> > On Sat, Jun 13, 2020 at 10:41 PM Austin Cawley-Edwards <
>> > austin.caw...@gmail.com> wrote:
>> >
>> > > Hi all,
>> > >
>> > > I'm working on [FLINK-10195] on the RabbitMQ connector which involves
>> > > modifying some of the RMQ client source code (that has been moved out
>> of
>> > > that package) and bringing it into Flink. The RMQ client code is
>> > > triple-licensed under Mozilla Public License 1.1 ("MPL"), the GNU
>> General
>> > > Public License version 2 ("GPL"), and the Apache License version 2
>> > ("ASL").
>> > >
>> > > Does anyone have experience doing something similar/ what I would
>> need to
>> > > do in terms of the license headers in the Flink source files?
>> > >
>> > > Thank you,
>> > > Austin
>> > >
>> > > [FLINK-10195]: https://issues.apache.org/jira/browse/FLINK-10195
>> > >
>> >
>>
>


[jira] [Created] (FLINK-18311) StreamingKafkaITCase stalls indefinitely

2020-06-15 Thread Dian Fu (Jira)
Dian Fu created FLINK-18311:
---

 Summary: StreamingKafkaITCase stalls indefinitely
 Key: FLINK-18311
 URL: https://issues.apache.org/jira/browse/FLINK-18311
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka, Tests
Affects Versions: 1.11.0, 1.12.0
Reporter: Dian Fu


CI: 
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=3537&view=logs&s=ae4f8708-9994-57d3-c2d7-b892156e7812&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee]

{code}
020-06-15T21:01:59.0792207Z [INFO] 
org.apache.flink:flink-sql-connector-kafka-0.10_2.11:1.11-SNAPSHOT:jar already 
exists in 
/home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/target/dependencies
2020-06-15T21:01:59.0793580Z [INFO] 
org.apache.flink:flink-sql-connector-kafka-0.11_2.11:1.11-SNAPSHOT:jar already 
exists in 
/home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/target/dependencies
2020-06-15T21:01:59.0794931Z [INFO] 
org.apache.flink:flink-sql-connector-kafka_2.11:1.11-SNAPSHOT:jar already 
exists in 
/home/vsts/work/1/s/flink-end-to-end-tests/flink-end-to-end-tests-common-kafka/target/dependencies
2020-06-15T21:01:59.0795686Z [INFO] 
2020-06-15T21:01:59.0796403Z [INFO] --- maven-surefire-plugin:2.22.1:test 
(end-to-end-tests) @ flink-end-to-end-tests-common-kafka ---
2020-06-15T21:01:59.0869911Z [INFO] 
2020-06-15T21:01:59.0871981Z [INFO] 
---
2020-06-15T21:01:59.0874203Z [INFO]  T E S T S
2020-06-15T21:01:59.0875086Z [INFO] 
---
2020-06-15T21:02:00.0134000Z [INFO] Running 
org.apache.flink.tests.util.kafka.StreamingKafkaITCase
2020-06-15T21:45:33.4889677Z ##[error]The operation was canceled.
2020-06-15T21:45:33.4902658Z ##[section]Finishing: Run e2e tests
2020-06-15T21:45:33.5058601Z ##[section]Starting: Cache Maven local repo
2020-06-15T21:45:33.5164621Z 
==
2020-06-15T21:45:33.5164972Z Task : Cache
2020-06-15T21:45:33.5165250Z Description  : Cache files between runs
2020-06-15T21:45:33.5165497Z Version  : 2.0.1
2020-06-15T21:45:33.5165769Z Author   : Microsoft Corporation
2020-06-15T21:45:33.5166079Z Help : https://aka.ms/pipeline-caching-docs
2020-06-15T21:45:33.5166442Z 
==
2020-06-15T21:45:34.0475096Z ##[section]Finishing: Cache Maven local repo
2020-06-15T21:45:34.0502436Z ##[section]Starting: Checkout 
flink-ci/flink-mirror@release-1.11 to s
2020-06-15T21:45:34.0506976Z 
==
2020-06-15T21:45:34.0507297Z Task : Get sources
2020-06-15T21:45:34.0507642Z Description  : Get sources from a repository. 
Supports Git, TfsVC, and SVN repositories.
2020-06-15T21:45:34.0507965Z Version  : 1.0.0
2020-06-15T21:45:34.0508198Z Author   : Microsoft
2020-06-15T21:45:34.0508559Z Help : [More 
Information](https://go.microsoft.com/fwlink/?LinkId=798199)
2020-06-15T21:45:34.0508934Z 
==
2020-06-15T21:45:34.3924966Z Cleaning any cached credential from repository: 
flink-ci/flink-mirror (GitHub)
2020-06-15T21:45:34.3990430Z ##[section]Finishing: Checkout 
flink-ci/flink-mirror@release-1.11 to s
2020-06-15T21:45:34.4049857Z ##[section]Starting: Finalize Job
2020-06-15T21:45:34.4086754Z Cleaning up task key
2020-06-15T21:45:34.4087951Z Start cleaning up orphan processes.
2020-06-15T21:45:34.4481307Z Terminate orphan process: pid (11772) (java)
2020-06-15T21:45:34.4548480Z Terminate orphan process: pid (12132) (java)
2020-06-15T21:45:34.4632331Z Terminate orphan process: pid (30726) (bash)
2020-06-15T21:45:34.4660351Z Terminate orphan process: pid (30728) (bash)
2020-06-15T21:45:34.4710124Z Terminate orphan process: pid (68958) (java)
2020-06-15T21:45:34.4751577Z Terminate orphan process: pid (119102) (java)
2020-06-15T21:45:34.4800161Z Terminate orphan process: pid (129546) (sh)
2020-06-15T21:45:34.4830588Z Terminate orphan process: pid (129548) (java)
2020-06-15T21:45:34.4833955Z ##[section]Finishing: Finalize Job
2020-06-15T21:45:34.4877321Z ##[section]Finishing: e2e_ci
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18313) Hive dialect doc should mention that views created in Flink cannot be used in Hive

2020-06-15 Thread Rui Li (Jira)
Rui Li created FLINK-18313:
--

 Summary: Hive dialect doc should mention that views created in 
Flink cannot be used in Hive
 Key: FLINK-18313
 URL: https://issues.apache.org/jira/browse/FLINK-18313
 Project: Flink
  Issue Type: Task
  Components: Connectors / Hive, Documentation
Reporter: Rui Li
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18312) SavepointStatusHandler and StaticFileServerHandler not redirect

2020-06-15 Thread Yu Wang (Jira)
Yu Wang created FLINK-18312:
---

 Summary: SavepointStatusHandler and StaticFileServerHandler not 
redirect 
 Key: FLINK-18312
 URL: https://issues.apache.org/jira/browse/FLINK-18312
 Project: Flink
  Issue Type: Bug
  Components: Runtime / REST
Affects Versions: 1.10.0, 1.9.0, 1.8.0
 Environment: 1. Deploy flink cluster in standlone mode on kubernetes 
and use two Jobmanagers for HA.
2. Deploy a kubernetes service for the two jobmanagers to provide a unified url.
Reporter: Yu Wang


Savepoint:

1. Deploy our flink cluster in standlone mode on kubernetes and use two 
Jobmanagers for HA.

2. Deploy a kubernetes service for the two jobmanagers to provide a unified url.

3. Send a savepoint trigger request to the leader Jobmanager.

4. Query the savepoint status from leader Jobmanager, get correct response.

5. Query the savepoint status from standby Jobmanager, the response will be 404.

Jobmanager log:

1. Query log from leader Jobmanager, get leader log.

2. Query log from standby Jobmanager, get standby log.

 

Both these two requests will be redirect to the leader in 1.7.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] SQL Syntax for Table API StatementSet

2020-06-15 Thread Jark Wu
Hi Fabian,

Thanks for starting this discussion. I think this is a very important
syntax to support file mode and multi-statement for SQL Client.
I'm +1 to introduce a syntax to group SQL statements to execute together.

As a reference, traditional database systems also have similar syntax, such
as "START/BEGIN TRANSACTION ... COMMIT" to group statements as a
transaction [1],
and also "BEGIN ... END" [2] [3] to group a set of SQL statements that
execute together.

Maybe we can also use "BEGIN ... END" syntax which is much simpler?

Regarding where to implement, I also prefer to have it in Flink SQL core,
here are some reasons from my side:
1) I think many downstream projects (e.g Zeppelin) will have the same
requirement. It would be better to have it in core instead of reinventing
the wheel by users.
2) Having it in SQL CLI means it is a standard syntax to support statement
set in Flink. So I think it makes sense to have it in core too, otherwise,
it looks like a broken feature.
In 1.10, CREATE VIEW is only supported in SQL CLI, not supported in
TableEnvironment, which confuses many users.
3) Currently, we are moving statement parsing to use sql-parser
(FLINK-17728). Calcite has a good support for parsing multi-statements.
It will be tricky to parse multi-statements only in SQL Client.

Best,
Jark

[1]:
https://docs.microsoft.com/en-us/sql/t-sql/language-elements/begin-transaction-transact-sql?view=sql-server-ver15
[2]:
https://www.sqlservertutorial.net/sql-server-stored-procedures/sql-server-begin-end/
[3]: https://dev.mysql.com/doc/refman/8.0/en/begin-end.html

On Mon, 15 Jun 2020 at 20:50, Fabian Hueske  wrote:

> Hi everyone,
>
> FLIP-84 [1] added the concept of a "statement set" to group multiple INSERT
> INTO statements (SQL or Table API) together. The statements in a statement
> set are jointly optimized and executed as a single Flink job.
>
> I would like to start a discussion about a SQL syntax to group multiple
> INSERT INTO statements in a statement set. The use case would be to expose
> the statement set feature to a solely text based client for Flink SQL such
> as Flink's SQL CLI [1].
>
> During the discussion of FLIP-84, we had briefly talked about such a syntax
> [3].
>
> START STATEMENT SET;
> INSERT INTO ... SELECT ...;
> INSERT INTO ... SELECT ...;
> ...
> END STATEMENT SET;
>
> We didn't follow up on this proposal, to keep the focus on the FLIP-84
> Table API changes and to not dive into a discussion about multiline SQL
> query support [4].
>
> While this feature is clearly based on multiple SQL queries, I think it is
> a bit different from what we usually understand as multiline SQL support.
> That's because a statement set ends up to be a single Flink job. Hence,
> there is no need on the Flink side to coordinate the execution of multiple
> jobs (incl. the discussion about blocking or async execution of queries).
> Flink would treat the queries in a STATEMENT SET as a single query.
>
> I would like to start a discussion about supporting the [START|END]
> STATEMENT SET syntax (or a different syntax with equivalent semantics) in
> Flink.
> I don't have a strong preference whether this should be implemented in
> Flink's SQL core or be a purely client side implementation in the CLI
> client. It would be good though to have parser support in Flink for this.
>
> What do others think?
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
> [2]
>
> https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/sqlClient.html
> [3]
>
> https://docs.google.com/document/d/1ueLjQWRPdLTFB_TReAyhseAX-1N3j4WYWD0F02Uau0E/edit#heading=h.al86t1h4ecuv
> [4]
>
> https://lists.apache.org/thread.html/rf494e227c47010c91583f90eeaf807d3a4c3eb59d105349afd5fdc31%40%3Cdev.flink.apache.org%3E
>


[jira] [Created] (FLINK-18314) There are some problems in docs

2020-06-15 Thread jinxin (Jira)
jinxin created FLINK-18314:
--

 Summary: There are some problems in docs
 Key: FLINK-18314
 URL: https://issues.apache.org/jira/browse/FLINK-18314
 Project: Flink
  Issue Type: Bug
  Components: Documentation
Affects Versions: 1.11.0
Reporter: jinxin


In this page 
[https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/kafka.html.|https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connectors/kafka.html]

Maven dependency should be flink-connector-kafka-0.11 instead of 
flink-connector-kafka-011, which is missing `.` 

flink-connector-kafka-010_2.11 has the same problem.

I read the source code, the content of kafka.md is wrong.

In the same page,DDL should be 

{{`properties.bootstrap.servers` instead of }}{{properties.bootstrap.server.}}

{{when i used}} {{properties.bootstrap.server,i got a exception :}}

{{}}

Caused by: org.apache.flink.table.api.ValidationException: One or more required 
options are missing.

{{}}

Missing required options are:

{{}}

properties.bootstrap.servers

{{}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18315) Insert into partitioned table can fail with values

2020-06-15 Thread Danny Chen (Jira)
Danny Chen created FLINK-18315:
--

 Summary: Insert into partitioned table can fail with values
 Key: FLINK-18315
 URL: https://issues.apache.org/jira/browse/FLINK-18315
 Project: Flink
  Issue Type: Task
  Components: Table SQL / API
Affects Versions: 1.10.0
Reporter: Danny Chen
 Fix For: 1.11.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (FLINK-18316) Add a dynamic state registration primitive for Stateful Functions

2020-06-15 Thread Tzu-Li (Gordon) Tai (Jira)
Tzu-Li (Gordon) Tai created FLINK-18316:
---

 Summary: Add a dynamic state registration primitive for Stateful 
Functions
 Key: FLINK-18316
 URL: https://issues.apache.org/jira/browse/FLINK-18316
 Project: Flink
  Issue Type: New Feature
  Components: Stateful Functions
Reporter: Tzu-Li (Gordon) Tai
Assignee: Tzu-Li (Gordon) Tai


Currently, using the {{PersistedValue}} / {{PersistedTable}} / 
{{PersistedAppendingBuffer}} primitives, the user can only eagerly define 
states prior to function instance activation using the {{Persisted}} field 
annotation.

We propose to add a primitive that allows them to register states dynamically 
after activation (i.e. during runtime), along the lines of:
{code}
public MyStateFn implements StatefulFunction {

@Persisted
private final PersistedStateProvider provider = new 
PersistedStateProvider();

public MyStateFn() {
PersistedValue valueState = provider.getValue(...);
}

void invoke(Object input) {
PersistedValue anotherValueState = provider.getValue(...);
}
}
{code}

Note how you can register state during instantiation (in the constructor) and 
in the invoke method. Both registrations should be picked up by the runtime and 
bound to Flink state.

This will be useful for a few scenarios:
- Could enable us to get rid of eager state spec definitions in the YAML 
modules for remote functions in the future.
- Will allow new state to be registered in remote functions, without shutting 
down the StateFun cluster.
- Moreover, this approach allows us to differentiate which functions have 
dynamic state and which ones have only eager state, which might be handy in the 
future in case there is a need to differentiate.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Request for assignment: FLINK-18119 Fix unlimitedly growing state for ...

2020-06-15 Thread Hyeonseop Lee
Hello,


I have experienced an unlimitedly growing state in my streaming query in
table API and identified that the current implementation of time range
bounded over aggregate function was the cause. I was able to fix it by
modifying a couple of functions in flink-table-runtime-blink.


I am running several streaming applications in production and desiring this
fix to be merged to the official Flink. I have stated detailed issue
statements and fix plans in [FLINK-18119](
https://issues.apache.org/jira/browse/FLINK-18119) but didn't get reactions
for days. Please consider assigning the ticket.


Regards,

Hyeonseop

--
Hyeonseop Lee


Does FlinkKinesisConsumer not retry on NoHttpResponseException?

2020-06-15 Thread Singh Aulakh, Karanpreet KP
Hello!

(Apache Flink1.8 on AWS EMR release label 5.28.x)

Our data source is an AWS Kinesis stream (with 450 shards if that matters). We 
use the FlinkKinesisConsumer to read the kinesis stream. Our application 
occasionally (once every couple of days) crashes with a "Target server failed 
to respond" error. The full stack trace is at the bottom.

Looking more into the codebase I found out that 
'ProvisionedThroughputExceededException' are the only exception types that are 
retried on. 
Code
1. Wondering why a transient http response exception is not retried by the 
kinesis connector?
2. Is there a way I can pass in a retry configuration that will retry on these 
errors?

As a side note, we set the following retry configuration -

env.setRestartStrategy(RestartStrategies.failureRateRestart(12,

  org.apache.flink.api.common.time.Time.of(60, TimeUnit.MINUTES),

org.apache.flink.api.common.time.Time.of(300, 
TimeUnit.SECONDS)));

Full stack trace of the exception -

at 
org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1201)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1147)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:796)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:764)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:738)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:698)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:680)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:544)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:524)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.doInvoke(AmazonKinesisClient.java:2809)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2776)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.invoke(AmazonKinesisClient.java:2765)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.executeGetRecords(AmazonKinesisClient.java:1292)

at 
org.apache.flink.kinesis.shaded.com.amazonaws.services.kinesis.AmazonKinesisClient.getRecords(AmazonKinesisClient.java:1263)

at 
org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxy.getRecords(KinesisProxy.java:250)

at 
org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.getRecords(ShardConsumer.java:400)

at 
org.apache.flink.streaming.connectors.kinesis.internals.ShardConsumer.run(ShardConsumer.java:243)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

(Shamelessly copy pasting from the stack overflow question I posted 
https://stackoverflow.com/questions/62399248/flinkkinesisconsumer-does-not-retry-on-nohttpresponseexception
 )

--
KP


Re: Request for assignment: FLINK-18119 Fix unlimitedly growing state for ...

2020-06-15 Thread Benchao Li
Hi Hyeonseop,

I'm sorry to hear that you got no reactions in time.
We can move to the issue you mentioned for further discussions.

Hyeonseop Lee  于2020年6月16日周二 下午1:00写道:

> Hello,
>
>
> I have experienced an unlimitedly growing state in my streaming query in
> table API and identified that the current implementation of time range
> bounded over aggregate function was the cause. I was able to fix it by
> modifying a couple of functions in flink-table-runtime-blink.
>
>
> I am running several streaming applications in production and desiring this
> fix to be merged to the official Flink. I have stated detailed issue
> statements and fix plans in [FLINK-18119](
> https://issues.apache.org/jira/browse/FLINK-18119) but didn't get
> reactions
> for days. Please consider assigning the ticket.
>
>
> Regards,
>
> Hyeonseop
>
> --
> Hyeonseop Lee
>