Re: Building with Hadoop 3

2019-12-04 Thread Chesnay Schepler
There's no JIRA and no one actively working on it. I'm not aware of any investigations on the matter; hence the first step would be to just try it out. A flink-shaded artifact isn't a hard requirement; Flink will work with any 2.X hadoop distribution (provided that there aren't any dependency

Re: Building with Hadoop 3

2019-12-04 Thread Márton Balassi
Wearing my Cloudera hat I can tell you that we have done this exercise for our distros of the 3.0 and 3.1 Hadoop versions. We have not contributed these back just yet, but we are open to do so. If the community is interested we can contribute those changes back to flink-shaded and suggest the nece

Re: A problem of open in AggregateFunction

2019-12-04 Thread Guobao Li
Hi, Thanks for the quick response. In fact, I am using a branch which forked the version 1.6. I implemented an AggregateFunction UDF which is used in Flink SQL. The idea is to do groupBy with this UDF and add a metric inside the UDF to track the size of aggregated array. I have override the ope

Re: A problem of open in AggregateFunction

2019-12-04 Thread Biao Liu
Hi Guobao, I just re-checked the data stream API. There is an interesting restriction of `AggregateFunction`. It could not be a rich function. And there are already several relevant issues [1][2][3]. I guess your scenario might be relevant to this restriction (I assume you are using table/SQL API)

Re: Accessing fields in a POJO stream

2019-12-04 Thread Chris Miller
Thank you, adding the missing constructor has done the trick! (FWIW: my 'real' code is in Kotlin and I had a data class with no @JvmOverloads or empty secondary constructor). I haven't seen the fx.get('currency') field access syntax anywhere in the documentation, do you happen to know where I

Flink on YARN: Where to install Flink binaries?

2019-12-04 Thread Piper Piper
Hello, I have a YARN/Hadoop 2.7.6 cluster, on which I plan to run Flink in Job mode using: Flink 1.9.1 (with Flink application programs written in Java) Prebundled Hadoop 2.7.5 Question 1: Which scala version must I choose for the Flink 1.9.1 binary (2.11 or 2.12)? Secondly, I had read a documen

Re: Flink on YARN: Where to install Flink binaries?

2019-12-04 Thread Till Rohrmann
Hi Piper, Answer 1: You should pick the Scala version you are using in your user program. If you don't use Scala at all, then pick 2.11. Answer 2: Flink does not need to be installed on the Yarn nodes. The client is the machine from which you start the Flink cluster. The client machine needs to ha

Re: Building with Hadoop 3

2019-12-04 Thread vino yang
Hi Marton, Thanks for your explanation. Personally, I look forward to your contribution! Best, Vino Márton Balassi 于2019年12月4日周三 下午5:15写道: > Wearing my Cloudera hat I can tell you that we have done this exercise for > our distros of the 3.0 and 3.1 Hadoop versions. We have not contributed > t

Re: Flink 'Job Cluster' mode Ui Access

2019-12-04 Thread Chesnay Schepler
hmm...this is quite odd. Let's try to narrow things down a bit. Could you try starting a local cluster (using the same distribution) and checking whether the UI is accessible? Could you also check whether the flink-dist.jar in /lib contains web/index.html? On 04/12/2019 06:02, Jatin Banger

Re: Flink on YARN: Where to install Flink binaries?

2019-12-04 Thread Piper Piper
Thank you, Till! On Wed, Dec 4, 2019, 5:51 AM Till Rohrmann wrote: > Hi Piper, > > Answer 1: You should pick the Scala version you are using in your user > program. If you don't use Scala at all, then pick 2.11. > Answer 2: Flink does not need to be installed on the Yarn nodes. The > client is t

Re: What happens to a Source's Operator State if it stops being initialized and snapshotted? Accidentally exponential?

2019-12-04 Thread Aaron Levin
Thanks for the clarification. I'll try to find some time to write a reproducible test case and submit a ticket. While it may not be able to delete the non-referenced ones, I'm surprised it's exponentially replicating them, and so it's probably worth documenting in a ticket. On Wed, Nov 27, 2019 at

Need help using AggregateFunction instead of FoldFunction

2019-12-04 Thread devinbost
Hi, In my use case, I am attempting to create a keyedStream (on a string) and then window that stream (which represents keyed JSON objects) with EventTimeSessionWindows (so that I have a separate window for each set of JSON messages, according to the key), and then concatenate the JSON objects by

Re: [DISCUSS] Drop Kafka 0.8/0.9

2019-12-04 Thread Jark Wu
+1 for dropping, also cc'ed user mailing list. Best, Jark On Thu, 5 Dec 2019 at 03:39, Konstantin Knauf wrote: > Hi Chesnay, > > +1 for dropping. I have not heard from any user using 0.8 or 0.9 for a long > while. > > Cheers, > > Konstantin > > On Wed, Dec 4, 2019 at 1:57 PM Chesnay Schepler

Re: [DISCUSS] Drop Kafka 0.8/0.9

2019-12-04 Thread jincheng sun
+1 for drop it, and Thanks for bring up this discussion Chesnay! Best, Jincheng Jark Wu 于2019年12月5日周四 上午10:19写道: > +1 for dropping, also cc'ed user mailing list. > > > Best, > Jark > > On Thu, 5 Dec 2019 at 03:39, Konstantin Knauf > wrote: > > > Hi Chesnay, > > > > +1 for dropping. I have not

Re: [DISCUSS] Drop Kafka 0.8/0.9

2019-12-04 Thread vino yang
+1 jincheng sun 于2019年12月5日周四 上午10:26写道: > +1 for drop it, and Thanks for bring up this discussion Chesnay! > > Best, > Jincheng > > Jark Wu 于2019年12月5日周四 上午10:19写道: > >> +1 for dropping, also cc'ed user mailing list. >> >> >> Best, >> Jark >> >> On Thu, 5 Dec 2019 at 03:39, Konstantin Knauf

Re: Need help using AggregateFunction instead of FoldFunction

2019-12-04 Thread vino yang
Hi devinbost, Sharing two example links with you : - the example code of official documentation[1]; - a StackOverflow answer of a similar question[2]; [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/stream/operators/windows.html#aggregatefunction [2]: https://stackove

What S3 Permissions does StreamingFileSink need?

2019-12-04 Thread Li Peng
Hey folks, I'm trying to use StreamingFileSink with s3, using IAM roles for auth. Does anyone know what permissions the role should have for the specified s3 bucket to work properly? I've been getting some auth errors, and I suspect I'm missing some permissions: data "aws_iam_policy_document" "s3

Re: [DISCUSS] Drop Kafka 0.8/0.9

2019-12-04 Thread Dian Fu
+1 for dropping them. Just FYI: there was a similar discussion few months ago [1]. [1] http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/DISCUSS-Drop-older-versions-of-Kafka-Connectors-0-9-0-10-for-Flink-1-10-td29916.html#a29997

Re: Need help using AggregateFunction instead of FoldFunction

2019-12-04 Thread devinbost
Thanks for the help. I was able to make more progress (based on the documentation you provided), but now I'm getting this exception: org.apache.pulsar.client.impl.DefaultBatcherBuilder@3b5fad2d is not serializable. The object probably contains or references non serializable fields. org.apa

Re: Need help using AggregateFunction instead of FoldFunction

2019-12-04 Thread devinbost
It turns out that the exception that I was getting is actually related to Pulsar since I'm using the Pulsar Flink connector. I found the exact issue reported here: https://github.com/apache/pulsar/issues/4721 devinbost wrote > I was able to make more progress (based on the documentation you > pro

Re: Flink 'Job Cluster' mode Ui Access

2019-12-04 Thread Jatin Banger
I have tried that already using '$FLINK_HOME/bin/jobmanager.sh" start-foreground Ui comes fine with this one. Which means web/index.html is present. On Wed, Dec 4, 2019 at 9:01 PM Chesnay Schepler wrote: > hmm...this is quite odd. > > Let's try to narrow things down a bit. > > Could you try sta

User program failures cause JobManager to be shutdown

2019-12-04 Thread Dongwon Kim
Hi, I tried to run a program by uploading a jar on Flink UI. When I intentionally enter a wrong parameter to my program, JobManager dies. Below is all log messages I can get from JobManager; JobManager dies as soon as spitting the second line: 2019-12-05 04:47:58,623 WARN > org.apache.flink.runt