Re: Auto Scaling in Flink

2019-12-03 Thread vino yang
Hi Akash, The key difference between Pravega and Kafka is: Kafka is a messaging system, while Pravega is a streaming system.[1] The official documentation also statements their difference in their faq page.[2] [1]: https://siliconangle.com/2017/04/17/dell-emc-takes-on-streaming-storage-with-open-

Flink authentication hbase use kerberos

2019-12-03 Thread venn
Hi Guys: I wonder about, it is work that flink on yarn deploy on no authentication Hadoop cluster, access hbase deploy on Kerberos authentication Hadoop cluster? If work, what I need to do. I already config flink-conf-yaml properties "security.kerberos.login.keytab" and "security.kerbe

Re: Building with Hadoop 3

2019-12-03 Thread vino yang
cc @Chesnay Schepler to answer this question. Foster, Craig 于2019年12月4日周三 上午1:22写道: > Hi: > > I don’t see a JIRA for Hadoop 3 support. I see a comment on a JIRA here > from a year ago that no one is looking into Hadoop 3 support [1]. Is there > a document or JIRA that now exists which would poi

Re: [DISCUSS] Drop RequiredParameters and OptionType

2019-12-03 Thread vino yang
+1, One concern: these two classes are marked with `@publicEvolving` annotation. Shall we mark them with `@Deprecated` annotation firstly? Best, Vino Dian Fu 于2019年12月3日周二 下午8:56写道: > +1 to remove them. It seems that we should also drop the class Option as > it's currently only used in Require

How to explain the latency at different source injection rate?

2019-12-03 Thread Wang, Rui2
Hi there, I have got confused by the issue about the end-2-end latency statics. When I run benchmark(similar to yahoo streaming benchmark, yet without Redis, data generated by Flink, sink to Kafka), key parameters set as below: Task manager(short for tm) number: 3, slots per tm: 3, paral

Re: [ANNOUNCE] Weekly Community Update 2019/48

2019-12-03 Thread Dongwon Kim
Dear community, As Konstantin mentioned, there will be the second Apache Flink meetup in Seoul [1] and we just finalized the list of speakers. Here I share the talk lineup with brief introduction. 1. "Mastering Flink's interval join" by Jihye Yeom (Kakao corp.) : Kakao corp. recently adopts Apach

Re: Flink 'Job Cluster' mode Ui Access

2019-12-03 Thread Jatin Banger
Hi, I am using flink binary directly. I am using this command to deploy the script. "$FLINK_HOME/bin/standalone-job.sh" start-foreground --job-classname ${ARGS_FOR_JOB} where ARGS_FOR_JOB contain job class name and all other necessary details needed by the job. Best rega

Re: Auto Scaling in Flink

2019-12-03 Thread Akash Goel
Hi, If my application is already running on Kafka, then do I need to replace with Pravega or can Pravega read directly from Kafka? I have also reached out to to Pravega Community but just checking here. Thanks, Akash Goel On Fri, Nov 29, 2019 at 11:14 AM Caizhi Weng wrote: > Hi Akash, > > Fli

Re: Add time attribute column to POJO stream

2019-12-03 Thread Jingsong Lee
Hi Chris, First thing, FxRate is not POJO, a POJO should have a constructor without arguments. In this way, you can read from a POJO DataStream directly. Second, if you want get field from POJO, please use get function like: fx.get('currency'), if you have a POJO field, you can use this way to ge

Re: A problem of open in AggregateFunction

2019-12-03 Thread Jingsong Li
Hi Guobao, Looks like this is from table/SQL API. You can override public void open(FunctionContext context) It should work, can you provide more information? Like: - version - which planner - what problem, open method never being invoked? Best, Jingsong Lee On Wed, Dec 4, 2019 at 11:09 AM Biao

Re: A problem of open in AggregateFunction

2019-12-03 Thread Biao Liu
Hi Guobao, Are you using table API? I'm not familiar with table API, but for data stream API, generally speaking user could do some initialization through "open" method of "Rich" function, like "RichAggregateFunction". Thanks, Biao /'bɪ.aʊ/ On Tue, 3 Dec 2019 at 22:44, Guobao Li wrote: > Hi

How does Flink handle backpressure in EMR

2019-12-03 Thread Nguyen, Michael
Hello all, How does Flink handle backpressure (caused by an increase in traffic) in a Flink job when it’s being hosted in an EMR cluster? Does Flink detect the backpressure and auto-scales the EMR cluster to handle the workload to relieve the backpressure? Once the backpressure is gone, then th

Localenvironment jobcluster ha High availability

2019-12-03 Thread Eric HOFFMANN
Hi, i use a jobcluster (1 manager and 1 worker) in kubernetes for streaming application, i would like to have the lightest possible solution, is it possible to use a localenvironment (manager and worker embeded) and still have HA with zookeeper in this mode?, I mean kubernetes will restart the j

Re: Event Timestamp corrupted by timezone

2019-12-03 Thread Lasse Nedergaard
Hi. We have the same Challenges. I asked on Flink forward and it’s a known problem. We input in utc but Flink output in local machine time. We have created a function that converts it back to utc before collecting to down stream. Med venlig hilsen / Best regards Lasse Nedergaard > Den 3. de

Building with Hadoop 3

2019-12-03 Thread Foster, Craig
Hi: I don’t see a JIRA for Hadoop 3 support. I see a comment on a JIRA here from a year ago that no one is looking into Hadoop 3 support [1]. Is there a document or JIRA that now exists which would point to what needs to be done to support Hadoop 3? Right now builds with Hadoop 3 don’t work obvi

Add time attribute column to POJO stream

2019-12-03 Thread Chris Miller
I'm having trouble dealing with a DataStream of POJOs. In particular, when I perform SQL operations on it I can't figure out the syntax for referring to individual fields within the POJO. Below is an example that illustrates the problem and the various approaches I've tried. Can anyone please

A problem of open in AggregateFunction

2019-12-03 Thread Guobao Li
Hi community, I am trying to register a metric in an aggregate UDF by overriding the open function. According to the documentation, the open function can be override in order to retrieve the metric group to do the metric registration. But it works only on ScalarFunction not on AggregateFunction

Event Timestamp corrupted by timezone

2019-12-03 Thread Wojciech Indyk
Hi! I use Flink 1.8 with Scala. I think I've found a problem with event timestamps in TableAPI. When I mark my timestamp: Long as .rowtime and then save it back to stream as sql.Timestamp I will get wrong .getTime result. The gist for reproduction is here: https://gist.github.com/woj-i/b1dfbb71590b

Re: [DISCUSS] Drop RequiredParameters and OptionType

2019-12-03 Thread Dian Fu
+1 to remove them. It seems that we should also drop the class Option as it's currently only used in RequiredParameters. > 在 2019年12月3日,下午8:34,Robert Metzger 写道: > > +1 on removing it. > > On Tue, Dec 3, 2019 at 12:31 PM Stephan Ewen > wrote: > I just stumbled across

Re: [DISCUSS] Drop RequiredParameters and OptionType

2019-12-03 Thread Robert Metzger
+1 on removing it. On Tue, Dec 3, 2019 at 12:31 PM Stephan Ewen wrote: > I just stumbled across these classes recently and was looking for sample > uses. > No examples and other tests in the code base seem to > use RequiredParameters and OptionType. > > They also seem quite redundant with how Pa

Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?

2019-12-03 Thread Piotr Nowojski
Hi, Yes, it is only related to **batch** jobs, but not necessarily only to DataSet API jobs. If you are using for example Blink SQL/Table API to process some bounded data streams (tables), it could also be visible/affected there. If not, I would suggest to start a new user mailing list question

Re: Table/SQL API to read and parse JSON, Java.

2019-12-03 Thread Zhenghua Gao
the kafka connector jar is missing in your class path *Best Regards,* *Zhenghua Gao* On Mon, Dec 2, 2019 at 2:14 PM srikanth flink wrote: > Hi there, > > I'm following the link > > to read JSON data from Kafka

[DISCUSS] Drop RequiredParameters and OptionType

2019-12-03 Thread Stephan Ewen
I just stumbled across these classes recently and was looking for sample uses. No examples and other tests in the code base seem to use RequiredParameters and OptionType. They also seem quite redundant with how ParameterTool itself works (tool.getRequired()). Should we drop them, in an attempt to

Re: [DISCUSSION] Using `DataStreamUtils.reinterpretAsKeyedStream` on a pre-partitioned Kafka topic

2019-12-03 Thread Robin Cassan
Thanks for your answers, we will have a look at adapting the Kafka source to assign the input partitions depending on the assigned Keygroups. If anyone has already done such a thing I'd love your advice! Cheers Robin Le lun. 2 déc. 2019 à 08:48, Gyula Fóra a écrit : > Hi! > > As far as I know,