async io parallelism

2020-02-21 Thread Alexey Trenikhun
Hello, Let's say, my elements are simple key-value pairs, elements are coming from Kafka, where they were partitioned by "key", then I do processing using KeyedProcessFunction (keyed by same "key"), then I enrich elements using ordered RichAsyncFunction, then output to another KeyedProcessFuncti

StreamingFileSink Not Flushing All Data

2020-02-21 Thread Austin Cawley-Edwards
Hi there, Using Flink 1.9.1, trying to write .tgz files with the StreamingFileSink#BulkWriter. It seems like flushing the output stream doesn't flush all the data written. I've verified I can create valid files using the same APIs and data on there own, so thinking it must be something I'm doing w

Re: [Flink 1.10] How do I use LocalCollectionOutputFormat now that writeUsingOutputFormat is deprecated?

2020-02-21 Thread Robert Metzger
Hey, you are right. I'm also seeing this exception now. It was hidden in other log output. The solution to all this confusion is simple: DataStreamUtils.collect() Is like an execute(). The stream graph is cleared on each execute(). That's why collect() and then execute() lead to the "no operators

Re: JDBC source running continuously

2020-02-21 Thread Fanbin Bu
Jark, Thank you for the reply. By running continuously, I meant the source operator does not finish after all the data is read. Similar to ContinuousFileMonitoringFunction, i'm thinking of a continuously database monitoring function. The reason for doing this is to enable savepoint for my pipeli

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-21 Thread Fabian Hueske
Congrats Jingsong! Cheers, Fabian Am Fr., 21. Feb. 2020 um 17:49 Uhr schrieb Rong Rong : > Congratulations Jingsong!! > > Cheers, > Rong > > On Fri, Feb 21, 2020 at 8:45 AM Bowen Li wrote: > > > Congrats, Jingsong! > > > > On Fri, Feb 21, 2020 at 7:28 AM Till Rohrmann > > wrote: > > > >> Congr

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-21 Thread Peter Huang
Congrats Jingsong! On Fri, Feb 21, 2020 at 8:49 AM Rong Rong wrote: > Congratulations Jingsong!! > > Cheers, > Rong > > On Fri, Feb 21, 2020 at 8:45 AM Bowen Li wrote: > >> Congrats, Jingsong! >> >> On Fri, Feb 21, 2020 at 7:28 AM Till Rohrmann >> wrote: >> >>> Congratulations Jingsong! >>> >

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-21 Thread Rong Rong
Congratulations Jingsong!! Cheers, Rong On Fri, Feb 21, 2020 at 8:45 AM Bowen Li wrote: > Congrats, Jingsong! > > On Fri, Feb 21, 2020 at 7:28 AM Till Rohrmann > wrote: > >> Congratulations Jingsong! >> >> Cheers, >> Till >> >> On Fri, Feb 21, 2020 at 4:03 PM Yun Gao wrote: >> >>> Congr

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-21 Thread Bowen Li
Congrats, Jingsong! On Fri, Feb 21, 2020 at 7:28 AM Till Rohrmann wrote: > Congratulations Jingsong! > > Cheers, > Till > > On Fri, Feb 21, 2020 at 4:03 PM Yun Gao wrote: > >> Congratulations Jingsong! >> >>Best, >>Yun >> >>

Re: FlinkCEP questions - architecture

2020-02-21 Thread Oytun Tez
Amazing content, thanks for asking and answering. On Fri, Feb 21, 2020 at 5:04 AM Juergen Donnerstag < juergen.donners...@gmail.com> wrote: > thanks a lot > Juergen > > On Mon, Feb 17, 2020 at 11:08 AM Kostas Kloudas > wrote: > >> Hi Juergen, >> >> I will reply to your questions inline. As a gen

Re: [Flink 1.10] How do I use LocalCollectionOutputFormat now that writeUsingOutputFormat is deprecated?

2020-02-21 Thread Niels Basjes
I tried this in Flink 1.10.0 : @Test public void experimentalTest() throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); DataStream input = env.fromElements("One", "Two"); //DataStream input = env.addSource(

Re: [DISCUSS] Drop Savepoint Compatibility with Flink 1.2

2020-02-21 Thread Till Rohrmann
+1 for dropping savepoint compatibility with Flink 1.2. Cheers, Till On Thu, Feb 20, 2020 at 6:55 PM Stephan Ewen wrote: > Thank you for the feedback. > > Here is the JIRA issue with some more explanation also about the > background and implications: > https://jira.apache.org/jira/browse/FLINK-

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-21 Thread Till Rohrmann
Congratulations Jingsong! Cheers, Till On Fri, Feb 21, 2020 at 4:03 PM Yun Gao wrote: > Congratulations Jingsong! > >Best, >Yun > > -- > From:Jingsong Li > Send Time:2020 Feb. 21 (Fri.) 21:42 > To:Hequn Cheng > Cc:Y

Re: Emit message at start and end of event time session window

2020-02-21 Thread Till Rohrmann
Hi Manas and Rafi, you are right that when using merging windows as event time session windows are, then Flink requires that any state the Trigger keeps is of type MergingState. This constraint allows that the state can be merged whenever two windows get merged. Rafi, you are right. With the curr

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-21 Thread Yun Gao
Congratulations Jingsong! Best, Yun -- From:Jingsong Li Send Time:2020 Feb. 21 (Fri.) 21:42 To:Hequn Cheng Cc:Yang Wang ; Zhijiang ; Zhenghua Gao ; godfrey he ; dev ; user Subject:Re: [ANNOUNCE] Jingsong Lee becomes a Fli

Re: Flink's Either type information

2020-02-21 Thread Yun Gao
Hi Jacopo, Robert, Very sorry for missing the previous email and not response in time. I think exactly as Robert has pointed out with the example: using inline anonymous subclass of KeyedBroadcastProcessFunction should not cause the problem. As far as I know, the possible reason

Re: Running Flink Cluster on PaaS

2020-02-21 Thread Yang Wang
> I always wonder what do you guys mean by "Standalone Flink session" or "Standalone Cluster" ... "Standalone Flink session" usually means an empty Flink cluster is started and could accept multiple jobs submission from the Flink client or webui. Even all the jobs finished, the session cluster wil

Re: Question: Determining Total Recovery Time

2020-02-21 Thread Arvid Heise
Hi Morgan, sorry for the late reply. In general, that should work. You need to ensure that the same task is processing the same record though. Local copy needs to be state or else the last message would be lost upon restart. Performance will take a hit but if that is significant depends on the re

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-21 Thread Jingsong Li
Thanks everyone~ It's my pleasure to be part of the community. I hope I can make a better contribution in future. Best, Jingsong Lee On Fri, Feb 21, 2020 at 2:48 PM Hequn Cheng wrote: > Congratulations Jingsong! Well deserved. > > Best, > Hequn > > On Fri, Feb 21, 2020 at 2:42 PM Yang Wang wr

Re: FsStateBackend vs RocksDBStateBackend

2020-02-21 Thread Robert Metzger
I would try the FsStateBackend in this scenario, as you have enough memory available. On Thu, Jan 30, 2020 at 5:26 PM Ran Zhang wrote: > Hi Gordon, > > Thanks for your reply! Regarding state size - we are at 200-300gb but we > have 120 parallelism which will make each task handle ~2 - 3 gb state

Re: [Flink 1.10] How do I use LocalCollectionOutputFormat now that writeUsingOutputFormat is deprecated?

2020-02-21 Thread Robert Metzger
Hey Niels, This minimal Flink job executes in Flink 1.10: public static void main(String[] args) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); DataStream input = env.addSource(new StringSourceFunction()); List result = new

Re: Flink's Either type information

2020-02-21 Thread Robert Metzger
Hey Jacopo, can you post an example to reproduce the issue? I've tried it, but it worked in this artificial example: MapStateDescriptor state = new MapStateDescriptor<>("test", BasicTypeInfo.STRING_TYPE_INFO, BasicTypeInfo.STRING_TYPE_INFO); DataStream> result = input .map((MapFunction>) val

Re: Tests in FileUtilsTest while building Flink in local

2020-02-21 Thread Andrey Zagrebin
These tests also fail on my mac. It may be some mac os setup related issue. I create a JIRA ticket for that: https://issues.apache.org/jira/browse/FLINK-16198 > On 20 Feb 2020, at 12:03, Chesnay Schepler wrote: > > Is the stacktrace identical in both tests? > > Did these fail on the command-l

Re: Flink Kafka connector consume from a single kafka partition

2020-02-21 Thread Robert Metzger
Hey Hemant, Are you able to reconstruct the ordering of the event, for example based on time or some sequence number? If so, you could create as many Kafka partitions as you need (for proper load distribution), disregarding any ordering at that point. Then you keyBy your stream in Flink, and order

Re: AWS Client Builder with default credentials

2020-02-21 Thread Robert Metzger
There are multiple ways of passing configuration parameters to your user defined code in Flink a) use getRuntimeContext().getUserCodeClassLoader().getResource() to load a config file from your user code jar or the classpath. b) use getRuntimeContext().getExecutionConfig().getGlobalJobParameters(

Re: FlinkCEP questions - architecture

2020-02-21 Thread Juergen Donnerstag
thanks a lot Juergen On Mon, Feb 17, 2020 at 11:08 AM Kostas Kloudas wrote: > Hi Juergen, > > I will reply to your questions inline. As a general comment I would > suggest to also have a look at [3] so that you have an idea of some of > the alternatives. > With that said, here come the answers :

Re: JDBC source running continuously

2020-02-21 Thread Jark Wu
Hi Fanbin, .iterate() is not available on Table API, it's an API of DataStream. Currently, the JDBC source is a bounded source (a snapshot of table at the execution time), so the job will finish when it processes all the data. Regarding to your requirement, "running continuously with JDBC source"

Re: Running Flink Cluster on PaaS

2020-02-21 Thread KristoffSC
Thank you Yang Wang, Regarding [1] and a sentence from that doc. "This page describes deploying a standalone Flink session" I always wonder what do you guys mean by "Standalone Flink session" or "Standalone Cluster" that can be found here [2]. I'm using a Docker with Job Cluster approach, I know