Re: late element and expired state

2019-02-06 Thread Yun Tang
Hi Ajay, >From your description, I think watermarks[1], which indicates all earlier >events have been arrived, might meet your requests in a way. But this means >you should use windows and have event-time in your stream job. If you don't want to introduce the concept of window, I think you can

Re: Exactly Once Guarantees with StreamingFileSink to S3

2019-02-06 Thread Kostas Kloudas
Hi Kaustubh, Your general understanding is correct. In this case though, the sink will call the S3Committer#commitAfterRecovery() method. This method, after failing to commit the MPU, it will check if the file is there and if the length is correct, and if everything is ok (which is the case in yo

graceful shutdown of taskmanager

2019-02-06 Thread Bernd.Winterstein
Hi Is there a possibility to gracefully remove a taskmanager from a running cluster? My idea would be to trigger affected jobs to restart via a savepoint on the remaining taskmanagers. When the taskmanager is idle it can be stopped without jobs falling back to an older checkpoint. Regards Bern

Re: No resource available error while testing HA

2019-02-06 Thread Gary Yao
Hi Averell, That log file does not look complete. I do not see any INFO level log messages such as [1]. Best, Gary [1] https://github.com/apache/flink/blob/46326ab9181acec53d1e9e7ec8f4a26c672fec31/flink-yarn/src/main/java/org/apache/flink/yarn/YarnResourceManager.java#L544 On Fri, Feb 1, 2019 a

Get nested Rows from Json string

2019-02-06 Thread françois lacombe
Hi all, I currently get a json string from my pgsql source with nested objects to be converted into Flink's Row. Nested json objects should go in nested Rows. An avro schema rules the structure my source should conform to. According to this json : { "a":"b", "c":"d", "e":{ "f":"g"

Using custom evictor and trigger on Table API

2019-02-06 Thread eastcirclek
Hi all, I’m looking into Table API for my new project. It looks like a sweet spot between DataStream API/SQL. However, it doesn’t seem like the expressivity of Table API equals to that of DataStream API. My previous Flink projects were building simple pipelines using DataStream API with custom

Re: JDBCAppendTableSink on Data stream

2019-02-06 Thread Stefan Richter
Hi, That should be no problem, for example the `JDBCAppendTableSinkTest` is using it also with data stream. Best, Stefan > On 6. Feb 2019, at 07:29, Chirag Dewan wrote: > > Hi, > > In the documentation, the JDBC sink is mentioned as a source on Table > API/stream. > > Can I use the same s

Re: AssertionError: mismatched type $5 TIMESTAMP(3)

2019-02-06 Thread Timo Walther
Hi Chris, the error that you've observed is a bug that might be related to another bug that is not easily solvable. I created an issue for it nevertheless: https://issues.apache.org/jira/browse/FLINK-11543 In general, I think you need to adapt your program in any case. Because you are aggr

Re: H-A Deployment : Job / task manager configuration

2019-02-06 Thread Stefan Richter
Hi, You only need to do the configuration in conf/flink-conf.yaml on the job manager. The configuration will be shipped to the TMs. Best, Stefan > On 5. Feb 2019, at 16:59, bastien dine wrote: > > Hello everyone, > > I would like to know what exactly I need to configure on my job / task >

Re: Exactly Once Guarantees with StreamingFileSink to S3

2019-02-06 Thread Kaustubh Rudrawar
Hi Kostas, Thanks for the response! Yes - I see the commitAfterRecovery being called when a Bucket is restored. I confused myself in thinking that 'onSuccessfulCompletionOfCheckpoint' is called on restore as well, which led me to believe that we were only calling commit and not commitAfterRecovery

Re: AssertionError: mismatched type $5 TIMESTAMP(3)

2019-02-06 Thread Chris Miller
Hi Timo, Thanks for the pointers, bug reports and slides, much appreciated. I'll read up to get a better understanding of the issue and hopefully figure out a more appropriate solution for what I'm trying achieve. I'll report back if I come up with anything that others might find useful. Reg

Re: Get nested Rows from Json string

2019-02-06 Thread Rong Rong
Hi François, I wasn't exactly sure this is a JSON object or JSON string you are trying to process. For a JSON string this [1] article might help. For a JSON object, I am assuming you are trying to convert it into a TableSource and processing using Table/SQL API, you could probably use the example

Re: Using custom evictor and trigger on Table API

2019-02-06 Thread Rong Rong
Hi Dongwon, There was a previous thread regarding this[1], unfortunately this is not supported yet. However there are some latest development proposal[2,3] to enhance the TableAPI which might be able to support your use case. -- Rong [1] http://apache-flink-user-mailing-list-archive.2336050.n4.

HA HDFS

2019-02-06 Thread Steven Nelson
I am working on a POC High Availability installation of Flink on top of Kubernetes with HDFS as a data storage location. I am not finding much documentation on doing this, or I am finding the documentation in parts and maybe getting it put together correctly. I think it falls between being an HDFS