ProcessFunction example

2017-03-08 Thread Philippe Caparroy
I think there is an error in the code snippet describing the ProcessFunction time out example : https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/stream/process_function.html @Override public void onTimer(long timestamp, OnTimerContext ctx, Collector> out) throws

Re: Frequent Full GC's in case of FSStateBackend

2017-03-08 Thread saiprasad mishra
Thanks Vinay for the quick reply. Yes rocksdb version is working perfectly without any issues but it needs more planning on the hardware side for the servers running the job As you said and observed FsStateBackend is not useful for large state which does not fit memory, we do have very large stat

Re: Frequent Full GC's in case of FSStateBackend

2017-03-08 Thread vinay patil
Hi Sai, If you are sure that your state will not exceed the memory limit of nodes then you should consider FSStatebackend otherwise you should go for RocksDB What is the configuration of your cluster ? On Mar 9, 2017 7:31 AM, "saiprasad mishra [via Apache Flink User Mailing List archive.]" wrot

Re: Frequent Full GC's in case of FSStateBackend

2017-03-08 Thread saiprasad mishra
Hi All I am also seeing issues with FsStateBackend as it stalls coz of full gc. We have very large state, Does this mean the below doc should not claim that FsStateBackend is encouraged for large state. https://ci.apache.org/projects/flink/flink-docs-release-1.2/ops/state_backends.html#the-fsstat

Re: AWS exception serialization problem

2017-03-08 Thread Bruno Aranda
Hi Stephan, we are running Flink 1.2.0 on Yarn (AWS EMR cluster) On Wed, 8 Mar 2017, 21:41 Stephan Ewen, wrote: > @Bruno: How are you running Flink? On yarn, standalone, mesos, docker? > > On Wed, Mar 8, 2017 at 2:13 PM, Bruno Aranda > wrote: > > Hi, > > We have seen something similar in Flink

Re: Connecting workflows in batch

2017-03-08 Thread Shannon Carey
It may not return for batch jobs, either. See my post http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Job-completion-or-failure-callback-td12123.html In short, if Flink returned an OptimizerPlanEnvironment from your call to ExecutionEnvironment.getExecutionEnvironment, when y

Job completion or failure callback?

2017-03-08 Thread Shannon Carey
Hi, Is there any way we can run a callback on job completion or failure without leaving the client running during job execution? For example, when we submit the job via the web UI the main() method's call to ExecutionEnvironment#execute() appears to by asynchronous with the job execution. Ther

Re: AWS exception serialization problem

2017-03-08 Thread Stephan Ewen
@Bruno: How are you running Flink? On yarn, standalone, mesos, docker? On Wed, Mar 8, 2017 at 2:13 PM, Bruno Aranda wrote: > Hi, > > We have seen something similar in Flink 1.2. We have an operation that > parses some JSON, and when it fails to parse it, we can see the > ClassNotFoundException f

Re: window function not working when control stream broadcast

2017-03-08 Thread Sam Huang
Hi Timo, The window function sinks the data into InfluxDB, and it's not triggered. If I comment the ".timeWindow", and print results after the reduce function, it works Code for window function is here: private static class WindowFunImpl implements WindowFunction { @Override public void a

Re: Flink 1.2 Proper Packaging of flink-table with SBT

2017-03-08 Thread Justin Yan
Hi Timo, Regarding the dependency issue, looking at flink-table's pom.xml, I believe the issue is the dependency on flink-streaming-scala, which then transitively depends on almost all of the core flink modules. If it had not been for the aforementioned JIRA issue about OOM errors, I probably wou

Re: Remove Accumulators at runtime

2017-03-08 Thread PedroMrChaves
Hi, I'm building a system that maintains a set of rules that can be dynamically added/removed. I wanted to count every element that matched each rule in an accumulator ( I have several parallel instances). If the rule is removed so should the accumulator. - Best Regards, Pedro Chaves -- V

Re: Isolate Tasks - Run Distinct Tasks in Different Task Managers

2017-03-08 Thread PedroMrChaves
Thanks for the response. I would like to assure that the map operator is not in the same task manager as the window/apply operator, regardless of the number of slots of each task manager. - Best Regards, Pedro Chaves -- View this message in context: http://apache-flink-user-mailing-list-a

Re: flink/cancel & shutdown hooks

2017-03-08 Thread Ufuk Celebi
OK, the thing is that the JVMs are not shut down when you cancel the task. Therefore no shut down hook is executed when you cancel. You would have to execute bin/stop-cluster.sh to stop the JVM. Does that make sense? On Wed, Mar 8, 2017 at 3:34 PM, Dominik Safaric wrote: > I’m not using YARN bu

Re: flink/cancel & shutdown hooks

2017-03-08 Thread Dominik Safaric
I’m not using YARN but instead of starting the cluster using bin/start-cluster.sh > On 8 Mar 2017, at 15:32, Ufuk Celebi wrote: > > On Wed, Mar 8, 2017 at 3:19 PM, Dominik Safaric > wrote: >> The cluster consists of 4 workers and a master node. > > Are you starting the cluster via bin/start-

Re: flink/cancel & shutdown hooks

2017-03-08 Thread Ufuk Celebi
On Wed, Mar 8, 2017 at 3:19 PM, Dominik Safaric wrote: > The cluster consists of 4 workers and a master node. Are you starting the cluster via bin/start-cluster.sh or are you using YARN etc.?

Re: flink/cancel & shutdown hooks

2017-03-08 Thread Ufuk Celebi
How are you deploying your job? Shutdown hooks are executed when the JVM terminates whereas the cancel command only cancels the Flink job and the JVM process potentially keeps running. For example, running a standalone cluster would keep the JVMs running. On Wed, Mar 8, 2017 at 9:36 AM, Timo Walt

Re: flink/cancel & shutdown hooks

2017-03-08 Thread Dominik Safaric
I’m deploying the job from the master node of the cluster itself using bin/flink run -c . The cluster consists of 4 workers and a master node. Dominik > On 8 Mar 2017, at 15:16, Ufuk Celebi wrote: > > How are you deploying your job? > > Shutdown hooks are executed when the JVM terminates

Re: TTL for State Entries / FLINK-3089

2017-03-08 Thread Ufuk Celebi
Looping in Aljoscha and Kostas who are the expert on this. :-) On Mon, Mar 6, 2017 at 6:06 PM, Johannes Schulte wrote: > Hi, > > I am trying to achieve a stream-to-stream join with big windows and are > searching for a way to clean up state of old keys. I am already using a > RichCoProcessFunctio

Re: Isolate Tasks - Run Distinct Tasks in Different Task Managers

2017-03-08 Thread Ufuk Celebi
Internally, Flink defines through SlotSharingGroup which tasks may share a task manager slot. By configuring each TaskManager to have a single slot and configuring the slot sharing groups accordingly, you can get the desired behaviour. You can specify the slot sharing group for an operator like ma

Re: AWS exception serialization problem

2017-03-08 Thread Bruno Aranda
Hi, We have seen something similar in Flink 1.2. We have an operation that parses some JSON, and when it fails to parse it, we can see the ClassNotFoundException for the relevant exception (in our case JsResultException from the play-json library). The library is indeed in the shaded JAR, otherwis

Re: AWS exception serialization problem

2017-03-08 Thread Tzu-Li (Gordon) Tai
Hi Shannon, Just to clarify: From the error trace, it seems like that the messages fetched from Kafka are serialized `AmazonS3Exception`s, and you’re emitting a stream of `AmazonS3Exception` as records from FlinkKafkaConsumer? Is this correct? If so, I think we should just make sure that the `

Re: Remove Accumulators at runtime

2017-03-08 Thread Ufuk Celebi
Hey Pedro! No, this is not possible. What your use case for this? On Wed, Mar 8, 2017 at 10:52 AM, PedroMrChaves wrote: > Hello, > > We can add an accumulator using the following call: > getRuntimeContext().addAccumulator(NAME, ACCUMULATOR); > > Is there a way to remove the added accumulators at

Re: Starting flink HA cluster with start-cluster.sh

2017-03-08 Thread Ufuk Celebi
Shouldn't the else branch ``` else HIGH_AVAILABILITY=${DEPRECATED_HA} fi ``` set it to `zookeeper`? Of course, the truth is whatever the script execution prints out. ;-) PS Emails like this should either go to the dev list or it's also fine to open an issue and discuss there (and potentially

Remove Accumulators at runtime

2017-03-08 Thread PedroMrChaves
Hello, We can add an accumulator using the following call: getRuntimeContext().addAccumulator(NAME, ACCUMULATOR); Is there a way to remove the added accumulators at runtime? Regards, Pedro Chaves - Best Regards, Pedro Chaves -- View this message in context: http://apache-flink-user-mail

Re: Flink 1.2 Proper Packaging of flink-table with SBT

2017-03-08 Thread Timo Walther
Hi Justin, thank you for reporting your issues. I never tried the Table API with SBT but `flink-table` should not declare dependencies to core modules, this is only done in `test` scope, maybe you have to specify the right scope manually? You are right, the mentioned Jira should be fixed asap,

Isolate Tasks - Run Distinct Tasks in Different Task Managers

2017-03-08 Thread PedroMrChaves
Hello, Assuming that I have the following Job Graph, (Source) -> (map) -> (KeyBy | Window | apply) -> (Sink) Is there a way to assure that the map operator (and all its subtasks) run on a different task manager than the operator (map | window | apply)? This would allow JVM memory isolati

Starting flink HA cluster with start-cluster.sh

2017-03-08 Thread Dawid Wysakowicz
Hi, I've tried to start cluster with HA mode as described in the doc, but with a current state of bin/config.sh I failed. I think there is a bug with configuring the HIGH_AVAILABILITY variable in block (bin/config.sh): if [ -z "${HIGH_AVAILABILITY}" ]; then HIGH_AVAILABILITY=$(readFromConfi

Re: Flink checkpointing gets stuck

2017-03-08 Thread Ufuk Celebi
Added this JIRA https://issues.apache.org/jira/browse/FLINK-5993 to track this. Would be great to comment there if you have any other issues that should be covered in a Azure deployment section. On Tue, Mar 7, 2017 at 7:10 PM, Stephan Ewen wrote: > Great to hear it! > > What do you think about a

Re: Integrate Flink with S3 on EMR cluster

2017-03-08 Thread vinay patil
Hi , @Shannon - I am not facing any issue while writing to S3, was getting NoClassDef errors when reading the file from S3. ''Hadoop File System" - I mean I am using FileSystem class of Hadoop to read the file from S3. @Stephan - I tried with 1.1.4 , was getting the same issue. The easiest way

Re: window function not working when control stream broadcast

2017-03-08 Thread Timo Walther
Hi Sam, could you explain the behavior a bit more? How does the window function behave? Is it not triggered or what is the content? What is the result if you don't use a window function? Timo Am 08/03/17 um 02:59 schrieb Sam Huang: btw, the reduce function works well, I've printed out the

Re: Appropriate State to use to buffer events in ProcessFunction

2017-03-08 Thread Timo Walther
Hi Yassine, have you thought about using a ListState? As far as I know, it keeps at least the insertion order. You could sort it once your trigger event has arrived. If you use a RocksDB as state backend, 100+ GB of state should not be a problem. Have you thought about using Flink's CEP librar

Re: flink/cancel & shutdown hooks

2017-03-08 Thread Timo Walther
Hi Dominik, did you take a look into the logs? Maybe the exception is not shown in the CLI but in the logs. Timo Am 07/03/17 um 23:58 schrieb Dominik Safaric: Hi all, I would appreciate for any help or advice in regard to default Java runtime shutdown hooks and canceling Flink jobs. Namel