Re: Java types

2018-01-10 Thread Boris Lublinsky
More questions In Scala my DataProcessor is defined as class DataProcessorKeyed extends CoProcessFunction[WineRecord, ModelToServe, Double] with CheckpointedFunction { And it is used as follows val models = modelsStream.map(ModelToServe.fromByteArray(_)) .flatMap(BadDataHandler[ModelToServe])

Java types

2018-01-10 Thread Boris Lublinsky
I am trying to covert Scala code (which works fine) to Java The sacral code is: // create a Kafka consumers // Data val dataConsumer = new FlinkKafkaConsumer010[Array[Byte]]( DATA_TOPIC, new ByteArraySchema, dataKafkaProps ) // Model val modelConsumer = new FlinkKafkaConsumer010[Array[Byte]]

Re: Anyone got Flink working in EMR with KinesisConnector

2018-01-10 Thread xiatao123
got the issue fixed after applying patch from https://github.com/apache/flink/pull/4150 -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Anyone got Flink working in EMR with KinesisConnector

2018-01-10 Thread Tao Xia
Hi All, I ran into an exception after deployed our app in EMR. It seems like the connection to Kinesis failed. Any one got Flink KinesisConnector working in EMR? Release label:emr-5.11.0 Hadoop distribution:Amazon 2.7.3 Applications:Flink 1.3.2 java.lang.IllegalStateException: Socket not create

Trigger not firing when using BoundedOutOfOrdernessTimestampExtractor

2018-01-10 Thread Jayant Ameta
Hi, When using a BoundedOutOfOrdernessTimestampExtractor, the trigger is not firing. However, the trigger fires when using custom timestamp extractor with similar watermark. Sample code below: 1.Assigner as anonymous class which works fine AssignerWithPeriodicWatermarks> assigner = new AssignerWi

Re: BucketingSink broken in flink 1.4.0 ?

2018-01-10 Thread Kyle Hamlin
I'm having similar issues after moving from 1.3..2 to 1.4.0 *My mailing list thread: *BucketingSink doesn't work anymore moving from 1.3.2 to 1.4.0 I'm not ac

Re: Datastream broadcast with KeyBy

2018-01-10 Thread Fabian Hueske
Hi Anuj, connecting a keyed stream and a broadcasted stream is not supported at the moment but there is work in progress [1] to add this functionality for the next release (Flink 1.5.0). Best, Fabian [1] https://issues.apache.org/jira/browse/FLINK-3659 2018-01-10 12:21 GMT+01:00 Piotr Nowojski

Re: BucketingSink broken in flink 1.4.0 ?

2018-01-10 Thread Chesnay Schepler
Your analysis looks correct, the code in question will never properly detect hadoop file systems. I'll open a jira. Your suggestion to replace it with getUnguardedFileSystem() was my first instinct as well. Good job debugging this. On 10.01.2018 14:17, jelmer wrote: Hi I am trying to conver

BucketingSink broken in flink 1.4.0 ?

2018-01-10 Thread jelmer
Hi I am trying to convert some jobs from flink 1.3.2 to flink 1.4.0 But i am running into the issue that the bucketing sink will always try and connect to hdfs://localhost:12345/ instead of the hfds url i have specified in the constructor If i look at the code at https://github.com/apache/flink/

Re: Stream job failed after increasing number retained checkpoints

2018-01-10 Thread Piotr Nowojski
Hi, This Task Manager log is suggesting that problems lays on the Job Manager side (no visible gap in the logs, GC Time reported is accumulated and 31 seconds accumulated over 963 gc collections is low value). Could you show the Job Manager log itself? Probably it’s the own that’s causing the T

Re: Datastream broadcast with KeyBy

2018-01-10 Thread Piotr Nowojski
Hi, Could you elaborate what is the problem that you are having? What is the exception(s) that you are getting? I have tested such simple example and it’s seems to be working as expected: DataStreamSource input = env.fromElements(1, 2, 3, 4, 5, 1, 2, 3); DataStreamSource confStream = env.fromE

Re: Stream job failed after increasing number retained checkpoints

2018-01-10 Thread Jose Miguel Tejedor Fernandez
Hi, I wonder what reason you might have that you ever want such a huge number > of retained checkpoints? The Flink jobs running on EMR cluster require a checkpoint at midnight. (In our use case we need to synch a loaded delta to our a third party partner with the streamed data). The delta load t

Re: Queryable State - Count within Time Window

2018-01-10 Thread Velumani Duraisamy
Thank you, Fabian, for the references. This is helpful. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: akka.remote.ShutDownAssociation: Shut down address: akka.tcp://flink@fps-flink-jobmanager:45652

2018-01-10 Thread Till Rohrmann
Hi, I'm actually not sure what's going on there. I suspect that it must have something to do with the local setup. Not sure from where you submit your job. If you submitted the job from fps-flink-jobmanager then this could the client actor system which shuts down after submitting the job because y

Re: hadoop-free hdfs config

2018-01-10 Thread Till Rohrmann
Hi Sasha, you're right that if you want to access HDFS from the user code only it should be possible to use the Hadoop free Flink version and bundle the Hadoop dependencies with your user code. However, if you want to use Flink's file system state backend as you did, then you have to start the Fli

Re: is it possible to convert "retract" datastream to table

2018-01-10 Thread Timo Walther
Hi Yan, there are no table source interfaces that allow for creating a retract stream directly yet. Such an interface has to be carefully designed because built-in operators assume that only records that have been emitted previously are retracted. However, they are planned for future Flink ve

Re: is it possible to convert "retract" datastream to table

2018-01-10 Thread Fabian Hueske
Hi, Unfortunately, converting a retraction stream into a Table is not supported yet. However this is definitely on our road map and will be added in a future version. Best, Fabian 2018-01-10 1:04 GMT+01:00 Yan Zhou [FDS Science] : > Hi, > > > There are APIs to convert a dynamic table to retract

Datastream broadcast with KeyBy

2018-01-10 Thread anujk
Currently we have an Flink pipeline running with Data-Src —> KeyBy —> ProcessFunction. State Management (with RocksDB) and Timers are working well. Now we have to extend this by having another Config Stream which we want to broadcast to all process operators. So wanted to connect the Data Stream w

Re: Flink 1.4.0 Mapr libs issues

2018-01-10 Thread Fabian Hueske
Great, thanks for reporting back! 2018-01-09 18:05 GMT+01:00 ani.desh1512 : > Hi Fabian, > Thanks a lot for the reply. Setting the classloader.resolve-order > configuration seems to have done the trick. For anybody else, having the > same problem as this, this is the config that I set: > > /*clas

Re: Custom Partitioning for Keyed Streams

2018-01-10 Thread Piotr Nowojski
Hi, I don’t think it is possible to enforce scheduling of two keys to different nodes, since all of that is based on hashes. For some cases, doing the pre-aggregation step (initial aggregation done before keyBy, which is followed by final aggregation after the keyBy) can be the solution for ha

Re: Stream job failed after increasing number retained checkpoints

2018-01-10 Thread Stefan Richter
Hi, there is no known limitation in the strict sense, but you might run out of dfs space or job manager memory if you keep around a huge number checkpoints. I wonder what reason you might have that you ever want such a huge number of retained checkpoints? Usually keeping one checkpoint should d

Re: What's the meaning of "Registered `TaskManager` at akka://flink/deadLetters " ?

2018-01-10 Thread Piotr Nowojski
Hi, Search both job manager and task manager logs for ip address(es) and port(s) that have timeouted. First of all make sure that nodes are visible to each other using some simple ping. Afterwards please check that those timeouted ports are opened and not blocked by some firewall (telnet). You