Failed to submit 0.10.1

2016-02-08 Thread Andrew Ge Wu
Hi All I’m new to flink and come to the step to submit to a remote cluster, and it failed with following message: Association with remote system [akka.tcp://flink@127.0.0.1:61231] has failed, address is now gated for [5000] ms. Reason is: [scala.Option; local class incompatible: stream classde

Re: Failed to submit 0.10.1

2016-02-08 Thread Maximilian Michels
Hi Andrew, It appears that you're using two different versions of the Scala library in your Flink job. Please make sure you use either 2.10 or 2.11 but not both at the same time. Best, Max On Mon, Feb 8, 2016 at 10:30 AM, Andrew Ge Wu wrote: > Hi All > > I’m new to flink and come to the step to

Re: Error while reading binary file

2016-02-08 Thread Till Rohrmann
Hi Saliya, in order to set the file path for the SerializedInputFormat you first have to create it and then explicitly call setFilePath. final SerializedInputFormat inputFormat = new SerializedInputFormat(); inputFormat.setFilePath(PATH_TO_FILE); env.createInput(inputFormat, myTypeInfo); Cheers

Re: Error while reading binary file

2016-02-08 Thread Maximilian Michels
Hi Saliya, Thanks for your question. Flink's type analyzer couldn't extract the type information. You may implement the ResultTypeQueryable interface in your custom source. That way you can manually specify the correct type. If that doesn't help you, could you please share more of the stack trace?

Re: OutputFormat vs SinkFunction

2016-02-08 Thread Maximilian Michels
Hi Nick, SinkFunction just implements user-defined functions on incoming elements. OutputFormat offers more lifecycle methods. Thus it is a more powerful interface. The OutputFormat originally comes from the batch API, whereas the SinkFunction originates from streaming. Those were more separate co

Re: Distribution of sinks among the nodes

2016-02-08 Thread Aljoscha Krettek
Hi, I just merged the new feature, so once this makes it into the 1.0-SNAPSHOT builds you should be able to use: env.setParallelism(4); env .addSource(kafkaSource) .rescale() .map(mapper).setParallelism(16); .rescale() .addSink(kafkaSink); to get your desired behavior. For t

Re: Failed to submit 0.10.1

2016-02-08 Thread Andrew Ge Wu
Thanks Max My local and remote environment are running: Scala code runner version 2.11.7 -- Copyright 2002-2013, LAMP/EPFL And I downloaded binary 2.11(//apache.mirrors.spacedump.net/flink/flink-0.10.1/flink-0.10.1-bin-hadoop27-scala_2.11.tgz), Is there a different version of client lib for scal

Re: Failed to submit 0.10.1

2016-02-08 Thread Andrew Ge Wu
Yes, found a special dependency for 2.11, Thanks! org.apache.flink flink-streaming-java_2.11 ${apache.flink.versin} Andrew > On 08 Feb 2016, at 14:18, Andrew Ge Wu wrote: > > Thanks Max > > My local and remote environment are running: Scala code runner versi

Re: Failed to submit 0.10.1

2016-02-08 Thread Maximilian Michels
You're welcome. As of recent changes, all Maven artifact names are now suffixed with the Scala major version. However, the old artifacts are still available for the snapshot version. I've just pushed an empty flink-streaming-java for 1.0-SNAPSHOT to prevent users from compiling code which would fai

Re: Error while reading binary file

2016-02-08 Thread Saliya Ekanayake
Thank you Till and Max. I'll try the set file path method and let you know. On Feb 8, 2016 5:45 AM, "Maximilian Michels" wrote: > Hi Saliya, > > Thanks for your question. Flink's type analyzer couldn't extract the > type information. You may implement the ResultTypeQueryable interface > in your c

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Robert Metzger
You said earlier that you are using Flink 0.10. The feature is only available in 1.0-SNAPSHOT. On Mon, Feb 8, 2016 at 4:53 PM, Pieter Hameete wrote: > Ive tried setting the yarn.application-master.port property in > flink-conf.yaml to a range suggested in > https://ci.apache.org/projects/flink/f

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Pieter Hameete
Matter of RTFM eh ;-) thx and sorry for the bother. 2016-02-08 17:06 GMT+01:00 Robert Metzger : > You said earlier that you are using Flink 0.10. The feature is only > available in 1.0-SNAPSHOT. > > On Mon, Feb 8, 2016 at 4:53 PM, Pieter Hameete wrote: > >> Ive tried setting the yarn.application

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Pieter Hameete
Ive tried setting the yarn.application-master.port property in flink-conf.yaml to a range suggested in https://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html#running-flink-on-yarn-behind-fi rewalls The JobManager does not seem to be picking the property up. Am I setting this

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Pieter Hameete
After downloading and building the 1.0-SNAPSHOT from the master branch I do run into another problem when starting a YARN cluster. The startup now infinitely loops at the following step: 17:39:12,369 INFO org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider - Failing over to rm2 17:39:

Re: OutputFormat vs SinkFunction

2016-02-08 Thread Nick Dimiduk
In my case, I have my application code that is calling addSink, for which I'm writing a test that needs to use LocalCollectionOutputFormat. Having two separate class hierarchies is not helpful, hence the adapter. Much of this code already exists in the implementation of FileSinkFunction, so the pro

Re: OutputFormat vs SinkFunction

2016-02-08 Thread Maximilian Michels
Changing the class hierarchy would break backwards-compatibility of the API. However, we could add another method to DataStream to easily use OutputFormats in streaming. How did you write your adapter? I came up with the one below. Admittedly, it is sort of a hack but works fine. By the way, you c

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Robert Metzger
Mh, that's weird. Maybe both resource managers are marked as "standby"? Not sure what can cause this issue. Which YARN version are you using? Maybe you need to build Flink against that specific hadoop version yourself. On Mon, Feb 8, 2016 at 5:50 PM, Pieter Hameete wrote: > After downloading an

Re: Error while reading binary file

2016-02-08 Thread Saliya Ekanayake
Till, I am still having trouble getting this to work. Here's my code ( https://github.com/esaliya/flinkit) String binaryFile = "src/main/resources/sample.bin"; SerializedInputFormat sif = new SerializedInputFormat<>(); sif.setFilePath(binaryFile); DataSet ds = env.createInput(sif); System.out.pri

Re: OutputFormat vs SinkFunction

2016-02-08 Thread Nick Dimiduk
On Mon, Feb 8, 2016 at 9:51 AM, Maximilian Michels wrote: > Changing the class hierarchy would break backwards-compatibility of the > API. However, we could add another method to DataStream to easily use > OutputFormats in streaming. > Indeed, that's why I suggested deprecating one and moving to

Re: Flink on YARN: Stuck on "Trying to register at JobManager"

2016-02-08 Thread Pieter Hameete
Solved: indeed it needed to be built for YARN 2.7.1 specifically. Cheers! 2016-02-08 19:13 GMT+01:00 Robert Metzger : > Mh, that's weird. Maybe both resource managers are marked as "standby"? > Not sure what can cause this issue. > > Which YARN version are you using? Maybe you need to build Flink

Kafka partition alignment for event time

2016-02-08 Thread shikhar
My Flink job is doing aggregations on top of event-time based windowing across Kafka partitions. As I have been developing and restarting it, the state for the catch-up periods becomes unreliable -- lots of duplicate emits for time windows already seen before, that I have to discard since my sink c

Re: Error while reading binary file

2016-02-08 Thread Fabian Hueske
Hi, please try to replace DataSet ds = env.createInput(sif); by DataSet ds = env.createInput(sif, ValueTypeInfo.SHORT_VALUE_TYPE_INFO); Best, Fabian 2016-02-08 19:33 GMT+01:00 Saliya Ekanayake : > Till, > > I am still having trouble getting this to work. Here's my code ( > https://github.com/es

Re: OutputFormat vs SinkFunction

2016-02-08 Thread Aljoscha Krettek
Hi, one problem that I see with OutputFormat is that they are not made for a streaming world. By this, I mean that they don’t handle failure well and don’t consider fault-torelant streaming, i.e. exactly once semantics. For example, what would be expected to happen if a job with a FileOutputForm

Re: Error while reading binary file

2016-02-08 Thread Saliya Ekanayake
Thank you, Fabian. It solved the compilation error, but at runtime I get an end-of-file exception. I've put up a sample code with data at Github https://github.com/esaliya/flinkit. The data file is a binary file containing 64 Short values. 02/08/2016 16:01:19 CHAIN DataSource (at main(WordCount.j

Re: Kafka partition alignment for event time

2016-02-08 Thread shikhar
Things make more sense after coming across http://mail-archives.apache.org/mod_mbox/flink-user/201512.mbox/%3CCANC1h_vVUT3BkFFck5wJA2ju_sSenxmE=Fiizr=ds6tbasy...@mail.gmail.com%3E I need to ensure the parallelism is at least the number of partitions. This seems like a gotcha that could be better d

Re: Kafka partition alignment for event time

2016-02-08 Thread Aljoscha Krettek
Hi, what do you mean by this? I think it should also work when setting parallelism to 1. If not, then there is either a problem with Flink or maybe something in the Data. -Aljoscha > On 08 Feb 2016, at 21:43, shikhar wrote: > > Things make more sense after coming across > http://mail-archives.

Re: Kafka partition alignment for event time

2016-02-08 Thread shikhar
Stephan explained in that thread that we're picking the min watermark when doing operations that join streams from multiple sources. If we have m:n partition-source assignment where m>n, the source is going to end up with the max watermark. Having m<=n ensures that the lowest watermark is used. Re

Re: Error while reading binary file

2016-02-08 Thread Fabian Hueske
The SerializedInputFormat extends the BinaryInputFormat which expects a special block-wise encoding and certain metadata fields. It is not suited to read arbitrary binary files such as a file with 64 short values. I suggest to implement a custom input format based on FileInputFormat. Best, Fabian

Re: Error while reading binary file

2016-02-08 Thread Saliya Ekanayake
Thank you, Fabian. I'll try to do it. On Mon, Feb 8, 2016 at 4:37 PM, Fabian Hueske wrote: > The SerializedInputFormat extends the BinaryInputFormat which expects a > special block-wise encoding and certain metadata fields. > It is not suited to read arbitrary binary files such as a file with 64

Re: OutputFormat vs SinkFunction

2016-02-08 Thread Nick Dimiduk
I think this depends on the implementation of the OutputFormat. For instance, an HBase, Cassandra or ES OF will handle most operations as idempotent when the scheme is designed appropriately. You are (rightly) focusing on FileOF's, which also depend on the semantics of their implementation. MR alw