date:20171129

Flik typesafe configuration

2017-11-29 Thread Georg Heiler

Starting out with flint from a scala background I would like to use the Typesafe configuration like: https://github.com/pureconfig/pureconfig, however, https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/best_practices.html link recommends to setup: env.getConfig().setGlobalJobParamete

Maintain heavy hitters in Flink application

2017-11-29 Thread m@xi

Hello everyone! I want to implement a streaming algorithm like Misa-Gries or Space Saving in Flink. The goal is to maintain the heavy hitters for my (possibly unbounded) input streams throughout all the time my app runs. More precisely, I want to have a non-stop running task that runs the Space Sa

Issue with Checkpoint restore( Beam pipeline)

2017-11-29 Thread Jins George

Hi, I am running a Beam Pipeline on Flink 1.2 and facing an issue in restoring a job from checkpoint. If I modify my beam pipeline to add a new operator and try to restore from the externalized checkpoint, I get the error /java.lang.IllegalStateException: Invalid Invalid number of operator

Re: getting started with link / scala

2017-11-29 Thread Georg Heiler

You would suggest: https://github.com/ottogroup/flink-spector for unit tests? Georg Heiler schrieb am Mi., 29. Nov. 2017 um 22:33 Uhr: > Thanks, this sounds like a good idea - can you recommend such a project? > > Jörn Franke schrieb am Mi., 29. Nov. 2017 um > 22:30 Uhr: > >> If you want to rea

Re: getting started with link / scala

2017-11-29 Thread Georg Heiler

Thanks, this sounds like a good idea - can you recommend such a project? Jörn Franke schrieb am Mi., 29. Nov. 2017 um 22:30 Uhr: > If you want to really learn then I recommend you to start with a flink > project that contains unit tests and integration tests (maybe augmented > with https://wiki.

Re: getting started with link / scala

2017-11-29 Thread Jörn Franke

If you want to really learn then I recommend you to start with a flink project that contains unit tests and integration tests (maybe augmented with https://wiki.apache.org/hadoop/HowToDevelopUnitTests to simulate a HDFS cluster during unit tests). It should also include coverage reporting. These

getting started with link / scala

2017-11-29 Thread Georg Heiler

Getting started with Flink / scala, I wonder whether the scala base library should be excluded as a best practice: https://github.com/tillrohrmann/flink-project/blob/master/build.sbt#L32 // exclude Scala library from assembly assemblyOption in assembly := (assemblyOption in assembly).value.copy(inc

Re: Dataset using several count operator in the same environment

2017-11-29 Thread Timo Walther

Hi Ebru, the count() operator is a very simple utility functions that calls execute() internally. If you want to have a more complex pipeline you can take a look at how our WordCount [0] example works. The general concept is to emit a 1 for every record and sum the ones in parallel. If you ne

Re: Classpath for execution of KafkaSerializer/Deserializer; java.lang.NoClassDefFoundError while class in job jar

2017-11-29 Thread Chesnay Schepler

This issues sounds strikingly similar to FLINK-6965. TL;DR: You must place classes loaded during serialization by the kafka connector under /lib. On 29.11.2017 16:15, Timo Walther wrote: Hi Bart, usually, this error means that your Maven project configuration is not correct. Is your custom

Re: Classpath for execution of KafkaSerializer/Deserializer; java.lang.NoClassDefFoundError while class in job jar

2017-11-29 Thread Timo Walther

Hi Bart, usually, this error means that your Maven project configuration is not correct. Is your custom class included in the jar file that you submit to the cluster? It might make sense to share your pom.xml with us. Regards, Timo Am 11/29/17 um 2:44 PM schrieb Bart Kastermans: I have a

Re: Taskmanagers are quarantined

2017-11-29 Thread Stephan Ewen

We also saw issues in the failure detection/quarantining with some Hadoop versions because of a subtle runtime netty version conflict. Fink 1.4 shades Flink's / Akka's Netty, in Flink 1.3 you may need to exclude the Netty dependency pulled in through Hadoop explicitly. Also, Hadoop version mismatc

Dataset using several count operator in the same environment

2017-11-29 Thread ebru

Hi all, We are trying to use more than one count operator for dataset, but it executes first count and skips other operations. Also we call env.execute(). How can we solve this problem? -Ebru

Re: user driven stream processing

2017-11-29 Thread zanqing zhang

Thanks Fabian and Tony for the info. It's very helpful. Looks like the general approach is to implement a job topology containing parameterized (CoXXXMapFunction) operators. The user defined parameters will be ingested using the extra input the CoXXXMapFunction take. Ken On Wed, Nov 29, 2017 at

Classpath for execution of KafkaSerializer/Deserializer; java.lang.NoClassDefFoundError while class in job jar

2017-11-29 Thread Bart Kastermans

I have a custom serializer for writing/reading from kafka. I am setting this up in main with code as follows: val kafkaConsumerProps = new Properties() kafkaConsumerProps.setProperty("bootstrap.servers", kafka_bootstrap) kafkaConsumerProps.setProperty("group.id",s"normalize-call-even

Re: Taskmanagers are quarantined

2017-11-29 Thread Till Rohrmann

Hi, you could also try increasing the heartbeat timeout via `akka.watch.heartbeat.pause`. Maybe this helps to overcome the GC pauses. Cheers, Till On Wed, Nov 29, 2017 at 12:41 PM, T Obi wrote: > Warnings of Datanode appeared not in all cases of timeout. They seem > to be raised just by timeou

FlinkKafkaProducerXX

2017-11-29 Thread Mikhail Pryakhin

Hi all, I've just come across a FlinkKafkaProducer misconfiguration issue especially when a FlinkKafkaProducer is created without specifying a kafka partitioner then a FlinkFixedPartitioner instance is used, and all messages end up in a single kafka partition (in case I have a single task manage

Re: Taskmanagers are quarantined

2017-11-29 Thread T Obi

Warnings of Datanode appeared not in all cases of timeout. They seem to be raised just by timeout while snapshotting. We output GC logs on taskmanagers and found that someone kicks System.gc() every an hour. So a full GC runs every an hour, and it takes about a minute or more in our cases... When

Re: Question about Timestamp in Flink SQL

2017-11-29 Thread Timo Walther

Hi Wangsan, I opened an issue to document the behavior properly in the future (https://issues.apache.org/jira/browse/FLINK-8169). Basically, both your event-time and processing-time timestamps should be GMT. We plan to support offsets for windows in the future (https://issues.apache.org/jira/

Re: Are there plans to support Hadoop 2.9.0 on near future?

2017-11-29 Thread Kostas Kloudas

Hi Oriol, This estimation is not accurate and the whole plan is a bit outdated. This was based on an outdated time-based release model that the community tried but without the expected results, so we changed it. You can follow the release voting for 1.4 in the dev mailing list. And the archived

Re: Are there plans to support Hadoop 2.9.0 on near future?

2017-11-29 Thread ORIOL LOPEZ SANCHEZ

Thanks, it helped a lot! But I've seen on https://cwiki.apache.org/confluence/display/FLINK/Flink+Release+and+Feature+Plan#FlinkReleaseandFeaturePlan-Flink1.4 that they estimated releasing 1.4 at September. Do you know if it will be released this year or we may have to wait longer? Thanks a

Re: user driven stream processing

2017-11-29 Thread Fabian Hueske

Another example is King's RBEA platform [1] which was built on Flink. In a nutshell, RBEA runs a single large Flink job, to which users can add queries that should be computed. Of course, the query language is restricted because they queries must match on the structure of the running job. Hope thi

Re: How to perform efficient DataSet reuse between iterations

2017-11-29 Thread Fabian Hueske

The monitoring REST interface provides detailed stats about a job, its tasks, and processing verticies including their start and end time [1]. However, it is not trivial to make sense of the execution times because Flink uses pipelined shuffles by default. That means that the execution of multiple

Re: Are there plans to support Hadoop 2.9.0 on near future?

2017-11-29 Thread Kostas Kloudas

Hi Oriol, As you may have seen form the mailing list we are currently in the process of releasing Flink 1.4. This is going to be a hadoop-free distribution which means that it should work with any hadoop version, including Hadoop 2.9.0. Given this, I would recommend to try out the release cand

Re: Question about Timestamp in Flink SQL

2017-11-29 Thread wangsan

Hi Timo, What I am doing is extracting a timestamp field (may be string format as “2017-11-28 11:00:00” or a long value base on my current timezone) as Event time attribute. So In timestampAndWatermarkAssigner , for string format I should parse the data time string using GMT, and for long value

RV: Are there plans to support Hadoop 2.9.0 on near future?

2017-11-29 Thread ORIOL LOPEZ SANCHEZ

Hi, I'm currently working on designing a data-processing cluster, and one of the distributed processing tools we want to use is Flink. As we're creating our cluster from barebones, without relying on any Hadoop distributions such as Hortonworks or Cloudera, we want to use Flink with Hadoop 2.9

Re: Question about Timestamp in Flink SQL

2017-11-29 Thread Timo Walther

Hi Wangsan, currently the timestamps in Flink SQL do not depend on a timezone. All calculations happen on the UTC timestamp. This also guarantees that an input with Timestamp.valueOf("XXX") remains consistent when parsing and outputing it with toString(). Regards, Timo Am 11/29/17 um 3:43

Re: S3 Access in eu-central-1

2017-11-29 Thread Ufuk Celebi

Hey Dominik, yes, we should definitely add this to the docs. @Nico: You recently updated the Flink S3 setup docs. Would you mind adding these hints for eu-central-1 from Steve? I think that would be super helpful! Best, Ufuk On Tue, Nov 28, 2017 at 10:00 PM, Dominik Bruhn wrote: > Hey Stephan

Flik typesafe configuration

Maintain heavy hitters in Flink application

Issue with Checkpoint restore( Beam pipeline)

Re: getting started with link / scala

Re: getting started with link / scala

Re: getting started with link / scala

getting started with link / scala

Re: Dataset using several count operator in the same environment

Re: Classpath for execution of KafkaSerializer/Deserializer; java.lang.NoClassDefFoundError while class in job jar

Re: Classpath for execution of KafkaSerializer/Deserializer; java.lang.NoClassDefFoundError while class in job jar

Re: Taskmanagers are quarantined

Dataset using several count operator in the same environment

Re: user driven stream processing

Classpath for execution of KafkaSerializer/Deserializer; java.lang.NoClassDefFoundError while class in job jar

Re: Taskmanagers are quarantined

FlinkKafkaProducerXX

Re: Taskmanagers are quarantined

Re: Question about Timestamp in Flink SQL

Re: Are there plans to support Hadoop 2.9.0 on near future?

Re: Are there plans to support Hadoop 2.9.0 on near future?

Re: user driven stream processing

Re: How to perform efficient DataSet reuse between iterations

Re: Are there plans to support Hadoop 2.9.0 on near future?

Re: Question about Timestamp in Flink SQL

RV: Are there plans to support Hadoop 2.9.0 on near future?

Re: Question about Timestamp in Flink SQL

Re: S3 Access in eu-central-1

27 matches

Site Navigation

Mail list logo

Footer information