Fwd: Decouple Kafka partitions and Flink parallelism for ordered streams

2017-10-13 Thread Sanne de Roever
.sum(1) val output2 = sideOutputStream2.map { (_, 1) }.slotSharingGroup("map2") .keyBy(0) .sum(1) output1.print() output2.print() env.execute("Scala SocketTextStreamWordCount Example") } } On Thu, Oct 12, 2017 at 12:09 PM, S

Decouple Kafka partitions and Flink parallelism for ordered streams

2017-10-11 Thread Sanne de Roever
Hi, Currently we need 75 Kafka partitions per topic and a parallelism of 75 to meet required performance, increasing the partitions and parallelism gives diminished returns Currently the performance is approx. 1500 msg/s per core, having one pipeline (source, map, sink) deployed as one instance p

Re: Flink Kafka producer with a topic per message

2016-12-07 Thread Sanne de Roever
Hi Gordon, Yes, this has been addressed in 1.0.0; and in a very nice way. Thank you. Cheers, Sanne On Wed, Dec 7, 2016 at 11:11 AM, Sanne de Roever wrote: > Hi Gordon, > > Sounds very close, I will have look; thx. > > Cheers, > > Sanne > > On Wed, Dec 7, 2016

Re: Flink Kafka producer with a topic per message

2016-12-07 Thread Sanne de Roever
Or have I > misunderstood what you have in mind? > > Cheers, > Gordon > > > On December 7, 2016 at 5:55:24 PM, Sanne de Roever ( > sanne.de.roe...@gmail.com) wrote: > > The next step would be to determine the impact on the interface of a Sink. > Currently a Kafka sink has one topic, for example: > >

Re: Flink Kafka producer with a topic per message

2016-12-07 Thread Sanne de Roever
such that with each message optionally a key-value map can be passed. This optional key-value map would allow the sink to alter its behavior given the hints in the map. On Wed, Dec 7, 2016 at 10:55 AM, Sanne de Roever wrote: > A first sketch > > Central to this functionality i

Re: Flink Kafka producer with a topic per message

2016-12-07 Thread Sanne de Roever
more control, but less abstraction; it is for advanced applications. Any abstraction attempts will only create less transparency as far as I can see. The contract would not likely work on other queuing providers. On Wed, Dec 7, 2016 at 10:27 AM, Sanne de Roever wrote: > Good questions

Re: Flink Kafka producer with a topic per message

2016-12-07 Thread Sanne de Roever
t; > > > On Tue, Dec 6, 2016 at 2:23 PM, Sanne de Roever > wrote: > >> Hi, >> >> Kafka producer clients for 0.10 allow the following syntax: >> >> producer.send(new ProducerRecord("my-topic", >> Integer.toString(i), Integer.toString(i)

Flink Kafka producer with a topic per message

2016-12-06 Thread Sanne de Roever
Hi, Kafka producer clients for 0.10 allow the following syntax: producer.send(new ProducerRecord("my-topic", Integer.toString(i), Integer.toString(i))); The gist is that one producer can send messages to different topics; it is useful for event routing ao. It makes the creation generic endpoints

Re: Using Flink and Cassandra with Scala

2016-09-30 Thread Sanne de Roever
Thanks Chesnay. Have a good weekend. On Thu, Sep 29, 2016 at 5:03 PM, Chesnay Schepler wrote: > the cassandra sink only supports java tuples and POJO's. > > > On 29.09.2016 16:33, Sanne de Roever wrote: > >> Hi, >> >> Does the Cassandra sink support

Using Flink and Cassandra with Scala

2016-09-29 Thread Sanne de Roever
Hi, Does the Cassandra sink support Scala and case classes? It looks like using Java is at the moment best practice. Cheers, Sanne

Re: Yammer metrics

2016-04-08 Thread Sanne de Roever
ill create cleaner code, easier debugging and allow us to > adjust things as required at any time. > > On 08.04.2016 12:39, Sanne de Roever wrote: > > I forgot to add some extra information, still all tentative. > > Earlier (erm, 14 years ago to be honest), I also worked on a JMX >

Re: Yammer metrics

2016-04-08 Thread Sanne de Roever
echos this sentiment, and proposes the Yammer route: http://www.slideshare.net/NaderGan/cassandra-jmxexpresshow-do-we-monitor-cassandra-using-graphite-leveraging-yammer-codahale-library On Fri, Apr 8, 2016 at 12:12 PM, Sanne de Roever wrote: > Thanks Chesnay. > > Might I make a tenta

Re: Yammer metrics

2016-04-08 Thread Sanne de Roever
JMX as this > effectively covers all use-cases with the help of JMXTrans and similar > tools. > > Reporting to specific systems is something we want to add as well though. > > Regards, > Chesnay Schepler > > > On 08.04.2016 09:32, Sanne de Roever wrote: > > Hi, &g

Yammer metrics

2016-04-08 Thread Sanne de Roever
Hi, I´m looking into setting up monitoring for our (Flink) environment and realized that both Kafka and Cassandra use the yammer metrics library. This library enables the direct export of all metrics to Graphite (and possibly statsd). Does Flink use Yammer metrics? Cheers, Sanne

Re: State in external db (dynamodb)

2016-04-05 Thread Sanne de Roever
FYI Cassandra has a TTL on data: https://docs.datastax.com/en/cql/3.1/cql/cql_using/use_expire_t.html On Wed, Apr 6, 2016 at 7:55 AM, Shannon Carey wrote: > Hi, new Flink user here! > > I found a discussion on user@flink.apache.org about using DynamoDB as a > sink. However, as noted, sinks have