Re: [DISCUSS] Flink 1.6 features

2018-06-05 Thread Rico Bergmann
+1 on K8s integration > Am 06.06.2018 um 00:01 schrieb Hao Sun : > > adding my vote to K8S Job mode, maybe it is this? > > Smoothen the integration in Container environment, like "Flink as a > > Library", and easier integration with Kubernetes services and other proxies. > > > >> On Mon, J

Re: Flink 1.4.2 in Zeppelin Notebook

2018-04-09 Thread Rico Bergmann
FYI: I finally managed to get the new Flink version running in Zeppelin. Besides adding the parameters mentioned below you have to build Zeppelin with profile scala-2.11 and the new Flink version 1.4.2. Best, Rico. Am 09.04.2018 um 14:43 schrieb Rico Bergmann: > > The error mess

Re: Flink 1.4.2 in Zeppelin Notebook

2018-04-09 Thread Rico Bergmann
you click the > run (flink code) button after making these changes for flink > interpreter config (I assume you restart the interpreter)? > > Regards, > Kedar > > On Mon, Apr 9, 2018 at 12:50 AM, Rico Bergmann <mailto:i...@ricobergmann.de>> wrote: > > Hi. 

Re: Flink 1.4.2 in Zeppelin Notebook

2018-04-09 Thread Rico Bergmann
ee, it is > confusing because the properties named "host" and "port" already available, > but the names of the useful properties are different): > > > Could you please try this and let us know if it works for you? > > Regards, > Kedar > > >> On

Flink 1.4.2 in Zeppelin Notebook

2018-04-06 Thread Dipl.-Inf. Rico Bergmann
Hi! Has someone successfully integrated Flink 1.4.2 into Zeppelin notebook (using Flink in cluster mode, not local mode)? Best, Rico.

Re: Performance Issue

2015-09-24 Thread Rico Bergmann
link program or some external > queue, such as Kafka? > > Cheers, > Aljoscha > >> On Thu, 24 Sep 2015 at 13:47 Rico Bergmann wrote: >> And as side note: >> >> The problem with duplicates seems also to be solved! >> >> Cheers Rico. >>

Re: Performance Issue

2015-09-24 Thread Rico Bergmann
And as side note: The problem with duplicates seems also to be solved! Cheers Rico. > Am 24.09.2015 um 12:21 schrieb Rico Bergmann : > > I took a first glance. > > I ran 2 test setups. One with a limited test data generator, the outputs > around 200 events per second. I

Re: Performance Issue

2015-09-24 Thread Rico Bergmann
che/flink/pull/1175 >>> >>> It should fix the issues and offer vastly improved performance (up to 50x >>> faster). For now, it supports time windows, but we will support the other >>> cases in the next days. >>> >>> I'll ping you once

Re: Performance Issue

2015-09-24 Thread Rico Bergmann
; I'll ping you once it is merged, I'd be curious if it fixes your issue. Sorry > that you ran into this problem... > > Greetings, > Stephan > > >> On Mon, Sep 7, 2015 at 12:00 PM, Rico Bergmann wrote: >> Hi! >> >> While working with g

Re: Performance Issue

2015-09-08 Thread Rico Bergmann
s, are there > many unique keys, do the keys keep evolving, i.e. is it always new and > different keys? > > Cheers, > Aljoscha > >> On Tue, 8 Sep 2015 at 13:44 Rico Bergmann wrote: >> I also see in the TM overview the CPU load is still around 25% although >>

Re: Performance Issue

2015-09-08 Thread Rico Bergmann
. Now 10 minutes are over ... I think there must be something with flink... > Am 08.09.2015 um 13:32 schrieb Rico Bergmann : > > The marksweep value is very high, the scavenge very low. If this helps ;-) > > > > >> Am 08.09.2015 um 11:27 schrieb Robert Me

Re: Performance Issue

2015-09-08 Thread Rico Bergmann
On Tue, Sep 8, 2015 at 10:34 AM, Rico Bergmann wrote: >> Where can I find these information? I can see the memory usage and cpu load. >> But where are the information on the GC? >> >> >> >>> Am 08.09.2015 um 09:34 schrieb Robert Metzger : >>>

Re: Performance Issue

2015-09-08 Thread Rico Bergmann
M spend with garbage collection. > Can you check whether the number of GC calls + the time spend goes up after > 30 minutes? > >> On Tue, Sep 8, 2015 at 8:37 AM, Rico Bergmann wrote: >> Hi! >> >> I also think it's a GC problem. In the KeySelector I don't in

Re: Performance Issue

2015-09-07 Thread Rico Bergmann
it locally and use something like jstat to rule this out. > > cheers Martin > >> On Mon, Sep 7, 2015 at 12:00 PM, Rico Bergmann wrote: >> Hi! >> >> While working with grouping and windowing I encountered a strange behavior. >> I'm doing: >>> dataStream.g

Performance Issue

2015-09-07 Thread Rico Bergmann
Hi! While working with grouping and windowing I encountered a strange behavior. I'm doing: > dataStream.groupBy(KeySelector).window(Time.of(x, > TimeUnit.SECONDS)).mapWindow(toString).flatten() When I run the program containing this snippet it initially outputs data at a rate around 150 events

Re: Duplicates in Flink

2015-09-03 Thread Rico Bergmann
the KafkaSink comes into play? At what point do the > duplicates come up? > >> On Thu, Sep 3, 2015 at 2:09 PM, Rico Bergmann wrote: >> No. I mean the KafkaSink. >> >> A bit more insight to my program: I read from a Kafka topic with >> flinkKafkaConsumer082, the

Re: Duplicates in Flink

2015-09-03 Thread Rico Bergmann
t; On Thu, Sep 3, 2015 at 1:10 PM, Rico Bergmann wrote: >> Hi! >> >> Testing it with the current 0.10 snapshot is not easily possible atm >> >> But I deactivated checkpointing in my program and still get duplicates in my >> output. So it seems not only to come

Re: Duplicates in Flink

2015-09-03 Thread Rico Bergmann
ld work well there. > > Could you maybe try running on the 0.10-SNAPSHOT release and see if the > problems persist there? > > Cheers, > Aljoscha > >> On Tue, 1 Sep 2015 at 14:38 Dipl.-Inf. Rico Bergmann >> wrote: >> Hi! >> >> I still have an issue...

Duplicates in Flink

2015-09-01 Thread Dipl.-Inf. Rico Bergmann
Hi! I still have an issue... I was now using 0.9.1 and the new KafkaConnector. But I still get duplicates in my flink prog. Here's the relevant part: final FlinkKafkaConsumer082 kafkaSrc = new FlinkKafkaConsumer082( kafkaTopicIn, new SimpleStringSchema(), properties);

Re: Problem with Windowing

2015-09-01 Thread Rico Bergmann
; On Mon, Aug 31, 2015 at 6:40 PM, Matthias J. Sax >> wrote: >> Maybe you could include some log statements in you user code to see >> which parts of the program receive data and which not. To narrow down >> the problematic part... >> >> On 08/31/2015 06:03 PM

Re: Problem with Windowing

2015-08-31 Thread Rico Bergmann
ode example, the assignment is missing -- but maybe it just > missing in your email. > > -Matthias > > >> On 08/31/2015 04:38 PM, Dipl.-Inf. Rico Bergmann wrote: >> Hi! >> >> I have a problem that I cannot really track down. I'll try to describe >

Problem with Windowing

2015-08-31 Thread Dipl.-Inf. Rico Bergmann
Hi! I have a problem that I cannot really track down. I'll try to describe the issue. My streaming flink program computes something. At the end I'm doing the follwing on my DataStream ds ds.window(2, TimeUnit.SECONDS) .groupBy(/*custom KeySelector converting input to a String representation*

Re: Flink to ingest from Kafka to HDFS?

2015-08-25 Thread Rico Bergmann
no seek for >>> write. Not sure how to solve this, other then writing to tmp files and >>> copying upon success. >>> >>> Apache Flume must have solved this issue in some way, it may be a worth >>> looking into how they solved it. >>> &

Re: when use broadcast variable and run on bigdata display this error please help

2015-08-20 Thread Rico Bergmann
Because the broadcasted variable is completely stored at each operator. If you use a hash join, then both inputs can be hash partitioned. This reduces the amount of memory needed for each operator, I think. > Am 20.08.2015 um 12:14 schrieb hagersaleh : > > why this is not good broadcast v

Re: when use broadcast variable and run on bigdata display this error please help

2015-08-20 Thread Rico Bergmann
As you can see from the exceptions your broadcast variable is too large to fit into the main memory. I think storing that amount of data in a broadcast variable is not the best approach. Try to use a dataset for this, I would suggest. > Am 20.08.2015 um 11:56 schrieb hagersaleh : > > pleas

Re: Flink to ingest from Kafka to HDFS?

2015-08-20 Thread Rico Bergmann
cation about a > completed checkpoint is received the contents of this file would me moved (or > appended) to the actual destination. > > Do you have any Ideas about the rolling files/checkpointing? > >> On Thu, 20 Aug 2015 at 09:44 Rico Bergmann wrote: >> I'm t

Re: Flink to ingest from Kafka to HDFS?

2015-08-20 Thread Rico Bergmann
I'm thinking about implementing this. After looking into the flink code I would basically subclass FileOutputFormat in let's say KeyedFileOutputFormat, that gets an additional KeySelector object. The path in the file system is then appended by the string, the KeySelector returns. U think thi

Re: Custom Class for state checkpointing

2015-08-19 Thread Rico Bergmann
ly the map, you can just use the HashMap<> as the state. >> >> If you have more data, you can use TupleX, for example: >> >> Tuple2, Long>(myMap, myLong); >> >> >>> On Tue, Aug 18, 2015 at 12:21 PM, Rico Bergmann >>> wrote: >>&

Re: Custom Class for state checkpointing

2015-08-18 Thread Rico Bergmann
gt; wrote: >>>> Java's HashMap is serializable. >>>> If it is only the map, you can just use the HashMap<> as the state. >>>> >>>> If you have more data, you can use TupleX, for example: >>>> >>>> Tuple2, Long>

Re: Custom Class for state checkpointing

2015-08-18 Thread Rico Bergmann
tion the return type of > the snapshotstate method (the generic paramter of Checkpointed) has to be > java Serializable. I suspect that is the problem here. This is a limitation > that we plan to lift soon. > > Marton > >> On Tue, Aug 18, 2015 at 11:32 AM, Rico Bergmann

Re: Custom Class for state checkpointing

2015-08-18 Thread Rico Bergmann
found, since this case is not > yet tested (afaik). > We'll fix the issue asap, until then, are you able to encapsulate your state > in something that is available in Flink, for example a TupleX or just > serialize it yourself into a byte[] ? > >> On Tue, Aug 18, 2015

Custom Class for state checkpointing

2015-08-18 Thread Rico Bergmann
Hi! Is it possible to use your own class? I'm using the file state handler at the Jobmanager and implemented the Checkpointed interface. I tried this and got an exception: Error: java.lang.RuntimeException: Failed to deserialize state handle and setup initial operator state. at org.apache.flin

optimal deployment model for Flink Streaming programs

2015-07-29 Thread Dipl.-Inf. Rico Bergmann
Hi! We want to build an infrastructure for automated deployment of Flink Streaming programs to a dedicated environment. This includes automated tests (unit and integration) via Jenkins and in case of a successful build&test the program should be deployed to the execution environment. Since s