RE: ype of TypeVariable could not be determined

2016-03-08 Thread Radu Tudoran
Hi, The issue is that this problem appears when I want to create a stream source. StreamExecutionEnvironment.addSource(new MySourceFunction()) … Where the stream source class is MySourceFunction implements SourceFunction { … } In such a case I am not sure how I can pass the outertype nor how I

Re: ype of TypeVariable could not be determined

2016-03-08 Thread Timo Walther
Hi Radu, the exception can have multiple causes. It would be great if you could share some example code. In most cases the problem is the following: public class MapFunction { } new MapFunction(); The type WhatEverType is type erasured by Java. The type must not be declared in the "new"

Re: Jobmanager HA with Rolling Sink in HDFS

2016-03-08 Thread Maximilian Bode
Hi Aljoscha, yeah I should have been clearer. I did mean those accumulators but am not trusting them in the sense of total number (as you said, they are reset on failure). On the other hand, if they do not change for a while it is pretty obvious that the job has ingested everything in the queue

JDBCInputFormat preparation with Flink 1.1-SNAPSHOT and Scala 2.11

2016-03-08 Thread Prez Cannady
I’m attempting to create a stream using JDBCInputFormat. Objective is to convert each record into a tuple and then serialize for input into a Kafka topic. Here’s what I have so far. ``` val env = StreamExecutionEnvironment.getExecutionEnvironment val inputFormat = JDBCInputFormat.buildJDBCInp

protobuf messages from Kafka to elasticsearch using flink

2016-03-08 Thread Madhukar Thota
Friends, Can someone guide me or share an example on how to consume protobuf message from kafka and index into Elasticsearch using flink?

Re: [ANNOUNCE] Flink 1.0.0 has been released

2016-03-08 Thread Igor Berman
Congratulations! Very nice work, very interesting features. One question regarding CEP: do you think it's feasible to define pattern over window of 1 month or even more? Is there some deep explanation regarding how this partial states are saved? I mean events that create "funnel" might be separate

Re: ype of TypeVariable could not be determined

2016-03-08 Thread Wang Yangjun
Hi Radu, I met this issue also. The reason is outTypeInfo couldn't be created base on generic type when a transform applied. public SingleOutputStreamOperator transform(String operatorName, TypeInformation outTypeInfo, OneInputStreamOperator operator) The solution would be passed Calss to you

Stack overflow from self referencing Avro schema

2016-03-08 Thread David Kim
Hello all, I'm running into a StackOverflowError using flink 1.0.0. I have an Avro schema that has a self reference. For example: item.avsc { "namespace": "..." "type": "record" "name": "Item", "fields": [ { "name": "parent" "type": ["null, "Item"] } ] } When ru

ype of TypeVariable could not be determined

2016-03-08 Thread Radu Tudoran
Hi, I am trying to create a custom stream source. I first build this with generic and I run into problems regarding type extraction. I tried to put concrete types but run into the same issue (see errors below). Can anyone provide a solution to solve this issue. Caused by: org.apache.flink.api.

Re: Jobmanager HA with Rolling Sink in HDFS

2016-03-08 Thread Aljoscha Krettek
Hi, with accumulator you mean the ones you get from RuntimeContext.addAccumulator/getAccumulator? I’m afraid these are not fault-tolerant which means that the count in these probably doesn’t reflect the actual number of elements that were processed. When a job fails and restarts the accumulator

Re: Window apply problem

2016-03-08 Thread Aljoscha Krettek
Hi, there is also PurgingTrigger, which turns any Trigger into a trigger that also purges when firing. Use it like this: .trigger(PurgingTrigger.of(CountTrigger.of(5))) Cheers, Aljoscha > On 08 Mar 2016, at 17:23, Marcela Charfuelan > wrote: > > Thanks Jun, > Very useful, I was confusing the

Re: Window apply problem

2016-03-08 Thread Marcela Charfuelan
Thanks Jun, Very useful, I was confusing the parameters because my input is tuples, which I might not need in the end... I have now what I wanted (non-parallel and not so efficient I guess, any suggestion to improve is welcome) and I have to modify the trigger so to FIRE_AND_PURGE when it rea

RE: [ANNOUNCE] Flink 1.0.0 has been released

2016-03-08 Thread Radu Tudoran
Hi, Do you have also a linkedin post that I could share - or should I make a blogpost in which I take this announcement? Dr. Radu Tudoran Research Engineer - Big Data Expert IT R&D Division HUAWEI TECHNOLOGIES Duesseldorf GmbH European Research Center Riesstrasse 25, 80992 München E-mail:

[ANNOUNCE] Flink 1.0.0 has been released

2016-03-08 Thread Kostas Tzoumas
Hi everyone! As you might have noticed, Apache Flink 1.0.0 has been released and announced! You can read more about the release at the ASF blog and the Flink blog - https://blogs.apache.org/foundation/entry/the_apache_software_foundation_announces88 - http://flink.apache.org/news/2016/03/08/relea

Re: rebalance of streaming job after taskManager restart

2016-03-08 Thread Aljoscha Krettek
Yes, there are plans to make this more streamlined but we are not there yet, unfortunately. > On 08 Mar 2016, at 16:07, Maciek Próchniak wrote: > > Hi, > > thanks for quick answer - yes, I does what I want to accomplish, > but I was hoping for some "easier" solution. > Are there any plans for "

Re: rebalance of streaming job after taskManager restart

2016-03-08 Thread Maciek Próchniak
Hi, thanks for quick answer - yes, I does what I want to accomplish, but I was hoping for some "easier" solution. Are there any plans for "restart" button/command or sth similar? I mean, the whole process of restarting is ready as I understand - as it's triggered when task manager dies. thank

Re: rebalance of streaming job after taskManager restart

2016-03-08 Thread Aljoscha Krettek
Hi, I think what you can do is make a savepoint of your program, then cancel it and restart it from the savepoint. This should make Flink redistribute it on all TaskManagers. See https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/savepoints.html and https://ci.apache.org/proj

rebalance of streaming job after taskManager restart

2016-03-08 Thread Maciek Próchniak
Hi, we have streaming job with paralelism 2 and two task managers. The job is occupying one slot on each task manager. When I stop manager2 the job is restarted and it runs on manager1 - occupying two of it's slots. How can I trigger restart (or other similar process) that will cause the job t

Re: Webclient script misses building from source

2016-03-08 Thread Andrea Sella
I missed it! Thank you, Andrea 2016-03-08 14:27 GMT+01:00 Aljoscha Krettek : > Hi Andrea, > in Flink 1.0 there is no more a separate web client. The web client is > part of the default JobManager dashboard now. > > You can also disable the web client part of the JobManager dashboard by > setting

Re: Webclient script misses building from source

2016-03-08 Thread Aljoscha Krettek
Hi Andrea, in Flink 1.0 there is no more a separate web client. The web client is part of the default JobManager dashboard now. You can also disable the web client part of the JobManager dashboard by setting: jobmanager.web.submit.enable: false in flink-conf.yaml. Cheers, Aljoscha > On 08 Mar

Webclient script misses building from source

2016-03-08 Thread Andrea Sella
Hi, I've built Flink from source but I was not able to find in build-target/bin the script start-webclient.sh to launch the WebUI. The script is available just in the binaries or I have to add an argument to trigger its generation? Thanks in advance, Andrea

Re: Jobmanager HA with Rolling Sink in HDFS

2016-03-08 Thread Maximilian Bode
Hi, thanks for the fast answer. Answers inline. > Am 08.03.2016 um 13:31 schrieb Aljoscha Krettek : > > Hi, > a missing part file for one of the parallel sinks is not necessarily a > problem. This can happen if that parallel instance of the sink never received > data after the job successfully

Re: Flink streaming throughput

2016-03-08 Thread Aljoscha Krettek
Hi, Another interesting test would be a combination of 3) and 2). I.e. no JSON parsing and no sink. This would show what the raw throughput can be before being slowed down by writing to Elasticsearch. Also .print() is also not feasible for production since it just prints every element to the st

Re: Jobmanager HA with Rolling Sink in HDFS

2016-03-08 Thread Aljoscha Krettek
Hi, a missing part file for one of the parallel sinks is not necessarily a problem. This can happen if that parallel instance of the sink never received data after the job successfully restarted. Missing data, however, is a problem. Maybe I need some more information about your setup: - When

Re: Flink streaming throughput

2016-03-08 Thread おぎばやしひろのり
Stephan, Sorry for the delay in my response. I tried 3 cases you suggested. This time, I set parallelism to 1 for simpicity. 0) base performance (same as the first e-mail): 1,480msg/sec 1) Disable checkpointing : almost same as 0) 2) No ES sink. just print() : 1,510msg/sec 3) JSON to TSV : 8,000

Re: Jobmanager HA with Rolling Sink in HDFS

2016-03-08 Thread Maximilian Bode
Hi Aljoscha, oh I see. I was under the impression this file was used internally and the output being completed at the end. Ok, so I extracted the relevant lines using for i in part-*; do head -c $(cat "_$i.valid-length" | strings) "$i" > "$i.final"; done which seems to do the trick. Unf

Re: Jobmanager HA with Rolling Sink in HDFS

2016-03-08 Thread Aljoscha Krettek
Hi, are you taking the “.valid-length” files into account. The problem with doing “exactly-once” with HDFS is that before Hadoop 2.7 it was not possible to truncate files. So the trick we’re using is to write the length up to which a file is valid if we would normally need to truncate it. (If th

RE: realtion between operator and task

2016-03-08 Thread Radu Tudoran
Hi, Thanks for the answer. Can you point me to the code where the operators are being assign to tasks. Thanks From: ewenstep...@gmail.com [mailto:ewenstep...@gmail.com] On Behalf Of Stephan Ewen Sent: Monday, March 07, 2016 8:29 PM To: user@flink.apache.org Subject: Re: realtion between operato

Re: Window apply problem

2016-03-08 Thread Wang Yangjun
Hello Marcela, I am not sure what is the “parameters mismatch” here. From the example you shown, it seems that you just want do a window word count. Right? Could you try this code and is it want you want? Best, Jun - StreamExecutionEnvironment en