Re: How to get help on ClassCastException when re-submitting a job

2017-01-30 Thread Giuliano Caliari
Quick update: I've closed the issue after confirming that Yuri's workaround fixed it for us. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-get-help-on-ClassCastException-when-re-submitting-a-job-tp10972p11374.html Sent from the Apac

RE: Calling external services/databases from DataStream API

2017-01-30 Thread Diego Fustes Villadóniga
Hi Stephan, Thanks a lot for your response. I’ll study the options that you mention, I’m not sure if the “chagelog stream” will be easy to implement since the lookup is based on matching IP ranges and not just keys. Regards, Diego De: Stephan Ewen [mailto:se...@apache.org] Enviado el: lunes,

Connection refused error when writing to socket?

2017-01-30 Thread Li Peng
Hi there, I'm trying to test a couple of things by having my stream write to a socket, but it keeps failing to connect (I'm trying to have a stream write to a socket, and have another stream read from that socket). Caused by: java.net.ConnectException: Connection refused (Connection refused) at

allowed lateness on windowed join?

2017-01-30 Thread Saiph Kappa
Hi all, Is it possible to specify allowed lateness for a window join like the following one: val tweetsAndWarning = warningsPerStock.join(tweetsPerStock).where(_.symbol).equalTo(_.symbol) .window(SlidingEventTimeWindows.of(Time.of(windowDurationSec, TimeUnit.SECONDS), Time.of(windowDurationS

Re: Calling external services/databases from DataStream API

2017-01-30 Thread Stephan Ewen
Hi! The Distributed cache would actually indeed be nice to add to the DataStream API. Since the runtime parts for that are all in place, the code would be mainly on the "client" side that sets up the JobGraph to be submitted and executed. For the problem of scaling this, there are two solutions t

Re: Kafka data not read in FlinkKafkaConsumer09 second time from command line

2017-01-30 Thread Sujit Sakre
Hi Robert, Aljoscha, Many thanks for pointing out. Watermark generation is the problem. It is generating timestamps far ahead of current year due to our code which tried to cover all records but inadvertently made a very large watermark. We have tried fixing this with other combinations of genera

Re: Calling external services/databases from DataStream API

2017-01-30 Thread Jonas
I have a similar usecase where I (for the purposes of this discussion) have a GeoIP Database that is not fully available from the start but will eventually be "full". The GeoIP tuples are coming in one after another. After ~4M tuples the GeoIP database is complete. I also need to do the same query

Calling external services/databases from DataStream API

2017-01-30 Thread Diego Fustes Villadóniga
Hi all, I'm working on an application that enriches network connections with Geolocations, using the GeoIP database, which is stored as a file in HDFS. To do so, I map every connection in my stream, using this function: def enrichIp(ip: String): Location = { val location = service.getLocati