Flink-ml multiple linear regression fit

2015-09-18 Thread Florian Heyl
Hey Guys need your help again, I am currently having problems with the multiple linear regression from the flink-ml on the HDFS. Locally it works fine with the 0.9-SNAPSHOT. The cluster runs with the 0.10-SNAPSHOT. The code is the following: // set linear regression with parameters: val mlr = Mu

Re: Expressions for Table operations (select, filter)?

2015-09-18 Thread Aljoscha Krettek
Hi Stefan, I added a section in the documentation that describes the syntax of the expressions. It is a bit bare bones but I hope it helps nonetheless. https://ci.apache.org/projects/flink/flink-docs-master/libs/table.html Cheers, Aljoscha On Wed, 16 Sep 2015 at 14:55 Aljoscha Krettek wrote: >

Re: Window based on tuple timestamps

2015-09-18 Thread Aljoscha Krettek
Hi Philipp, am I correct to assume that your tuples do not arrive in the order of the timestamp that you extract. Unfortunately, for that case the current windowing implementation does not work correctly. We are working hard on fixing this for the upcoming 0.10 release, though. If you are intereste

Re: FlinkKafkaConsumer and multiple topics

2015-09-18 Thread Robert Metzger
Hi, did you manually add a Kafka dependency into your project? Maybe you are overwriting the Kafka version to a lower version? I'm sorry that our consumer is crashing when its supposed to read an invalid topic .. but In general, thats a good behavior ;) Maybe you can check whether the topic exis

Re: FlinkKafkaConsumer and multiple topics

2015-09-18 Thread Jakob Ericsson
Hit another problem. It is probably related to a topic that still exist in zk but is not used anymore (therefore no partitions) or I want to start a listener for a topic that hasn't yet been created. I would like it not to crash. Also, some funny Scala <-> Java Exception in thread "main" java.lan

Re: Kinesis Connector

2015-09-18 Thread Stephan Ewen
Hi Giancarlo! Cool that you are working on a Kinesis connector, very exciting :-) To have a look at the Kafka fault tolerance, you can check out this blog post, it explains it in one of the later sections: http://data-artisans.com/kafka-flink-a-practical-how-to/ A general overview of checkpointi

Window based on tuple timestamps

2015-09-18 Thread Philipp Goetze
Hello again, another question from me =). Could you provide an example on how to correctly use windows based on timestamps on the tuples (i.e. non-realtime)? As a simple example I tried something like this: val w4 = rdfSource.window(Time.of(20L, customTimestamp, 121962510L)).map

Re: Kinesis Connector

2015-09-18 Thread Giancarlo Pagano
Hi Stephan, I’m not a lot familiar with Kafka on the other hand, but I think they offer a very similar abstraction. Kinesis has a low-level api and an high level consumer, the Kinesis Client Library (KCL). I‘ve implemented a first version of the connector using the KCL, that I’ve been using for

Re: Joining Windowed Data Streams

2015-09-18 Thread Philipp Goetze
Hey Stephan, the query would require to either join windowed data streams or doing self joins within one window. If you take a look at Q1 of SRBench you will see, that it is a quite short query. Another option would be to support SPARQL queries as Flink operat

Re: FlinkKafkaConsumer and multiple topics

2015-09-18 Thread Jakob Ericsson
That will work. We have some utility classes for exposing the ZK-info. On Fri, Sep 18, 2015 at 10:50 AM, Robert Metzger wrote: > Hi Jakob, > > currently, its not possible to subscribe to multiple topics with one > FlinkKafkaConsumer. > > So for now, you have to create a FKC for each topic .. so

Re: Joining Windowed Data Streams

2015-09-18 Thread Stephan Ewen
Hi Philipp! I don't think this currently works. We are starting to rework windowing and window streams, which means we can take feedback for the future implementation. What kind of requirements would the benchmark query have? Greetings, Stephan On Thu, Sep 17, 2015 at 11:02 AM, Philipp Goetze

Re: FlinkKafkaConsumer and multiple topics

2015-09-18 Thread Robert Metzger
Hi Jakob, currently, its not possible to subscribe to multiple topics with one FlinkKafkaConsumer. So for now, you have to create a FKC for each topic .. so you'll end up with 50 sources. As soon as Kafka releases the new consumer, it will support subscribing to multiple topics (I think even wit

FlinkKafkaConsumer and multiple topics

2015-09-18 Thread Jakob Ericsson
Hi, Would it be possible to get the FlinkKafkaConsumer to support multiple topics, like a list? Or would it be better to instantiate one FlinkKafkaConsumers per topic and add as a source? We have about 40-50 topics to listen for one job. Or even better, supply a regexp pattern that defines the qu