Re: Flink program without a line of code

2016-04-22 Thread Aljoscha Krettek
Hi, I think if the Table API/SQL API evolves enough it should be able to supply a Flink program as just an SQL query with source/sink definitions. Hopefully, in the future. :-) Cheers, Aljoscha On Fri, 22 Apr 2016 at 23:10 Fabian Hueske wrote: > Hi Alex, > > welcome to the Flink community! > Ri

Re: Thanks everyone

2016-04-22 Thread Márton Balassi
Hi Prez, Thanks for sharing, the community is always glad to welcome new Flink users. Best, Marton On Sat, Apr 23, 2016 at 6:01 AM, Prez Cannady wrote: > We’ve completed our first full sweep on a five node Flink cluster and it > went beautifully. On behalf of my team, thought I’d say thanks

Thanks everyone

2016-04-22 Thread Prez Cannady
We’ve completed our first full sweep on a five node Flink cluster and it went beautifully. On behalf of my team, thought I’d say thanks for all the support. Lots more learning and work to do, so we look forward to working with you all. Prez Cannady p: 617 500 3378 e: revp...@opencorrelate.

Re: Count windows missing last elements?

2016-04-22 Thread Konstantin Kulagin
I was trying to implement this (force flink to handle all values from input) but had no success... Probably I am not getting smth with flink windowing mechanism I've created my 'finishing' trigger which is basically a copy of purging trigger But was not able to make it work: https://gist.github.c

Re: implementing a continuous time window

2016-04-22 Thread Jonathan Yom-Tov
Thanks for taking the time. That seems like it would complicated without good knowledge of the overall architecture. I might give it a shot anyway. On Fri, Apr 22, 2016 at 4:22 PM, Fabian Hueske wrote: > Hi Jonathan, > > I thought about your use case again. I'm afraid, the approach I proposed >

Re: java.lang.ClassCastException: org.apache.flink.streaming.api.watermark.Watermark cannot be cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord

2016-04-22 Thread Fabian Hueske
Hi Konstantin, this exception is thrown if you do not set the time characteristic to event time and assign timestamps. Please try to add > env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) after you obtained the StreamExecutionEnvironment. Best, Fabian 2016-04-22 15:47 GMT+02:00 Ko

Re: Flink program without a line of code

2016-04-22 Thread Fabian Hueske
Hi Alex, welcome to the Flink community! Right now, there is no way to specify a Flink program without writing code (Java, Scala, Python(beta)). In principle it is possible to put such functionality on top of the DataStream or DataSet APIs. This has been done before for other programming APIs (Fl

Getting java.lang.Exception when try to fetch data from Kafka

2016-04-22 Thread prateekarora
Hi I am sending data using kafkaProducer API imageRecord = new ProducerRecord(topic,messageKey, imageData); producer.send(imageRecord); And in flink program try to fect data using FlinkKafkaConsumer08 . below are the sample code . def main(a

YARN terminating TaskNode

2016-04-22 Thread Timur Fayruzov
Hello, Next issue in a string of things I'm solving is that my application fails with the message 'Connection unexpectedly closed by remote task manager'. Yarn log shows the following: Container [pid=4102,containerID=container_1461341357870_0004_01_15] is running beyond physical memory limit

Re: Access to a shared resource within a mapper

2016-04-22 Thread Timur Fayruzov
Actually, a follow-up question: is map function single-threaded (within one task manager, that is). If it's not then lazy initialization wont' work, is it right? On Fri, Apr 22, 2016 at 11:50 AM, Stephan Ewen wrote: > You may also be able to initialize the client only in the parallel > execution

Re: Access to a shared resource within a mapper

2016-04-22 Thread Stephan Ewen
You may also be able to initialize the client only in the parallel execution by making it a "lazy" variable in Scala. On Fri, Apr 22, 2016 at 11:46 AM, Timur Fayruzov wrote: > Outstanding! Thanks, Aljoscha. > > On Fri, Apr 22, 2016 at 2:06 AM, Aljoscha Krettek > wrote: > >> Hi, >> you could use

Flink program without a line of code

2016-04-22 Thread Alexander Smirnov
Hi guys! I’m new to Flink, and actually to this mailing list as well :) this is my first message. I’m still reading the documentation and I would say Flink is an amazing system!! Thanks everybody who participated in the development! The information I didn’t find in the documentation - if it is

Re: Join DataStream with dimension tables?

2016-04-22 Thread Lohith Samaga M
Hi, Cassandra could be used as a distributed cache. Lohith. Sent from my Sony Xperia™ smartphone Aljoscha Krettek wrote Hi Srikanth, that's an interesting use case. It's not possible to do something like this out-of-box but I'm actually working on API for such cases. In the mean ti

Re: Join DataStream with dimension tables?

2016-04-22 Thread Aljoscha Krettek
Hi Srikanth, that's an interesting use case. It's not possible to do something like this out-of-box but I'm actually working on API for such cases. In the mean time, I programmed a short example that shows how something like this can be programmed using the API that is currently available. It requ

Re: Access to a shared resource within a mapper

2016-04-22 Thread Timur Fayruzov
Outstanding! Thanks, Aljoscha. On Fri, Apr 22, 2016 at 2:06 AM, Aljoscha Krettek wrote: > Hi, > you could use a RichMapFunction that has an open method: > > data.map(new RichMapFunction[...]() { > def open(): () = { > // initialize client > } > > def map(input: INT): OUT = { > // u

Re: Flink on Yarn - ApplicationMaster command

2016-04-22 Thread Theofilos Kakantousis
Hi Max, I manage to get the jobManagerAddress from FlinkYarnCluster, however when I submit a job using the code below the jobID is null. Is there something wrong in the way I submit the job? Otherwise any ideas to which direction should I further investigate? The /runBlocking /call returns al

Re: Threads waiting on LocalBufferPool

2016-04-22 Thread Maciek Próchniak
On 21/04/2016 16:46, Aljoscha Krettek wrote: Hi, I would be very happy about improvements to our RocksDB performance. What are the RocksDB Java benchmarks that you are running? In Flink, we also have to serialize/deserialize every time that we access RocksDB using our TypeSerializer. Maybe t

Re: Count windows missing last elements?

2016-04-22 Thread Konstantin Kulagin
No problems at all, there is not much flink people and a lot of asking guys - it should be hard to understand each person's issues :) Yes, it is not as easy as 'contains' operator: I need to collect some amount of tuples in order to create a in-memory lucene index. After that I will filter entrie

java.lang.ClassCastException: org.apache.flink.streaming.api.watermark.Watermark cannot be cast to org.apache.flink.streaming.runtime.streamrecord.StreamRecord

2016-04-22 Thread Konstantin Kulagin
Hi guys, trying to run this example: StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); DataStreamSource> source = env.addSource(new SourceFunction>() { @Override public void run(SourceContext> ctx) throws Exception { LongStream.rang

Re: Count windows missing last elements?

2016-04-22 Thread Aljoscha Krettek
Hi, I'm afraid I don't understand your use case yet. In you example you want to preserve only the elements where the string value contains a "3"? This can be done using a filter, as in source.filter( value -> value.f1.contains("3") ) This is probably too easy, though, and I'm misunderstanding the

Re: implementing a continuous time window

2016-04-22 Thread Fabian Hueske
Hi Jonathan, I thought about your use case again. I'm afraid, the approach I proposed is not working due to limitations of the Evictor interface. The only way that I see to implement you use case is to implement a custom stream operator by extending AbstractStreamOperator and implementing the OneI

Re: Flink on Yarn - ApplicationMaster command

2016-04-22 Thread Maximilian Michels
Hi Theofilos, Assuming you have the FlinkYarnCluster after the call to deploy(). You can get the JobManager address using the InetSocketAddress address = cluster.getJobManagerAddress(); Then create a Configuration with this address: Configuration config = new Configuration(); config.setString(C

Re: How to fetch kafka Message have [KEY,VALUE] pair

2016-04-22 Thread Robert Metzger
If you've serialized your data with a custom format, you can also implement a custom deserializer using the KeyedDeserializationSchema. On Fri, Apr 22, 2016 at 2:35 PM, Till Rohrmann wrote: > Depending on how the key value pair is encoded, you could use the > TypeInformationKeyValueSerialization

Re: How to fetch kafka Message have [KEY,VALUE] pair

2016-04-22 Thread Till Rohrmann
Depending on how the key value pair is encoded, you could use the TypeInformationKeyValueSerializationSchema where you provide the BasicTypeInfo.STRING_TYPE_INFO and PrimitiveArrayTypeInfo.BYTE_PRIMITIVE_ARRAY_TYPE_INFO as the key and value type information. But this only works if your data was ser

Re: Programatic way to get version

2016-04-22 Thread Till Rohrmann
But be aware that this method only returns a non null string if the binaries have been built with Maven. Otherwise it will return null. Cheers, Till On Fri, Apr 22, 2016 at 12:12 AM, Trevor Grant wrote: > dug through the codebase, in case any others want to know: > > import org.apache.flink.run

Re: Flink YARN job manager web port

2016-04-22 Thread Till Rohrmann
Hi Shannon, if you need this feature (assigning range of web server ports) for your use case, then we would have to add it. If you want to do it, then it would help us a lot. I think the documentation is a bit outdated here. The port is either chosen from the range of ports or a ephemeral port is

Re: jobmanager.web.* properties for long running yarn session

2016-04-22 Thread Aljoscha Krettek
Hi, I'm afraid you found a bug. I opened a Jira issue for it: https://issues.apache.org/jira/browse/FLINK-3803 Cheers, Aljoscha On Fri, 22 Apr 2016 at 13:20 Aljoscha Krettek wrote: > Hi, > I'm investigating. > > Cheers, > Aljoscha > > On Tue, 19 Apr 2016 at 13:08 Konstantin Knauf < > konstantin

Re: jobmanager.web.* properties for long running yarn session

2016-04-22 Thread Aljoscha Krettek
Hi, I'm investigating. Cheers, Aljoscha On Tue, 19 Apr 2016 at 13:08 Konstantin Knauf wrote: > Hi everyone, > > we are using a long running yarn session and changed > jobmanager.web.checkpoints.history to 20. On the dashboard's job manager > panel I can see the changed config, but the checkpoin

Re: FoldFunction accumulator checkpointing

2016-04-22 Thread Aljoscha Krettek
Hi Ron, I see that this leads to a bit of a hassle for you. I'm very reluctant to allow the general RichFunction interface in functions that are used inside state because this has quite some implications. Maybe we can add a simplified interface just for functions that are used inside state to allow

Re: Replays message in Kafka topics with FlinkKafkaConsumer09

2016-04-22 Thread Aljoscha Krettek
Hi, I think the "auto.offset.reset" parameter is only used if your consumer never read from a topic. To simulate being a new consumer you can set " group.id" property to a new random value. Cheers, Aljoscha On Fri, 22 Apr 2016 at 03:10 Jack Huang wrote: > Hi all, > > I am trying to force my j

Re: Access to a shared resource within a mapper

2016-04-22 Thread Aljoscha Krettek
Hi, you could use a RichMapFunction that has an open method: data.map(new RichMapFunction[...]() { def open(): () = { // initialize client } def map(input: INT): OUT = { // use client } } the open() method is called before any elements are passed to the function. The counterpart

Re: Custom state values in CEP

2016-04-22 Thread Till Rohrmann
Hi Sowmya, I'm afraid at the moment it is not possible to store custom state in the filter or select function. If you need access to the whole sequence of matching events then you have to put this code in the select clause of your pattern stream. Cheers, Till On Fri, Apr 22, 2016 at 7:55 AM, Sow