How to Broadcast a very large model object (used in iterative scoring in recommendation system) in Flink

2016-09-29 Thread Anchit Jatana
Hi All, I'm building a recommendation system streaming application for which I need to broadcast a very large model object (used in iterative scoring) among all the task managers performing the operation parallely for the operator I'm doing an this operation in map1 of CoMapFunction. Please sugge

Counting latest state of stateful entities in streaming

2016-09-29 Thread Simone Robutti
Hello, in the last few days I tried to create my first real-time analytics job in Flink. The approach is kappa-architecture-like, so I have my raw data on Kafka where we receive a message for every change of state of any entity. So the messages are of the form (id,newStatus, timestamp) We want

Exceptions from collector.collect after cancelling job

2016-09-29 Thread Shannon Carey
When I cancel a job, I get many exceptions that look like this: java.lang.RuntimeException: Could not forward element to next operator at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:376) at org.apache.flink.streaming.runtime.tasks.Opera

Re: Error while adding data to RocksDB: No more bytes left

2016-09-29 Thread Shannon Carey
Hi Stephan! The failure appeared to occur every 10 minutes, which is also the interval for checkpointing. However, I agree with you that the stack trace appears to be independent. Could this perhaps be an issue with multithreading, where the checkpoint mechanism is somehow interfering with ongo

Re: Flink-HBase connectivity through Kerberos keytab | Simple get/put sample implementation

2016-09-29 Thread Stephan Ewen
Hi! In Flink prior to 1.2, you can use Kerberos with HBase via Hadoop's mechanism: https://ci.apache.org/projects/flink/flink-docs- master/setup/config.html#kerberos In Flink 1.2-SNAPSHOT, keytabs are installed in the Java security context, from which HBase can probably pick it up. This is quite

Re: Flink-HBase connectivity through Kerberos keytab | Simple get/put sample implementation

2016-09-29 Thread Stephan Ewen
Hi! In Flink prior to 1.2, you can use Kerberos with HBase via Hadoop's mechanism: https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#kerberos In Flink 1.2-SNAPSHOT, keytabs are installed in the Java security context, from which HBase can probably pick it up. This is quite n

Re: Flink-HBase connectivity through Kerberos keytab | Simple get/put sample implementation

2016-09-29 Thread Anchit Jatana
Hi Fabian, Right, I hope the committers take into account the kerberised access as well. Thanks for the update! Regards, Anchit On Thu, Sep 29, 2016 at 6:15 AM, Fabian Hueske wrote: > Hi Anchit, > > Flink does not yet have a streaming sink connector for HBase. Some members > of the community a

Re: How to interact with a running flink application?

2016-09-29 Thread Anchit Jatana
Hi Ufuk, Thanks for your help, I'm working on using the suggested approach to address my use case. Regards, Anchit On Wed, Sep 28, 2016 at 12:48 AM, Ufuk Celebi wrote: > Hey Anchit, > > the usual recommendation for this is to use a CoMap/CoFlatMap > operator, where the second input are the lo

Re: Unsubscribe

2016-09-29 Thread Matthias J. Sax
You need to send an email to user-unsubscr...@flink.apache.org to unsubscribe. See: https://flink.apache.org/community.html#mailing-lists -Matthias On 09/29/2016 05:44 AM, Vaidyanathan Sivasubramanian wrote: > signature.asc Description: OpenPGP digital signature

Re: Events B33/35 in Parallel Streams Diagram

2016-09-29 Thread Fabian Hueske
Sure, that would be great! Thanks! 2016-09-29 17:43 GMT+02:00 Neil Derraugh < neil.derra...@intellifylearning.com>: > Hi Fabian, > > Yes. Thanks! I think it would be helpful to indicate that on the graph. > Call it “key” or “key_id" instead of just “id”, as it is in fact the key of > the stream

Re: Events B33/35 in Parallel Streams Diagram

2016-09-29 Thread Neil Derraugh
Hi Fabian, Yes. Thanks! I think it would be helpful to indicate that on the graph. Call it “key” or “key_id" instead of just “id”, as it is in fact the key of the stream and not the id of the event? Probably seems trivial, but I struggled with this one. haha. I’ll submit a PR for the docs

Re: Events B33/35 in Parallel Streams Diagram

2016-09-29 Thread Fabian Hueske
Hi Neil, "B" only refers to the key-part of the record, the number is the timestamp (as you assumed out). The payload of the record is not displayed in the figure. So B35 and B31 are two different records with identical key. The keyBy() operation sends all records with the same key to the same sub

Events B33/35 in Parallel Streams Diagram

2016-09-29 Thread Neil Derraugh
Hi, I’m confused about the meaning of event(s?) B33 and B35 in the Parallel Streams Diagram (https://ci.apache.org/projects/flink/flink-docs-master/dev/event_time.html#watermarks-in-parallel-streams

Re: AW: Problem with CEPPatternOperator when taskmanager is killed

2016-09-29 Thread Fabian Hueske
Great, thanks! I gave you contributor permissions in JIRA. You can now also assign issues to yourself if you decide to continue to contribute. Best, Fabian 2016-09-29 16:48 GMT+02:00 jaxbihani : > Hi Fabian > > My JIRA user is: jaxbihani > I have created a pull request for the fix : > https://gi

Re: Using Flink and Cassandra with Scala

2016-09-29 Thread Chesnay Schepler
the cassandra sink only supports java tuples and POJO's. On 29.09.2016 16:33, Sanne de Roever wrote: Hi, Does the Cassandra sink support Scala and case classes? It looks like using Java is at the moment best practice. Cheers, Sanne

Re: AW: Problem with CEPPatternOperator when taskmanager is killed

2016-09-29 Thread jaxbihani
Hi Fabian My JIRA user is: jaxbihani I have created a pull request for the fix : https://github.com/apache/flink/pull/2568 -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Problem-with-CEPPatternOperator-when-taskmanager-is-killed-tp9024p9246

Using Flink and Cassandra with Scala

2016-09-29 Thread Sanne de Roever
Hi, Does the Cassandra sink support Scala and case classes? It looks like using Java is at the moment best practice. Cheers, Sanne

Re: Iterations vs. combo source/sink

2016-09-29 Thread Fabian Hueske
Hi Ken, you can certainly have partitioned sources and sinks. You can control the parallelism by calling .setParallelism() method. If you need a partitioned sink, you can call .keyBy() to hash partition. I did not completely understand the requirements of your program. Can you maybe provide pseud

Re: Flink-HBase connectivity through Kerberos keytab | Simple get/put sample implementation

2016-09-29 Thread Fabian Hueske
Hi Anchit, Flink does not yet have a streaming sink connector for HBase. Some members of the community are working on this though [1]. I think we resolved a similar issue for the Kafka connector recently [2]. Maybe the related commits contain some relevant code for your problem. Best, Fabian [1]

Re: flink 1.1.2 RichFunction not working

2016-09-29 Thread Stephan Ewen
Sorry for that inconvenience. You are right about mentioning that in the release notes (adding it even after the release). We'll take that as feedback for the next release. On Tue, Sep 27, 2016 at 9:42 PM, Chen Bekor wrote: > thanks. worth mentioning in the release notes of 1.1.2 that file so

Unsubscribe

2016-09-29 Thread Vaidyanathan Sivasubramanian

Flink-HBase connectivity through Kerberos keytab | Simple get/put sample implementation

2016-09-29 Thread Anchit Jatana
Hi All, I'm trying to link my flink application with HBase for simple read/write operations. I need to implement Flink to HBase the connectivity through Kerberos using the keytab. Can somebody share(or link me to some resource) a sample code/implementation on how to achieve Flink to HBase connect