Yahoo! Streaming Benchmark with Flink

2016-10-27 Thread Eric Fukuda
Hi, I have two questions on the blog post on Yahoo! Streaming Benchmark with Flink [1]. First is about the join operation to associate ad_ids and campaign_ids. In flink.benchmark.state.AdvertisingTopologyFlinkStateHighKeyCard, I don't see this being done. Is there a reason for this? Second is ab

Re: FlinkKafkaConsumerBase - Received confirmation for unknown checkpoint

2016-10-27 Thread Stefan Richter
Hi, I tracked down the problem and have a fix in this PR https://github.com/apache/flink/pull/2706 . Besides the misleading warning, the code should also still behave correctly in the old version. Best, Stefan > Am 27.10.2016 um 17:25 schrieb Robert

Re: TIMESTAMP TypeInformation

2016-10-27 Thread Greg Hogan
Could be. I had thought TypeInfoParser was closely related to TypeExtractor. On Thu, Oct 27, 2016 at 10:20 AM, Fabian Hueske wrote: > Wouldn't that be orthogonal to adding it to the TypeInfoParser? > > 2016-10-27 15:22 GMT+02:00 Greg Hogan : > >> Fabian, >> >> Should we instead add this as a reg

Re: About stateful transformations

2016-10-27 Thread Juan Rodríguez Hortalá
Hi Aljoscha, Thanks for your answer. At least by keeping only the latest one we don't have retention problems with the state backend, and for now I guess we could use manually triggered savepoints if we needed to store the history of the state. Thanks, Juan On Tue, Oct 25, 2016 at 6:58 AM, Aljo

Re: Can we do batch writes on cassandra using flink while leveraging the locality?

2016-10-27 Thread Shannon Carey
It certainly seems possible to write a Partitioner that does what you describe. I started implementing one but didn't have time to finish it. I think the main difficulty is in properly dealing with partition ownership changes in Cassandra… if you are maintaining state in Flink and the partitioni

Re: FlinkKafkaConsumerBase - Received confirmation for unknown checkpoint

2016-10-27 Thread Robert Metzger
Hi, it would be nice if you could check with a stable version as well. Thank you! On Thu, Oct 27, 2016 at 9:58 AM, PedroMrChaves wrote: > Hello, > > I Am using the version 1.2-SNAPSHOT. > I will try with a stable version to see if the problem persists. > > Regards, > Pedro Chaves. > > > > -- >

Re: watermark trigger doesn't check whether element's timestamp is passed

2016-10-27 Thread Manu Zhang
Hi, It's what I'm seeing. If timers are not fired at the end of window, a state (in the window) whose timestamp is *after *the timer will also be emitted. That's a problem for event-time trigger. Thanks, Manu On Thu, Oct 27, 2016 at 10:29 PM Aljoscha Krettek wrote: > Hi, > is that example inp

Re: Testing iterative data flows

2016-10-27 Thread Ufuk Celebi
Hey Ken! Unfortunately, no. But Paris just posted a proposal to improve this: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-13-Consistent-Processing-with-Loops-tt14149.html On Wed, Oct 26, 2016 at 11:10 PM, Ken Krugler wrote: > Hi all, > > What’s the recommended way

emit partial state in window (streaming)

2016-10-27 Thread Luis Mariano Guerra
hi, I need to calculate some counts for the day but also emit the partial counts periodically, I think triggers may help me, I'm searching around but there's not much content about it, any tip? for example I'm counting access by location to different services, I want to accumulate access during

Re: emit partial state in window (streaming)

2016-10-27 Thread Fabian Hueske
Hi Luis, these blogposts should help you with the periodic partial result trigger [1] [2]. Regarding the second question: Time windows are by default aligned to 1970-01-01-00:00:00. So a 24 hour window will always start at 00:00. Best, Fabian [1] http://flink.apache.org/news/2015/12/04/Introduc

Re: watermark trigger doesn't check whether element's timestamp is passed

2016-10-27 Thread Aljoscha Krettek
Hi, is that example input/output what you would like to achieve or what you are currently seeing with Flink? I think for your use case a custom Trigger would be required that works like the event-time trigger but additionally registers timers for each element where you want to emit. Cheers, Aljosc

Re: TIMESTAMP TypeInformation

2016-10-27 Thread Fabian Hueske
Wouldn't that be orthogonal to adding it to the TypeInfoParser? 2016-10-27 15:22 GMT+02:00 Greg Hogan : > Fabian, > > Should we instead add this as a registered TypeInfoFactory? > > Greg > > On Thu, Oct 27, 2016 at 3:55 AM, Fabian Hueske wrote: > >> Yes, I think you are right. >> TypeInfoParser

Re: TIMESTAMP TypeInformation

2016-10-27 Thread Greg Hogan
Fabian, Should we instead add this as a registered TypeInfoFactory? Greg On Thu, Oct 27, 2016 at 3:55 AM, Fabian Hueske wrote: > Yes, I think you are right. > TypeInfoParser needs to be extended to parse the java.sql.* types into the > corresponding TypeInfos. > > Can you open a JIRA for that?

Re: Broadcast Config-Values through connected Configuration Stream

2016-10-27 Thread Aljoscha Krettek
Hi, yes it can be done, in fact I have code like this in the Beam-on-Flink runner: // we have to manually contruct the two-input transform because we're not // allowed to have only one input keyed, normally. TwoInputTransformation< WindowedValue>, RawUnionValue, WindowedValue>> rawFli

Re: Broadcast Config-Values through connected Configuration Stream

2016-10-27 Thread Gyula Fóra
Hi, I agree with Aljoscha that needs to be solved properly, but it is technically possible to do it now as well (he actually had a PR a while back doing this.) You need to manually change how the transform method works on the connected stream to allow setting the key only one input. You need to u

Re: Broadcast Config-Values through connected Configuration Stream

2016-10-27 Thread Aljoscha Krettek
Hi Julian, I think it's currently not possible to do that in a fault-tolerant way. (The problem is that the state that results from the broadcast input also needs to be checkpointed, which is not possible right now.) A while back, I created an issue for that: https://issues.apache.org/jira/browse/F

Re: Broadcast Config-Values through connected Configuration Stream

2016-10-27 Thread Julian Bauß
Hi Ufuk, Thanks for your response. Unfortunately that does not work. Having ValueStateDescriptors in the CoFlatMap on the connected Stream requires a keyBy on the connected Stream. Another solution I can think of would be this: stream1.connect(stream2) .map(new MergeStreamsMapFunctio

Can we do batch writes on cassandra using flink while leveraging the locality?

2016-10-27 Thread kant kodali
Can we do batch writes on Cassandra using Flink while leveraging the locality? For example the batch writes in Cassandra will put pressure on the coordinator but since the connectors are built by leveraging the locality I was wondering if we could do batch of writes on a node where the batch belong

Re: Unit testing a Kafka stream based application?

2016-10-27 Thread Niels Basjes
Thanks. This is exactly what I needed. Niels On Wed, Oct 26, 2016 at 11:03 AM, Robert Metzger wrote: > Hi Niels, > > Sorry for the late response. > > you can launch a Kafka Broker within a JVM and use it for testing purposes. > Flink's Kafka connector is using that a lot for integration tests.

Re: FlinkKafkaConsumerBase - Received confirmation for unknown checkpoint

2016-10-27 Thread PedroMrChaves
Hello, I Am using the version 1.2-SNAPSHOT. I will try with a stable version to see if the problem persists. Regards, Pedro Chaves. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/FlinkKafkaConsumerBase-Received-confirmation-for-unknown-c

RE: TIMESTAMP TypeInformation

2016-10-27 Thread Radu Tudoran
OK. I will open the JIRA From: Fabian Hueske [mailto:fhue...@gmail.com] Sent: Thursday, October 27, 2016 9:56 AM To: user@flink.apache.org Subject: Re: TIMESTAMP TypeInformation Yes, I think you are right. TypeInfoParser needs to be extended to parse the java.sql.* types into the corresponding T

Re: TIMESTAMP TypeInformation

2016-10-27 Thread Fabian Hueske
Yes, I think you are right. TypeInfoParser needs to be extended to parse the java.sql.* types into the corresponding TypeInfos. Can you open a JIRA for that? Thanks, Fabian 2016-10-27 9:31 GMT+02:00 Radu Tudoran : > Hi, > > > > I dig meanwhile more through this and I think I found a bug actuall

Re: Flink Cassandra Connector is not working

2016-10-27 Thread Fabian Hueske
Hi, a NoSuchMethod indicates that you are using incompatible versions. You should check that the versions of your job dependencies and the version cluster you want to run the job on are the same. Best, Fabian 2016-10-27 7:13 GMT+02:00 NagaSaiPradeep : > Hi, > I am working on connecting Flink

RE: TIMESTAMP TypeInformation

2016-10-27 Thread Radu Tudoran
Hi, I dig meanwhile more through this and I think I found a bug actually. The scenario that I was trying to describe was something like 1. You have a generic datastream with Tuple (alternatively I could move to row I guess) and you get the data from whatever stream (e.g., kafka, network