Re: dynamic tables in cassandra sink

2018-05-03 Thread Michael Latta
If you restart the job each month you can build the string dynamically. If you want data to flow into the log based on a date in the record you will need to do something fancier. I have not used the casandra connector so I can’t help on the details. Can you subclass the connector and build the q

dynamic tables in cassandra sink

2018-05-03 Thread Meghashyam Sandeep V
Hi there, I have a flink stream from kafka writing to Cassandra. We use monthly tables in Cassandra to avoid TTL and tombstones that come with it. Tables would be like table_05_2018, table_06_2018 and so on. How do I dynamically register this table name in the following snippet? CassandraSink .ad

Re: Classloader and removal of native libraries

2018-05-03 Thread Martin Eden
Hi, I'm reviving this thread because I am probably hitting a similar issue with loading a native library. However I am not able to resolve it with the suggestions here. I am using Flink 1.3.2 and the Jep library to call Cpython from a RichFlatMapFunction with a parallelism of 10. I am instantiati

Re: Stream.union doesn't change the parallelism of the new stream?

2018-05-03 Thread Fabian Hueske
Hi, Union is not an actual operator in Flink. Instead, the operator that is applied on the unioned stream ingests its input from all unioned streams. The parallelism of that operator is the configured default parallelism (can be specified at the execution environment) unless it is explicitly defin

Stream.union doesn't change the parallelism of the new stream?

2018-05-03 Thread Ajay Tripathy
It seems, at least in Flink 1.3.2, unioning two streams together doesn't change the parallelism, and that the new stream just retains the parallelism of the stream in the first. Does it make sense for the parallelism of the new stream to be set to the max of the two streams parallelism? Do I have t

Re: Cannot submit jobs on a HA Standalone JobManager

2018-05-03 Thread Julio Biason
Hey Gary (again), Yup, that worked. Now I can launch apps again. ... but that's not something actually good. I mean, I have my own test environment, which doesn't need HA -- after all, I don't need to worry about this, this is a framework job, not my pipeline job. Which means now I'll need to ei

Flink + Marathon (Mesos) Memory Issues

2018-05-03 Thread ani.desh1512
*Background*: We have a setup of Flink 1.4.0. We run this flink cluster via /flink-jobmanager.sh foreground/ and /flink-taskmanager.sh foreground/ command via Marathon (which launches them as mesos jobs). So, basically, jobmanager and taskmanagers run as mesos tasks. Now, say, we run the flink

Re: How to set fix JobId for my application.

2018-05-03 Thread shashank734
Thanks for the response, Actually I know that. The main thing I have to use that job id of app A in the queryable state in other application B. In that case, I don't want to redeploy application B whenever I change something in application A. -- Sent from: http://apache-flink-user-mailing-list

Stashing key with AggregateFunction

2018-05-03 Thread Ken Krugler
Hi list, I was trying different ways to implement a moving average (count based, not time based). The blunt instrument approach is to create a custom FlatMapFunction that keeps track of the last N values. It seemed like using an AggregateFunction would be most consistent with the Flink API, a

Re: intentional back-pressure (or a poor man's side-input)

2018-05-03 Thread Derek VerLee
Thanks for the thoughts Piotr. Seems I have a talent for asking (nearly) the same question as someone else at the same time, and the check-pointing was raised in that thread as well. I guess one way to conceptualize it is that you have is a stream job that has "

Re: IOException: Size of the state is larger than the maximum permitted memory-backed state

2018-05-03 Thread Gary Yao
Hi, It looks like you are still using the MemoryStateBackend. Are you overriding the state backend settings from within your job? To debug this, it would be helpful to see the JobManager logs and the contents of your flink-conf.yaml Best, Gary On Wed, May 2, 2018 at 3:25 AM, syed wrote: > Hi;

Re: How to set fix JobId for my application.

2018-05-03 Thread Gary Yao
Hi, AFAIK it is not possible to set the job id manually. You could get the job id via the REST API. For example, http://host:port/joboverview/running gives you a list of running jobs [1] which you can filter by name. Would that work for you? Best, Gary [1] https://ci.apache.org/projects/flink/f

Re: Batch job stuck in Canceled state in Flink 1.5

2018-05-03 Thread Nico Kruber
Also, please have a look at the other TaskManagers' logs, in particular the one that is running the operator that was mentioned in the exception. You should look out for the ID 98f5976716234236dc69fb0e82a0cc34. Nico PS: Flink logs files should compress quite nicely if they grow too big :) On 0

Re: intentional back-pressure (or a poor man's side-input)

2018-05-03 Thread Piotr Nowojski
Maybe it could work with Flink’s 1.5 credit base flow control. But you would need a way to express state “block one input side of the CoProcessFunction”, pass this information up to the input gate and handle it probably similar to how `org.apache.flink.streaming.runtime.io.CachedBufferBlocker` b

Re: Cannot submit jobs on a HA Standalone JobManager

2018-05-03 Thread Julio Biason
Hey Gary, Yes, I was still running with the `-m` flag on my dev machine -- partially configured like prod, but without the HA stuff. I never thought it could be a problem, since even the web interface can redirect from the secondary back to primary. Currently I'm still running 1.4.0 (and I plan t

Re: Cannot submit jobs on a HA Standalone JobManager

2018-05-03 Thread Gary Yao
Hi Julio, Are you using the -m flag of "bin/flink run" by any chance? In HA mode, you cannot manually specify the JobManager address. The client determines the leader through ZooKeeper. If you did not configure the ZooKeeper quorum in the flink-conf.yaml on the machine from which you are submittin

Re: Batch job stuck in Canceled state in Flink 1.5

2018-05-03 Thread Stephan Ewen
Google Drive would be great. Thanks! On Thu, May 3, 2018 at 1:33 PM, Amit Jain wrote: > Hi Stephan, > > Size of JM log file is 122 MB. Could you provide me other media to > post the same? We can use Google Drive if that's fine with you. > > -- > Thanks, > Amit > > On Thu, May 3, 2018 at 12:58 P

Re: Batch job stuck in Canceled state in Flink 1.5

2018-05-03 Thread Amit Jain
Hi Stephan, Size of JM log file is 122 MB. Could you provide me other media to post the same? We can use Google Drive if that's fine with you. -- Thanks, Amit On Thu, May 3, 2018 at 12:58 PM, Stephan Ewen wrote: > Hi Amit! > > Thanks for sharing this, this looks like a regression with the netwo

Re: Configure provided libraries with maven

2018-05-03 Thread Chesnay Schepler
If you use IntelliJ you can use a profile that sets dependencies to compile if executed in the IDE. We use this in our quickstarts . On 03.05.2018 12:56, Georg

Re: ConnectedIterativeStreams and processing state 1.4.2

2018-05-03 Thread Lasse Nedergaard
I could but the external Rest call is done with async operator and I want to reduce the number of objects going to async and it would require that I store the state in the async operator to. Med venlig hilsen / Best regards Lasse Nedergaard > Den 3. maj 2018 kl. 13.09 skrev Aljoscha Krettek :

Re: ConnectedIterativeStreams and processing state 1.4.2

2018-05-03 Thread Aljoscha Krettek
Couldn't you do that in one operator then? I mean doing the calls and caching the results? > On 3. May 2018, at 12:28, Lasse Nedergaard wrote: > > Hi. > > The idea is to cache the latest enrichment data to reuse them and thereby > limit the number of external enrichment calls a local cache i

Configure provided libraries with maven

2018-05-03 Thread Georgi Stoyanov
Hi guys, In our project we have setup that uses flink-core, flink-java, rocksdb, slf4j (and some more libraries, that are on the machines, where the jars will be deployed) with scope “provided”. This is a bit tricky cause when we run the tests in IntelliJ there’s no logging (cause of missing sl

RE: use of values of previously accepted event

2018-05-03 Thread Esa Heikkinen
Hi Thanks for the reply. I have tried to understand IterativeCondition, but I have not yet fully understood. How can it apply to my case ? If I have more (than one) variables to set (and read) in pattern, is that possible by IterativeCondition ? Are there exist more examples how to use it ? o

Re: ConnectedIterativeStreams and processing state 1.4.2

2018-05-03 Thread Lasse Nedergaard
Hi. The idea is to cache the latest enrichment data to reuse them and thereby limit the number of external enrichment calls a local cache in Flink as many of our data objects are enriched with the same data. An alternative solution could be to store the enriched data in Kafka and then stream

Re: ConnectedIterativeStreams and processing state 1.4.2

2018-05-03 Thread Aljoscha Krettek
Hi, Why do you want to do the enrichment downstream and send the data back up? The problem is that feedback edges (or iterations, they are the same in Flink) have some issues with fault-tolerance. Could you maybe outline a bit more in-depths what you're doing and what the flow of data and enric

Re: Odd job failure

2018-05-03 Thread Stephan Ewen
Concerning the connectivity issue - it is hard to say anything more without any logs or details. Does the JM log that it is trying to send tasks to the 3rd TM, but the TM does not show signs of executing them? On Thu, May 3, 2018 at 10:22 AM, Stephan Ewen wrote: > Hi Elias! > > Concerning the

Re: Odd job failure

2018-05-03 Thread Stephan Ewen
Hi Elias! Concerning the spilling of alignment data to disk: - In 1.4.x , you can set an upper limit via " task.checkpoint.alignment.max-size ". See [1]. - In 1.5.x, the default is a back-pressure based alignment, which does not spill any more. Best, Stephan [1] https://ci.apache.org/projec

Re: Window over events defined by a time range

2018-05-03 Thread Fabian Hueske
Hi, Flink can add events to mulitple windows. For instance, the built-in sliding windows are doing this. You can address your use case by implementing a custom WindowAssigner [1]. Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/windows.html#windo

Re: Batch job stuck in Canceled state in Flink 1.5

2018-05-03 Thread Stephan Ewen
Hi Amit! Thanks for sharing this, this looks like a regression with the network stack changes. The log you shared from the TaskManager gives some hint, but that exception alone should not be a problem. That exception can occur under a race between deployment of some tasks while the whole job is e

How to set fix JobId for my application.

2018-05-03 Thread shashank734
How I can set Fixed JobId for my flink Job. Cause queryable client required Job ID. So whenever I'll update or redeploy my queryable state job than JobID will change and i have to change and redeploy in queryable client app. Is there any way I can fix the jobID or dynamically pass in the client a