Re: Batch job per stream message?

2017-11-01 Thread Tomas Mazukna
Hi Fabian, thanks for pointing me in the right direction reading through the documentation here: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/asyncio.html seems like I can accomplish what I need with async call to a rest service or jdbc query per stream item being pr

Re: JobManager web interface redirect strategy when running in HA

2017-11-01 Thread Chesnay Schepler
We intend to change the redirection behavior such that any jobmanager (leading or not) can accept requests, and communicates internally with the leader. In this model you could setup the flink.domain.tld to point to any jobmanager (or distribute requests among them). Would this work for you?

Re: Flink send checkpointing message in IT

2017-11-01 Thread Chesnay Schepler
You could trigger a savepoint, which from the viewpoint of sources/operators/sinks is the same thing as a checkpoint. How to do this depends a bit on how your test case is written, but you can take a look at the SavepointMigrationTestBase#executeAndSavepoint which is all about running josb and

Re: Using Flink Ml with DataStream

2017-11-01 Thread Chesnay Schepler
I don't believe this to be possible. The ML library works exclusively with the Batch API. On 30.10.2017 12:52, Adarsh Jain wrote: Hi, Is there a way to use Stochastic Outlier Selection (SOS) and/or SVM using CoCoA with streaming data. Please suggest and give pointers. Regards, Adarsh ‌

Re: Job Manager Configuration

2017-11-01 Thread Chesnay Schepler
AFAIK there is no theoretical limit on the size of the plan, it just depends on the available resources. The job submissions times out since it takes too long to deploy all the operators that the job defines. With 300 flows, each with 6 operators you're looking at potentially (1800 * paralleli

Re: Reprocessing the data after config change

2017-11-01 Thread Fabian Hueske
Hi Tomasz, that sounds like a sound design. You have to make sure that the output of the application is idempotent such that the reprocessing job overrides all! output data of the earlier job. Best, Fabian 2017-10-23 16:24 GMT+02:00 Tomasz Dobrzycki : > Hi all, > > I'm currently working on a

Re: Batch job per stream message?

2017-11-01 Thread Fabian Hueske
Hi Tomas, triggering a batch DataSet job from a DataStream program for each input record doesn't sound like a good idea to me. You would have to make sure that the cluster always has sufficient resources and handle failures. It would be preferable to have all data processing in a DataStream job.

Re: Help on RowTypeInfo?

2017-11-01 Thread Fabian Hueske
Hi Paul, The *.scala.StreamTableEnvironment is for Scala programs, the *.java.StreamTableEnvironment for Java programs and the third is the common basis of the Scala and Java environment. TableEnvironment.getTableEnvironment automatically creates the appropriate TableEnvironment based on the provi