Concerning your question how to run the programs one after another: In the core method of the program, you can simply have a loop around the part between "getExecutionEnvironment()" and "env.execute()". That way, you trigger the programs one after another.
On Wed, Feb 4, 2015 at 9:34 PM, Fabian Hueske <fhue...@gmail.com> wrote: > Hi Stefan, > > Flink uses only one broadcast variable for all parallel tasks on one > machine. > Flink can also load the broadcast variable into a custom data structure. > > Have a look at the getBroadcastVariableWithInitializer() method: > > /** > * Returns the result bound to the broadcast variable identified by the > * given {@code name}. The broadcast variable is returned as a shared data > structure > * that is initialized with the given {@link BroadcastVariableInitializer}. > * <p> > * IMPORTANT: The broadcast variable data structure is shared between the > parallel > * tasks on one machine. Any access that modifies its internal > state needs to > * be manually synchronized by the caller. > * > * @param name The name under which the broadcast variable is registered; > * @param initializer The initializer that creates the shared data > structure of the broadcast > * variable from the sequence of elements. > * @return The broadcast variable, materialized as a list of elements. > */ > <T, C> C getBroadcastVariableWithInitializer(String name, > BroadcastVariableInitializer<T, C> initializer); > > Right now, there is no easy way to run multiple tasks one after the other > that I am aware of. > However, we are working on materializing intermediate results. Once this > feature is available, it should be easy to do the grep steps one by one. > > Cheers, Fabian > >