Hi Flavio, This is not really possible at the moment. Though there is a workaround. You can create a dummy jar file (may be empty). Then you can use
./flink run -C hdfs:///path/to/cluster.jar -c org.package.SampleClass /path/to/dummy.jar That way Flink will include your cluster jar and you can load all classes necessary. Alternatively, using the Remote Environment, this looks like this: public static void main(String[] args) throws Exception { final RemoteEnvironment env = new RemoteEnvironment( "remoteHost", 6123, new Configuration(), new String[0], new URL[]{ new URL("file:///path/to/sample.jar"), new URL("file:///Users/max/Dev/flink/build-target/lib/flink-dist_2.10-1.2-SNAPSHOT.jar")}); URLClassLoader classLoader = new URLClassLoader(env.globalClasspaths.toArray(new URL[0])); Class<?> clazz = classLoader.loadClass("org.package.sample.SampleClass"); Method main = clazz.getDeclaredMethod("sampleMethod", ExecutionEnvironment.class); // pass environment as an argument to your sample method // the method should return the results of the execution Object sampleResult = main.invoke(null, env); } Beware, this is extremely hacky. We should have a better way to invoke jar files remotely. Honestly, the best thing is if you keep a local copy of your sampling jars and work directly with them. Cheers, Max On Tue, Sep 27, 2016 at 12:25 PM, Flavio Pompermaier <pomperma...@okkam.it> wrote: > Hi Max, > actually I have a jar containing sampling jobs and I need to collect > results from a client. > I've tried to use ExecutionEnvironment.createRemoteEnvironment but I fear > that it's not the right way to do that because > I just need to tell the cluster the main class and the parameters to run > the job (and where the jar file is on HDFS). > > Best, > Flavio > > On Tue, Sep 27, 2016 at 12:06 PM, Maximilian Michels <m...@apache.org> > wrote: > >> Hi Flavio, >> >> Do you want to sample from a running batch job? That would be like >> Queryable State in streaming jobs but it is not supported in batch >> mode. >> >> Cheers, >> Max >> >> >> On Mon, Sep 26, 2016 at 6:13 PM, Flavio Pompermaier >> <pomperma...@okkam.it> wrote: >> > Hi to all, >> > >> > I have a use case where I need to tell a Flink cluster to give me a >> sample >> > of X records using parametrizable sampling functions. Is there any best >> > practice or advice to do that? >> > >> > Should I create a Remote ExecutionEnvironment or should I use the Flink >> > client (I don't know if it uses REST services or RPC or whatever)? >> > Is there any java snippet for that? >> > >> > Best, >> > Flavio >> > >> > > > >