Hi max,
that's exactly what I was looking for. What do you mean for 'the best thing
is if you keep a local copy of your sampling jars and work directly with
them'?

Best,
Flavio

On Tue, Sep 27, 2016 at 2:35 PM, Maximilian Michels <m...@apache.org> wrote:

> Hi Flavio,
>
> This is not really possible at the moment. Though there is a workaround.
> You can create a dummy jar file (may be empty). Then you can use
>
> ./flink run -C hdfs:///path/to/cluster.jar -c org.package.SampleClass
> /path/to/dummy.jar
>
> That way Flink will include your cluster jar and you can load all classes
> necessary.
>
> Alternatively, using the Remote Environment, this looks like this:
>
> public static void main(String[] args) throws Exception {
>
>    final RemoteEnvironment env = new RemoteEnvironment(
>       "remoteHost",
>       6123,
>       new Configuration(),
>       new String[0],
>       new URL[]{
>          new URL("file:///path/to/sample.jar"),
>          new 
> URL("file:///Users/max/Dev/flink/build-target/lib/flink-dist_2.10-1.2-SNAPSHOT.jar")});
>    URLClassLoader classLoader = new 
> URLClassLoader(env.globalClasspaths.toArray(new URL[0]));
>
>    Class<?> clazz = classLoader.loadClass("org.package.sample.SampleClass");
>
>    Method main = clazz.getDeclaredMethod("sampleMethod", 
> ExecutionEnvironment.class);
>
>    // pass environment as an argument to your sample method
>    // the method should return the results of the execution
>    Object sampleResult = main.invoke(null, env);
> }
>
>
> Beware, this is extremely hacky. We should have a better way to invoke jar
> files remotely. Honestly, the best thing is if you keep a local copy of
> your sampling jars and work directly with them.
>
> Cheers,
> Max
>
> On Tue, Sep 27, 2016 at 12:25 PM, Flavio Pompermaier <pomperma...@okkam.it
> > wrote:
>
>> Hi Max,
>> actually I have a jar containing sampling jobs and I need to collect
>> results from a client.
>> I've tried to use ExecutionEnvironment.createRemoteEnvironment but I
>> fear that it's not the right way to do that because
>> I just need to tell the cluster the main class and the parameters to run
>> the job (and where the jar file is on HDFS).
>>
>> Best,
>> Flavio
>>
>> On Tue, Sep 27, 2016 at 12:06 PM, Maximilian Michels <m...@apache.org>
>> wrote:
>>
>>> Hi Flavio,
>>>
>>> Do you want to sample from a running batch job? That would be like
>>> Queryable State in streaming jobs but it is not supported in batch
>>> mode.
>>>
>>> Cheers,
>>> Max
>>>
>>>
>>> On Mon, Sep 26, 2016 at 6:13 PM, Flavio Pompermaier
>>> <pomperma...@okkam.it> wrote:
>>> > Hi to all,
>>> >
>>> > I have a use case where I need to tell a Flink cluster to give me a
>>> sample
>>> > of X records using parametrizable sampling functions. Is there any best
>>> > practice or advice to do that?
>>> >
>>> > Should I create a Remote ExecutionEnvironment or should I use the Flink
>>> > client (I don't know if it uses REST services or RPC or whatever)?
>>> > Is there any java snippet for that?
>>> >
>>> > Best,
>>> > Flavio
>>> >
>>>
>>
>>
>>
>>
>

Reply via email to