Hi Cory!

There is no flag to define the BlobServer port right now, but we should
definitely add this: https://issues.apache.org/jira/browse/FLINK-2996

If your setup is such that the firewall problem is only between client and
master node (and the workers can reach the master on all ports), then you
can try two workarounds:

1) Start the program in the cluster (or on the master node, via ssh).

2) Add the program jar to the lib directory of Flink, and start your
program with the RemoteExecutor, without a jar attachment. Then it only
needs to communicate to the actor system (RPC) port, which is not random in
standalone mode (6123 by default).

Stephan




On Tue, Nov 10, 2015 at 8:46 PM, Cory Monty <cory.mo...@getbraintree.com>
wrote:

> I'm also running into an issue with a non-YARN cluster. When submitting a
> JAR to Flink, we'll need to have an arbitrary port open on all of the
> hosts, which we don't know about until the socket attempts to bind; a bit
> of a problem for us.
>
> Are there ways to submit a JAR to Flink that bypasses the need for the
> BlobServer's random port binding? Or, to control the port BlobServer binds
> to?
>
> Cheers,
>
> Cory
>
> On Thu, Nov 5, 2015 at 8:07 AM, Niels Basjes <ni...@basjes.nl> wrote:
>
>> That is what I tried. Couldn't find that port though.
>>
>> On Thu, Nov 5, 2015 at 3:06 PM, Robert Metzger <rmetz...@apache.org>
>> wrote:
>>
>>> Hi,
>>>
>>> cool, that's good news.
>>>
>>> The RM proxy is only for the web interface of the AM.
>>>
>>>  I'm pretty sure that the MapReduce AM has at least two ports:
>>> - one for the web interface (accessible through the RM proxy, so behind
>>> the firewall)
>>> - one for the AM RPC (and that port is allocated within the configured
>>> range, open through the firewall).
>>>
>>> You can probably find the RPC port in the log file of the running
>>> MapReduce AM (to find that, identify the NodeManager running the AM, access
>>> the NM web interface and retrieve the logs of the container running the AM).
>>>
>>> Maybe the mapreduce client also logs the AM RPC port when querying the
>>> status of a running job.
>>>
>>>
>>> On Thu, Nov 5, 2015 at 2:59 PM, Niels Basjes <ni...@basjes.nl> wrote:
>>>
>>>> Hi,
>>>>
>>>> I checked and this setting has been set to a limited port range of only
>>>> 100 port numbers.
>>>>
>>>> I tried to find the actual port an AM is running on and couldn't find
>>>> it (I'm not the admin on that cluster)
>>>>
>>>> The url to the AM that I use to access it always looks like this:
>>>>
>>>> http://master-001.xxxxxx.net:8088/proxy/application_1443166961758_85492/index.html
>>>>
>>>> As you can see I never connect directly; always via the proxy that runs
>>>> over the master on a single fixed port.
>>>>
>>>> Niels
>>>>
>>>> On Thu, Nov 5, 2015 at 2:46 PM, Robert Metzger <rmetz...@apache.org>
>>>> wrote:
>>>>
>>>>> While discussing with my colleagues about the issue today, we came up
>>>>> with another approach to resolve the issue:
>>>>>
>>>>> d) Upload the job jar to HDFS (or another FS) and trigger the
>>>>> execution of the jar using an HTTP request to the web interface.
>>>>>
>>>>> We could add some tooling into the /bin/flink client to submit a job
>>>>> like this transparently, so users would not need to bother with the file
>>>>> upload and request sending.
>>>>> Also, Sachin started a discussion on the dev@ list to add support for
>>>>> submitting jobs over the web interface, so maybe we can base the fix for
>>>>> FLINK-2960 on that.
>>>>>
>>>>> I've also looked into the Hadoop MapReduce code and it seems they do
>>>>> the following:
>>>>> When submitting a job, they are uploading the job jar file to HDFS.
>>>>> They also upload a configuration file that contains all the config options
>>>>> of the job. Then, they submit this altogether as an application to YARN.
>>>>> So far, there has not been any firewall involved. They establish a
>>>>> connection between the JobClient and the ApplicationMaster when the user 
>>>>> is
>>>>> querying the current job status, but I could not find any special code
>>>>> getting the status over HTTP.
>>>>>
>>>>> But I found the following configuration parameter:
>>>>> "yarn.app.mapreduce.am.job.client.port-range", so it seems that they try 
>>>>> to
>>>>> allocate the AM port within that range (if specified).
>>>>> Niels, can you check if this configuration parameter is set in your
>>>>> environment? I assume your firewall allows outside connections from that
>>>>> port range.
>>>>> So we also have a new approach:
>>>>>
>>>>> f) Allocate the YARN application master (and blob manager) within a
>>>>> user-specified port-range.
>>>>>
>>>>> This would be really easy to implement, because we would just need to
>>>>> go through the range until we find an available port.
>>>>>
>>>>>
>>>>> On Tue, Nov 3, 2015 at 1:06 PM, Niels Basjes <ni...@basjes.nl> wrote:
>>>>>
>>>>>> Great!
>>>>>>
>>>>>> I'll watch the issue and give it a test once I see a working patch.
>>>>>>
>>>>>> Niels Basjes
>>>>>>
>>>>>> On Tue, Nov 3, 2015 at 1:03 PM, Maximilian Michels <m...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Niels,
>>>>>>>
>>>>>>> Thanks a lot for reporting this issue. I think it is a very common
>>>>>>> setup in corporate infrastructure to have restrictive firewall settings.
>>>>>>> For Flink 1.0 (and probably in a minor 0.10.X release) we will have to
>>>>>>> address this issue to ensure proper integration of Flink.
>>>>>>>
>>>>>>> I've created a JIRA to keep track:
>>>>>>> https://issues.apache.org/jira/browse/FLINK-2960
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Max
>>>>>>>
>>>>>>> On Tue, Nov 3, 2015 at 11:02 AM, Niels Basjes <ni...@basjes.nl>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I forgot to answer your other question:
>>>>>>>>
>>>>>>>> On Mon, Nov 2, 2015 at 4:34 PM, Robert Metzger <rmetz...@apache.org
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> so the problem is that you can not submit a job to Flink using the
>>>>>>>>> "/bin/flink" tool, right?
>>>>>>>>> I assume Flink and its TaskManagers properly start and connect to
>>>>>>>>> each other (the number of TaskManagers is shown correctly in the web
>>>>>>>>> interface).
>>>>>>>>>
>>>>>>>>
>>>>>>>> Correct. Flink starts (i see the jobmanager UI) but the actual job
>>>>>>>> is not started.
>>>>>>>>
>>>>>>>> Niels Basjes
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Best regards / Met vriendelijke groeten,
>>>>>>
>>>>>> Niels Basjes
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Best regards / Met vriendelijke groeten,
>>>>
>>>> Niels Basjes
>>>>
>>>
>>>
>>
>>
>> --
>> Best regards / Met vriendelijke groeten,
>>
>> Niels Basjes
>>
>
>

Reply via email to