Hi Arvid, thanks for the reply.

Our stores are world-readable, so I don’t think that it’s an access issue. All 
of our clients have the stores present through a shared mount as well. I’m able 
to see the shipped stores in the directory.info output when pulling the YARN 
logs, and can confirm the account submitting the application has correct 
privileges.

The exception I shared occurs during the cluster deployment phase. Here’s the 
full stacktrace:

2021-04-26 13:37:17,468 [main] ERROR ClusterEntrypoint - Could not start 
cluster entrypoint YarnSessionClusterEntrypoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to 
initialize the cluster entrypoint YarnSessionClusterEntrypoint.
        at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:182)
        at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:501)
        at 
org.apache.flink.yarn.entrypoint.YarnSessionClusterEntrypoint.main(YarnSessionClusterEntrypoint.java:93)
Caused by: org.apache.flink.util.FlinkException: Could not create the 
DispatcherResourceManagerComponent.
        at 
org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentF
actory.java:257)
        at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:210)
        at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:164)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
        at 
org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at 
org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:163)
        ... 2 more
Caused by: org.apache.flink.util.ConfigurationException: Failed to initialize 
SSLEngineFactory for REST server endpoint.
        at 
org.apache.flink.runtime.rest.RestServerEndpointConfiguration.fromConfiguration(RestServerEndpointConfiguration.java:162)
        at 
org.apache.flink.runtime.rest.SessionRestEndpointFactory.createRestEndpoint(SessionRestEndpointFactory.java:54)
        at 
org.apache.flink.runtime.entrypoint.component.AbstractDispatcherResourceManagerComponentFactory.create(AbstractDispatcherResourceManagerComponentF
actory.java:150)
        ... 9 more
Caused by: java.nio.file.NoSuchFileException: 
/home/user/ssl/deploy-keys/rest.keystore
        at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at 
sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.newByteChannel(Files.java:407)
        at 
java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
        at java.nio.file.Files.newInputStream(Files.java:152)
        at 
org.apache.flink.runtime.net.SSLUtils.getKeyManagerFactory(SSLUtils.java:266)
        at 
org.apache.flink.runtime.net.SSLUtils.createRestNettySSLContext(SSLUtils.java:392)
        at 
org.apache.flink.runtime.net.SSLUtils.createRestNettySSLContext(SSLUtils.java:365)
       at 
org.apache.flink.runtime.net.SSLUtils.createRestServerSSLEngineFactory(SSLUtils.java:163)
        at 
org.apache.flink.runtime.rest.RestServerEndpointConfiguration.fromConfiguration(RestServerEndpointConfiguration.java:160)
        ... 11 more

Given the number of machines in our YARN compute cluster, we’d really like to 
avoid having to have to copy the stores to each machine as that would add 
another step in configuration each time a machine is replaced, added, etc. The 
YARN shipping feature is really what we need.

The documentation [1] says that we should be able to ship the stores directly 
from my our client:

flink run -m yarn-cluster -yt deploy-keys/ flinkapp.jar

But it doesn’t provide an example of the requisite change made in the 
flink-conf.yaml that supports shipped stores.

If we consider that we have the stores available in a local directory called 
/home/user/ssl/deploy-keys/, and we’re shipping the directory through the –yt 
option, what do the values of:

1.     security.ssl.rest.keystore

2.     security.ssl.rest.truststore
Need to be in order for this to work? Happy to share our failed application’s 
YARN logs with you If you require them.

[1] 
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/security-ssl.html#tips-for-yarn--mesos-deployment

// ah

From: Arvid Heise <ar...@apache.org>
Sent: Wednesday, April 21, 2021 1:05 PM
To: Hailu, Andreas [Engineering] <andreas.ha...@ny.email.gs.com>
Cc: user@flink.apache.org
Subject: Re: [1.9.2] Flink SSL on YARN - NoSuchFileException

Hi Andreas,

I'd check where the exception occurs (not clear from what you posted) and 
double-check that the part of the system can access the given path 
deploy-keys/rest.keystore.

The brute-force solution is to manually copy the files onto all worker nodes on 
the respective directory + potentially the client.

On Mon, Apr 19, 2021 at 4:45 PM Hailu, Andreas [Engineering] 
<andreas.ha...@gs.com<mailto:andreas.ha...@gs.com>> wrote:
Hi Flink team,

I’m trying to configure a Flink on YARN with SSL enabled. I’ve followed the 
documentation’s instruction  [1] to generate a Keystore and Truststore locally, 
and added a the properties to my flink-conf.yaml.
security.ssl.rest.keystore: /home/user/ssl/deploy-keys/rest.keystore
security.ssl.rest.truststore: /home/user/ssl/deploy-keys/rest.truststore

I’ve also added the yarnship option so that the keystore and truststore are 
deployed as suggested in [1].

-m yarn-cluster --class <class> [...] -yt /home/user/ssl/deploy-keys/

However, starting the Flink cluster results in a NoSuchFileException,
Caused by: java.nio.file.NoSuchFileException: 
/home/user/ssl/deploy-keys/rest.keystore
            at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
            at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
            at 
sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
            at 
sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
            at java.nio.file.Files.newByteChannel(Files.java:361)
            at java.nio.file.Files.newByteChannel(Files.java:407)
            at 
java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
            at java.nio.file.Files.newInputStream(Files.java:152)
            at 
org.apache.flink.runtime.net.SSLUtils.getKeyManagerFactory(SSLUtils.java:266)
            at 
org.apache.flink.runtime.net.SSLUtils.createRestNettySSLContext(SSLUtils.java:392)
            at 
org.apache.flink.runtime.net.SSLUtils.createRestNettySSLContext(SSLUtils.java:365)
            at 
org.apache.flink.runtime.net.SSLUtils.createRestServerSSLEngineFactory(SSLUtils.java:163)
            at 
org.apache.flink.runtime.rest.RestServerEndpointConfiguration.fromConfiguration(RestServerEndpointConfiguration.java:160)

I’m able to see in launch_container.sh that the shipped directory was able to 
be created successfully:

mkdir -p deploy-keys
ln -sf 
"/fs/htmp/yarn/local/usercache/delp/appcache/application_1618711298408_2664/filecache/16/rest.truststore"
 "deploy-keys/rest.truststore"
mkdir -p deploy-keys
ln -sf 
"/fs/htmp/yarn/local/usercache/delp/appcache/application_1618711298408_2664/filecache/13/rest.keystore"
 "deploy-keys/rest.keystore"

So given the above logs, I tried editing flink-conf.yaml to reflect what I saw:
security.ssl.rest.keystore: deploy-keys/rest.keystore
security.ssl.rest.truststore: deploy-keys/rest.truststore

But that didn’t seem to work, either:
Caused by: java.nio.file.NoSuchFileException: deploy-keys/rest.truststore
        at 
sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at 
sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:214)
        at java.nio.file.Files.newByteChannel(Files.java:361)
        at java.nio.file.Files.newByteChannel(Files.java:407)
        at 
java.nio.file.spi.FileSystemProvider.newInputStream(FileSystemProvider.java:384)
        at java.nio.file.Files.newInputStream(Files.java:152)
        at 
org.apache.flink.runtime.net.SSLUtils.getTrustManagerFactory(SSLUtils.java:233)
        at 
org.apache.flink.runtime.net.SSLUtils.createRestNettySSLContext(SSLUtils.java:397)
        at 
org.apache.flink.runtime.net.SSLUtils.createRestNettySSLContext(SSLUtils.java:365)
        at 
org.apache.flink.runtime.net.SSLUtils.createRestClientSSLEngineFactory(SSLUtils.java:181)
        at 
org.apache.flink.runtime.rest.RestClientConfiguration.fromConfiguration(RestClientConfiguration.java:106)

What needs to be done to get the YARN application to point to the right 
keystore and truststore?

[1] 
https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/security-ssl.html#tips-for-yarn--mesos-deployment<https://urldefense.proofpoint.com/v2/url?u=https-3A__ci.apache.org_projects_flink_flink-2Ddocs-2Drelease-2D1.9_ops_security-2Dssl.html-23tips-2Dfor-2Dyarn-2D-2Dmesos-2Ddeployment&d=DwMFaQ&c=7563p3e2zaQw0AB1wrFVgyagb2IE5rTZOYPxLxfZlX4&r=hRr4SA7BtUvKoMBP6VDhfisy2OJ1ZAzai-pcCC6TFXM&m=6sX96fiy1e71tCaTQbdV2QYtM4FfnAq3hR9u74PK7kU&s=bJsC35KHZmJQrcj5Ug4F1WhDE96V6eM91wotNOtZoo0&e=>

____________

Andreas Hailu
Data Lake Engineering | Goldman Sachs & Co.


________________________________

Your Personal Data: We may collect and process information about you that may 
be subject to data protection laws. For more information about how we use and 
disclose your personal data, how we protect your information, our legal basis 
to use your information, your rights and who you can contact, please refer to: 
www.gs.com/privacy-notices<http://www.gs.com/privacy-notices>

________________________________

Your Personal Data: We may collect and process information about you that may 
be subject to data protection laws. For more information about how we use and 
disclose your personal data, how we protect your information, our legal basis 
to use your information, your rights and who you can contact, please refer to: 
www.gs.com/privacy-notices<http://www.gs.com/privacy-notices>

Reply via email to