[ 
https://issues.apache.org/jira/browse/FLINK-36370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniruddh J updated FLINK-36370:
-------------------------------
     Attachment: flink-cert-issue.log
    Description: 
Hi, in my kubernetes cluster I have flink-kubernetes-operator v1.7.0 and 
apache-flink v1.18.1 installed. In the FlinkDeployment CR when I enable 
Kubernetes high availability services with mTLS something like below:


{code:java}
high-availability.type: kubernetes
high-availability: 
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: 'file:///mnt/pv/ha'
security.ssl.rest.authentication-enabled: 'true'{code}


I am ending up with *SSLHandshakeException with empty client certificate*
 
Though both of them work fine when implemented individually. Upon enabling  
*{{{}-{}}}{{{}[Djavax.net|http://djavax.net/]{}}}{{{}.debug=all{}}}* observed 
client server communication and figured out  
[https://github.com/apache/flink/blob/release-1.18/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestClient.java]
 is where Client gets setup and it happens from the operator side 
[https://github.com/apache/flink-kubernetes-operator/blob/b081b75b72ddde643710e869b95b214912882363/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L750]
 (correct me here please)
 
When we enable both mTLS and HA the client doesn't seem to be getting setup. 
Not only that, it doesn't follow the same path of client creation. Below is the 
part of the ssl handshake log before getting the error (attached the entire ssl 
handshake log):
{code:java}
javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.508 GMT|null:-1|Produced CertificateRequest handshake message (
"CertificateRequest":
{ "certificate types": [ecdsa_sign, rsa_sign, dss_sign] "supported signature 
algorithms": [ecdsa_secp256r1_sha256, .., rsa_sha224, dsa_sha224, ecdsa_sha1, 
rsa_pkcs1_sha1, dsa_sha1] "certificate authorities": [CN=FlinkCA, O=Apache 
Flink] }
javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.512 GMT|null:-1|Raw read (
0000: 1603030007 0B 000003000000 ............
)
javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.513 GMT|null:-1|READ: TLSv1.2 handshake, length = 7
javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.513 GMT|null:-1|Consuming client Certificate handshake message (
"Certificates": <empty list>
)
javax.net.ssl|ERROR|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.514 GMT|null:-1|Fatal (BAD_CERTIFICATE): Empty server certificate 
chain (
"throwable" : {
javax.net.ssl.SSLHandshakeException: Empty server certificate chain
{code}
>From the initial looks it seems when Flink server is requesting for 
>certificates from Client, the client doesn't send anything back since it does 
>not have certificates  matching the CA?
 
Some client is sending a REST request to Flink server which the netty library 
is handling but until we figure out the client we don't know whether it's the 
truststore on client that's a problem or something else we don't see here. 
Thanks!

  was:
Hi, in my kubernetes cluster I have flink-kubernetes-operator v1.7.0 and 
apache-flink v1.18.1 installed. In the FlinkDeployment CR when I enable 
Kubernetes high availability services with mTLS something like below:
```
high-availability.type: kubernetes
high-availability: 
org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
high-availability.storageDir: 'file:///mnt/pv/ha'
security.ssl.rest.authentication-enabled: 'true'
```
I am ending up with `SSLHandshakeException with empty client certificate` .
 
Though both of them work fine when implemented individually. Upon enabling  
`{{{}-{}}}{{{}[Djavax.net|http://djavax.net/]{}}}{{{}.debug=all`{}}} observed 
client server communication and figured out  
[https://github.com/apache/flink/blob/release-1.18/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestClient.java]
 is where Client gets setup and it happens from the operator side 
[https://github.com/apache/flink-kubernetes-operator/blob/b081b75b72ddde643710e869b95b214912882363/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L750]
 (correct me here please)
 
When we enable both mTLS and HA the client doesn't seem to be getting setup. 
Not only that, it doesn't follow the same path of client creation. Below is the 
part of the ssl handshake log before getting the error:
```
javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.508 GMT|null:-1|Produced CertificateRequest handshake message (
"CertificateRequest": {
"certificate types": [ecdsa_sign, rsa_sign, dss_sign]
"supported signature algorithms": [ecdsa_secp256r1_sha256, .., rsa_sha224, 
dsa_sha224, ecdsa_sha1, rsa_pkcs1_sha1, dsa_sha1]
"certificate authorities": [CN=FlinkCA, O=Apache Flink]
}
javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.512 GMT|null:-1|Raw read (
0000: 1603030007 0B 000003000000 ............
)
javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.513 GMT|null:-1|READ: TLSv1.2 handshake, length = 7
javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.513 GMT|null:-1|Consuming client Certificate handshake message (
"Certificates": <empty list>
)
javax.net.ssl|ERROR|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
15:16:12.514 GMT|null:-1|Fatal (BAD_CERTIFICATE): Empty server certificate 
chain (
"throwable" : {
javax.net.ssl.SSLHandshakeException: Empty server certificate chain
 
```
>From the initial looks it seems when Flink server is requesting for 
>certificates from Client, the client doesn't send anything back since it does 
>not have matching CAs?
 
Some client is sending a REST request to Flink server which the netty library 
is handling but until we figure out the client we don't know whether it's the 
truststore on client that's a problem or something else we don't see here. 
Thanks!


> Flink 1.18 fails with Empty server certificate chain when High Availability 
> and mTLS both enabled
> -------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-36370
>                 URL: https://issues.apache.org/jira/browse/FLINK-36370
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Aniruddh J
>            Priority: Minor
>         Attachments: flink-cert-issue.log
>
>
> Hi, in my kubernetes cluster I have flink-kubernetes-operator v1.7.0 and 
> apache-flink v1.18.1 installed. In the FlinkDeployment CR when I enable 
> Kubernetes high availability services with mTLS something like below:
> {code:java}
> high-availability.type: kubernetes
> high-availability: 
> org.apache.flink.kubernetes.highavailability.KubernetesHaServicesFactory
> high-availability.storageDir: 'file:///mnt/pv/ha'
> security.ssl.rest.authentication-enabled: 'true'{code}
> I am ending up with *SSLHandshakeException with empty client certificate*
>  
> Though both of them work fine when implemented individually. Upon enabling  
> *{{{}-{}}}{{{}[Djavax.net|http://djavax.net/]{}}}{{{}.debug=all{}}}* observed 
> client server communication and figured out  
> [https://github.com/apache/flink/blob/release-1.18/flink-runtime/src/main/java/org/apache/flink/runtime/rest/RestClient.java]
>  is where Client gets setup and it happens from the operator side 
> [https://github.com/apache/flink-kubernetes-operator/blob/b081b75b72ddde643710e869b95b214912882363/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/service/AbstractFlinkService.java#L750]
>  (correct me here please)
>  
> When we enable both mTLS and HA the client doesn't seem to be getting setup. 
> Not only that, it doesn't follow the same path of client creation. Below is 
> the part of the ssl handshake log before getting the error (attached the 
> entire ssl handshake log):
> {code:java}
> javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
> 15:16:12.508 GMT|null:-1|Produced CertificateRequest handshake message (
> "CertificateRequest":
> { "certificate types": [ecdsa_sign, rsa_sign, dss_sign] "supported signature 
> algorithms": [ecdsa_secp256r1_sha256, .., rsa_sha224, dsa_sha224, ecdsa_sha1, 
> rsa_pkcs1_sha1, dsa_sha1] "certificate authorities": [CN=FlinkCA, O=Apache 
> Flink] }
> javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
> 15:16:12.512 GMT|null:-1|Raw read (
> 0000: 1603030007 0B 000003000000 ............
> )
> javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
> 15:16:12.513 GMT|null:-1|READ: TLSv1.2 handshake, length = 7
> javax.net.ssl|DEBUG|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
> 15:16:12.513 GMT|null:-1|Consuming client Certificate handshake message (
> "Certificates": <empty list>
> )
> javax.net.ssl|ERROR|53|flink-rest-server-netty-worker-thread-1|2024-09-19 
> 15:16:12.514 GMT|null:-1|Fatal (BAD_CERTIFICATE): Empty server certificate 
> chain (
> "throwable" : {
> javax.net.ssl.SSLHandshakeException: Empty server certificate chain
> {code}
> From the initial looks it seems when Flink server is requesting for 
> certificates from Client, the client doesn't send anything back since it does 
> not have certificates  matching the CA?
>  
> Some client is sending a REST request to Flink server which the netty library 
> is handling but until we figure out the client we don't know whether it's the 
> truststore on client that's a problem or something else we don't see here. 
> Thanks!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to