rybakovanton-metta opened a new issue, #11460:
URL: https://github.com/apache/cloudstack/issues/11460

   ### problem
   
   <!--
     Verified this issue exists in main branch and affects cluster 
communication in VRF/multi-homed environments.
     -->
   
     ##### ISSUE TYPE
     * Bug Report
   
     ##### COMPONENT NAME
     ~~~
     Cluster Communication / Management Server
     ~~~
   
     ##### CLOUDSTACK VERSION
     ~~~
     4.20.1.0
     main branch (current)
     All versions with cluster communication feature
     ~~~
   
     ##### CONFIGURATION
     Advanced networking, VRF/multi-homed management server configuration
   
     ##### OS / ENVIRONMENT
     Linux with VRF (Virtual Routing and Forwarding) or multiple network 
interfaces
   
     ##### SUMMARY
     The `bind.interface` configuration in server.properties only affects 
server-side binding (Jetty HTTP server) but is completely ignored for 
client-side HTTP
     connections used in cluster communication. This causes cluster 
communication failures in VRF or multi-homed environments where management 
traffic should be
     isolated to a specific interface.
   
     ##### STEPS TO REPRODUCE
     ~~~
     1. Configure management server with VRF or multiple interfaces
     2. Set bind.interface=192.168.100.10 in server.properties (VRF interface)
     3. Set cluster.node.IP=192.168.100.10 in db.properties
     4. Start CloudStack management server
     5. Observe cluster communication attempts
     ~~~
   
     ##### EXPECTED RESULTS
     ~~~
     - Management server binds to 192.168.100.10:8080 (server-side) ✓
     - Management server binds to 192.168.100.10:9090 (cluster service) ✓
     - Outbound cluster HTTP connections originate from 192.168.100.10 ✓
     - Cluster communication works within the same network namespace/VRF ✓
     ~~~
   
     ##### ACTUAL RESULTS
     ~~~
     - Management server binds to 192.168.100.10:8080 (server-side) ✓
     - Management server binds to 192.168.100.10:9090 (cluster service) ✓
     - Outbound cluster HTTP connections originate from default interface 
(10.0.1.2) ❌
     - Connection timeout: "Connect to 192.168.100.10:9090 [/192.168.100.10] 
failed: Connection timed out"
   
     ERROR [c.c.c.ClusterServiceServletImpl] Exception from : 
https://192.168.100.10:9090/clusterservice
     org.apache.http.conn.ConnectTimeoutException: Connect to 
192.168.100.10:9090 [/192.168.100.10] failed: Connection timed out
   
     ss output shows asymmetric routing:
     SYN-SENT [::ffff:10.0.1.2]:57104 -> [::ffff:192.168.100.10]:9090
     ~~~
   
     **Root Cause**: `ClusterServiceServletImpl.getHttpClient()` creates 
HttpClient without source address binding configuration. The HttpClient uses 
system
     default routing instead of respecting the configured bind.interface.
   
     **Files Affected**:
     - 
`framework/cluster/src/main/java/com/cloud/cluster/ClusterServiceServletImpl.java:180-183`
     - `client/src/main/java/org/apache/cloudstack/ServerDaemon.java:246` 
(server binding works correctly)
   
     **Suggested Fix**:
     Add source interface binding to HttpClient configuration by reading 
bind.interface from server.properties and configuring a custom 
ConnectionSocketFactory
     with setLocalAddress().
   
     **Example Diff**:
     ```diff
     --- 
a/framework/cluster/src/main/java/com/cloud/cluster/ClusterServiceServletImpl.java
     +++ 
b/framework/cluster/src/main/java/com/cloud/cluster/ClusterServiceServletImpl.java
     @@ -177,8 +177,26 @@ public class ClusterServiceServletImpl implements 
ClusterService {
                      .setConnectionRequestTimeout(timeout)
                      .setSocketTimeout(timeout).build();
   
     +            // Read bind.interface from server.properties for source 
binding
     +            String bindInterface = getBindInterface();
     +            ConnectionSocketFactory socketFactory = new 
SSLConnectionSocketFactory(sslContext);
     +
     +            if (bindInterface != null) {
     +                InetAddress localAddress = 
InetAddress.getByName(bindInterface);
     +                socketFactory = new 
SSLConnectionSocketFactory(sslContext) {
     +                    @Override
     +                    public Socket connectSocket(int connectTimeout, 
Socket sock, HttpHost host,
     +                            InetSocketAddress remoteAddress, 
InetSocketAddress localSocketAddress,
     +                            HttpContext context) throws IOException {
     +                        sock.bind(new InetSocketAddress(localAddress, 0));
     +                        return super.connectSocket(connectTimeout, sock, 
host, remoteAddress,
     +                                                 localSocketAddress, 
context);
     +                    }
     +                };
     +            }
     +
                  s_client = HttpClientBuilder.create()
                          .setDefaultRequestConfig(config)
     -                    .setSSLContext(sslContext)
     +                    .setSSLSocketFactory(socketFactory)
                          .build();
              }
   
     ---
     P.S.: Example code provided by Claude AI assistant during issue analysis.
   
   ### versions
   
   cloudstack 4.20.1.0
   ubuntu 25
   
   ### The steps to reproduce the bug
   
     1. Configure management server with VRF or multiple interfaces
     2. Set bind.interface=192.168.100.10 in server.properties (VRF interface)
     3. Set cluster.node.IP=192.168.100.10 in db.properties
     4. Start CloudStack management server
     5. Observe cluster communication attempts
   
   
   ### What to do about it?
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@cloudstack.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to