Gunter Van de Velde has entered the following ballot position for
draft-ietf-rtgwg-net2cloud-problem-statement-41: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to 
https://www.ietf.org/about/groups/iesg/statements/handling-ballot-positions/ 
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-rtgwg-net2cloud-problem-statement/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

# Gunter Van de Velde, RTG AD, comments for
draft-ietf-rtgwg-net2cloud-problem-statement-41

# Thanks for writing up this work to make cloud DCs more mainstream and
connected to enterprises and to open the discussion on routing aspects.

# Please find the following blocking DISCUSS observations when processing the
draft and some non-blocking comments.

#DISCUSS
#=======
# I support the DISCUSS from John and Paul. (1) requirements do not belong in a
use-case document (2) there is information in the document which will not age
well

# [DISCUSS1] Section 3 is not complete from a Connecting to Cloud DC Routing
issues perspective. Connecting to cloud data centers presents various routing
challenges, including scalability, security, latency, routing policy
consistency, and multi-cloud complexity. Enterprises need to carefully plan and
manage their routing architecture to ensure reliable, efficient, and secure
connections between on-premises infrastructure and cloud data centers.
Solutions like dedicated connections, BGP security enhancements, and dynamic
routing policies can help mitigate some of these challenges, but they also add
complexity to the overall network architecture. I believe that a use-case
document should address or at least position all of these. When focused on a
small subset then the bigger picture may be lost. Connecting to and between
Cloud DCs is a multi-dimensional complex routing aware problem space. See my
note [DISCUSS1] below

# [DISCUSS2] Cloud DC implications on security considerations is not complete.
There are many aspects to consider. See note [DISCUSS2]. Topics are for example
Encryption of Data in Transit, Authentication and Access Control, Secure
Routing Protocols, Network Segmentation, Data Encryption at Rest, Visibility
and Monitoring, DDoS Protection, Firewalls and Security Groups, Zero Trust
Security Model, Compliance and Regulatory Considerations, Network Access
Control, Patch Management and Vulnerability Scanning, Distributed Workloads and
Traffic Control and an Incident Response Plan

# [DISCUSS3] Vendor specific Cloud DC products and explicit behaviors are
documented in this document. IETF documents should be vendor agnostic,
especially when very specific behaviors are documented. Vendor behavior will
change over time making the information provided in the draft stale, outdated
and potentially harmful to the referenced (unaware) cloud DC vendors


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

#DETAILED COMMENTS
#=================
## classified as [minor] and [major]

72         3. Issues and Mitigation Methods of Connecting to Cloud DCs.......4
73            3.1. Increased BGP Peering Errors and Mitigation Methods.......4
74            3.2. Site Failures and Methods to Minimize Impacts.............6
75            3.3. Limitations of DNS-based Cloud DC Location Selection......6
76            3.4. Network Issues for 5G Edge Clouds and Mitigation Methods..7
77            3.5. DNS Practices for Hybrid Workloads........................8
78            3.6. NAT Practices for Accessing Cloud Services................9
79            3.7. Cloud Discovery Practices................................10
80         4. Dynamic Connecting Enterprise Sites with Cloud DCs............10
81            4.1. Sites to Cloud DC........................................11
82            4.2. Inter-Cloud Connection...................................13
83            4.3. Extending Private VPNs to Hybrid Cloud DCs...............14
84         5. Methods to Scale IPsec Tunnels to Cloud DCs...................15
85            5.1. Scale IPsec Tunnels Management...........................16
86            5.2. CPEs Interconnection Over the Public Internet............16
.....
98      1. Introduction
99         With the advent of widely available Cloud data centers (DCs)
100        providing services in various geographic locations and advanced
101        tools for monitoring and predicting application behaviors, it is
102        tempting for enterprises to instantiate applications and workloads
103        in Cloud DCs. Some enterprises prefer specific applications to be
104        located close to the end users accessing these services, as the
105        proximity can improve end-to-end latency. In addition, applications
106        and workloads in Cloud DCs can be shut down or moved along with end
107        users in motion thereby modifying the networking connection of
108        subsequently relocated applications and workloads.
109        Cloud services are typically on-demand and designed to be scalable,
110        highly available, and billed based on usage. Most Cloud Operators
111        offer various network functions, such as virtual Firewall services,
112        virtual private clouds services, and virtual Private Branch eXchange
113        (PBX) services, including voice and video conferencing systems. A
114        Cloud DC is a shared infrastructure that hosts services for multiple
115        customers.
116        This document describes the network-related problems enterprises
117        face at the time of writing this document when interconnecting their
118        branch offices with dynamic workloads in Cloud DCs and the
119        mitigation practices to get around those problems.

[major]
Cloud data centers offer numerous benefits, but they also have several
downsides or challenges that organizations need to consider. While cloud data
centers offer scalability, flexibility, and cost-efficiency, organizations must
weigh these benefits against potential downsides such as security risks,
unpredictable costs, limited control, and regulatory compliance challenges. It
is not just about network access to Cloud DCs.

Some of the suggested key downsides are not strictly of a routing technical
nature, while others are, and these have not been adequately addressed in
Section 3. Including these issues in the document, along with an explicit
indication of which are within scope and which are outside scope, will provide
greater clarity to readers and enhance their understanding of the problem space
being discussed:

1. Security and Privacy Concerns:
* Data Breaches: Storing sensitive data in cloud environments increases the
risk of data breaches, as cloud data centers are prime targets for
cyberattacks. Organizations may face issues of unauthorized access if cloud
security is compromised. * Shared Responsibility: In cloud environments,
security is a shared responsibility between the cloud provider and the
customer. Misconfigurations or failures on either side can lead to
vulnerabilities. * Data Sovereignty: Data stored in cloud data centers may be
subject to the laws and regulations of the country where the data center is
located, which can lead to compliance issues regarding data privacy.

2. Downtime and Availability:
* Service Outages: Even the most reliable cloud providers can experience
downtime, which can lead to service disruptions for organizations relying on
cloud infrastructure. High availability is typically guaranteed, but 100%
uptime is rarely achieved. * Network Latency: Cloud data centers are remote, so
applications that require low-latency performance might face challenges,
especially if the data center is far from end-users.

3. Cost Management:
* Unpredictable Costs: While cloud services are often marketed as
cost-effective, costs can quickly add up if resources are not properly managed.
Unexpected charges for data egress, scaling, or additional services can lead to
budget overruns. * Long-Term Costs: Over time, running workloads in the cloud
might be more expensive than on-premises solutions, particularly for
organizations with steady and predictable workloads.

4. Lack of Control:
* Limited Customization: Cloud services typically offer standardized
environments, which may limit an organization’s ability to customize
infrastructure or configurations to meet specific needs. This lack of control
can be problematic for highly specialized applications. * Vendor Lock-In: Many
cloud providers offer proprietary services that can make it difficult or costly
to migrate to another provider or move workloads back on-premises.

5. Data Transfer and Performance Issues:
* Data Transfer Costs: Transferring large volumes of data to and from the cloud
can be expensive and time-consuming, particularly when dealing with bandwidth
limitations or the cost of data egress. * Performance Variability: In
multi-tenant cloud environments, performance can fluctuate depending on the
overall usage of resources by other clients. This can impact critical workloads
if performance varies unexpectedly.

6. Compliance and Legal Issues:
* Regulatory Compliance: Organizations in highly regulated industries (e.g.,
healthcare, finance) must ensure that their use of cloud services complies with
specific regulations such as GDPR, HIPAA, or PCI-DSS. Ensuring compliance can
be complicated by the global nature of cloud data centers. * Data Jurisdiction:
Storing data in foreign cloud data centers might expose organizations to
jurisdictional issues, where the data becomes subject to foreign laws and
regulations.

7. Dependence on Internet Connectivity:
* Connectivity Issues: Cloud services require reliable internet access. If an
organization experiences internet outages or slow connectivity, access to
cloud-hosted applications and data may be compromised, impacting productivity.

8. Complexity in Hybrid Environments:
* Integration Challenges: Managing hybrid cloud environments (where some
resources are in the cloud and others on-premises) can be complex, especially
when it comes to data synchronization, security policies, and monitoring.

150        SD-WAN      An overlay connectivity service that optimizes transport
151                    of IP Packets over one or more Underlay Connectivity
152                    Services by recognizing applications (Application Flows)
153                    and determining forwarding behavior by applying Policies
154                    to them. [MEF-70.1]

[major]
fails to say that SD-WAN stands for "Software-Defined Wide Area Network"

Maybe the following could be added to describe what SD-WAN is:

"
SD-WAN (Software-Defined Wide Area Network) is a networking technology that
simplifies the management and operation of a wide area network (WAN) by
decoupling the network hardware from its control mechanism. It allows
enterprises to securely and efficiently connect users to applications,
particularly across multiple branch locations, data centers, and cloud
environments. "

156        VPC:        A Virtual Private Cloud is a virtual network dedicated
157                    to one client account. It is logically isolated from
158                    other virtual networks in a Cloud DC. Each client can
159                    launch his/her desired resources, such as compute,
160                    storage, or network functions into his/her VPC. At the
161                    time of writing this document, most Cloud operators'
162                    VPCs only support private addresses, some support IPv4
163                    only, others support IPv4/IPv6 dual stack.

[minor]
A simpler proposal to describe VPC

"
A VPC (Virtual Private Cloud) is a secure, isolated segment of a public cloud,
where users can deploy and manage resources such as virtual machines,
databases, and applications. VPCs offer the flexibility of using the public
cloud's infrastructure while providing more control over networking and
security. "

165     3. Issues and Mitigation Methods of Connecting to Cloud DCs

167        This section identifies some high-level problems that the IETF could
168        address, especially within the Routing area. Other Cloud DC problems
169        (e.g., managing cloud spending) are out of the scope of this
170        document.

[DISCUSS1]
Connecting to cloud data centers presents various routing challenges, including
scalability, security, latency, routing policy consistency, and multi-cloud
complexity. Enterprises need to carefully plan and manage their routing
architecture to ensure reliable, efficient, and secure connections between
on-premises infrastructure and cloud data centers. Solutions like dedicated
connections, BGP security enhancements, and dynamic routing policies can help
mitigate some of these challenges, but they also add complexity to the overall
network architecture. Not all of these high level enterprise related concerns
are addressed in draft-ietf-rtgwg-net2cloud-problem-statement-41

Key Routing Issues of interest by enterprises when Connecting to Cloud Data
Centers:

1. Latency and Path Optimization:
* Suboptimal Routing: Traffic between on-premises data centers and cloud
providers may traverse multiple ISPs or intermediary networks, leading to
increased latency. Default internet paths may not always be the most optimal,
which can negatively impact performance for latency-sensitive applications. *
Traffic Engineering: Enterprises may struggle to optimize routes for specific
applications. This can be critical when performance demands, such as low
latency for real-time applications, are high.

2. Multi-Cloud and Hybrid Cloud Connectivity:
* Inter-Cloud Routing Complexity: Routing between multiple cloud providers
(multi-cloud) or between on-premises environments and the cloud (hybrid cloud)
is challenging. Each cloud provider may use different routing policies,
protocols, and architectures, complicating consistent policy enforcement and
efficient routing across different environments. * Vendor-Specific Routing
Mechanisms: Cloud providers like AWS, Microsoft Azure, and Google Cloud have
their own proprietary routing mechanisms, such as AWS Transit Gateway or Azure
Virtual WAN. Managing routing across different clouds requires expertise in
each platform’s unique setup.

3. BGP Complexity:
* BGP Configuration: Enterprises often use Border Gateway Protocol (BGP) to
connect their on-premises networks with cloud DCs. However, configuring BGP for
efficient and secure communication can be complex, especially when dealing with
cloud providers’ route limitations, filtering, and peering configurations. *
BGP Route Convergence: If there is a network topology change, BGP may take time
to converge on a new optimal route, which could cause temporary routing loops
or black holes, leading to downtime or degraded performance. * BGP Security:
Routing security issues like BGP hijacking can be a concern. If not properly
secured, attackers can manipulate routes, potentially intercepting or
redirecting traffic between an enterprise and a cloud data center.

4. Overlapping IP Addresses:
* IP Address Conflicts: When connecting multiple cloud environments or when
integrating with on-premises networks, organizations may encounter overlapping
private IP address spaces (e.g., two networks using the same RFC1918 address
space). This creates routing conflicts and requires address translation (e.g.,
NAT) or careful IP planning. * NAT Complexity: Network Address Translation
(NAT) is often used to resolve overlapping IPs, but it adds complexity to
routing, and troubleshooting connectivity issues can become more difficult.

5. Routing Scalability:
* Large Route Tables: Cloud environments often host a large number of subnets,
virtual machines (VMs), and applications, which results in significant route
table growth. On-premises routers may struggle to handle the large number of
routes advertised by cloud data centers. * Route Aggregation: To manage large
routing tables, route aggregation is essential, but improper aggregation can
lead to suboptimal routing or create security issues by allowing unintended
access to broader network segments.

6. East-West Traffic Optimization:
* East-West Traffic Challenges: Modern cloud workloads often involve
significant east-west traffic (i.e., traffic between different applications or
services within the cloud). Efficiently routing this traffic between cloud
regions or between an on-premises data center and the cloud can be challenging,
especially if cross-region bandwidth or routing constraints exist.

7. Latency and Bandwidth Considerations:
* Performance Over Public Internet: Connecting to a cloud DC over the public
internet introduces unpredictable latency and limited control over the routing
path. Enterprises may use dedicated connectivity solutions like AWS Direct
Connect or Azure ExpressRoute to avoid the public internet and achieve more
predictable performance, but these solutions come with additional cost and
complexity. * Bandwidth Costs: Cloud providers often charge for egress traffic
(traffic leaving the cloud data center). Suboptimal routing can increase data
transfer costs if traffic is unnecessarily routed through expensive pathways.

8. Route Propagation and Policy Enforcement:
* Consistent Route Propagation: Propagating routes between an on-premises
network and a cloud data center can be inconsistent, especially when using
complex routing policies. Enterprises need to carefully manage route
redistribution between different routing domains (e.g., BGP on-premises and
cloud provider proprietary routing). * Policy Control: Implementing consistent
routing policies (e.g., security, load balancing, and traffic engineering
policies) across cloud and on-premises environments can be challenging due to
the different tools and mechanisms used by cloud providers.

9. Routing Security:
* Securing Routing Information: When using BGP to connect to cloud data
centers, securing routing information is crucial. BGP hijacking and route leaks
can lead to malicious traffic redirection. Organizations need to implement
security measures like BGP authentication, RPKI (Resource Public Key
Infrastructure), and route filtering to prevent unauthorized route
advertisements. * Encryption and Privacy: Data traveling between an enterprise
and the cloud may need encryption to protect against eavesdropping.
Implementing encrypted tunnels (e.g., IPSec VPN) can add complexity to the
routing setup.

10. Failover and High Availability:
* Redundancy and Failover: Ensuring high availability in cloud connectivity
involves setting up redundant links and implementing fast failover mechanisms
to ensure traffic is re-routed quickly in the event of a link failure. However,
configuring effective failover paths that meet performance and cost
requirements can be complex, especially across different clouds or between
cloud and on-premises environments. * Dynamic Failover: In hybrid environments,
ensuring that routes dynamically change during failover scenarios can be
difficult due to the different routing protocols or static routes used in cloud
environments.

11. Geographic Routing and Data Residency:
* Compliance and Regulation: Enterprises may face legal and regulatory
challenges regarding where data is routed. For instance, data residency
requirements (e.g., GDPR) may mandate that certain data be routed or stored
only within specific geographical regions. Ensuring that routing policies
comply with these regulations across cloud and on-premises environments can be
a complex issue. * Geographic Load Balancing: Routing traffic to cloud data
centers in different regions to optimize for performance or compliance requires
careful planning and monitoring.

743     7. Security Considerations
744
745        The security issues in terms of networking to Cloud DCs include

[DISCUSS2]
Enterprises connecting to cloud data centers must address a wide range of
security concerns, from ensuring encrypted communications and controlling
access, to securing routing protocols and complying with regulatory
requirements. By employing robust encryption, strong access controls,
comprehensive monitoring, and segmentation strategies, organizations can
mitigate risks and securely connect their on-premises infrastructure to cloud
environments. Additionally, leveraging the security tools and services provided
by cloud vendors can help ensure that the network and data remain protected. A
security section should investigate these to provide an holistic security
overview. While not all of these have direct impact upon routing, or should
even be standardized, it is important for enterprises to have a secure and
robust cloud DC experience.

1. Encryption of Data in Transit:
* End-to-End Encryption: Data traveling between on-premises infrastructure and
the cloud should be encrypted to protect against interception and
eavesdropping. Common methods include using IPsec VPNs, SSL/TLS, or private
connectivity options like AWS Direct Connect or Azure ExpressRoute, which
provide secure, dedicated connections to the cloud. * Encrypted Tunnels: Secure
tunnels (IPsec, SSL, or GRE) can be used to ensure data confidentiality and
integrity during transmission. Encryption helps mitigate man-in-the-middle
attacks.

2. Authentication and Access Control:
* Strong Authentication Mechanisms: Employ strong, multi-factor authentication
(MFA) for accessing both on-premises and cloud resources. Implement VPN access
control to ensure only authorized users and devices can establish connections
to cloud environments. * Identity and Access Management (IAM): Use IAM policies
to control who can access resources in the cloud. Ensure that IAM roles are
tightly controlled and that users and applications only have the minimum
permissions they need (principle of least privilege).

3. Secure Routing Protocols:
* BGP Security: If using Border Gateway Protocol (BGP) to connect to cloud
services, protect the routing protocol by implementing BGP authentication
(using for example TCP-AO) and route filtering to prevent unauthorized or
incorrect routing information from being accepted. * Route Filtering: Control
which routes are propagated between on-premises networks and the cloud to
prevent route leaks, which could expose sensitive routes to external parties or
misdirect traffic. * RPKI (Resource Public Key Infrastructure): Consider using
RPKI to prevent BGP hijacking, ensuring that the routes being advertised are
valid and have not been tampered with.

4. Network Segmentation:
* Isolating Traffic: Use Virtual Private Clouds (VPCs) and subnet segmentation
to isolate traffic between different departments, workloads, or tenants. This
ensures that sensitive data is not exposed to unauthorized users within the
same cloud environment. * Private Connectivity: Use private connectivity
options (e.g., AWS Direct Connect, Azure ExpressRoute) to avoid sending
sensitive data over the public internet, reducing the risk of exposure to
attacks.

5. Data Encryption at Rest:
* Cloud Data Encryption: Ensure that data stored in the cloud is encrypted at
rest. Many cloud providers offer encryption services (e.g., AWS Key Management
Service, Azure Key Vault) to manage encryption keys securely. Consider using
customer-managed keys for additional control over encryption processes. *
Compliance with Encryption Standards: Ensure that encryption protocols comply
with industry standards and regulatory requirements (e.g., AES-256 encryption
for sensitive data).

6. Visibility and Monitoring:
* Traffic Monitoring: Use tools like cloud network traffic analyzers or
intrusion detection systems to monitor traffic between on-premises
infrastructure and cloud environments. Detect anomalous behavior or
unauthorized access attempts by maintaining visibility into network traffic. *
Logging and Auditing: Enable comprehensive logging of all access and
configuration changes in both on-premises and cloud environments. Cloud
providers often offer logging services like AWS CloudTrail or Azure Monitor to
track user activity and help detect security breaches. * Threat Detection and
Response: Deploy security tools that offer threat detection, real-time
monitoring, and automated response. Solutions like SIEM (Security Information
and Event Management) systems can help correlate events across the hybrid cloud
to detect security incidents.

7. DDoS Protection:
* Distributed Denial of Service (DDoS) Protection: Cloud data centers can be
targets of DDoS attacks, which can disrupt network services. Cloud providers
offer DDoS mitigation services (e.g., AWS Shield, Azure DDoS Protection) that
can protect both the cloud environment and the connection to on-premises
infrastructure. * Rate Limiting: Implement rate limiting and other traffic
control mechanisms to prevent network saturation during potential attacks.

8. Firewalls and Security Groups:
* Network Firewalls: Use firewalls to control traffic flowing between
on-premises networks and cloud environments. Cloud providers offer virtual
firewalls that can be configured to enforce strict access controls. * Security
Groups: Implement security groups and network ACLs (Access Control Lists) to
control inbound and outbound traffic at the VPC or subnet level. These
mechanisms should be used to restrict access to only those IP addresses or
protocols that are necessary.

9. Zero Trust Security Model:
* Zero Trust: Adopt a Zero Trust model that assumes no network (internal or
external) is automatically trusted. Every access request should be verified,
and users, devices, and applications should be authenticated before being
allowed access to resources. * Microsegmentation: Use microsegmentation to
further isolate workloads within the cloud, ensuring that even if an attacker
gains access to one part of the network, they cannot easily move laterally.

10. Compliance and Regulatory Considerations:
* Data Sovereignty and Residency: Ensure compliance with data sovereignty laws
(e.g., GDPR) by enforcing routing policies that keep sensitive data within
specified geographical regions. * Encryption for Compliance: Encrypt sensitive
data both in transit and at rest to meet regulatory requirements like HIPAA,
PCI-DSS, or GDPR. Cloud providers often offer compliance certification, but
it's important to ensure the proper configurations are in place. * Auditing and
Reporting: Regularly audit the security posture of the hybrid cloud environment
to ensure ongoing compliance with security standards and regulations.

11. Network Access Control:
* VPN Access: Use VPN gateways to securely connect on-premises networks to
cloud environments, encrypting traffic between the two endpoints. *
Multi-Factor Authentication (MFA): Implement MFA for users and administrators
accessing cloud resources remotely to add an extra layer of security.

12. Patch Management and Vulnerability Scanning:
* Patch Cloud Resources: Ensure that virtual machines, containers, and other
cloud resources are regularly patched to protect against vulnerabilities.
Leverage automated tools for patch management across both on-premises and cloud
environments. * Vulnerability Scanning: Regularly scan cloud environments for
vulnerabilities and misconfigurations that could be exploited by attackers.

13. Distributed Workloads and Traffic Control:
* Load Balancing: Use cloud-based load balancers to evenly distribute traffic
across multiple servers and data centers, reducing the risk of congestion or
single points of failure. * Content Delivery Networks (CDNs): Use CDNs to
distribute content closer to users, reducing latency and improving performance
while also offering security benefits such as DDoS protection and content
encryption.

14. Incident Response Plan:
* Develop a Cloud-Specific Incident Response Plan: Ensure that the
organization's incident response plan accounts for both on-premises and cloud
environments. This includes identifying responsibilities, communication
channels, and the tools needed to detect, investigate, and respond to security
incidents. * Automated Responses: Consider automating certain responses, such
as shutting down suspicious instances, revoking access, or blocking traffic,
based on pre-defined security rules.



_______________________________________________
rtgwg mailing list -- rtgwg@ietf.org
To unsubscribe send an email to rtgwg-le...@ietf.org

Reply via email to