Virtualizing the physical network and treating the logical networks differently based on the business needs and security policies have been around for multiple decades. The technologies available to create such segmentations are different for different areas of the network. For example, in the Local Area Networks (LANs), VLANs have been the predominant technology where as within the Wide Area Networks (WANs) the choices have ranged from IPSec based VPNs to provider managed MPLS based VPNs.
In my last blog I have covered the ways the physical infrastructures of large-scale cloud service providers are built. In this blog I intend to explore the network virtualization aspects of present and future infrastructure builds. My focus will be mainly on the datacenter oriented networks.
As a part of this exploration, I plan to cover the following:
- Why incumbent technologies like VLAN fall short?
- What are the new network virtualization requirements?
- What’s missing in current approaches?
Why incumbent technologies like VLAN fall short?
Even though new approaches are emerging, a huge number of existing datacenters and enterprise networks are still being built out of VLAN technologies that expose operators to substantial scaling and operational risks. These have been beaten to death and most people have a good understanding of the issues. However, for people looking for a refresher I am going to cover them in more detail at the end of this blog post in the section “What’s wrong with VLAN based approaches?”.
What are the new network virtualization requirements?
Given the inability of incumbent VLAN technologies to support machine segmentation at-scale, it is imperative to have a closer look at the problem space itself. Is there a true need for machine segmentation at lower level (L2/L3) in a cloud-based infrastructure? I think the need for segmentation has become ever more pressing and prominent for common multi-tenant and enterprise use cases.
Here are some of the network virtualization requirements of the new infrastructure builds.
- Machine compartmentalization
The compute and storage resources of modern datacenters are being shared by numerous public and private entities that have variable needs and security practices. For example, some entities store and process mission-critical data, which cannot be compromised, whereas others might process data whose loss will have minimal business impact. Depending on the sensitivity of the data, the level of security hygiene varies. For example, someone could run an OS with known security vulnerabilities, exposing other machines in the same domain to hackers.
- Application of value-added services for machine pools
Network Virtualization can also be leveraged to impose service-chaining capabilities. For example, let’s consider a set of virtual machines that carry sensitive data and other virtual machines that have different roles and are exposed to different depth of security exploits. In such cases one might want to scrub all traffic going towards the first group of machines from the second group using DPI (Deep Packet Inspection) capabilities. In a complicated and distributed setup that carries a variety of machine pools (Virtual Networks), having a well-orchestrated service chaining and policy-based mechanism is important.
- Dynamic spanning of machine pools
In the older paradigm, groups of machines had to be physically co-located as mandated by VLAN technology. However, currently these machines can reside anywhere within a multi-datacenter (and possibly multi-cloud) build out. So, there needs to be a capability by which these virtual networks can span dynamically based on the application needs.
- Extension of Enterprise networks into Cloud
Enterprise networks are actively looking at public or private clouds to extend their networks as a mechanism to flexibly access more resources. In many cases, however, an enterprise network running a block of RFC1918 space may want to extend seamlessly into a private or public cloud environment without altering its addressing scheme. The address block used is typically ingrained into the security policies that are used by these enterprise networks. In some case the enterprise might be heavily using L3VPN (RFC2547/4364) from the providers and they just want the cloud services to resemble a set of VLANs connected to their VPN cloud. Hence, mechanisms are required that can enable automated and seamless extension of these networks. Some large providers have used IPSEC-based point-to-point tunnels to connect these address spaces, but as we have learned from the CPE (customer premises equipment) based VPN space, provisioning and operating scalable L3-based mechanisms (i.e. provider L3VPNs) is a lot easier than operating an IPSEC tunnel mesh.
What’s missing in current approaches?
Recently some proprietary overlay approaches that have tried to tackle the above problem spaces. These approaches remind me of the VL2 work that I did at Microsoft in 2008-2009 timeframe – It was supposed to introduce network virtualization within a single datacenter by encapsulating the payload after doing some smart lookups in a central machine where all the information was derived from a central database. NVGRE was used for overlay in the data-plane in VL2, while other newer approaches have proposed VXLAN. Each of these data-plane methods has been proposed in IETF towards standardization. Every Cloud Operator and SDN vendor has its own proprietary machinery that cannot interoperate with another system – even though a standards based overlay (VXLAN or NVGRE) may be able to exchange packets, lack of control plane will prevent disparate systems to comprehend them in any significant manner.
Over the last year, however, some approaches have shown up at IETF, which leverages constructs that have proven to scale well within Internet. Proper control plane interactions using standardized mechanism fosters a multi-vendor environment that keeps the vendors honest as well as enables the virtualization models to scale. I see promise in the L3VPN-End-System draft as well as some on the E-VPN work that is happening. However, I have never been a fan of extending the L2 services beyond a rack or so. So lesser the L2 capabilities bleed in better it is, to my mind. Applications should shred any dependency of L2 services like broadcast etc.
In addition to a lack of fully standardized interoperable control plane, I have observed that each of these newer overlay based systems implement their own independent northbound API for provisioning systems and applications/services. This works well for a cloud provider with significant internal resources to build the entire system end-to-end (think – Google, Amazon, etc.). However, this can present a major challenge to enterprise customers as well as emerging cloud providers as they get locked into a vendor or a provider. .
To summarize, what is missing is the capability to orchestrate the topology, segmentation and policy in an automated and standards-based way within a data-center and on a multi-datacenter/multi-cloud basis. This becomes quite a challenging and complex problem as different cloud operators have their own variants of the compute and storage orchestration systems. However, since network is all-pervasive and fundamentally critical glue, standardization is extremely important in this area.
This field is not yet matured and there will be many constructive and innovative efforts to shape out this area. Initial innovations around network virtualization and software-defined networking suggest a much larger inflection in scale-out system design and architectural thinking in coming days.
What’s wrong with VLAN based approaches?
- Physical Span of VLANs within Datacenters:
VLANs are built on spanning-tree based protocols that are quite temperamental and fundamentally weak in nature as a control plane protocol. A switch domain implementation with more than two layers of hierarchical switches and more than very few topology loops is prone to witness operational issues.
In general, two or more disparate L2 switching domains are never inter-connected within a datacenter. So a VLAN built in a particular co-location area within datacenter is seldom spanned beyond that. Even if a VLAN needs to host more number of machines and the co-location area has run out of space, VLANs rarely get extended. This in turn results in the VLAN based network segment getting fragmented into multiple VLANs, which complicates the packet filtering rules on the routers exponentially.
- Scale of VLANs:
The best practice is not to carry more than 500 hosts in a VLAN. This is due excessive cumulative broadcast traffic that gets generated as well as the unicast flooding that hit each box in case the switching system loses a MAC address. Additionally, the maximum number of VLANs that is supported by Ethernet frame format is 4K. But operators can’t even approach 4K VLANs in most cases, since gateway routers run out of HSRP/VRRP scaling long before the 4K limits. I have seen well-established and popular datacenter aggregation platforms unable to scale beyond 500-600 HSRP/VRRP sessions.
- Scale and complexity of Packet filters on routers:
In many cases, the packet filters (ACLs) on routers are built through simplistic scripts etc. Many environments are faced with situations where the ACLs are completely out of control. People can neither clean them nor consolidate them due to the uncertainty they would induce to the operational environment. Inability to manipulate the router ACLs in an automated and nimble, yet reliable way has turned out to be a nightmare for many operators.
- Spanning-tree in CLOS network:
Most modern cloud datacenters run networks with CLOS based architectures that inherently don’t work well with spanning-tree types of protocols. Spanning-tree operates by putting links in blocking state (for one or more VLANs) to prevent frame loops, but CLOS architectures operate with the notion of equally distributing flows over multiple parallel paths in a semi-mesh topology. People have looked into other L2 control plane protocols (TRILL etc.) to replace spanning in a CLOS type environment, but those approaches have demanded new hardware and vendor- specific customizations that have forced the operators to shy away. Multi-chassis based port-channel approaches have introduced some sanity in spanning-tree world, but they wouldn’t be as applicable in CLOS architectures.
By Parantap Lahiri – VP Solution Engineering, Contrail Systems