Nuage Networks VSP Deep Dive

Ever since Cisco bought Insieme and created Cisco ACI, and VMware bought Nicira and created NSX, I've been intensively deep-diving and blogging about both these solutions, how they compare to each other and to some Open Source SDN solutions out there, such as OpenDayLight and Open Contrail |(check out the Blog Map section for some of my older posts). I even did boot camps and got the highest certifications in both NSX and ACI. SDN is still a rather new technology, and I wanted to make sure I have enough expertise to always explain to a customer which SDN solution is the right one for their Organization and why. Apart from ACI, NSX and open source solutions, there is another player on the SDN market, and from what I've seen - they mean business! I'm talking about Nuage Networks, acquired by Nokia from Alcatel-Lucent in November 2016. Even though I've known about this solution for a while, my opinion was that their strongest side was marketing, so I didn’t spend a lot of time investigating about Nuage (it's also pretty difficult to find the information about Nuage, there is an apparent lack of experts/blogs/technical info about the product). I finally decided to give them an opportunity, I did a boot camp, a lots of Hands-on, and recently I passed a certification 4A0-N01 Nuage Network Professional – Datacenter (NNP-DC). Let me share with you what Nuage Networks is all about, and give the unbiased opinion about their understanding of SDN, and how they compare to other SDN solutions on the market.

Disclaimer: Some of the materials I used come directly from Nuage technical documentation, which is for some reason not available to the public (and it should be!). If someone from Nokia is reading this, please note that revealing technical information about what your product provides more market, and gives more visibility to your product. I strongly advise you to make as much Nuage documentation public as possible, because if your product is good (and in my opinion - it is), invite bloggers and technical experts to give you feedback, if they feel comfortable with your product, they will feel free sharing it with potential customers.

Before I get deeper into what Nuage VSP is good for, let's make sure we understand the difference between the IaaS, PaaS and SaaS. In order to really get what your company is doing (or should be doing), whether you are a Service Provider (SP), or an Enterprise consuming resources provided by other Service Provider, you need to have a clear distinction of what is handled by whom in each of these architectures. Basically:

  • IaaS (Infrastructure as a Service) - SP Provides Network, Compute and Storage, Customer builds OS and Apps
  • PaaS (Platform as a Service) - SP also Provides the OS. Who takes tare of the OS Upgrades and other stuff? Good questions… Depends on the PaaS Provider, could be either way.
  • SaaS (Software as a Service) - SP owns everything, including the application

Let's start with the basics. We already know what SDN is all about, separating the Control Plane from the Data Plane, and providing a single Management plane that exposes the Northbound APIs. Nuage  follows the same concepts. Nuage created a platform called VSP. VSP  stands for Virtualized  Services Platform, and it does the Orchestration of the Deployment, handling  the following Planes:

  • Management plane, represented by Nuage Virtual Service Directory (VSD) and the Cloud Management System or CMS (OpenStack, CloudStack etc.)
  • Control plane, handled by Nuage Service Controller (VSC)
  • Data Plane, handled by a Virtual Router & Switch (VRS)



VSP includes the software suite comprising of three key products:

  • VSD (Virtual Services Directory), which holds the policy and network service templates.
  • VSC (Virtual Services Controller), which is the SDN controller that communicates to the hypervisors.
  • VRS (Virtual Routing and Switching) agent that resides within the hypervisor on the server hardware.

Let's now take a deep dive into what communication protocols are deployed between different VSP components:

  • Communication between the CMS (Cloud Management System, such as OpenStack, CloudStack, vCenter, vCloud, etc.) and the VSD is done via RESTful APIs. We're talking about the Northbound APIs that allow us to configure Nuage Platform, or VSP.
  • Communication between the VSD and VSC is via industry standard XMPP (Extensible Messaging and Presence Protocol), using the Management network. SSL is optional, but recommendable.
  • Communication between the VSC and the hypervisors (including the VRS) is via OpenFlow, using the Underlay Network. SSL is again, optional but recommended.
  • SDN is all about virtualization, but luckily - physical servers have not been forgotten. To integrate “bare metal” assets such as non-virtualized servers and appliances, Nuage Networks also provides a comprehensive Gateway solution: software-based VRS gateway (VRS-G) and hardware-based 7850 VSG.

I'd recommend you to get acquainted with the individual components of the architecture by reading the rest of this post first, and then re-visit the previous paragraph. It will all make much more sense for you.

Let's now check out individual Nuage VSP components, and see what each one does. Once again, I'll try to be methodical (not an intuitive task for my mind), and try to structure the post, so that you can follow:

  1. VSD, or the Virtualized Services Directory at the Network Management Plane
  2. VSC, or the Virtualized Services Controller, at the Network Control Plane
  3. VRS and VSG , or the Virtualized Routing & Switching and Virtualized Services Gateway, at the network Data Plane
  4. Security Policies: NFV and Service Chaining


1. VSD  - Virtualized Services Directory, holds the Policy and Network Templates. VSD uses XMPP protocol to communicate with VSC.

VSD is where we do the Service Definition by defining Network Service Templates. The service definition includes domain, zone, subnet and policy templates. A domain template can also include policies (e.g. security, forwarding, QoS, etc.) to be applied at the different levels (vPort, subnet, zone, domain). I will cover all these concepts in just a while. It´s an essential component that will manage everything. It can be deployed as a Physical or a Virtual machine. it comes as an OVA file (for ESXi) and QCOW2 file (for KVM), or as an ISO image (recommended for the production environment). You can choose whether you want to do a standalone deployment, or a cluster of 3 VMs. To work properly, VSD requires an NTP server and a DNS server in the network.


The VSD also contains a powerful analytics engine (optional, based on Elastic Search). The VSD supports RESTful APIs for communicating to the cloud provider’s management systems. In the case of OpenStack, it is between nova and nova-compute, while vCloud uses the vCenter API to access the ESXi HVs.

VSD has two types of users:
Administrator/CSP Users, who will have full visibility into all of the functionality of VSD
Enterprise/Organization Users. An enterprise user belongs to one, and only one, specific enterprise.

TIP: Have in mind that if you're interesting in LDAP, users must be manually created in VSD, even if they have already been created in the LDAP directory.

VSD Service Abstraction - a VSD way of creating an Object Tree, where the Domain is the one Root and Zones, Subnets and another Object have an exact place in the Tree. VSD then translates the Service Abstraction into the Service Instances, following the same Object Tree. Having in mind that domain is mapped to a distributed VPRN instance (dVPRN) while a subnet is mapped to a distributed RVPLS instance (dRVPLS), we are counting with:

  • L2 Service Instances (vRVPLS)
  • L3 Service Instances (dRVPN)


Domain:  An enterprise contains one or more domains. A domain is a single “Layer 3” space, which can include one or more subnetworks that can communicate with each other. In standard networking terminology, a domain maps to a VPRN (Virtual Private Routed Network) service instance. Route distinguisher (RD) and route target (RT) values for the VPRN service are generated automatically by default, but can be modified. CSP Root users can create domain template for all the enterprises. Enterprise Administrators and Network Designers group users can create domain templates for their enterprises. Users that belong to other groups cannot create domain templates.
Layer 2 Domain:  A standard domain is a Layer 3 construct, including routing between subnets. A Layer 2 domain, however, is a mechanism to provide a single subnet, or a single L2 broadcast domain within the datacenter environment. It is possible to extend that broadcast domain into the WAN, or legacy VLAN.
Zone:  Zones are defined within a domain. A zone does not map to anything on the network directly, but instead it acts as an object with which policies are associated such that all endpoints in the zone adhere to the same set of policies.
Subnet: Subnets are defined within a zone. A subnet is a specific IP subnet within the domain instance. The subnet is instantiated as a routed virtual private LAN service (R‐VPLS). A subnet is unique and distinct within a domain; that is, subnets within a domain are not allowed to overlap or to contain other subnets in accordance with the standard IP subnet definitions.
vPorts: Intended to provide more granular configuration than at the subnet level, and also support a split workflow. The vPort is configured and associated with a VM port (or gateway port) before the port exists on the hypervisor or gateway. Ports that connect the Bare Metal Servers to an Overlay are also called vPort. Whenever an vPort is instantiated, an IP address is assigned to it, unique at the Domain level, from the Subnet that the vPort belongs to. VSD is responsible for assigning the correct IP address, regardless if the VM asks for a specific IP (statically configured on the OS), or from a DHCP pool. The same Virtual IP can be assigned to multiple vPorts for redundancy (must be different then any of the IPs assigned to the vPorts).
All ports will have a corresponding vPort, either auto-configured or configured via REST API. Configuration attributes may optionally be configured on the vPort.

VM is formed from its profile, which contains the VM metadata. This metadata defines which Domain, Zone, Subnet and vPort to apply to every vNIC of the VM. It also defines which Enterprise and User Group it belongs to.  Additionally, some metadata may be specified if attaching to a specific vPort is required. When a new VM is created, a VM creation request is sent to the VSC from the VRS agent in an OpenFlow message using the Underlay Network. This message contains the VM-related metadata. VSC forwards the request one level higher in the hierarchy, to the VSD in an XMPP message using the Management Network. The VSD receives the VM creation request, reads its metadata and checks them against the policy definitions. The VSD learns the MAC address assigned to this VM from the metadata, and in a VSD managed IP address allocation scenario, it assigns an IP address for it from the subnet (usually the next available IP address).

VSD has a somewhat complex architecture. The components of the VSD can be centralized on a single machine or distributed across multiple machines for redundancy and scale. Some of the most important to have in mind at this point are:

  • TNC stands for trusted network connect, which is an open architecture for network access control.
  • Policy management engine evaluates the policy rules configured on the VSD (Security and QoS policies, IP assignments etc.) It sends policies to VSC based on network events.
  • VSD mediator is a VSD Southbound interface used for communication to the VSC. It receives requests for policy information and updates from the VSC, and pushes policy updates to the VSC. The VSD itself is an XMPP client: it communicates with an XMPP server, or server clusters.
  • Statistics engine collects fine-grained network information at the VRS, VSC and VM levels. It can collect various packet-based statistics such as Packets in/out, dropped packets in/out, dropped by rate limit etc. It provides an open interface for Nuage and third-party analytics applications. Have in mind that by default, Statistics collection is disabled on the VSD. A separate VSD node running Elastic Search needs to be deployed (can also be deployed as a Cluster).
  • REST API is the VSD Northbound interface, which exposes all the VSD functionalities via API calls. It can be used by Nuage CMS plug-ins for integration with many CMSs.


2. VSC - Virtualized Services Controller - SDN Controller, controls the Network, communicates with the Hypervisor and collects the VM related information such as MAC and IP addresses . VSC uses OpenFlow to control the VRS. On each VRS we need to define which VSC is active, and which is standby (you can configure various active VCS for Load Balancing). OpenFlow uses TCP port 6633, and it is used to download actual L2/L3 FIBs to the virtual switch components on the Hypervisor.


VSC is only installed as a VM (or as an Integrated Module on a Nuage NSG, when NSG is used as VxLAN Gateway), and it comes as OVA file, a QCOW2 file and a VMDK file. VSC has a control interface connected to the Underlay. It is based on Nokia Service Router Operating System (SROS), which is somewhat similar to Cisco IOS (not the same commands, but… intuitive, if you come from Cisco).

Now comes a really cool part about why Nuage. Controllers act like Router Control Plane, and routing is established between VSCs and other routers. This makes is so much easier to implement DCI. VSC needs a routing protocol to exchange the routes with the other VSCs. It can be ISIS, OSPF or Static Routes. MP-BGP EVPN also needs to be established between all the VSCs.

The VSC has three main communication directions:
Northbound: to the VSD via XMPP
East/West: federation functions to other VSCs or IP/MPLS Provider Edge nodes via MP-BGP
Southbound: to the VRSs via OpenFlow

3. VRS (Data Plane) - Virtual Routing and Switching plugin inside the Hypervisor. It´s based on OVS, and it´s responsible for L2/L3 forwarding, encapsulation.

 On VRS you can define various VSC for redundancy and load balancing (one active and one standby), and each of them establishes an OpenFlow session using the Underlay network, not Management , using TCP port 6633 (SSL is optional).

VRS includes two main Nuage components:

  • VRS Agent, that talks to VSC using OpenFlow. It's responsible for programming L2/L3 FIBs, and it replies to all ARP (no flooding). It also reports changes in VMs to the VSC. The forwarding table is pushed to VRS from VSC via OpenFlow. It has not only a view of all the IP and MAC addresses of the VMs being served by the local hypervisor, but also those which belong to the same domain (L2 and L3 segments), that is, all possible destinations of traffic for the VMs being served by that HV.
  • Open vSwitch (OVS), provides Switching and Routing components and Tunneling to forward the traffic.

VRS supports a wide range of L2 and L3 encapsulation methods (VXLAN, VLAN, MPLSoGRE) so that it can communicate with a wide range of external network endpoints (other hypervisors, IP- or MPLS-based routers).


Let's get even deeper into the connection between the Control Plane and Data Plane, or VSC and VRS in Nuage Language. Nuage Networks uses Open Source components, such as libvert, OVS and OpenFlow. Nuage Networks makes use of the libvirt library in the VRS component that runs in Linux-based hypervisor environments (Xen and KVM) to get VM event notifications (new VM, start VM, stop VM, etc.). Libvirt is a package installed on the Hypervisor. Nuage also installs Nuage VRS. This enables the usage of User space tools:

  • Virt-Manager: For GUI
  • Virsh: Commands (CLI)

Before we continue, let's make sure we understand the basic concepts needed to understand the VRS and VGS (VRS-G included). Basically we need to understand:

  • What is OVS (Open virtual Switch).
  • Difference between the Underlay and the Overlay.
  • What is VxLAN, what are VTEPs, and how it all works.

Open vSwitch (OVS) is a major building block for Nuage SDN. It implements a L2 bridge includingu MAC learning. . OpenFlow is used to configure the vSwitch. It's used for Linux Networking and it's part of Linux Kernel, now used instead of Linux Bridge. OVS can be configured via CLI, OpenFlow or OVSDB management protocol. OVS doesn’t work like VMware VDS or Cisco 1000v. Instead, it only exists on each individual physical host, and it makes it easier for developers of virtualization/cloud management platforms to offer distributed vSwitch capabilities. In Nuage, OpenFlow is used to program the virtual switch within the hypervisor, with the vSwitch becoming the new edge of the datacenter network. The OVS becomes the access layer of the network. The access is where control policies are typically implemented: ACLs, QoS policies, monitoring (netflow, sflow), OVS has these features, and also provides an SDN programmatic interface (OpenFlow and OVSDB management).

The three main components of OVS are:

  • ovsdb-server is the configuration database which contains details about bridges, interfaces, tunnels, QoS, etc.
  • OVS kernel module handles the data path, including packet header handling, table lookup and tunnel encapsulation and decapsulation. The first frame of a flow goes to ovs-vswitchd to make the forwarding decision; the following frames are then processed by the kernel.
  • ovs-vswitchd matches the first frame for a “flow” action (L2 forwarding, mirroring, tunneling, QoS processing, ACL filtering, etc.) and caches these in the flow table in the kernel module.


The Open vSwitch is configured by the “control cluster” through a combination of the following methods:

  • SSH and the CLI can be used to manually configure the switch locally
  • The OVSDB management protocol is used to create switch instances, attach interfaces and define QoS and security policies.
  • OpenFlow is used to establish flow states and the forwarding tables for these flows
  • Netlink is the Linux communication API used between kernel and user space

Open vSwitch can also be implemented on hardware switches, for example an SDN white box switch, as OVSDB management protocol is also implemented on some vendors’ switches.

Overlay Network: Virtual abstraction built on top of a Physical Network. There are Network-Centric overlays (VPLS, TRILL, Fabric Path) where hosts are not aware of the Overlay, and Host-Centric (VxLAN, NV-GRE, STT) where hosts help create the virtual tunnels.

VxLAN: You can check out my previous posts (go to Blog Map) for more details on how VxLAN Control Plane and Encapsulation take place. VXLAN has a 24 bit VXLAN identifier, which allows for 16 million different tenant IDs. The VXLAN UDP source port is set on the sending side with a special hashing function that allows for load balancing of traffic by ECMP (equal cost multiple path) in the datacenter network. Destination Port is 4789. On the data plane, each VTEP capable device needs to have a forwarding table with each possible destination MAC address within the same L2 domain and the hypervisor hosting it. The VNI identifies the L2 domain within the DC.

More and more server NIC cards support VXLAN offload functionality, which improves the encapsulation/decapsulation performance.

All VTEPs (Virtual Tunnel End Points) in the VxLAN Control Plane need at least the IP connectivity. VTEP needs to act as the default gateway for all the subnetworks that its hosted VMs belong to. In order to do this, VTEP will be assigned a MAC address and an IP address within each of such subnetworks. The combination of the IP and MAC addresses corresponding to a given VM is known as EVPN prefix. When a packet is sent by a VM to its default gateway, because its final destination is an IP address in a different subnetwork, the VTEP will look into its EVPN route table, swap the destination MAC address (presently pointing to the default gateway) to the MAC address of the VM intended to receive the packet, and send the frame to the VTEP hosting the destination VM using the corresponding VXLAN tunnel.

BGP EVPN is an Address Family that can include both, IP and MAC address for a given end point. Forwarding tables on each hypervisor contain information about all VMs in all subnets (each subnet corresponds to a different EVPN instance). VXLAN tunnels exist to reach these subnets on all the hypervisors. Backhaul VPLS brings optimization and enhanced scaling for the number of EVPN MAC addresses and tunnels. With this optimization, each VRS receives only complete forwarding information related to subnets (EVPNs) locally hosted on itself. Each VRS is still aware of every VM in remote subnets, the hypervisor hosting it and its IP address (but not its MAC address). Consequently, when a VM wants to communicate with another VM in a remote EVPN, the VRS (acting as the default gateway) only has to do a route-table lookup to identify which hypervisor is hosting the relevant IP address. This way, it can use the VXLAN tunnel indicated by the backhaul VPLS to forward the packet. There is no need in this case to find the corresponding VPLS and to do an additional L2 FDB lookup to determine the destination MAC address, as would happen if the subnet were not remote.


VRS is in the Underlay Network, and OVS is in the Overlay Network. All Hypervisors need at lease one interface connected to the Underlay Network. You can also have the VTEP assigned to the ToR Switch instead of a Hypervisor, but the concepts don't change.


VSG - Virtual Services Gateway allows the interconnection between Physical and Virtual domains. It basically translates VLAN to VxLAN (VxLAN towards Nuage Overlay, and VLAN to Legacy Infrastructure). There are two Nuage versions (physical and virtual), and a version for the "White Boxes":

  • Software (VRS-G) which offers Network ports via Overlay (VxLAN) and access ports to the traditional network (VLAN)
  • Hardware, 7850 VSG is a 10/40G  Switch providing VTEP GW functionality (VTEP in Hardware).
  • Hardware VTEP on a White Box


VSN is a Virtual Service Node, composed of VSC and a group of VRS. VSC is like a Control Plane, and VRS are like a Hypervisors. The VSN provides the network operator with a unified view of all the elements being handled by it, making HVs appear as line cards in a chassis when compared to a classic router. It provides a one-stop management and provisioning point for all the HVs under the VSN control.



4. Don’t forget the Security: NFV and Service Chaining

Security Policies are defined at a Domain level, define From/To Zones and/or Subnets. It is important to understand the relative directions of security policies before implementing them. The easiest way to understand the directions is to imagine it from the OVS point of view. This would mean that the INGRESS would me traffic entering the OVS, and Egress - traffic going OUT of the OVS:
Ingress refers to the direction of traffic flow from the VM towards the network (or the OVS component).
Egress refers to the direction of traffic from the network (or the OVS component) towards the VM

Policies have priorities which allows defining the order. They can be Imported/Exported between the Domains, or to/from a File. Before you apply the policies:
By default all INGRESS traffic is dropped (INGRESS means from VM to the OVS).
By default all EGRESS traffic is accepted (from OVS to VM).

When defining the Security Policy, it's important to have in mind Nuage mode of operation, shown on the diagram below.


At the time of creation, a Policy Group Type is assigned to each Security Policy:
Hardware, for hosts and bridge vPort hosted in Nuage VSG/VSA Gateways
Software, VRS and VRS-G hosted vPort, including VM, host and bridge vPort

Important: When you do Stateless, you need ACLs for "returning" traffic. For stateful you just need one policy in one direction.

ACL Sandwich feature enables a network admin to define a supra-list that will drop specific traffic that should NEVER reach the VM. The end user who owns the domain instance can then combine ACL rules into ACLs defined on the domain instance level.

Logging can be able on ACL entry level.

Service Chaining
VSP provides so called Forwarding Policies to control the redirection of packets. This is what later enables Service Chaining. In my opinion, Nuage has the most elegant implementation of Service Chaining of all SDN products out there. All is implemented through flow-based redirection.

Nuage supports Physical and Virtual L4-7 Appliances/Cluster of Appliances as redirection targets, and it gives you the option of creating the Advanced Redirection Policies, where you're given the option to redirect only the traffic destined to a certain TCP/UDP port.

How to sell SDN

The most important thing about presenting SDN to a potential Customer, and about how you need to focus your Presentation, and I cannot stress this enough: your entire speech needs to be adapted to your audience.

1. Networking and Security Department

What you need to know before you start planning the presentation:
Before we get to the point, you need to understand that the Networking guys do not want SDN. Within the Networking department you will easily distinguish two types of engineers:
- The ones who hate SDN, hate you for presenting it, and just want to continue doing things their own way.
- The ones who understand that unless they understand and learn SDN, the System guys will choose the product, learn it, and take care of Networking themselves, making the Networking department obsolete. You should always direct to this group in your presentations.

What's the most positive thing SDN brings to the table?

SDN is a concept of a Network that is Multi-Tenant, that has a single point of control of the entire Network, and most importantly - allows you to "consume" the Network using the APIs. This means that the Network department can give your Developers, or Cloud Admins, the tools and teach them to consume the Network. This way you avoid a usual delay that Networking department needs to configure Networking and Security for the new Apps and Services, and most importantly - the concept of Tenant allows them to use overlapping IPs, VLANs, Names, without ever being able to compromise the stability of your Network.

What will they want to know? Here is a day to day of a Network Admin:
- Something stops working.
- In average it takes around 10 micro seconds before someone says "Hey, maybe it's a networking issue?"
- Regardless if your Network Admin has "more important stuff to do", he ends having to verify the entire Networking environment, because of a "issue in a production environment" and everything goes on top of the network.
- The issue gets resolved. More often then not, Network Admins get no feedback about how they solved it.

Network Admins just want no one to shout at them because something isn't working correctly.




This just means that the only thing the Network Admins will be demanding from your SDN solution is a set of easy-to-use  Troubleshooting tools. Have this in mind when preparing your presentation.


1. Systems/Cloud Department
What's the most positive thing SDN brings to the table?
Networking department is handling so many "critical production issues" that they hardly have any time to provision the networking for new services. Even when they have time, they have to take so much care just not to break something in the network while configuring new stuff. In the world where it takes us seconds to bring up a new instance of VM or a Container, the current Network model just won't do. System guys need a way to simply provision the networking without writing an essay to the Networking department detailing why their request needs to be prioritized. This is why it will be really easy to make these guys understand (and probably love) any SDN solution you might be presenting.

What will they want to know? This depends on the solution you're trying to position. Have in mind that these guys will love "graphical" solutions, such as VMware NSX and Nokia Nuage , and since they have a limited knowledge of Networking, it will be complicated to explain the advantages of the solution that handles Physical + Virtual Network, such as Cisco ACI and OpenDayLight.


3. Software Developers
Developers have similar "needs" like the System guys, they need a way to simply provision and secure the communication flows. If you tell them that the solution you're presenting gives them the possibility to consume the Network using API calls - they're on board.

4. Mixed audience
This is probably the most complex audience you can possibly have when talking about SDN, because each of the departments will understand the concept in a different manner. Be sure that you can handle the opened discussion, you have to be a true SDN Ninja to handle the "lost in translation" paradox that will occur. I strongly advice you to bring both, Networking/SDN and Systems Experts to a presentation of this type, and make sure that YOUR experts agree on what SDN is before you let them approach the client as a team.


What are Cisco Cloud Center (CliQr) and UCS Director, how to choose/integrate?

Before we get into the details about each technology, and how you should choose which one best fits in your environment, I would strongly advise you to sit down and think about what exactly you need, what would be your ideal target environment. While doing this here are a few questions you need to ask yourself:

  • What do I want to offer, IaaS, PaaS, SaaS, or a combination of these?
  • Do you want to automate the Application Deployment or Infrastructure Deployment?
  • Are you really ready for automation? I strongly believe that once you choose your Platforms, you should stick to it, because everything can be done in each of these… It's just that some are more suitable for certain tasks/ways of use then the others.


USC Director is used for the Infrastructure Automation and Management (yes, management as well!). UCS has a huge Task Library for Infrastructure Elements such as Cisco Nexus and ACI, UCS, NetApp, EMC, vCenter, VMware vSAN etc.


The main competitors of UCS Director are:

  • vRealize Suite (Automation, Orchestration) by VMware. I've seen very cool projects done with vRealize, but typically it's optimized for a mostly VMware environment.
  • Terraform by HashiCorp. Linux geeks tend to love this one, as it only has a command mode, and you can deploy your infrastructure directly using the Code.
  • Ansible by RedHat. You right your own Playbooks, they are human-readable. Very flexible.


Why choose UCS Director? It really depends on your environment and what you want to do. In my opinion it's a perfect fit when you want to include the automation of the physical infrastructure in your Workflow,  and get the unified support by Cisco. Out of the Box UCS Director has bunch of Tasks already at your disposal (as you probably guessed, most Cisco products, such as ACI, Nexus, UCS etc. are already included). If you need to add tasks, there is a pretty nice community. Just check this one, it's a UCSD Workflow INDEX (UCSD Technical Content Index):
https://communities.cisco.com/docs/DOC-56419

TIP: If you are really interested in UCS Director I strongly advise you to build your own Lab and test it, before you make a purchase. Don’t trust that Power Point, the stakes are to high. There is a built in Evaluation License in a UCS Director, and you can download it as an OVA or VMDK from Cisco.

Cisco Cloud Center (ex Cliqr) is a CMP (Cloud Management Platform). It is was a pretty pleasant surprise for me to see that Cisco is finally learning how to do Software products. In all fairness most of the original code comes from the company they acquired (Cliqr), but still… they also bought Insieme and turned it into ACI, and… well, you know how ACI GUI is.



The main competitors of Cisco Cloud Center are:

  • CloudForms by RedHat. While CloudForms is more flexible, it doesn’t come with Libraries so you will need to do most coding yourself.
  • vRealize Suite again, since it now supports Public Cloud.
  • Rightscale, which purely follows SAAS model. You can not deploy Rightscale in your environment. Its already hosted somewhere, all you do is, login to it, add your cloud solution account and start managing it.
  • Others (CloudBolt, Oracle etc.).
  • Dell Multi-Cloud Manager (don’t use this one, sorry @Dell).


These sound similar, should I use UCS Director, Cloud Center, or both?

A short answer would be - UCS does Infrastructure, Cloud Center does Application. This does not mean that UCS Director couldn’t automate the Application deployment, or that Cloud Center cannot do the infrastructure. It means that both products are better suited doing what they were designed to do. Now, go back to the first paragraph and answer the questions. At this point you should have a clearer picture which is the right product for you.


What if you need both, Application Deployment automation with Infrastructure modifications in accordance with the Application needs? In such case, you would use both, UCS Director as Day 1 product, and the Cloud Center for Application Deployment in Multi Cloud. On top of both these you would need an Orchestrator of Orchestrators. This is where you would place your Service Catalogue which would then use UCS Director and Cloud Center Northbound APIs to Automate your Application Deployment, doing the Application Tiers and Infrastructure deployments separately.


If you don’t want to build your own Service Catalogue Web, Cisco has a product of this type called PSC (Platform Service Catalogue). It's simple, but I'm not really sure how expensive it is… after all, it is Cisco.

How DevOps and Cloud raise the importance of System Integrator

System Integrators, buckle up, DevOps is coming, and if you play your cards right - your role is about to get crazy important.

Let me start this post by telling a story. It's a story that involves a stubborn customer, 3 big vendors and a Cloud. The reason I need to start this way is simple - the same scenario with different "players" has happened so many times in the last few years that someone should sum up what we've all learned (or haven't, in some cases). I guess this would be a great place for a Disclaimer, and  I'll quote my favourite disclaimer ever, from South Park: All Customers and events in this post, even those based on real people, are entirely fictional.

The story starts with Customer learning that Cloud is cool, and starting wanting it. The problem is that there is no manual on Google on how to build a personalised private cloud. That's no problem, why not just promote (rename) your head systems engineer to a Cloud Architect and follow his ideas and his experience_ "But… he's got no experience" you'll say, and you'd be right. What he does have is a lot of vendors at his disposal with fancy PowerPoint presentations explaining how cool and awesome OpenStack is. Everything is awesome,  (check out the LEGO movie soundtrack below, but be warned - It will stay in your head for days), not let's go build a cloud!



Now comes the tricky part. A majority of the bigger customers prefer working directly with the Vendors Professional Services, in order to lead the deployment themselves and hold someone accountable for a potential lack of functionalities, or problems with the product. While there is logic to this philosophy, it applies more to the legacy infrastructure, where there is no complexity of the integration between different technologies. So… why doesn't it apply to the Cloud? In the past few years I have seen so many different scenarios where a customer followed this strategy and either rolled back the entire environment, or is still struggling to make a Lab working. For now - lets just state the 3 most obvious you should not use this strategy:

  • "Lost in Translation" bug in the Integration phase: Just like your Network, Systems and Apps engineers don't really understand each when you try putting them in the same room and making them collaborate, the Vendors of different types also wont be able to easily collaborate. Don't think that you'll be able to lead this collaboration, you will most definitely end up with different vendors pointing a finger at each other when asked why the integration is not working.
  • Support: Each Vendor will support their own product, but no one will give you a support of the Integration, which is the most complex part. A Cloud environment is difficult to build, but easy to operate. If something goes wrong - it goes REALLY wrong, and you will need an expert who understands the integration of the components in depth. It's impossible to demand this from your Cloud Architect or a Lead Engineer. Once the environment was initially built, they most probably didn’t spend their afternoons reviewing the Plugins/Drivers/Manual code modifications done in the implementation phase, and they will not be able to troubleshoot anything.
  • Upgrades/Modifications: Now imagine the moment you realize that your OpenStack/SDN/Orchestrator/etc. is obsolete, and you need to Upgrade. How does Upgrading each of the components impact the stability of the entire system? You will basically go back to the 1st problem, each time you need to modify anything on your Cloud.

What should be your strategy when deploying the Cloud?

The answer is rather simple actually. You need a partner, most likely a System Integrator, with a strong partnership with all of the Vendors whose products you wish to include in your Cloud environment. Here are the main reasons to involve a System Integrator:

  • They also had the "Lost in Translation" problem, but it was most likely a long time ago. At this phase different area specialists know how to talk to each other, and they can even help you teach your own employees how to do the same.
  • All the disputes between the different Vendors will be transparent to you, and the System Integrator is more likely to figure out why the Integration isn't working, and either work with a Vendor to resolve the issue giving you a full transparency. They can even engineer a custom code within the solution for you and give you support for it.


Conclusion

This is not an easy task for the System Integrator, but as soon as everyone starts understanding how the new system should work, it will be so much easier to deploy a stable and fully supported and Upgradable Cloud environment, without an Engineering department that companies like Google, AWS and Facebook have managing their clouds.

Cisco ACI Unknown Unicast: Hardware Proxy vs Flooding Mode

Before we start, lets once again make sure we fully understand what Bridge Domain is. The bridge domain can be compared to a giant distributed switch. Cisco ACI preserves the Layer 2 forwarding semantics even if the traffic is routed on the fabric. The TTL is not decremented for Layer 2 traffic, and the MAC addresses of the source and destination endpoints are preserved.

When you configure the Bridge Domain in the ACI, you need to decide what you want to do with the ARP packets, and what you want to do with the Unknown L2 Unicast. You can basically:

  • Enable ARP Flooding, or not.
  • Choose between the two L2 Unknown Unicast modes: Flood and Hardware Proxy.




Hardware Proxy

By default, Layer 2 unknown unicast traffic is sent to the spine proxy. This behaviour is controlled by the hardware proxy option associated with a bridge domain: if the destination is not known, send the packet to the spine proxy; if the spine proxy also does not know the address, discard the packet (default mode).

The advantage of the hardware proxy mode is that no flooding occurs in the fabric. The potential disadvantage is that the fabric has to learn all the endpoint addresses.

With Cisco ACI, however, this is not a concern for virtual and physical servers that are part of the fabric: the database is built for scalability to millions of endpoints. However, if the fabric had to learn all the IP addresses coming from the Internet, it would clearly not scale.


Flooding Mode

Alternatively, you can enable flooding mode: if the destination MAC address is not known, flood in the bridge domain. By default, ARP traffic is not flooded but sent to the destination endpoint. By enabling ARP flooding, ARP traffic is also flooded. A good use case for enabling ARP flooding would be when the Default Gateway resides outside of the ACI Fabric. This non-optimal configuration will require ARP Flooding enabled on the BD.

This mode of operation is equivalent to that of a regular Layer 2 switch, except that in Cisco ACI this traffic is transported in the fabric as a Layer 3 frame with all the benefits of Layer 2 multi-pathing, fast convergence, and so on.

Hardware proxy and unknown unicast and ARP flooding are two opposite modes of operation. With hardware proxy disabled and without unicast and ARP flooding, Layer 2 switching would not work.

This option does not have any impact on what the mapping database actually learns; the mapping database is always populated for Layer 2 entries regardless of this configuration.

What is NFVi or Cisco NFV Infrastructure, and where exactly does it "fit"?



First let's establish the difference between the NFV and the VNF:
  • VNF (Virtualized Network Function) refers to the implementation of a network function using software that is decoupled from the underlying hardware. It simply moves network functions out of dedicated hardware devices and into software. Cisco currently has around 90 VNFs ready to be implemented, mostly for the SP environment.
  • NFV (Network Functions Virtualization) represents a concept, and it's based on running SDN functions, independent of any specific hardware platform.

This all simply means that we need the network functions virtualization (NFV) architecture to support the deterministic placement of virtualized network functions (VNFs).

Network Functions Virtualization is "the new black" in the Networking Security, and all of us Network Bloggers have been talking about it extensively within the past few years. What it basically means is that we are finding a way to virtualize one of the Functions of our Network/Network Security Elements, such as LB or FW. The concept is rather simple, and while the entire industry is wondering why we're not there yet, the Network Engineers (meaning - the ones with real understanding of how networking protocols work) are having a really hard time explaining why it can't all work as simply as the OpenStack enthusiasts expect it to.

Server Virtualization is not complicated. Once you have a Hypervisor - you can create numerous Virtual Machines on a single Bare Metal server. Networking is much more complicated. In order to implement the NFV, you need to have the Networking part underneath it completely handled and controlled, you need the ALL-to-ALL connectivity provisioned in your underlay, and just "apply" the desired connectivity in accordance with what your VNF needs. This might be simple if we're talking about a couple of switches where we would simply extend a big group of VLANs all over the place, but as soon as we get into a bit more complicated Networking Architecture (as in - any serious companys DC network) - we add Spanning Tree, Routing, VxLAN Control Plane (and all other control planes that use MCAST) etc. If we don't have an SDN solution capable to handle both Physical and Virtual Network elements - we shouldn't even start thinking about the NFV. It would be like trying to breathe in space, you know WHAT you want to do and which organs you need to activate, but there is simply no all-to-all elements connectivity, which would be an oxygen in this case. Therefore - SDN is the enabler for NFV, and the two concepts go hand in hand.

What Cisco did is they came up with an alternative that allows them to offer the NFV solution using al alternative (OpenSource) SDN solution instead of Cisco ACI. NFVi is a reference of the architecture which does not depend on the SDN Solution at all, and it´s primarily made for the Service Providers. NFVi would be the Infrastructure component of Ciscos NFV Platform. A key part of Cisco NFVi is Cisco Virtualized Infrastructure Manager (VIM).

If you ever played with OpenStack, you know that we are talking a platform that is pretty complex to deploy and operate. This is where VIM really shows it´s value. VIM takes care of the Installer and the Life Cycle management of the entire NFVi Storage, Network and Compute components, and it fully integrates:
  • OpenStack Platform (Red Hat distribution)
  • CEPH (for reliable storage) 
  • All this on Cisco UCS (Unified Computing System)





There are so many different SDN and NFV ecosystems out there that it gets overwhelming for the end users, which is kinda why I wrote this post. NFVI is an Open Network Architecture compatible with any SP End-to-End Service Creation. There are a few Cisco solutions to have in mind when thinking about the Service Provider:

  • WAE: WAN Automation Engine that complements the Cisco NSO (Network Services Orchestrator, enabled by tail-f) and Cisco´s distribution of OpenDayLight.
  • VTS (Virtual Topology System) is a true controller, designed to be Open, and it works with any other Vendors Networking equipment. VTS only requires BGP-EVPN in the underlay to be able to build VXLAN overlay.
  • Mercury is an Internal Cisco´s OpenStack platform specific for SP, based on RedHat OpenStack, made to do a successful, reliable and stable installation via GUI every time.

I could write an entire post about Cisco VTS (Virtual Topology System). It´s basically an SDN Controller for Service Provider Datacenter, a Hybrid (Physical and Virtual Overlay) Provisioning & Management System. In the context of NFVi the diagram bellow will tell you what you need to know.




Where exactly does NFVi fit in then? It´s quite simple actually. NFV Infrastructure is simply a tested and validated design that is, as Cisco claims, easily extensible and expandable. You could build a similar architecture yourself, or get a System Integrator to do it for you, but if you opt for NFVi - you get a Cisco label on a support contract. The following diagram shows the most common use cases of NFVi Platform.




Low Power Wide Area Networks for IoT: SigFox, LoRa, LTE-M, 5G LP-WAN

[In collaboration with guest blogger Marc Espinosa]

The important topic of connectivity protocols was discussed in our previous IoT post, it is time to dive deeper into the telecommunications protocols underneath. The fact is that the technologies that enable the IoT architecture need to assume low power at the same time as the transmission via the long distances (meaning - lower frequencies). 

For example, if we need to cover a field, a campus, an entire building or turning a city smart, we will need a specific communication protocol. The truth is that ZigBee and 6LoWPAN do create a low-power and low-cost WPANs, but since the assets can be distributed in an pretty wide area - we need to include another variable to the equation: the range.

The following networks/technologies are called Low Power Wide Area Networks (LP-WAN). Regardless of the fact that the consumption of devices they are connecting is low, they actually cover a wide area network, making things easier and better to connect from one point to another:




As you can appreciate, there are two new agents in the network ecosystem: LoRaWAN and SigFox. Both are called LPWAN and as you can realise they cover from 150 to 500 times in suburban or rural areas (respectively) the maximum range that ZigBee is offering. 
These two characters have been competitors in the LPWAN space for several years. The business models and technologies they use are different, but the targets are very similar: mobile networks adopting their technology to deploy IoT solutions.

Even though LoRa and SigFox serve similar markets, the first option is more likely if you need bidirectionality, because of the symmetric link (if you need to command-and-control functionality, like an electric grid monitoring). 



However, for applications that send only small and infrequent bursts of data (like alarms and meters) I would recommend the second one or ultra-narrow band technology that can hold a 2-way transport message as well (3*).




When we talk about mobile networks we can’t forget talking about NB-IOT: a LPWAN Narrow-Band radio technology standard that has been developed to enable a wide range of devices and services to be connected using cellular telecommunications bands. It has been designed for the IoT, standardised by the 3rd Generation Partnership Project (3GPP), a collaboration between groups of telecommunications associations.

To sum up all this content let’s group it into a table that explains qualitatively the LPWANs takeaways (4*).


To conclude, let’s highlight the key takeaways for the 3 low-power networks:
  • SigFox : extremely low power and bandwidth, kind of open standard due to you have to use their own network, easily tradable, limited security but it’s got some and lot of deployments.
  • LoRa: driven by a chip company (Semtech) so they want you to but the maximum number of chips. Not quite as low power as SigFox but pretty good too, it has more bandwidth to make control functions and send good data streaming, not an open standard as commented because you have to use compulsory the Semtech chip (in my opinion I consider this point as weakness because you are forcing your client to buy your product instead of you understanding the market needs), there are a lot of suppliers willing you to use Semtech chips, pretty good security (they do all the basic authentication), and several deployments
  • NB-IoT: a technology that mobile operators carry on, very low-power and bandwidth, similar to SigFox but not as deployed as the ultra-narrow band network, it is an open standard because it is part of the 3GPP, lots of suppliers because it is open, solid security and authentication and some deployments (Vodafone with Huawei did the  first commercial PoC that took place in Madrid using Vodafone Spain’s network on the September 19th 2016). There is another network called LTE-M (Long Term Evolution-Machine) that is pretty similar (open standard as well) to NB-IoT: not as power efficient as NB-IoT but best security. Low deployments but they are going to grow exponentially hand in hand with NB-IoT.

The question is: which of these is the IoT network of the future? Will LoRa and SigFox be able to survive if 5G standard includes the IoT-WAN? IoT is a big market, in our opinion - there´s a place for everyone, we just need to wait and see what happens.


Most Popular Posts