How failing VCDX changed my life


I started my professional career in 2003, in a NOC (Network Operations Center), making sure that Network and Security services of the VIP customers of the ISP (Internet Service Provider) ran smoothly. Till 2013, I mostly worked on big Network and Security design implementations for big European and Middle East companies, on Cisco and Juniper equipment, I got all the technical certifications (including CCIE). I jumped into Software Defined world and Virtualization as soon as it hit the market, and got my VCIX (VMware Certified Implementation Expert) Certification on Network Virtualization track (NSX, basically) soon after VMware acquired Nicira in 2014.

My point is that I did so many technical designs and implementations, mostly in Data Center environments, that when I found out about VCDX (VMware Certified DESIGN Expert), I was sue that I'd get it in the first attempt.

That did not happen. Not only that it didn't happen, but I faced one of the toughest wake up calls… ever!

What's VCDX all about?

VCDX is currently the most prestigious VMware Certification, held by less than 300 people in the world (check out the official directory). VCDX is NOT a technical certification. Even so, you need to be a SME (Subject Matter Expert) in all the areas that your Design covers.

VCDX is all about the community. The official VMware documentation and support is all right, I guess, but the community is one of the greatest I've ever stumbled upon. I'd actually like to seize this opportunity to thank some of the people who helped me while I was working on my design, and doing my mocks (mock is a practice design defence):

About my journey

Phase 1: Delusion

I came into a VCDX as a "great architect", and received such a slap of reality, understanding how much I actually didn't know. Yes, it's all about the WHY, and no matter how many times you hear this, you just don't get it until you start presenting your design to other VCDXs and candidates.

I learned SO MUCH about what good design needs to cover, and why it's so important that you understand how all the business requirements map into technical requirements, which drive 100% of your design decisions. I understood that if you have a Design Decision that you can't explain how you reached (what were the options, and how one of the requirements triggered that one, covering all the risks that it introduces and how you mitigate them) - you will simply fail.

If I can give you one advice, it's - be patient, and accept tips other VCDXs are giving you. They're not being mean, they really want you to pass.

Phase 2: Success

During VMworld 2018, I had a mock session where I presented my VCDX design to 6 VCDXs. It was brutal… The design I thought was bulletproof got destroyed to pieces. Result: I learned so much… It took me some time to recover from this experience, but once I did, I understood what an amazing privilege it was to actually have everyone listen to me for over an hour. This is when my mind just "clicked". I changed my design, and was approved for defence in the next available defence slot.

I had to change my design twice before it was approved for defence. I never got to defend it, but I still consider the journey a success for the following reasons:
  • I learned so much about what great design is. I switched to a complete different set of technologies (public cloud mostly), but everything I learned still applies, 100%, and I use it daily building Cloud architectures.
  • I learned hot to be more down-to-earth, and that I'm not a great architect. I still have so much to learn.
  • I understood what "it's all about WHY" means. Technology needs to help achieve business goals, being technically superior doesn’t mean you can make great designs.

Phase 3: Change of plans

I ended up changing jobs a few months before the defence date, which was unfortunate, because I had to focus on an entire new set of technologies that I didn’t have time to prepare my defence. Some may see this as a pity, after all the time and effort I invested, but to be honest - I don’t see it that way. I see it as a success story, as I still get to apply everything I learned… just on a different set of technologies.

Conclusion

I highly recommend you to go for VCDX. You might not get your number, you might get crushed while getting the design right, but I guarantee one thing - you will learn a lot, and became a much better architect then you currently are.

What is Service Mesh and do I need one?

Let's start with this - what problem does Istio solve?

Or... are we just using it cause it's the next cool thing? To be honest - there's a bit of both, from what I'm seeing with most of my customers.

To illustrate the problem Istio solves, lets take an example customer who already has their Kubernetes clusters. It doesn't matter on which cloud/data center k8s Masters and Workers are. SRE team is properly skilled, and operating the environment. The Developers see the great improvement as there's a clear improvement, plus an SRE team clearly understands what they need. Life is good.

Knock, Knock...

D: Oh... it's Marketing! Hello, Marketing, how can we help you?
M: Hey, Developers!!! How are you, bros? So... we have this super awesome new feature we'd like to test only in Southern Spain, and only on iPhones... and maybe just like half of the users if possible. How long do we need to do this?
D: ˆ%#$%!@ say... what??? Hey @SRE team, any chance Kubernetes can manage traffic management this granular?
SRE: Hmmm... How many people are we allowed to add to the team to operate the environment? Does potential business benefit justify contracting new SREs?

Enter - Service Mesh

And let's consider Istio, as my favourite Service Mesh at the moment...

There are 3 Core Features of Istio:

  1. Traffic Management: We can do Canary Testing, where we would like redirect 10% of traffic to the New version of the app. Or, create an algorithm that redirects an application to a different version, such as - iPhone users, let me route you over ... here.
  2. Security Authentication and Authorization: Identity is assigned to each Pod when it's spun up, and we can create rules and policies for ACL, to say what services they can access.
  3. Logging: Istio also has a dashboard in Grafana.


Istio is a CONTROL PLANE (adds a pluggable Control Plane), and a Service Mesh is an actual Data Plane. Everything that Istio does is via Envoy Proxy, which is a literal Sidecar that is spun up with EACH Kubernetes Pod.




What are some elements in the Istio architecture diagram above?

Pilot

Delivering config to the Proxies (Envoy). As a User you interact with the Pilot, through CLI, Automatically, or CI/CD. Pilot is in charge of:
- Service Discovery
- Intelligent routing
- Resiliency

Envoy Proxy

L7 Load Balancer, Sidecar for all the Containers. It's a literal Sidecar, and Envoy Proxy is deployed along with EACH of the Pods. It takes care of:
- Dynamic Service Discovery
- Load Balancing
- TLS Termination
- Health Checks
- Staged Rollouts

Mixer

Access control, quota checking, policy enforcement. Mixer keeps checking and getting reports if all Proxies are alive and well. Single API for syndicating, so Plugins for Monitoring, API management or Prometheus would go to Mixer.

Citadel

Strong service-to-service and end-user authentication with built-in identity and credential management.

Istio CA

Handles the certificates, to secure the communications.

Istio uses the following configuration concepts:
- Virtual Service
- Destination Rule
- Gateway
- Service Entry



This entire mechanism seems (and is) pretty complex, but it allows us so much more in the micro service architecture. For more details I recommend checking out the official documentation, it's pretty well organized and technically written.

Conclusion

Kubernetes as such adds a big operational overhead. Istio adds even more overhead, and a big complexity on top of your platform. Should you use Istio then? If you have a huge Kubernetes clusters, bunch of Cloud Native Applications designed with micro services, with hundreds... maybe thousands of containers, and you also have a business requirement that justifies adding the overhead - sure, Istio is awesome! If not... maybe look for a simpler solution to your problem.


I use API Gateway. Can I claim I have an API Strategy now?

In the last few years, I've had the opportunity to talk to a number of customers who, when asked what their API strategy is, simply answer something like "We're using a NGINX as API Gateway", or "we got https://www.mulesoft.comlicenses, still struggling to implement all we need".

Let's start like this: API Gateway is NOT an API Manager. It's… just a Gateway for your APIs. What does API Gateway do? API Gateway if your frontend. It manages the API requests.
It enforces  policies (AAA), and lets you manage your L7 Ingress… but there's so much more to a API Management to that.

How do you create your requirements?

When you design the API Management solution, you need to think about how to design a strategy for your particular business. To be specific, you need to focus on two things:

  • Your Developers
  • Your Customers

Why?
If you motivate your Developers to explore the ways to improve the APIs, and cross-reference this with Analytics capabilities, in order to achieve the continuous feedback of how your Customers are consuming, and how they'd like to consume your app. This means:

  • You want to give the best APIs to your developers, so that they can achieve the best value.
  • You need to establish the API Team, who would be in charge of all your APIs, making sure that Usability and Security are of the highest quality.


What do I need to build an API strategy?

You need to be sure you have all of the following aspects "covered":

  • Developer Portal: Where you can quickly engage your developers and partners.
  • Analytics: to gain the deep insight into API usage and performance.
  • Operations Automation: Scale APIs at web scale with operational control.
  • API Development: Tools that help develop, version, deploy and monitor APIs.
  • Security, covering all the aspects of your APIs.
  • Monetization enablement: Setting up pricing rules, based on usage, load and functionality, issuing invoices and collecting payments.


What are some API Management products that I should consider?

I personally prefer MuleSoft, probably due to the experiences in the past, but as it sometimes happens -  Gartner doesn't fully agree with me. Here's what they've determined for 2019. What do you think?




Kubernetes Proxy: Envoy vs NGINX vs HA Proxy


Having spent quite some time with Linux and Kubernetes admins, I've come to realize that networking isn't one of their strong sides. Being a network guy myself, I feel obliged to share my views on topics as important as this one. So, which proxy should you use in your Kubernetes cluster?

Lets start with some facts:
  • All three of these proxies are highly reliable, L7, proven proxies, with Envoy being the newest kid on the block.
  • All these proxies do an outstanding job of routing traffic L7 reliably and efficiently, with a minimum of fuss.
  • There is no full parity of features, but you can implement any critical missing features in the proxy itself… the power of open source!





To keep the post structure, just a few lines about each of these 3 Proxies:
  • HA Proxy is the default Load Balancer when it comes to Kubernetes. It was initially released in 2006, when the Internet operated very differently than today, ergo… there's an issue of slow adoption of new features. This is very serious when you consider SECURITY, like support for last SSL/TLS versions.
  • NGINX is a high-performance web server, FASTER and more modern then HA Proxy Load Balancer, WAF and so many other things… and if you check out the SDN integrations (Cisco ACI, VMware NSX, Nokia Nuage), these are all based on open source version of NGINX. NGINX open source has a number of limitations, including limited observability and health checks, so it comes down to what you're looking for. If you want an enterprise product, depending on your company environment - go with NGINX Plus, ACI or NSX (be sure to ask for -T).
  • Envoy Proxy is new… so not very mature, BUT - most modern, and used in production in Apple, Google among others. Envoy was designed from the ground up for microservices, with features such as hitless reloads, resilience, and advanced load balancing, plus - and exposing dynamic APIs for configuration. THIS is a big deal, in the world where proxies have been configured using static configuration files (Envoy also supports static config, of course). And lets not forget that Istio Service Mesh, which I'm a big fan and contributor of, uses an extended version of the Envoy proxy.

How I prepared for AWS SA Professional exam

Last week I managed to pass the AWS Solution Architect professional certification exam. Here's my certification, in all its glory:



If you've been following my blog, you'll know that I passed a Google Cloud Professional Architect exam in March. I wrote a few blog posts about how I prepared it, and you may find it all here.

Even though I've been preparing for the AWS exam for quite a while, the two main reasons I went for GCP professional level exam first are simple:

  • I think Google Cloud is a sleeping giant, and I wanted to be among the first certified experts. 
  • AWS has much more services. For a professional level exam you don't just need to know some of them in depth, you need to know ALL of them in depth, in order to make the right architecture that fits the customers requirements.


How I prepared

Simple:

  • Linux Academy has amazing hands-on courses for both Associate and Professional level. In my experience - the only one that really prepare you for this exam.
  • Work experience. This is where it gets tricky… AWS has a wide service catalogue, and your work environment hands is unlikely to cover the entire blueprint.


Difference between AWS Associate and Professional level exams

This is something I get asked a lot. Here is the main difference:

  • To pass the associate level exam, you need to know what each service does. The questions are straight forward, if you know what the service does - you'll eliminate most of the options in your test, and get the right answer.
  • AWS SAP (Solutions Architect Professional) is a real world business problem oriented exam. It's understood that you know all the AWS Service Catalogue in depth, and you are tasked with getting the most optimal architecture based on the customer requirements. You will get 77 different business scenarios (this is a LOT of text, so be prepared), and each one has 4-5 possible answers, which are all correct, you just need to figure out which one is the best for that particular scenario.


This basically means that if the question is how to connect your VPN with your on-premises infrastructure in the most cost efficient way, the answer will vary:

  • In Associate level, you will go with VPN IPSec, cause Direct Connect is more expensive.
  • In Professional level you'll have to go deeper, and it's likely that mapping the use case with the architecture, Direct Connect could come out as the most cost efficient option.


AWS vs GCP professional certifications

This is a tricky one… Basically this is how it is:

  • GCP exam is very, very difficult. I feel like it's a Cloud Architect and DevOps merged into one exam, which makes it quite complex and "uncomfortable" at moments. BUT - GCP doesn’t have nearly as many services as AWS does in the Service Catalogue, so I guess the blueprint is narrower, which kind of justifies the complexity of the exam.
  • AWS is difficult, and long, requires high concentration during the 170 minutes, and probably what I like more - tests you for the real world skills. You will potentially get the same possible architectures as the answers in many different questions, and I feel it's impossible for someone to pass it even if they knew the questions, you really need an architect mind. On the positive side - there are no trick questions, so if you're good - you'll pass, it's as simple as that.


What's next? 

I'm going all in for my VMware VCDX (Design Expert) exam now. Did the design, going for the defence. I think I'm in the point in my career to go for something like this, get roasted for thinking I'm a super architect… Bring it on, my ego is about to be destroyed, but I feel like I'll come out of the experience as a true business architect.

On relevance of CCIE in 2019

A question I've been getting a lot from the Network Engineers, should they go for CCIE. There are two points to this question:

  • Knowledge and skill
  • Value of CCIE as a Certification



Let me get into more detail.

Value of CCIE as gaining skill and knowledge

Networking as such is changing. A network engineer for the cloud era needs to understand programmability, APIs, SDN with its use cases, Public Cloud networking (inter and intra public cloud). BUT, if you've ever talked to a network engineer who doesn't come from hardcore cisco or juniper networking, and rather comes from systems (VMware or Linux), or someone who's just studied something like OpenFlow and considers hardware to be a "commodity", you'll notice how due to lack of basic networking L1-4 concepts, they tend to not understand some limitations in both functionality and performance. There are exceptions, of course, and I want to acknowledge that!!! The point I'm trying to make is that CCIE gives you the best of breed base for any kind of programmable, cloud, Kubernetes or whichever networking-related activity you want to pursue in the future.

Value of CCIE as a Certification

This is a completely different topic. If you want to do your CCIE just because you want more money from your employer - don’t. Go learn AWS, learn Python and Ansible, maybe some ACI and NSX but from the "north side" (API). The days when getting a CCIE meant an immediate salary increase of 50% are over… It is now a step in your trip, not the final goal.

Conclusion

Should you go for a CCIE? Yes. If you are serious about networking, you 100% should. You will learn all that other SDx and Cloud stuff much more easy if you understand bits and bytes. Hey, I passed my Google Cloud, AWS, and NSX highest level technical certifications greatly thanking to the networking knowledge I learned working on the field as a CCIE... I'm just doing Networking in a different way now. But - it's still networking, L2 and L3, same old MAC, IP and BGP, just consumed in a different way.

Just married: IBM and RedHat. What does this mean for Cisco and VMware Multi-cloud offer?

As per yesterdays announcement, IBM is acquiring Red Hat in deal valued at $34 billion (more about this here). This is another one in a row of deals I did not expect to happen:

  • Oracle acquired Sun Microsystems
  • Microsoft acquired GitHub
  • Dell acquired VMware


How disruptive can a Purple Hat really be? VMware survived being acquired by Dell quite well... will RedHat have the same luck, or not? What I know for sure is that the RedHat employees are panicking right now...

Sure, 3k billion is a big sum, but also a bold move by IBM on the conquest to the Multi-Cloud market. Combined we're looking at (to name a few):

  • Ansible for the Automation
  • OpenShift, as the best of breed PaaS based on Kubernetes
  • CloudForms as a potential CMP (I wonder how this will work out...)
  • Watson for all AI/Machine Learning related
  • IBM Cloud as a Public Cloud platform


Is this a winner combo? Or do other Hybrid Cloud promoters, like Cisco and VMware have equally good lock-in-free proposals?



As a Hybrid Cloud and DevOps advocate, and a European CTO, I've had the experience to "casually chat" to many European companies about their Cloud strategy. Two things are evident:

  • The buyer is changing, Multi-Cloud is an APPLICATION strategy, not the infrastructure strategy (read more about this here).
  • Companies don't really know who to trust, as what they're being told by various vendors and providers is not really coherent. This makes is pretty difficult to actually build a Cloud strategy (don't get me started on CEOs who'll just tell you "We've adopted Cloud First", and actually think they have a cloud strategy).


Due to all this:
- IBM and RedHat, as software companies, will be able to get to the Application market.
- Neither of the two can do Infrastructure as well as VMware & Cisco.


How important is this? Very! And here is why.

Cisco has:

  • Cloud Center, a true application oriented micro-services ready CMP, Public Cloud and Automation Tool agnostic, equipped with the right Benchmarking and Brokering tools, that integrates quite well with the infrastructure, and workflow visibility platforms.
  • ACI and Tetration, that enable the implementation of coherent and consistent Network and Security Policy Model across multiple private and public clouds, along with the workload visibility.
  • HyperFlex and CCP, providing enterprise production-ready, lock-in-free Kubernetes solution on a Hyper Converged infrastructure.
  • AppDynamics and Turbonomic, a true DevOps combo for the Day 2 we're all fearing in the Cloud, letting the application architects model their post-installation architecture, and monitor the performance of each element, latency between different elements, and assure the optimal user experience.


VMware has:

  • vRealize Automation, the best of breed Automation and Orchestration Hybrid Cloud ready platform.
  • PKS and VKE, KaaS platforms that provide the enterprise production-ready Kubernetes solution, with a fully prepared Operations component, in both - private and public cloud.
  • Wavefront, application visibility tool running on Containers, designed with Cloud applications and Micro Services in mind, with just insane performance.
  • NSX, including the full SDN stack in both, Data Center and Cloud, with probably the best API (both documentation and usage wise).
  • Partnership with AWS, Azure, GCP and IBM, to leverage the most demanded Hybrid Cloud use cases in a "validated design" fashion.



What does this all mean?

Multi-cloud is still a space that, based on Gartner and IDC, over 90% of Companies are looking at. Big companies are making their moves... so just grab your popcorn, and observe. It's going to be a fun ride!

Most Popular Posts