Site icon Chris Colotti's Blog

The Most Common Mistake Made in VMware vSphere Networking

It seems I have had a number of the same conversations in recent weeks about a very particular topic, and some could argue a design flaw, with the most basic of VMware vSphere deployments. This has to do with using an overly simplistic deployment whereby the ESX host is only configured with (1) VMKernel port, primarily VMK0. While this “works fine” there is some simple inherent issues that can occur downstream with such a basic configuration. Let’s take a short look at the problem, but I am surprised it’s 2022 and some basic fundamentals of vSphere are still mis-understood

Let’s take this simple design shown below and break down what is happening. We need to assume the following is configured as well

The question we now have to ask is how is traffic flowing for everything in this host? It’s pretty simple really. On boot VMK0 will bind to ONE of the two pNICs based on the assumptions above. This means ALL traffic will flow over a single pNIC including most of your network based storage connections. As you add services and storage the only path to mount these is via VMK0 and thus pNIC1. Are we starting to see the problem? The second NIC is totally “unused”. Now this is a big deal on 1G connection and maybe sometimes even on a 10G or 40G connection. The real issue IMO is “wasting” the other pNIC for any host level services, let alone the potential security implications and risks associated with this.

Single VMK0 traffic path

How most vSphere architects have solved this is simple. Add more VMKernel ports for specific traffic and manipulate the Distributed Port Group teaming settings to re-path the communications. In some cases do this over L2 connections so ensure minimal routing. Here is a simple example of this concept

Setting specific active path for VMKernel traffic types

Lastly if you wanted to expand on this even more here is a table that breaks down using multiple VMKernel ports as shown above I have used many times in past designs. Here you can see we force traffic to active and standby uplinks, (assuming still there is no LACP), thus utilizing both pNIC’s to their fullest extent.

VMKernelNameVLAN TypeTCP Stack ServicesUpLink1UpLink2
vmk0ManagementL3DefaultMgmt OnlyActiveStandby
vmk1vMotion-1L2DefaultVMotion OnlyStandbyActive
vmk2vMotion-1L2DefaultVMotion OnlyActiveStandby
vmk3ESX-NFSL2DefaultnoneStandbyActive
vmk5iSCSI-1L2DefaultnoneActiveUNUSED
vmk6iSCSI-2L2DefaultnoneUNUSEDActive
ESX Software iSCSI adapter binding requires specific teaming setup of Active / UNUSED

There are other uses for more VMKernel ports for other traffic such as NSX-T, backup, cold migration, replication etc. However, this basic foundational mistake seems to be all too common. Once you are setup in this way, you have a lot more options to your overall design. In another post I may show how this can also affect the Storage Migration (Storage vMotion) process as well.

Exit mobile version