In recent weeks I have been working on configuring a workload as a virtual machine and I ran into a scenario where it required a specific vSwitch advanced setting to be used. I will describe the basic setup then the resolution to the issue. Below is a basic diagram of the setup for any vSphere virtual switch we have all seen before along with a simple workload in a port group.
The Required vSwitch or DVS Port Configuration
Something to point out is when running this particular workload in a virtual machine the standard vSwitch Port Group or DVS Port Group needs to be place in promiscuous mode. This is because when the virtual machine talks to the physical server, the traffic that goes out the virtual machine to the physical server is from the MAC address something else not the MAC address of the virtual machine given by vSphere. You will notice we also have the vSwitch connected with redundant uplinks in a standard Active/Active setup as most implementations are done. Because of this particular workload we see a condition similar to this:
- The virtual machine sends packets (multicast)
- The vSwitch sends it to ESXi-NIC1 to Switch-Left
- Switch-Left sends it to Switch Right
- Switch-Right sends it back to ESXi-NIC2
- ESXi-NIC2 sends it to virtual machine (since the port group is configured in promiscuous mode)
Ultimately the workload inside the virtual machine cannot resolve the traffic going out and coming back into itself.
How To Resolve The Conflict
Since we know that we must have Promiscuous Mode enabled for this workload to perform its function as a virtual machine, and we assume that people will use a separate port group on a Distributed vSwtich or standard vSwitch to maintain the security there is a way to fix this. You actually have three options to resolve the conflict above:
- Use a Separate vSwitch with only ONE pNIC (Not ideal)
- Use a separate vSwtich and Enable LACP on the two pNIC’s
- Use a separate vSwtich and set the following advanced setting on each host:
Net.ReversePathFwdCheckPromisc=1
In my testing I have found the third option to be the easiest and it was not a setting I have used before. Essentially this prevents the vSwitch from checking the reverse path when in Promiscuous Mode, and works perfectly. Once it is set you will see the workload functions normally inside the virtual machine. The setting was actually provided by some folks on the VMware Engineering side as something to try to it worked.
I actually had a lab setup recently where the first option was easier to use for that setup, so any of the three are viable solutions. You cannot just set one pNIC to “inactive” or standby either, that will not work and that was the first thing I tried. I am actually in the process of doing some messing around with this workload with Mark Achtemichuk as well.
I just thought this was interesting and not something you would run into every day on a most virtual machine workloads. I had never found the need for this setting before, but I guess you run into new things every day.