Fixing VMWare Player on Linux when using DHCP addresses

The free VMWare Player for Linux application is a great application to run virtual machines on a desktop or laptop. VMware Player offers most of the virtualisation features of VMWare's commercial products and of rival applications such as Oracle's VirtualBox or QEmu.

Problem with VMwarePlayer v8+

Unfortunately, starting somewhere around VMWarePlayer version 8, VMWare introduced a very annoying "feature" that makes it almost impossible to use VMWarePlayer effectively when the host machine is being assigned a DHCP address (which is almost always the case).

Every time the DHCP address of any of the network adapters of the host machine is renewed, all virtual machines receive a network disconnect-and-connect, rendering the network unusable for roughly 20 seconds with each renewal. The network at Nikhef has a DHCP lease time of less than 300 seconds, which means that roughly every 5 minutes, all VMs on my laptop would lose their network connectivity.
Not a good thing.

Analysis

The root cause for this behaviour lies in the vmnet-natd daemon, which seems to be always running. Even when all VMs are using virtual networks that do not make use of NAT'ting, this daemon is still active.
When a DHCP renewal has been processed by the Linux kernel networking stack, the vmnet-natd daemon picks up the DHCP renewal, as can be seen in /var/log/messages:
  vmnet-natd: RTM_NEWADDR: index:2, addr:192.168.1.72
The daemon then calls out a function in the vmnet module to handle the "change" (even though in 99% of the cases, the IP address was not altered, but simply renewed):
  kernel: userif-3: sent link down event.
  kernel: userif-3: sent link up event.
It turns out that this message is printed out by the vmnet module. Luckily, the sources for this modules are included with every VMWarePlayer installation, in the file
  /usr/lib/vmware/modules/source/vmnet-only.tar
This tar file is used to build the vmnet kernel module for the running kernel.

When we unpack this tar file and scan for userif-3 messages we find the following snippet in userif.c:
   965 int
   966 VNetUserIfSetUplinkState(VNetPort *port, uint8 linkUp)
   967 {
   ...
   1010    LOG(0, (KERN_NOTICE "userif-%d: sent link %s event.\n",
   1011         userIf->port.id, linkUp ? "up" : "down"));
   1012 
   1013    return retval;
   1014 }

Solution

By adding three lines of code to userif.c, we disable the explicit sending of the "Link down" state to all virtual machines:
  -- vmnet-only/userif.c	2017-12-21 17:02:28.555820933 +0100
  +++ vmnet-only.jjk/userif.c	2017-12-15 13:22:13.257724953 +0100
  @@ -973,6 +973,9 @@
      userIf = (VNetUserIF *)port->jack.private;
      hubJack = port->jack.peer;
 
  +   /* never send link down events */
  +   if (!linkUp) return 0;
  +
      if (port->jack.state == FALSE || hubJack == NULL) {
         return -EINVAL;
      }
You can also download the patch here. To apply this patch, follow these instructions: To test, start a virtual machine and trigger a DHCP renewal. You should not see any network disruption inside the virtual machine anymore. Check the /var/log/message file for any "userif" messages. You should now only see the "sent link up event" line being printed:
  kernel: userif-3: sent link up event.
If you're still seeing "sent link down event" lines, then the patch was not applied correctly.
Comments to Jan Just Keijser | visitors = 110