Return virtual interface generated by cni

2021-05-11
3 min read

Currently there is a bug on my nomad firecracker-task-driver where the auto generated interfaces are not cleaned up on vm exits, currently the cni documentation on the go-sdk does not specifies how to get the name of such interface. The reason is because it doesn’t, groking the code I found this gem. https://github.com/firecracker-microvm/firecracker-go-sdk/blob/5a976634b5266be3e717eea03a6c840af6731d8f/network.go

delNetworkFunc := func() error {
		err := cniPlugin.DelNetworkList(ctx, networkConf, runtimeConf)
		if err != nil {
			return errors.Wrapf(err, "failed to delete CNI network list %q", cniConf.NetworkName)
		}
		return nil
	}

the interface is being deleted on error, but there is no cleanup function, on vm stop . https://github.com/firecracker-microvm/firecracker-go-sdk/blob/5a976634b5266be3e717eea03a6c840af6731d8f/machine.go#L591

func (m *Machine) stopVMM() error {
	if m.cmd != nil && m.cmd.Process != nil {
		m.logger.Debug("stopVMM(): sending sigterm to firecracker")
		err := m.cmd.Process.Signal(syscall.SIGTERM)
		if err != nil && !strings.Contains(err.Error(), "os: process already finished") {
			return err
		}
		return nil
	}
	m.logger.Debug("stopVMM(): no firecracker process running, not sending a signal")

	// don't return an error if the process isn't even running
	return nil
}

It seems is a known issue https://github.com/weaveworks/weave/issues/3406

This scripts was proposed, but it relays on the interface have the wepl suffix.

#!/bin/bash
ip a | grep 'vethwepl.*\@' -oP | while read -r line ; do
    veth=${line::-1}
    if [[ $veth =~ [0-9] ]]; then
      echo check $veth
      pid=$(echo $veth | tr -dc '0-9')
      if ! ps -p $pid > /dev/null; then
        echo deleting $veth
        ip link delete $veth >&2
      else
        echo $veth still running
      fi
    else
      echo $veth veth has no number in it and will not be deleted
    fi
done

Root cause

The root cause is that the hostVethName is always passed as ""

https://github.com/containernetworking/plugins/blob/8de0287741e448a0a398b571030bcfa9243e4504/pkg/ip/link_linux.go#L178

// SetupVeth sets up a pair of virtual ethernet devices.
// Call SetupVeth from inside the container netns.  It will create both veth
// devices and move the host-side veth into the provided hostNS namespace.
// On success, SetupVeth returns (hostVeth, containerVeth, nil)
func SetupVeth(contVethName string, mtu int, hostNS ns.NetNS) (net.Interface, net.Interface, error) {
	return SetupVethWithName(contVethName, "", mtu, hostNS)
}

There is the functionality to name the host veth interface but the name is hardcoded as “”, so a random name is used. https://github.com/containernetworking/plugins/blob/8de0287741e448a0a398b571030bcfa9243e4504/pkg/ip/link_linux.go#L135

// SetupVethWithName sets up a pair of virtual ethernet devices.
// Call SetupVethWithName from inside the container netns.  It will create both veth
// devices and move the host-side veth into the provided hostNS namespace.
// hostVethName: If hostVethName is not specified, the host-side veth name will use a random string.
// On success, SetupVethWithName returns (hostVeth, containerVeth, nil)
func SetupVethWithName(contVethName, hostVethName string, mtu int, hostNS ns.NetNS) (net.Interface, net.Interface, error) {
	hostVethName, contVeth, err := makeVeth(contVethName, hostVethName, mtu)
	if err != nil {
		return net.Interface{}, net.Interface{}, err
	}

The proper fix requires more thought as all relays on that assumption, so an easy hack was to use the ifName as the HostName veth and add the vm suffix so it will be different between container and host interfaces.