Configuration

Leaf routers configuration

In this example, the leaf routers of the network fabric use BGP unnumbered peering, so no routable IP is required on underlay links. Instead, the IPv6 link-local address (assigned automatically to each interface) is used.

Note

In this deployment guide, focusing on how to configure the HNA, the spines routers are omitted.

Warning

In production, it is advised to set ebgp-requires-policy to true, and to configure relevant policies.

Customize and apply the following configuration (download link) on leaf1:

/ system license online serial HIDDEN
/ system hostname leaf1
/ system fast-path port pci-b0s4
/ system fast-path port pci-b0s5
/ vrf main interface physical eth1 port pci-b0s4
/ vrf main interface physical eth1 mtu 1550
/ vrf main interface physical eth2 port pci-b0s5
/ vrf main interface physical eth2 mtu 1550
/ vrf main interface loopback loop0 ipv4 address 192.168.200.1/32
/ vrf main routing bgp as 65000
/ vrf main routing bgp network-import-check false
/ vrf main routing bgp router-id 192.168.200.1
/ vrf main routing bgp ebgp-requires-policy false
/ vrf main routing bgp address-family ipv4-unicast redistribute connected
/ vrf main routing bgp address-family l2vpn-evpn enabled true
/ vrf main routing bgp neighbor-group group-hna1 remote-as 65001
/ vrf main routing bgp neighbor-group group-hna1 capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group-hna1 track bfd
/ vrf main routing bgp neighbor-group group-hna1 address-family l2vpn-evpn enabled true
/ vrf main routing bgp neighbor-group group-hna2 remote-as 65002
/ vrf main routing bgp neighbor-group group-hna2 capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group-hna2 track bfd
/ vrf main routing bgp neighbor-group group-hna2 address-family l2vpn-evpn enabled true
/ vrf main routing bgp unnumbered-neighbor eth1 neighbor-group group-hna1
/ vrf main routing bgp unnumbered-neighbor eth2 neighbor-group group-hna2

Note

Take care to at least update the license serial and the PCI ports.

Do the same for the configuration of leaf2 (download link):

/ system license online serial HIDDEN
/ system hostname leaf2
/ system fast-path port pci-b0s4
/ system fast-path port pci-b0s5
/ vrf main interface physical eth1 port pci-b0s4
/ vrf main interface physical eth1 mtu 1550
/ vrf main interface physical eth2 port pci-b0s5
/ vrf main interface physical eth2 mtu 1550
/ vrf main interface loopback loop0 ipv4 address 192.168.200.2/32
/ vrf main routing bgp as 65000
/ vrf main routing bgp network-import-check false
/ vrf main routing bgp router-id 192.168.200.2
/ vrf main routing bgp ebgp-requires-policy false
/ vrf main routing bgp address-family ipv4-unicast redistribute connected
/ vrf main routing bgp address-family l2vpn-evpn enabled true
/ vrf main routing bgp neighbor-group group-hna1 remote-as 65001
/ vrf main routing bgp neighbor-group group-hna1 capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group-hna1 track bfd
/ vrf main routing bgp neighbor-group group-hna1 address-family l2vpn-evpn enabled true
/ vrf main routing bgp neighbor-group group-hna2 remote-as 65002
/ vrf main routing bgp neighbor-group group-hna2 capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group-hna2 track bfd
/ vrf main routing bgp neighbor-group group-hna2 address-family l2vpn-evpn enabled true
/ vrf main routing bgp unnumbered-neighbor eth1 neighbor-group group-hna1
/ vrf main routing bgp unnumbered-neighbor eth2 neighbor-group group-hna2

HNA configuration

Startup probe

In a ConfigMap, add a script that will be used as a startup probe by the container: this script startup-probe.sh is executed by the container runtime inside the container to check if it is ready (download link):

ret=$(systemctl is-system-running)
if [ "$ret" = "running" ] || [ "$ret" = "degraded" ]; then
	exit 0
fi

Apply the ConfigMap like this (it will be used by the deployment file later):

root@node1:~# kubectl create configmap startup-probe --from-file=startup-probe.sh=startup-probe.sh

HNA deployment

The HNA pod takes the role of the HBR (Host Based Router). It provides the network connectivity to the CNF Pods, through a virtio or a veth interface. It runs on each Kubernetes node, which means the deployment type is a DaemonSet.

As described in the nc-k8s-plugin documentation, the multus-hna-hbr network must be present in the metadata annotations.

In the example below, the multus SR-IOV networks that corresponds to the connections to leaf1 and leaf2 are respectively called multus-sriov-1 and multus-sriov-2. You can get the name associated to your Kubernetes cluster with the following command:

root@node1:~# kubectl get --show-kind network-attachment-definitions

Similarly, the SR-IOV resources are called sriov/sriov1 and sriov/sriov2. You can get the name associated to your Kubernetes cluster with the following command:

root@node1:~# kubectl get -o yaml -n kube-system configMap sriovdp-config

The hna-vhost-user and hna-virtio-user mount volumes are used to share the virtio sockets needed to setup the virtual NICs.

The content of the deployment file deploy-hna.yaml is shown below (download link):

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: hna
spec:
  selector:
    matchLabels:
      role: hna
  template:
    metadata:
      labels:
        role: hna
      annotations:
         k8s.v1.cni.cncf.io/networks: multus-sriov-1,multus-sriov-2,multus-hna-hbr
    spec:
      restartPolicy: Always
      securityContext:
        appArmorProfile:
          type: Unconfined
        sysctls:
        - name: net.ipv4.conf.default.disable_policy
          value: "1"
        - name: net.ipv4.ip_local_port_range
          value: "30000 40000"
        - name: net.ipv4.ip_forward
          value: "1"
        - name: net.ipv6.conf.all.forwarding
          value: "1"
        - name: net.netfilter.nf_conntrack_events
          value: "1"
      containers:
      - image: download.6wind.com/vsr/x86_64-ce-vhost/3.11:3.11.0.ga
        imagePullPolicy: IfNotPresent
        name: hna
        startupProbe:
          exec:
            command: ["bash", "-c", "/bin/startup-probe"]
          initialDelaySeconds: 10
          failureThreshold: 20
          periodSeconds: 10
          timeoutSeconds: 9
        resources:
          limits:
            cpu: "2"
            memory: 2048Mi
            hugepages-2Mi: 1024Mi
            sriov/sriov1: 1
            sriov/sriov2: 1
            smarter-devices/ppp: 1
            nc-k8s-plugin.6wind.com/vhost-user-all: 1
          requests:
            cpu: "2"
            memory: 2048Mi
            hugepages-2Mi: 1024Mi
            sriov/sriov1: 1
            sriov/sriov2: 1
            smarter-devices/ppp: 1
            nc-k8s-plugin.6wind.com/vhost-user-all: 1
        env:
        - name: K8S_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        securityContext:
          capabilities:
            add: ["NET_ADMIN", "NET_RAW", "SYS_ADMIN", "SYS_NICE", "IPC_LOCK", "NET_BROADCAST", "SYSLOG", "SYS_TIME"
                 , "SYS_RAWIO"
                 ]
        volumeMounts:
        - mountPath: /dev/hugepages
          name: hugepage
        - mountPath: /dev/shm
          name: shm
        - mountPath: /tmp
          name: tmp
        - mountPath: /run
          name: run
        - mountPath: /run/lock
          name: run-lock
        - mountPath: /bin/startup-probe
          subPath: startup-probe.sh
          name: startup-probe
        stdin: true
        tty: true
      imagePullSecrets:
      - name: regcred
      volumes:
      - emptyDir:
          medium: HugePages
          sizeLimit: 2Gi
        name: hugepage
      - name: shm
        emptyDir:
          sizeLimit: "2Gi"
          medium: "Memory"
      - emptyDir:
          sizeLimit: "500Mi"
          medium: "Memory"
        name: tmp
      - emptyDir:
          sizeLimit: "200Mi"
          medium: "Memory"
        name: run
      - emptyDir:
          sizeLimit: "200Mi"
          medium: "Memory"
        name: run-lock
      - name: startup-probe
        configMap:
          name: startup-probe
          defaultMode: 0500

Note

In addition to the multus SR-IOV networks and the SR-IOV resources that must be customized, other parameters of the deployment file can be adadpted to your use-case be CPUs must be customized in the deployment file before being applied.

Apply the DaemonSet with the following command:

root@node1:~# kubectl apply -f deploy-hna.yaml

Once applied, the pods should be running:

root@node1:~# kubectl get pod -o wide
NAME        READY   STATUS    RESTARTS   AGE   IP             NODE                 NOMINATED NODE   READINESS GATES
hna-hzj7l   1/1     Running   0          31s   10.229.0.119   node1                <none>           <none>
hna-ltlzw   1/1     Running   0          31s   10.229.1.123   node2                <none>           <none>

HNA configuration template

The HNA Pods are configured automatically by the hna-operator. The configuration is generated from a Jinja2 template.

See also

For detailed instructions, please refer to the HNA Configuration Template section of the nc-k8s-plugin documentation.

This template relies on the Kubernetes database (list of Pods, list of nodes, custom resources definitions, …) to generate a valid CLI configuration that depends on the properties of the CNFs that are running on node.

Hera are some details about the template used in this document:

  • The list of PCI ports to configure on the Host Network Accelerator pod are retrieved from the Pod annotations (k8s.v1.cni.cncf.io/network-status).

  • For each hna_net (i.e. a network registered by a running CNF pod), a specific fast path and interface configuration is added, which depends on the interface kind (veth or virtio-user).

  • Depending on the tenant of the CNF associated to the hna_net, the interface is added into the proper bridge called bdg-.

  • For each existing tenant, a bridge and a vxlan interface is created.

  • An HNA identifier hna_id is derived from the node name, and used to build a unique IP for the HNA Pod.

  • A BGP configuration is used to peer with the leaf routers.

  • A KPI configuration is used to export metrics to an influxdb Pod on the Kubernetes cluster. This part is optional and can be removed.

The content of the configuration template hna-config-template.nc-cli is shown below (download link):

# hack to ensure at least the license is applied
/ system license online serial HIDDEN
commit
del /
/ system license online serial HIDDEN
# Fast path
/ system fast-path enabled true
/ system fast-path core-mask fast-path max
/ system fast-path advanced power-mode eco
/ system fast-path advanced machine-memory 2048
/ system fast-path max-virtual-ports 16

# Physical ports
    {% set pci_ifaces = [] %}
    {% set hna_net_ifaces = [] %}
    {% for sriov in hna_pod.metadata.annotations["k8s.v1.cni.cncf.io/network-status"] |
       parse_json |
       selectattr('device-info', 'defined') |
       selectattr('device-info.type', 'eq', 'pci') %}
        {% set pci_iface = sriov["device-info"]["pci"]["pci-address"] | pci2name %}
        {% set _ = pci_ifaces.append(pci_iface) %}
/ system fast-path port {{pci_iface}}
/ vrf main interface physical {{pci_iface}} port {{pci_iface}}
/ vrf main interface physical {{pci_iface}} mtu 1600
    {% endfor %}

    {% set tenants = {} %}
# Virtual ports
    {% for hna_net in hna_nets.values() | selectattr('kind', 'ne', 'hbr') %}
        {% set pod = pods[hna_net.pod_name] %}
        {% set pod_role = pod.metadata.labels['role'] %}
        {% if hna_net.kind == "veth" %}
/ system fast-path virtual-port infrastructure infra-{{hna_net.name}}
            {% set hna_net_iface = "veth-" + pod_role %}
/ vrf main interface infrastructure {{hna_net_iface}} port infra-{{hna_net.name}}
        {% elif hna_net.kind == "virtio-user" %}
/ system fast-path virtual-port fpvhost fpvhost-{{hna_net.name}}
            {% if "profile" in hna_net.userdata %}
/ system fast-path virtual-port fpvhost fpvhost-{{hna_net.name}} profile {{hna_net.userdata.profile}}
            {% endif %}
            {% if hna_net.userdata.socket_mode == "server" %}
/ system fast-path virtual-port fpvhost fpvhost-{{hna_net.name}} socket-mode client
            {% else %}
/ system fast-path virtual-port fpvhost fpvhost-{{hna_net.name}} socket-mode server
            {% endif %}
            {% set hna_net_iface = "vho-" + pod_role %}
/ vrf main interface fpvhost {{hna_net_iface}} port fpvhost-{{hna_net.name}}
        {% endif %}
        {% if 'tenant' in pod.metadata.labels and 'tenant_id' in pod.metadata.labels %}
            {% set tenant = pod.metadata.labels['tenant'] %}
            {% set tenant_id = pod.metadata.labels['tenant_id'] %}
            {% set _ = tenants.update({tenant: tenant_id}) %}
/ vrf main interface bridge bdg-{{tenant}} link-interface {{hna_net_iface}}
        {% endif %}
        {% set _ = hna_net_ifaces.append(hna_net_iface) %}
    {% endfor %}
    {% set hna_id = hna_net_hbr.spec.node_name | replace("node", "") %}
# Vxlan
        {% for tenant, tenant_id in tenants.items() %}
/ vrf main interface vxlan vxlan-{{tenant}} mtu 1500
/ vrf main interface vxlan vxlan-{{tenant}} vni {{tenant_id}}
/ vrf main interface vxlan vxlan-{{tenant}} local 192.168.100.{{hna_id}}
/ vrf main interface vxlan vxlan-{{tenant}} learning false
/ vrf main interface bridge bdg-{{tenant}} link-interface vxlan-{{tenant}} learning false
/ vrf main interface bridge bdg-{{tenant}} mtu 1550
    {% endfor %}

# Loopback
/ vrf main interface loopback loop0 ipv4 address 192.168.100.{{hna_id}}/32

# BGP
/ vrf main routing bgp as {{65000 + (hna_id | int)}}
/ vrf main routing bgp network-import-check false
/ vrf main routing bgp router-id 192.168.100.{{hna_id}}
/ vrf main routing bgp ebgp-requires-policy false
/ vrf main routing bgp address-family ipv4-unicast redistribute connected
/ vrf main routing bgp address-family l2vpn-evpn advertise-all-vni true
/ vrf main routing bgp neighbor-group group capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group remote-as 65000
/ vrf main routing bgp neighbor-group group track bfd
/ vrf main routing bgp neighbor-group group address-family l2vpn-evpn
    {% for pci_iface in pci_ifaces %}
/ vrf main routing bgp unnumbered-neighbor {{pci_iface}} neighbor-group group
/ vrf main routing bgp unnumbered-neighbor {{pci_iface}} ipv6-only true
    {% endfor %}
# KPIs
    {% for pci_iface in pci_ifaces %}
/ vrf main kpi telegraf metrics monitored-interface vrf main name {{pci_iface}}
    {% endfor %}
    {% for hna_net_iface in hna_net_ifaces %}
/ vrf main kpi telegraf metrics monitored-interface vrf main name {{hna_net_iface}}
    {% endfor %}
/ vrf main kpi telegraf metrics metric network-nic-traffic-stats enabled true period 3
/ vrf main kpi telegraf interval 5
/ vrf main kpi telegraf influxdb-output url http://influxdb.monitoring:8086 database telegraf

To apply the template, run the following command:

root@node1:~# kubectl create configmap -n hna-operator hna-template --from-file=config.nc-cli=/root/hna-config-template.nc-cli

Note

Take care to at least update the license serial (it appears twice in the template).

CNFs configuration

Boostrap configuration

The CNFs runs a Virtual Service Router. The configuration is generated automatically thanks to an initContainer that will run the below script. This script will generate a startup configuration inside the container, in /etc/init-config/config.cli, based on environment variables passed by Kubernetes. This CLI file is automatically applied when the Virtual Service Router container starts.

To demonstrate the 2 interface kinds, the red pods use veth based interfaces, while green ones use virtio based interfaces. The dataplane IP addresses are simply generated using the pod identifier.

The cnf-bootstrap.py python script (download link):

#!/usr/bin/env python3
# Copyright 2025 6WIND S.A.

"""
This script is exported by Kubernetes in the VSR filesystem for the greenX and redX pods. It
is used to generate the startup configuration.
"""

import json
import os
import re
import subprocess
import sys

BUS_ADDR_RE = re.compile(r'''
    ^
    (?P<domain>([\da-f]+)):
    (?P<bus>([\da-f]+)):
    (?P<slot>([\da-f]+))\.
    (?P<func>(\d+))
    $
    ''', re.VERBOSE | re.IGNORECASE)

def bus_addr_to_name(bus_addr):
    """
    Convert a PCI bus address into a port name as used in nc-cli.
    """
    match = BUS_ADDR_RE.match(bus_addr)
    if not match:
        raise ValueError('pci bus address %s does not match regexp' % bus_addr)

    d = match.groupdict()
    domain = int(d['domain'], 16)
    bus = int(d['bus'], 16)
    slot = int(d['slot'], 16)
    func = int(d['func'], 10)

    name = 'pci-'
    if domain != 0:
        name += 'd%d' % domain
    name += 'b%ds%d' % (bus, slot)
    if func != 0:
        name += 'f%d' % func

    return name

def get_env_vm():
    with open('/run/init-env.json', encoding='utf-8') as f:
        env = json.load(f)
    env['HNA_IFNAME'] = subprocess.run(
        "ip -json -details link | jq --raw-output "
        "'.[] | select(has(\"linkinfo\") | not) | "
        "select(.address | match(\"00:09:c0\")) | .ifname'",
        shell=True, check=True, capture_output=True, text=True).stdout.strip()
    pci_addr = subprocess.run(
        rf"ethtool -i {env['HNA_IFNAME']} | sed -n 's,^bus-info: \(.*\)$,\1,p'",
        shell=True, check=True, capture_output=True, text=True).stdout.strip()
    env['HNA_PCIADDR'] = bus_addr_to_name(pci_addr)
    return env

def get_env_container():
    if os.getpid() == 1:
        env = dict(os.environ)
    else:
        with open('/proc/1/environ', encoding='utf-8') as f:
            data = f.read()
        env = dict((var.split('=') for var in data.split('\x00') if var))

    env['K8S_POD_ID'] = int(re.sub('[^0-9]', '', env['K8S_POD_ROLE']))
    env['KS8_POD_IP'] = os.environ.get('K8S_POD_IP')
    env['DEFAULT_ROUTE'] = subprocess.run(
        'ip -j route get 8.8.8.8 | jq -r .[0].gateway', shell=True,
        check=True, capture_output=True, text=True).stdout.strip()
    env['VETH_INFRA_ID'] = subprocess.run(
        "ip -j link | jq --raw-output "
        "'(.[] | select(.ifname | match(\"veth-[0-9a-f]{10}\"))) | .ifalias'",
        shell=True, check=True, capture_output=True, text=True).stdout.strip()
    return env

def get_env():
    if os.path.exists('/run/init-env.json'):
        env = get_env_vm()
    else:
        env = get_env_container()
    env['K8S_POD_ID'] = int(re.sub('[^0-9]', '', env['K8S_POD_ROLE']))
    return env

def gen_green_config(env):
    mac = f"de:ad:de:80:00:{env['K8S_POD_ID']:02x}"
    conf = ""
    if 'HNA_PCIADDR' in env:
        conf += """\
/ vrf main interface physical eth1 ethernet mac-address {mac}
/ vrf main interface physical eth1 port {HNA_PCIADDR}
/ vrf main interface physical eth1 ipv4 address 192.168.0.{K8S_POD_ID}/24
/ system fast-path port {HNA_PCIADDR}
"""
    else:
        conf += """\
/ vrf main interface fpvirtio eth1 ethernet mac-address {mac}
/ vrf main interface fpvirtio eth1 port fpvirtio-0
/ vrf main interface fpvirtio eth1 ipv4 address 192.168.0.{K8S_POD_ID}/24
/ system fast-path virtual-port fpvirtio fpvirtio-0
/ system fast-path max-virtual-ports 1
"""
    conf += """\
/ system fast-path advanced machine-memory 2048
/ system fast-path advanced power-mode eco
/ system license online serial HIDDEN
"""

    return conf.format(**env, mac=mac)

def gen_red_config(env):
    mac = f"de:ad:de:80:01:{env['K8S_POD_ID']:02x}"
    return """\
cmd license file import content {license_data} serial {license_serial} | ignore-error
/ vrf main interface infrastructure eth1 ethernet mac-address {mac}
/ vrf main interface infrastructure eth1 port {VETH_INFRA_ID}
/ vrf main interface infrastructure eth1 ipv4 address 192.168.0.{K8S_POD_ID}/24
/ system fast-path virtual-port infrastructure {VETH_INFRA_ID}
/ system fast-path advanced machine-memory 2048
/ system fast-path advanced power-mode eco
/ system license online serial HIDDEN
""".format(**env, mac=mac)

def gen_config():
    env = get_env()
    if 'green' in env['K8S_POD_ROLE']:
        return gen_green_config(env)
    return gen_red_config(env)

def main():
    config = gen_config()
    if config[-1] != '\n':
        config += '\n'

    os.makedirs('/etc/init-config', exist_ok=True)
    with open('/etc/init-config/config.cli', 'w', encoding='utf-8') as f:
        f.write(config)
    if os.getpid() != 1:
        sys.stdout.write(config)

    return 0

if __name__ == '__main__':
    sys.exit(main())

Note

Take care to at least update the license serial.

To store this in a ConfigMap, run the following command on the Kubernetes control plane:

root@node1:~# kubectl create configmap cnf-bootstrap-config --from-file=cnf-bootstrap.py=/root/cnf-bootstrap.py

CNF Deployment

Now you can deploy the CNF Pods: green1, green2, green3, red1, red2. The green* Pods use a virtio connection, while the red* Pods use a veth connection.

In this document, we use a deployment file for each CNF to ease the placement of Pods on the different nodes: green1, green2, and red1 will have an affinity to node1, while the other ones will have an affinity to node2.

The content of the deployment file deploy-green1.yaml is shown below (download link):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: green1
spec:
  replicas: 1
  selector:
    matchLabels:
      role: green1
  template:
    metadata:
      labels:
        role: green1
        tenant: green
        tenant_id: "100"
      annotations:
         k8s.v1.cni.cncf.io/networks: multus-hna-virtio-user
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - node1
      restartPolicy: Always
      securityContext:
        appArmorProfile:
          type: Unconfined
        sysctls:
        - name: net.ipv4.conf.default.disable_policy
          value: "1"
        - name: net.ipv4.ip_local_port_range
          value: "30000 40000"
        - name: net.ipv4.ip_forward
          value: "1"
        - name: net.ipv6.conf.all.forwarding
          value: "1"
        - name: net.netfilter.nf_conntrack_events
          value: "1"
      initContainers:
      - name: bootstrap
        image: download.6wind.com/vsr/x86_64-ce/3.11:3.11.0.ga
        command: ["/sbin/bootstrap"]
        resources:
          limits:
            cpu: "2"
            memory: 2048Mi
            hugepages-2Mi: 1024Mi
            nc-k8s-plugin.6wind.com/virtio-user: 1
          requests:
            cpu: "2"
            memory: 2048Mi
            hugepages-2Mi: 1024Mi
            nc-k8s-plugin.6wind.com/virtio-user: 1
        env:
        - name: K8S_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
        - name: K8S_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: K8S_POD_ROLE
          value: green1
        - name: K8S_POD_IP
          valueFrom:
            fieldRef:
              fieldPath: status.podIP
        - name: K8S_POD_CPU_REQUEST
          valueFrom:
            resourceFieldRef:
              resource: requests.cpu
        - name: K8S_POD_MEM_REQUEST
          valueFrom:
            resourceFieldRef:
              resource: requests.memory
        volumeMounts:
        - mountPath: /sbin/bootstrap
          subPath: cnf-bootstrap.py
          name: bootstrap
        - mountPath: /etc/init-config
          name: init-config
      containers:
      - image: download.6wind.com/vsr/x86_64-ce/3.11:3.11.0.ga
        imagePullPolicy: IfNotPresent
        name: green1
        startupProbe:
          exec:
            command: ["bash", "-c", "/bin/startup-probe"]
          initialDelaySeconds: 10
          failureThreshold: 20
          periodSeconds: 10
          timeoutSeconds: 9
        resources:
          limits:
            cpu: "2"
            memory: 2048Mi
            hugepages-2Mi: 1024Mi
            smarter-devices/ppp: 1
            smarter-devices/vhost-net: 1
            smarter-devices/net_tun: 1
            nc-k8s-plugin.6wind.com/virtio-user: 1
          requests:
            cpu: "2"
            memory: 2048Mi
            hugepages-2Mi: 1024Mi
            smarter-devices/ppp: 1
            smarter-devices/vhost-net: 1
            smarter-devices/net_tun: 1
            nc-k8s-plugin.6wind.com/virtio-user: 1
        env:
        - name: K8S_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        securityContext:
          capabilities:
            add: ["NET_ADMIN", "NET_RAW", "SYS_ADMIN", "SYS_NICE", "IPC_LOCK", "NET_BROADCAST", "SYSLOG", "SYS_TIME"
                 , "SYS_RAWIO"
                 ]
        volumeMounts:
        - mountPath: /dev/hugepages
          name: hugepage
        - mountPath: /dev/shm
          name: shm
        - mountPath: /tmp
          name: tmp
        - mountPath: /run
          name: run
        - mountPath: /run/lock
          name: run-lock
        - mountPath: /bin/startup-probe
          subPath: startup-probe.sh
          name: startup-probe
        - mountPath: /etc/init-config
          name: init-config
        stdin: true
        tty: true
      imagePullSecrets:
      - name: regcred
      volumes:
      - emptyDir:
          medium: HugePages
          sizeLimit: 2Gi
        name: hugepage
      - name: shm
        emptyDir:
          sizeLimit: "2Gi"
          medium: "Memory"
      - emptyDir:
          sizeLimit: "500Mi"
          medium: "Memory"
        name: tmp
      - emptyDir:
          sizeLimit: "200Mi"
          medium: "Memory"
        name: run
      - emptyDir:
          sizeLimit: "200Mi"
          medium: "Memory"
        name: run-lock
      - name: bootstrap
        configMap:
          name: cnf-bootstrap-config
          defaultMode: 0500
      - name: startup-probe
        configMap:
          name: startup-probe
          defaultMode: 0500
      - name: init-config
        emptyDir:
          sizeLimit: "10Mi"
          medium: "Memory"

To apply the deployment file, run the following command:

root@node1:~# kubectl apply -f deploy-green1.yaml

After some time, the pod should be visible as “Running”:

root@node1:~# kubectl get pod
NAME                      READY   STATUS    RESTARTS   AGE
green1-75667cbd6f-8kn74   1/1     Running   0          39s
hna-hzj7l                 1/1     Running   0          152m
hna-ltlzw                 1/1     Running   0          152m

Login to the pod with kubectl exec -it POD_NAME -- login (admin/admin is the default login/password), and list interfaces:

green1-75667cbd6f-8kn74> show interface
Name   State L3vrf   IPv4 Addresses  IPv6 Addresses               Description
====   ===== =====   ==============  ==============               ===========
lo     UP    default 127.0.0.1/8     ::1/128                      loopback_main
eth0   UP    default 10.229.0.126/24 fe80::c437:61ff:feb5:165c/64 infra-eth0
eth1   UP    default 192.168.0.1/24  fe80::dcad:deff:fe80:1/64
fptun0 UP    default                 fe80::6470:74ff:fe75:6e30/64
  • eth0 is the primary CNI

  • eth1 is the virtio interface connected to the HNA

Note

eth1 may take some time to appear, since it requires to start the fast path.

The content of other deployment files is very similar (only changes are the pod name, the node affinity, the tenant, and the hna_net kind). Here are the download links for each of them:

VNF Deployment with Kubevirt

KubeVirt is an open-source project that lets you run VMs alongside containers in a Kubernetes cluster. You can use KubeVirt to deploy your network function as a VM, and connect it to the HNA using Virtio interfaces:

  • on VNF side, a Virtio PCI interface will be used,

  • on HNA side, a Vhost-user interface will be used.

Only Virtio is supported by the HNA CNI when using a VM. The use of Veth interfaces is not possible. So in our example, only the green pods can be instanciated as a VM.

This section explains how to deploy your Virtual Service Router as a VNF and connect it to the HNA. It requires the installation of a hook sidecar script, whose role is to add the VNF Virtio PCI ports connected to the HNA into the VM configuration, by modifying the libvirt XML domain description.

See also

  • Refer to the KubeVirt Installation section of the 6WIND HNA documentation for details about KubeVirt installation and configuration for HNA.

  • Refer to the kubevirt section of the nc-k8s-plugin documentation to deploy the hook sidecar script.

Load the hook sidecar script retrieved from nc-k8s-plugin documentation into a ConfigMap:

# kubectl create configmap kubevirt-sidecar --from-file=kubevirt_sidecar.py=/path/to/kubevirt_sidecar.py

Then, create a new NetworkAttachmentDefinition with the following content (download link):

apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: multus-hna-virtio-user-kubevirt
  annotations:
    k8s.v1.cni.cncf.io/resourceName: nc-k8s-plugin.6wind.com/virtio-user
spec:
  config: '{
  "cniVersion": "1.0.0",
  "name": "multus-hna-virtio-user-kubevirt",
  "type": "hna-cni",
  "kind": "virtio-user",
  "capabilities": {"CNIDeviceInfoFile": true, "deviceID": true},
  "log-level": "INFO",
  "log-file": "stderr",
  "userdata": {
    "socket_mode" : "server"
  }
}'

This NetworkAttachmentDefinition is similar to the default one provided in nc-k8s-plugin, except that it includes a userdata specifying a socket mode. This user data is used by the HNA configuration template.

To run a VM, a VirtualMachine is expected by KubeVirt. The content of this file, deploy-kubevirt-green1.yaml is shown below (download link):

apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: green1
spec:
  runStrategy: Always
  template:
    metadata:
      annotations:
        hooks.kubevirt.io/hookSidecars:  >
          [
            {
              "args": ["--version", "v1alpha2"],
              "image": "quay.io/kubevirt/sidecar-shim:v1.5.2",
              "configMap": {"name": "kubevirt-sidecar", "key": "kubevirt_sidecar.py", "hookPath": "/usr/bin/onDefineDomain"}
            }
          ]
      labels:
        kubevirt.io/domain: green1
        role: green1
        tenant: green
        tenant_id: "100"
    spec:
      nodeSelector:
        kubernetes.io/hostname: vm-k8s-hypervisor
      domain:
        cpu:
          sockets: 1
          cores: 1
          threads: 2
          dedicatedCpuPlacement: true
        devices:
          disks:
            - name: ctdisk
              disk: {}
          filesystems:
            - name: bootstrap
              virtiofs: {}
          interfaces:
            - name: default
              macAddress: de:ad:de:01:02:03
              masquerade: {}
            - name: multus-1
              sriov: {}
        resources:
          requests:
            memory: 2048Mi
            nc-k8s-plugin.6wind.com/virtio-user: '1'
          limits:
            memory: 2048Mi
            nc-k8s-plugin.6wind.com/virtio-user: '1'
        memory:
          hugepages:
            pageSize: "2Mi"
      networks:
        - name: default
          pod: {}
        - name: multus-1
          multus:
            networkName: multus-hna-virtio-user-kubevirt
            default: false
      volumes:
      - name: ctdisk
        containerDisk:
          image: download.6wind.com/vsr/x86_64/3.11:3.11.0.ga
      - name: bootstrap
        configMap:
          name: cnf-bootstrap-config
      - name: cloudinitdisk
        cloudInitNoCloud:
          userData: |-
            #cloud-config
            bootcmd:
              - "echo '{ \"K8S_POD_ROLE\": \"green1\" }' > /run/init-env.json"
              - "mkdir /run/bootstrap_script"
              - "mount -t virtiofs bootstrap /run/bootstrap_script"
              - "mkdir /etc/init-config"
              - "python3 /run/bootstrap_script/cnf-bootstrap.py"

To apply the deployment file, run the following command:

root@node1:~# kubectl apply -f deploy-kubevirt-green1.yaml

After some time, the pod should be visible as “Running”. Note that KubeVirt creates several containers in the Pod (here, 6):

root@node1:~# kubectl get pod
NAME                         READY   STATUS    RESTARTS   AGE
virt-launcher-green1-7nwdj   6/6     Running   0          20m

Login to the pod with virtctl console green1 (admin/admin is the default login/password), and list interfaces:

green1-vm-kubevirt> show interface
Name   State L3vrf   IPv4 Addresses IPv6 Addresses               Description
====   ===== =====   ============== ==============               ===========
lo     UP    default 127.0.0.1/8    ::1/128                      loopback_main
eth0   UP    default 10.0.2.2/24    fe80::dcad:deff:fe01:203/64
eth1   UP    default 192.168.0.1/24 fe80::dcad:deff:fe80:1/64
fptun0 UP    default                fe80::6470:74ff:fe75:6e30/64
  • eth0 is the primary CNI

  • eth1 is the virtio interface connected to the HNA

Note

eth1 may take some time to appear, since it requires to start the fast path.