Configuration¶
Leaf routers configuration¶
In this example, the leaf routers of the network fabric use BGP unnumbered peering, so no routable IP is required on underlay links. Instead, the IPv6 link-local address (assigned automatically to each interface) is used.
Note
In this deployment guide, focusing on how to configure the HNA, the spines routers are omitted.
Warning
In production, it is advised to set ebgp-requires-policy
to true, and to configure relevant policies.
Customize and apply the following configuration (download link) on leaf1:
/ system license online serial HIDDEN
/ system hostname leaf1
/ system fast-path port pci-b0s4
/ system fast-path port pci-b0s5
/ vrf main interface physical eth1 port pci-b0s4
/ vrf main interface physical eth1 mtu 1550
/ vrf main interface physical eth2 port pci-b0s5
/ vrf main interface physical eth2 mtu 1550
/ vrf main interface loopback loop0 ipv4 address 192.168.200.1/32
/ vrf main routing bgp as 65000
/ vrf main routing bgp network-import-check false
/ vrf main routing bgp router-id 192.168.200.1
/ vrf main routing bgp ebgp-requires-policy false
/ vrf main routing bgp address-family ipv4-unicast redistribute connected
/ vrf main routing bgp address-family l2vpn-evpn enabled true
/ vrf main routing bgp neighbor-group group-hna1 remote-as 65001
/ vrf main routing bgp neighbor-group group-hna1 capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group-hna1 track bfd
/ vrf main routing bgp neighbor-group group-hna1 address-family l2vpn-evpn enabled true
/ vrf main routing bgp neighbor-group group-hna2 remote-as 65002
/ vrf main routing bgp neighbor-group group-hna2 capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group-hna2 track bfd
/ vrf main routing bgp neighbor-group group-hna2 address-family l2vpn-evpn enabled true
/ vrf main routing bgp unnumbered-neighbor eth1 neighbor-group group-hna1
/ vrf main routing bgp unnumbered-neighbor eth2 neighbor-group group-hna2
Note
Take care to at least update the license serial and the PCI ports.
Do the same for the configuration of leaf2 (download link):
/ system license online serial HIDDEN
/ system hostname leaf2
/ system fast-path port pci-b0s4
/ system fast-path port pci-b0s5
/ vrf main interface physical eth1 port pci-b0s4
/ vrf main interface physical eth1 mtu 1550
/ vrf main interface physical eth2 port pci-b0s5
/ vrf main interface physical eth2 mtu 1550
/ vrf main interface loopback loop0 ipv4 address 192.168.200.2/32
/ vrf main routing bgp as 65000
/ vrf main routing bgp network-import-check false
/ vrf main routing bgp router-id 192.168.200.2
/ vrf main routing bgp ebgp-requires-policy false
/ vrf main routing bgp address-family ipv4-unicast redistribute connected
/ vrf main routing bgp address-family l2vpn-evpn enabled true
/ vrf main routing bgp neighbor-group group-hna1 remote-as 65001
/ vrf main routing bgp neighbor-group group-hna1 capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group-hna1 track bfd
/ vrf main routing bgp neighbor-group group-hna1 address-family l2vpn-evpn enabled true
/ vrf main routing bgp neighbor-group group-hna2 remote-as 65002
/ vrf main routing bgp neighbor-group group-hna2 capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group-hna2 track bfd
/ vrf main routing bgp neighbor-group group-hna2 address-family l2vpn-evpn enabled true
/ vrf main routing bgp unnumbered-neighbor eth1 neighbor-group group-hna1
/ vrf main routing bgp unnumbered-neighbor eth2 neighbor-group group-hna2
HNA configuration¶
Startup probe¶
In a ConfigMap, add a script that will be used as a startup probe by
the container: this script startup-probe.sh is executed by the
container runtime inside the container to check if it is ready
(download link):
ret=$(systemctl is-system-running)
if [ "$ret" = "running" ] || [ "$ret" = "degraded" ]; then
exit 0
fi
Apply the ConfigMap like this (it will be used by the deployment
file later):
root@node1:~# kubectl create configmap startup-probe --from-file=startup-probe.sh=startup-probe.sh
HNA deployment¶
The HNA pod takes the role of the HBR (Host Based Router). It provides the network connectivity to the CNF Pods, through a virtio or a veth interface. It runs on each Kubernetes node, which means the deployment type is a DaemonSet.
As described in the nc-k8s-plugin documentation, the
multus-hna-hbr network must be present in the metadata annotations.
In the example below, the multus SR-IOV networks that corresponds to
the connections to leaf1 and leaf2 are respectively called
multus-sriov-1 and multus-sriov-2. You can get the name associated
to your Kubernetes cluster with the following command:
root@node1:~# kubectl get --show-kind network-attachment-definitions
Similarly, the SR-IOV resources are called sriov/sriov1 and
sriov/sriov2. You can get the name associated to your Kubernetes cluster
with the following command:
root@node1:~# kubectl get -o yaml -n kube-system configMap sriovdp-config
The hna-vhost-user and hna-virtio-user mount volumes are used to
share the virtio sockets needed to setup the virtual NICs.
The content of the deployment file deploy-hna.yaml is shown below
(download link):
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: hna
spec:
selector:
matchLabels:
role: hna
template:
metadata:
labels:
role: hna
annotations:
k8s.v1.cni.cncf.io/networks: multus-sriov-1,multus-sriov-2,multus-hna-hbr
spec:
restartPolicy: Always
securityContext:
appArmorProfile:
type: Unconfined
sysctls:
- name: net.ipv4.conf.default.disable_policy
value: "1"
- name: net.ipv4.ip_local_port_range
value: "30000 40000"
- name: net.ipv4.ip_forward
value: "1"
- name: net.ipv6.conf.all.forwarding
value: "1"
- name: net.netfilter.nf_conntrack_events
value: "1"
containers:
- image: download.6wind.com/vsr/x86_64-ce-vhost/3.11:3.11.0.ga
imagePullPolicy: IfNotPresent
name: hna
startupProbe:
exec:
command: ["bash", "-c", "/bin/startup-probe"]
initialDelaySeconds: 10
failureThreshold: 20
periodSeconds: 10
timeoutSeconds: 9
resources:
limits:
cpu: "2"
memory: 2048Mi
hugepages-2Mi: 1024Mi
sriov/sriov1: 1
sriov/sriov2: 1
smarter-devices/ppp: 1
nc-k8s-plugin.6wind.com/vhost-user-all: 1
requests:
cpu: "2"
memory: 2048Mi
hugepages-2Mi: 1024Mi
sriov/sriov1: 1
sriov/sriov2: 1
smarter-devices/ppp: 1
nc-k8s-plugin.6wind.com/vhost-user-all: 1
env:
- name: K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
securityContext:
capabilities:
add: ["NET_ADMIN", "NET_RAW", "SYS_ADMIN", "SYS_NICE", "IPC_LOCK", "NET_BROADCAST", "SYSLOG", "SYS_TIME"
, "SYS_RAWIO"
]
volumeMounts:
- mountPath: /dev/hugepages
name: hugepage
- mountPath: /dev/shm
name: shm
- mountPath: /tmp
name: tmp
- mountPath: /run
name: run
- mountPath: /run/lock
name: run-lock
- mountPath: /bin/startup-probe
subPath: startup-probe.sh
name: startup-probe
stdin: true
tty: true
imagePullSecrets:
- name: regcred
volumes:
- emptyDir:
medium: HugePages
sizeLimit: 2Gi
name: hugepage
- name: shm
emptyDir:
sizeLimit: "2Gi"
medium: "Memory"
- emptyDir:
sizeLimit: "500Mi"
medium: "Memory"
name: tmp
- emptyDir:
sizeLimit: "200Mi"
medium: "Memory"
name: run
- emptyDir:
sizeLimit: "200Mi"
medium: "Memory"
name: run-lock
- name: startup-probe
configMap:
name: startup-probe
defaultMode: 0500
Note
In addition to the multus SR-IOV networks and the SR-IOV resources that must be customized, other parameters of the deployment file can be adadpted to your use-case be CPUs must be customized in the deployment file before being applied.
Apply the DaemonSet with the following command:
root@node1:~# kubectl apply -f deploy-hna.yaml
Once applied, the pods should be running:
root@node1:~# kubectl get pod -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
hna-hzj7l 1/1 Running 0 31s 10.229.0.119 node1 <none> <none>
hna-ltlzw 1/1 Running 0 31s 10.229.1.123 node2 <none> <none>
HNA configuration template¶
The HNA Pods are configured automatically by the
hna-operator. The configuration is generated from a Jinja2 template.
See also
For detailed instructions, please refer to the HNA Configuration Template section of the nc-k8s-plugin documentation.
This template relies on the Kubernetes database (list of Pods, list of nodes, custom resources definitions, …) to generate a valid CLI configuration that depends on the properties of the CNFs that are running on node.
Hera are some details about the template used in this document:
The list of PCI ports to configure on the Host Network Accelerator pod are retrieved from the Pod annotations (
k8s.v1.cni.cncf.io/network-status).For each
hna_net(i.e. a network registered by a running CNF pod), a specific fast path and interface configuration is added, which depends on the interface kind (vethorvirtio-user).Depending on the tenant of the CNF associated to the
hna_net, the interface is added into the proper bridge calledbdg-.For each existing tenant, a bridge and a vxlan interface is created.
An HNA identifier
hna_idis derived from the node name, and used to build a unique IP for the HNA Pod.A BGP configuration is used to peer with the leaf routers.
A KPI configuration is used to export metrics to an
influxdbPod on the Kubernetes cluster. This part is optional and can be removed.
The content of the configuration template hna-config-template.nc-cli
is shown below (download link):
# hack to ensure at least the license is applied
/ system license online serial HIDDEN
commit
del /
/ system license online serial HIDDEN
# Fast path
/ system fast-path enabled true
/ system fast-path core-mask fast-path max
/ system fast-path advanced power-mode eco
/ system fast-path advanced machine-memory 2048
/ system fast-path max-virtual-ports 16
# Physical ports
{% set pci_ifaces = [] %}
{% set hna_net_ifaces = [] %}
{% for sriov in hna_pod.metadata.annotations["k8s.v1.cni.cncf.io/network-status"] |
parse_json |
selectattr('device-info', 'defined') |
selectattr('device-info.type', 'eq', 'pci') %}
{% set pci_iface = sriov["device-info"]["pci"]["pci-address"] | pci2name %}
{% set _ = pci_ifaces.append(pci_iface) %}
/ system fast-path port {{pci_iface}}
/ vrf main interface physical {{pci_iface}} port {{pci_iface}}
/ vrf main interface physical {{pci_iface}} mtu 1600
{% endfor %}
{% set tenants = {} %}
# Virtual ports
{% for hna_net in hna_nets.values() | selectattr('kind', 'ne', 'hbr') %}
{% set pod = pods[hna_net.pod_name] %}
{% set pod_role = pod.metadata.labels['role'] %}
{% if hna_net.kind == "veth" %}
/ system fast-path virtual-port infrastructure infra-{{hna_net.name}}
{% set hna_net_iface = "veth-" + pod_role %}
/ vrf main interface infrastructure {{hna_net_iface}} port infra-{{hna_net.name}}
{% elif hna_net.kind == "virtio-user" %}
/ system fast-path virtual-port fpvhost fpvhost-{{hna_net.name}}
{% if "profile" in hna_net.userdata %}
/ system fast-path virtual-port fpvhost fpvhost-{{hna_net.name}} profile {{hna_net.userdata.profile}}
{% endif %}
{% if hna_net.userdata.socket_mode == "server" %}
/ system fast-path virtual-port fpvhost fpvhost-{{hna_net.name}} socket-mode client
{% else %}
/ system fast-path virtual-port fpvhost fpvhost-{{hna_net.name}} socket-mode server
{% endif %}
{% set hna_net_iface = "vho-" + pod_role %}
/ vrf main interface fpvhost {{hna_net_iface}} port fpvhost-{{hna_net.name}}
{% endif %}
{% if 'tenant' in pod.metadata.labels and 'tenant_id' in pod.metadata.labels %}
{% set tenant = pod.metadata.labels['tenant'] %}
{% set tenant_id = pod.metadata.labels['tenant_id'] %}
{% set _ = tenants.update({tenant: tenant_id}) %}
/ vrf main interface bridge bdg-{{tenant}} link-interface {{hna_net_iface}}
{% endif %}
{% set _ = hna_net_ifaces.append(hna_net_iface) %}
{% endfor %}
{% set hna_id = hna_net_hbr.spec.node_name | replace("node", "") %}
# Vxlan
{% for tenant, tenant_id in tenants.items() %}
/ vrf main interface vxlan vxlan-{{tenant}} mtu 1500
/ vrf main interface vxlan vxlan-{{tenant}} vni {{tenant_id}}
/ vrf main interface vxlan vxlan-{{tenant}} local 192.168.100.{{hna_id}}
/ vrf main interface vxlan vxlan-{{tenant}} learning false
/ vrf main interface bridge bdg-{{tenant}} link-interface vxlan-{{tenant}} learning false
/ vrf main interface bridge bdg-{{tenant}} mtu 1550
{% endfor %}
# Loopback
/ vrf main interface loopback loop0 ipv4 address 192.168.100.{{hna_id}}/32
# BGP
/ vrf main routing bgp as {{65000 + (hna_id | int)}}
/ vrf main routing bgp network-import-check false
/ vrf main routing bgp router-id 192.168.100.{{hna_id}}
/ vrf main routing bgp ebgp-requires-policy false
/ vrf main routing bgp address-family ipv4-unicast redistribute connected
/ vrf main routing bgp address-family l2vpn-evpn advertise-all-vni true
/ vrf main routing bgp neighbor-group group capabilities extended-nexthop true
/ vrf main routing bgp neighbor-group group remote-as 65000
/ vrf main routing bgp neighbor-group group track bfd
/ vrf main routing bgp neighbor-group group address-family l2vpn-evpn
{% for pci_iface in pci_ifaces %}
/ vrf main routing bgp unnumbered-neighbor {{pci_iface}} neighbor-group group
/ vrf main routing bgp unnumbered-neighbor {{pci_iface}} ipv6-only true
{% endfor %}
# KPIs
{% for pci_iface in pci_ifaces %}
/ vrf main kpi telegraf metrics monitored-interface vrf main name {{pci_iface}}
{% endfor %}
{% for hna_net_iface in hna_net_ifaces %}
/ vrf main kpi telegraf metrics monitored-interface vrf main name {{hna_net_iface}}
{% endfor %}
/ vrf main kpi telegraf metrics metric network-nic-traffic-stats enabled true period 3
/ vrf main kpi telegraf interval 5
/ vrf main kpi telegraf influxdb-output url http://influxdb.monitoring:8086 database telegraf
To apply the template, run the following command:
root@node1:~# kubectl create configmap -n hna-operator hna-template --from-file=config.nc-cli=/root/hna-config-template.nc-cli
Note
Take care to at least update the license serial (it appears twice in the template).
CNFs configuration¶
Boostrap configuration¶
The CNFs runs a Virtual Service Router. The configuration is generated automatically
thanks to an initContainer that will run the below script. This script
will generate a startup configuration inside the container, in
/etc/init-config/config.cli, based on environment variables passed by
Kubernetes. This CLI file is automatically applied when the Virtual Service Router container
starts.
To demonstrate the 2 interface kinds, the red pods use veth based
interfaces, while green ones use virtio based interfaces. The
dataplane IP addresses are simply generated using the pod identifier.
The cnf-bootstrap.py python script (download link):
#!/usr/bin/env python3
# Copyright 2025 6WIND S.A.
"""
This script is exported by Kubernetes in the VSR filesystem for the greenX and redX pods. It
is used to generate the startup configuration.
"""
import json
import os
import re
import subprocess
import sys
BUS_ADDR_RE = re.compile(r'''
^
(?P<domain>([\da-f]+)):
(?P<bus>([\da-f]+)):
(?P<slot>([\da-f]+))\.
(?P<func>(\d+))
$
''', re.VERBOSE | re.IGNORECASE)
def bus_addr_to_name(bus_addr):
"""
Convert a PCI bus address into a port name as used in nc-cli.
"""
match = BUS_ADDR_RE.match(bus_addr)
if not match:
raise ValueError('pci bus address %s does not match regexp' % bus_addr)
d = match.groupdict()
domain = int(d['domain'], 16)
bus = int(d['bus'], 16)
slot = int(d['slot'], 16)
func = int(d['func'], 10)
name = 'pci-'
if domain != 0:
name += 'd%d' % domain
name += 'b%ds%d' % (bus, slot)
if func != 0:
name += 'f%d' % func
return name
def get_env_vm():
with open('/run/init-env.json', encoding='utf-8') as f:
env = json.load(f)
env['HNA_IFNAME'] = subprocess.run(
"ip -json -details link | jq --raw-output "
"'.[] | select(has(\"linkinfo\") | not) | "
"select(.address | match(\"00:09:c0\")) | .ifname'",
shell=True, check=True, capture_output=True, text=True).stdout.strip()
pci_addr = subprocess.run(
rf"ethtool -i {env['HNA_IFNAME']} | sed -n 's,^bus-info: \(.*\)$,\1,p'",
shell=True, check=True, capture_output=True, text=True).stdout.strip()
env['HNA_PCIADDR'] = bus_addr_to_name(pci_addr)
return env
def get_env_container():
if os.getpid() == 1:
env = dict(os.environ)
else:
with open('/proc/1/environ', encoding='utf-8') as f:
data = f.read()
env = dict((var.split('=') for var in data.split('\x00') if var))
env['K8S_POD_ID'] = int(re.sub('[^0-9]', '', env['K8S_POD_ROLE']))
env['KS8_POD_IP'] = os.environ.get('K8S_POD_IP')
env['DEFAULT_ROUTE'] = subprocess.run(
'ip -j route get 8.8.8.8 | jq -r .[0].gateway', shell=True,
check=True, capture_output=True, text=True).stdout.strip()
env['VETH_INFRA_ID'] = subprocess.run(
"ip -j link | jq --raw-output "
"'(.[] | select(.ifname | match(\"veth-[0-9a-f]{10}\"))) | .ifalias'",
shell=True, check=True, capture_output=True, text=True).stdout.strip()
return env
def get_env():
if os.path.exists('/run/init-env.json'):
env = get_env_vm()
else:
env = get_env_container()
env['K8S_POD_ID'] = int(re.sub('[^0-9]', '', env['K8S_POD_ROLE']))
return env
def gen_green_config(env):
mac = f"de:ad:de:80:00:{env['K8S_POD_ID']:02x}"
conf = ""
if 'HNA_PCIADDR' in env:
conf += """\
/ vrf main interface physical eth1 ethernet mac-address {mac}
/ vrf main interface physical eth1 port {HNA_PCIADDR}
/ vrf main interface physical eth1 ipv4 address 192.168.0.{K8S_POD_ID}/24
/ system fast-path port {HNA_PCIADDR}
"""
else:
conf += """\
/ vrf main interface fpvirtio eth1 ethernet mac-address {mac}
/ vrf main interface fpvirtio eth1 port fpvirtio-0
/ vrf main interface fpvirtio eth1 ipv4 address 192.168.0.{K8S_POD_ID}/24
/ system fast-path virtual-port fpvirtio fpvirtio-0
/ system fast-path max-virtual-ports 1
"""
conf += """\
/ system fast-path advanced machine-memory 2048
/ system fast-path advanced power-mode eco
/ system license online serial HIDDEN
"""
return conf.format(**env, mac=mac)
def gen_red_config(env):
mac = f"de:ad:de:80:01:{env['K8S_POD_ID']:02x}"
return """\
cmd license file import content {license_data} serial {license_serial} | ignore-error
/ vrf main interface infrastructure eth1 ethernet mac-address {mac}
/ vrf main interface infrastructure eth1 port {VETH_INFRA_ID}
/ vrf main interface infrastructure eth1 ipv4 address 192.168.0.{K8S_POD_ID}/24
/ system fast-path virtual-port infrastructure {VETH_INFRA_ID}
/ system fast-path advanced machine-memory 2048
/ system fast-path advanced power-mode eco
/ system license online serial HIDDEN
""".format(**env, mac=mac)
def gen_config():
env = get_env()
if 'green' in env['K8S_POD_ROLE']:
return gen_green_config(env)
return gen_red_config(env)
def main():
config = gen_config()
if config[-1] != '\n':
config += '\n'
os.makedirs('/etc/init-config', exist_ok=True)
with open('/etc/init-config/config.cli', 'w', encoding='utf-8') as f:
f.write(config)
if os.getpid() != 1:
sys.stdout.write(config)
return 0
if __name__ == '__main__':
sys.exit(main())
Note
Take care to at least update the license serial.
To store this in a ConfigMap, run the following command on the Kubernetes
control plane:
root@node1:~# kubectl create configmap cnf-bootstrap-config --from-file=cnf-bootstrap.py=/root/cnf-bootstrap.py
CNF Deployment¶
Now you can deploy the CNF Pods: green1, green2, green3,
red1, red2. The green* Pods use a virtio connection, while
the red* Pods use a veth connection.
In this document, we use a deployment file for each CNF to ease the
placement of Pods on the different nodes: green1, green2, and
red1 will have an affinity to node1, while the other ones will have
an affinity to node2.
The content of the deployment file deploy-green1.yaml is shown below
(download link):
apiVersion: apps/v1
kind: Deployment
metadata:
name: green1
spec:
replicas: 1
selector:
matchLabels:
role: green1
template:
metadata:
labels:
role: green1
tenant: green
tenant_id: "100"
annotations:
k8s.v1.cni.cncf.io/networks: multus-hna-virtio-user
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- node1
restartPolicy: Always
securityContext:
appArmorProfile:
type: Unconfined
sysctls:
- name: net.ipv4.conf.default.disable_policy
value: "1"
- name: net.ipv4.ip_local_port_range
value: "30000 40000"
- name: net.ipv4.ip_forward
value: "1"
- name: net.ipv6.conf.all.forwarding
value: "1"
- name: net.netfilter.nf_conntrack_events
value: "1"
initContainers:
- name: bootstrap
image: download.6wind.com/vsr/x86_64-ce/3.11:3.11.0.ga
command: ["/sbin/bootstrap"]
resources:
limits:
cpu: "2"
memory: 2048Mi
hugepages-2Mi: 1024Mi
nc-k8s-plugin.6wind.com/virtio-user: 1
requests:
cpu: "2"
memory: 2048Mi
hugepages-2Mi: 1024Mi
nc-k8s-plugin.6wind.com/virtio-user: 1
env:
- name: K8S_NODE_NAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
- name: K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: K8S_POD_ROLE
value: green1
- name: K8S_POD_IP
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: K8S_POD_CPU_REQUEST
valueFrom:
resourceFieldRef:
resource: requests.cpu
- name: K8S_POD_MEM_REQUEST
valueFrom:
resourceFieldRef:
resource: requests.memory
volumeMounts:
- mountPath: /sbin/bootstrap
subPath: cnf-bootstrap.py
name: bootstrap
- mountPath: /etc/init-config
name: init-config
containers:
- image: download.6wind.com/vsr/x86_64-ce/3.11:3.11.0.ga
imagePullPolicy: IfNotPresent
name: green1
startupProbe:
exec:
command: ["bash", "-c", "/bin/startup-probe"]
initialDelaySeconds: 10
failureThreshold: 20
periodSeconds: 10
timeoutSeconds: 9
resources:
limits:
cpu: "2"
memory: 2048Mi
hugepages-2Mi: 1024Mi
smarter-devices/ppp: 1
smarter-devices/vhost-net: 1
smarter-devices/net_tun: 1
nc-k8s-plugin.6wind.com/virtio-user: 1
requests:
cpu: "2"
memory: 2048Mi
hugepages-2Mi: 1024Mi
smarter-devices/ppp: 1
smarter-devices/vhost-net: 1
smarter-devices/net_tun: 1
nc-k8s-plugin.6wind.com/virtio-user: 1
env:
- name: K8S_POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
securityContext:
capabilities:
add: ["NET_ADMIN", "NET_RAW", "SYS_ADMIN", "SYS_NICE", "IPC_LOCK", "NET_BROADCAST", "SYSLOG", "SYS_TIME"
, "SYS_RAWIO"
]
volumeMounts:
- mountPath: /dev/hugepages
name: hugepage
- mountPath: /dev/shm
name: shm
- mountPath: /tmp
name: tmp
- mountPath: /run
name: run
- mountPath: /run/lock
name: run-lock
- mountPath: /bin/startup-probe
subPath: startup-probe.sh
name: startup-probe
- mountPath: /etc/init-config
name: init-config
stdin: true
tty: true
imagePullSecrets:
- name: regcred
volumes:
- emptyDir:
medium: HugePages
sizeLimit: 2Gi
name: hugepage
- name: shm
emptyDir:
sizeLimit: "2Gi"
medium: "Memory"
- emptyDir:
sizeLimit: "500Mi"
medium: "Memory"
name: tmp
- emptyDir:
sizeLimit: "200Mi"
medium: "Memory"
name: run
- emptyDir:
sizeLimit: "200Mi"
medium: "Memory"
name: run-lock
- name: bootstrap
configMap:
name: cnf-bootstrap-config
defaultMode: 0500
- name: startup-probe
configMap:
name: startup-probe
defaultMode: 0500
- name: init-config
emptyDir:
sizeLimit: "10Mi"
medium: "Memory"
To apply the deployment file, run the following command:
root@node1:~# kubectl apply -f deploy-green1.yaml
After some time, the pod should be visible as “Running”:
root@node1:~# kubectl get pod
NAME READY STATUS RESTARTS AGE
green1-75667cbd6f-8kn74 1/1 Running 0 39s
hna-hzj7l 1/1 Running 0 152m
hna-ltlzw 1/1 Running 0 152m
Login to the pod with kubectl exec -it POD_NAME -- login
(admin/admin is the default login/password), and list interfaces:
green1-75667cbd6f-8kn74> show interface
Name State L3vrf IPv4 Addresses IPv6 Addresses Description
==== ===== ===== ============== ============== ===========
lo UP default 127.0.0.1/8 ::1/128 loopback_main
eth0 UP default 10.229.0.126/24 fe80::c437:61ff:feb5:165c/64 infra-eth0
eth1 UP default 192.168.0.1/24 fe80::dcad:deff:fe80:1/64
fptun0 UP default fe80::6470:74ff:fe75:6e30/64
eth0is the primary CNIeth1is the virtio interface connected to the HNA
Note
eth1 may take some time to appear, since it requires to
start the fast path.
The content of other deployment files is very similar (only changes are
the pod name, the node affinity, the tenant, and the hna_net
kind). Here are the download links for each of them:
deploy-green1.yaml: (download link)deploy-green2.yaml: (download link)deploy-green3.yaml: (download link)deploy-red1.yaml: (download link)deploy-red2.yaml: (download link)
VNF Deployment with Kubevirt¶
KubeVirt is an open-source project that lets you run VMs alongside containers in a Kubernetes cluster. You can use KubeVirt to deploy your network function as a VM, and connect it to the HNA using Virtio interfaces:
on VNF side, a Virtio PCI interface will be used,
on HNA side, a Vhost-user interface will be used.
Only Virtio is supported by the HNA CNI when using a VM. The use of Veth interfaces is not possible. So in our example, only the green pods can be instanciated as a VM.
This section explains how to deploy your Virtual Service Router as a VNF and connect it to the HNA. It requires the installation of a hook sidecar script, whose role is to add the VNF Virtio PCI ports connected to the HNA into the VM configuration, by modifying the libvirt XML domain description.
See also
Refer to the KubeVirt Installation section of the 6WIND HNA documentation for details about KubeVirt installation and configuration for HNA.
Refer to the kubevirt section of the nc-k8s-plugin documentation to deploy the hook sidecar script.
Load the hook sidecar script retrieved from nc-k8s-plugin documentation into a ConfigMap:
# kubectl create configmap kubevirt-sidecar --from-file=kubevirt_sidecar.py=/path/to/kubevirt_sidecar.py
Then, create a new NetworkAttachmentDefinition with the following
content (download link):
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: multus-hna-virtio-user-kubevirt
annotations:
k8s.v1.cni.cncf.io/resourceName: nc-k8s-plugin.6wind.com/virtio-user
spec:
config: '{
"cniVersion": "1.0.0",
"name": "multus-hna-virtio-user-kubevirt",
"type": "hna-cni",
"kind": "virtio-user",
"capabilities": {"CNIDeviceInfoFile": true, "deviceID": true},
"log-level": "INFO",
"log-file": "stderr",
"userdata": {
"socket_mode" : "server"
}
}'
This NetworkAttachmentDefinition is similar to the default one
provided in nc-k8s-plugin, except that it includes a userdata
specifying a socket mode. This user data is used by the HNA
configuration template.
To run a VM, a VirtualMachine is expected by KubeVirt. The content
of this file, deploy-kubevirt-green1.yaml is shown below (download
link):
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
name: green1
spec:
runStrategy: Always
template:
metadata:
annotations:
hooks.kubevirt.io/hookSidecars: >
[
{
"args": ["--version", "v1alpha2"],
"image": "quay.io/kubevirt/sidecar-shim:v1.5.2",
"configMap": {"name": "kubevirt-sidecar", "key": "kubevirt_sidecar.py", "hookPath": "/usr/bin/onDefineDomain"}
}
]
labels:
kubevirt.io/domain: green1
role: green1
tenant: green
tenant_id: "100"
spec:
nodeSelector:
kubernetes.io/hostname: vm-k8s-hypervisor
domain:
cpu:
sockets: 1
cores: 1
threads: 2
dedicatedCpuPlacement: true
devices:
disks:
- name: ctdisk
disk: {}
filesystems:
- name: bootstrap
virtiofs: {}
interfaces:
- name: default
macAddress: de:ad:de:01:02:03
masquerade: {}
- name: multus-1
sriov: {}
resources:
requests:
memory: 2048Mi
nc-k8s-plugin.6wind.com/virtio-user: '1'
limits:
memory: 2048Mi
nc-k8s-plugin.6wind.com/virtio-user: '1'
memory:
hugepages:
pageSize: "2Mi"
networks:
- name: default
pod: {}
- name: multus-1
multus:
networkName: multus-hna-virtio-user-kubevirt
default: false
volumes:
- name: ctdisk
containerDisk:
image: download.6wind.com/vsr/x86_64/3.11:3.11.0.ga
- name: bootstrap
configMap:
name: cnf-bootstrap-config
- name: cloudinitdisk
cloudInitNoCloud:
userData: |-
#cloud-config
bootcmd:
- "echo '{ \"K8S_POD_ROLE\": \"green1\" }' > /run/init-env.json"
- "mkdir /run/bootstrap_script"
- "mount -t virtiofs bootstrap /run/bootstrap_script"
- "mkdir /etc/init-config"
- "python3 /run/bootstrap_script/cnf-bootstrap.py"
To apply the deployment file, run the following command:
root@node1:~# kubectl apply -f deploy-kubevirt-green1.yaml
After some time, the pod should be visible as “Running”. Note that KubeVirt creates several containers in the Pod (here, 6):
root@node1:~# kubectl get pod
NAME READY STATUS RESTARTS AGE
virt-launcher-green1-7nwdj 6/6 Running 0 20m
Login to the pod with virtctl console green1 (admin/admin is the
default login/password), and list interfaces:
green1-vm-kubevirt> show interface
Name State L3vrf IPv4 Addresses IPv6 Addresses Description
==== ===== ===== ============== ============== ===========
lo UP default 127.0.0.1/8 ::1/128 loopback_main
eth0 UP default 10.0.2.2/24 fe80::dcad:deff:fe01:203/64
eth1 UP default 192.168.0.1/24 fe80::dcad:deff:fe80:1/64
fptun0 UP default fe80::6470:74ff:fe75:6e30/64
eth0is the primary CNIeth1is the virtio interface connected to the HNA
Note
eth1 may take some time to appear, since it requires to
start the fast path.