1.3.1. ISSU procedure¶
To reduce as much as possible the network outage during packages upgrade, an ISSU (In Service Software Upgrade) procedure is available. During this procedure, the updated Virtual Accelerator is maintained alive to process packets while the new Virtual Accelerator instance is being installed and configured.
Here is a typical running Virtual Accelerator block diagram:
Virtual Accelerator is switching traffic between some VMs taps, and physical NICs, taking its configuration through Linux - Fast Path Synchronization, that updates some network configuration shared memories and ovs-vswitchd daemon that feeds an Open vSwitch dedicated shared memory.
The ISSU procedure consists in installing new version of product packages while Virtual Accelerator continues to forward packets; it is composed of following steps, that must be done in order:
ISSU procedure startup
New packages installation
Virtual Accelerator service restart, leading to two Virtual Accelerator running at the same time, old one is managing data traffic.
Open vSwitch service stop
Virtual Accelerator internal state copy from old to new product instance
Open vSwitch service restart
old Virtual Accelerator instance is killed (traffic interruption), new Virtual Accelerator instance takes over VMs taps and physical NICs and starts forwarding traffic.
ISSU procedure finalization
Several commands are provided by fast-path.sh script to handle the ISSU procedure:
ISSU procedure initiation
fast-path.sh upgrade start
This command is used to mark the beginning of the ISSU procedure. It instructs the Virtual Accelerator to keep dataplane running during next call to systemctl restart, instead of killing the old instance.
Virtual Accelerator configuration restoration
fast-path.sh upgrade restore-conf
This command is used to copy internal states and configuration from old Virtual Accelerator to new Virtual Accelerator, such as Open vSwitch flows or ports configuration (Fast Path QoS - Exception Rate Limitation, control plane protection, …).
Can return a non null exit code if an error occured during configuration retrieval. It is the user responsibility to call Virtual Accelerator service restart if this error occurs to ensure a proper restart of the Virtual Accelerator without ISSU procedure.
Virtual Accelerator switching
fast-path.sh upgrade takeover
This command is used to kill the old instance of the Virtual Accelerator and to instruct the new Virtual Accelerator to take over all VMs taps and physical NICs released by the old Virtual Accelerator instance. There will be a traffic interruption as soon as the old Virtual Accelerator is killed and until the new Virtual Accelerator takes over all VMs taps and physical NICs.
Open vSwitch flows max idle time must be greater than this command run time to be sure that no flow will be dropped during takeover procedure due to max idle time reached. This idle time (here 15s as an example) can be configured using following command:
ovs-vsctl set Open_vSwitch . other_config:max-idle=15000
Can return a non null exit code if an error occured during takeover process. It is the user responsibility to call Virtual Accelerator service restart if this error occurs to ensure a proper restart of the Virtual Accelerator without ISSU procedure.
ISSU procedure finalization
fast-path.sh upgrade done
This command is used to finalize ISSU procedure. Virtual Accelerator daemon processes are monitored during all ISSU process. If a daemon restart occured during the ISSU process, Virtual Accelerator may be in an unstable state. If it happens, this command will return a non null error code.
Can return a non null exit code if process monitoring scripts did trigger during the ISSU procedure. It is the user responsibility to call Virtual Accelerator service restart if this error occurs to ensure a proper restart of the Virtual Accelerator without ISSU procedure.
ISSU script example
Here is a typical ISSU script:
# Start ISSU procedure fast-path.sh upgrade start # Upgrade packages dnf update -y # Restart service systemctl restart virtual-accelerator.target if ! fast-path.sh upgrade status; then if systemctl is-active virtual-accelerator.target; then # Failure during stop stage, already restarted in slow mode echo "ISSU failure, automatic slow restart done" else # Failure during restart stage, restart in slow mode echo "ISSU failure, force slow restart" systemctl restart virtual-accelerator.target systemctl restart network.service fi exit 1 fi # Save ovs flows ovs_bridges=$(ovs-vsctl -- --real list-br) ovs_flows=$(/usr/share/openvswitch/scripts/ovs-save save-flows $ovs_bridges) # Restart the database first, since a large database may take a # while to load, and we want to minimize forwarding disruption. systemctl --job-mode=ignore-dependencies restart ovsdb-server # Stop ovs-vswitchd. systemctl --job-mode=ignore-dependencies stop ovs-vswitchd # Start vswitchd by asking it to wait till flow restore is finished. ovs-vsctl --no-wait set open_vswitch . other_config:flow-restore-wait="true" systemctl --job-mode=ignore-dependencies start ovs-vswitchd # Restore configuration eval "$ovs_flows" # Sync fastpath internal state from old to new va if ! fast-path.sh upgrade restore-conf; then # Failure during sync stage, restart in slow mode echo "ISSU sync failure, force slow restart" systemctl restart virtual-accelerator.target systemctl restart network.service exit 1 fi # Restore OVS normal operation ovs-vsctl --if-exists remove open_vswitch . other_config flow-restore-wait="true" # Kill old instance, and takeover needed resources if ! fast-path.sh upgrade takeover; then # Failure during takeover stage, force restart in slow mode echo "ISSU takeover failure, force slow restart" systemctl restart virtual-accelerator.target systemctl restart network.service exit 1 fi # Finalize procedure and check process monitoring if ! fast-path.sh upgrade done; then # Monitoring triggered a process restart during ISSU, force restart in slow mode echo "Daemone failure during ISSU, force slow restart" systemctl restart virtual-accelerator.target systemctl restart network.service exit 1 fi echo "ISSU upgrade done"
Following paragraphs will give a short description of the various commands used in the script to achieve the ISSU procedure.
ISSU upgrade procedure startup
# Start ISSU procedure fast-path.sh upgrade start
This command is used to mark the beginning of the ISSU procedure.
Packages update
# Upgrade packages dnf update -y
Distribution packaging tool is used to install the new packages. This assumes that repositories configuration is correct and that this update command will lead to an effective update of the Virtual Accelerator packages.
Virtual Accelerator restart
# Restart service systemctl restart virtual-accelerator.target
Service is restarted using regular service restart command, but new Virtual Accelerator instance is running along with old instance:
ovs-vswitchd is always connected to old Virtual Accelerator instance, that continues to forward traffic between VMs taps and physical NICs; while new Virtual Accelerator instance is setting up and synchronize its configuration through Linux - Fast Path Synchronization. New Virtual Accelerator instance is running on the same set of cores than the old instance, but does not consume CPU cycles, since it does not forward any traffic at this time.
ovs-vswitchd stop
# Save ovs flows ovs_bridges=$(ovs-vsctl -- --real list-br) ovs_flows=$(/usr/share/openvswitch/scripts/ovs-save save-flows $ovs_bridges) # Restart the database first, since a large database may take a # while to load, and we want to minimize forwarding disruption. systemctl --job-mode=ignore-dependencies restart ovsdb-server # Stop ovs-vswitchd. systemctl --job-mode=ignore-dependencies stop ovs-vswitchd
These commands are used to save the Open vSwitch bridges and stop ovs-vswitchd.
Fastpath Open vSwitch flows recovery
fast-path.sh upgrade restore-conf
Now that the ovs-vswitchd is not running, the restore-conf command can be issued to recover current Open vSwitch flows and internal Open vSwitch structures from old Virtual Accelerator to new Virtual Accelerator.
ovs-vswitchd restart
# Start vswitchd by asking it to wait till flow restore is finished. ovs-vsctl --no-wait set open_vswitch . other_config:flow-restore-wait="true" systemctl --job-mode=ignore-dependencies start ovs-vswitchd # Restore configuration eval "$ovs_flows"
These commands are used to restart the ovs-vswitchd daemon in flow-restore-wait mode. The ovs-vswitchd daemon will now be connected to the new Virtual Accelerator instance.
Virtual Accelerator takeover
fast-path.sh upgrade takeover [--log] [--wait-link [t]]
This command is used to switch between old and new Virtual Accelerator. Old Virtual Accelerator is killed, and new one is starting to process the incoming traffic.
Options:
–wait-link can be used to specify a time to wait for all links to recover their previous state before returning from the command. Time is specified in seconds. If t is omitted, a default timeout of 10s is used.
–log can be used to log additional timing information that can help to debug timing issues during takeover process.
Returned values:
1 if an internal error occurs during takeover process.
2 if –wait-link is specified and timeout occurs before all physical links are properly up.
3 if –wait-link is specified and timeout occurs before all virtual ports are properly up, all physical ports being properly up.
ISSU finalization
# Restore normal operation ovs-vsctl --if-exists remove open_vswitch . other_config flow-restore-wait="true" # Finalize procedure and check process monitoring if ! fast-path.sh upgrade done; then # Monitoring triggered a process restart during ISSU, force restart in slow mode echo "Daemon failure during ISSU, force slow restart" systemctl restart virtual-accelerator.target systemctl restart network.service exit 1 fi
Finally ovs-vswitchd normal mode of operation can be restored, and ISSU procedure can be finalized properly.