BGP configuration¶
There are a list of necessary elements to know when forging a BGP configuration.
Basic elements for configuration¶
When forging a BGP configuration, the local AS number, and the remote AS number, as well as the remote IP address have to be known in order to establish peering.
An AS is an administrative set of routers, depending on an administrative authority. There are public or assigned ASes, and private ASes. An ASes is identified by a number called ASN.
The public ASNs are any registered ASN values that are not private. These ASNs are assigned by a RIR. The private ASNs are made up of 2 ranges that can be used for local administration. These numbers are 64512 through 65535, and 4200000000 through 4294967294.
BGP has been extended to exchange routing information not only for IPv4
routing tables, also other routing information like IPv6, or for other purpose:
flowspec, or L3VPN or L2VPN.
The address-family command allows us to identify the network protocol. It is
made up of a pair of arguments AFI, SAFI
. For instance, by default,
IPv4, unicast
is enabled and stands for the routing information of IPv4.
Here below is an example on how to configure a sample BGP configuration with both IPv4 and IPv6 address-family set:
vrf main
routing
bgp
router-id 10.125.0.1
as 65501
neighbor 10.125.0.3
remote-as 65502
address-family ipv6
..
..
commit
The same configuration can be made using this NETCONF XML configuration:
vsr running config# show config xml absolute vrf main routing bgp
<config xmlns="urn:6wind:vrouter">
<vrf>
<name>main</name>
<routing xmlns="urn:6wind:vrouter/routing">
<bgp xmlns="urn:6wind:vrouter/bgp">
<router-id>10.125.0.1</router-id>
<as>65501</as>
<neighbor>
<neighbor-address>10.125.0.3</neighbor-address>
<address-family>
<ipv6-unicast>
</ipv6-unicast>
</address-family>
<remote-as>65502</remote-as>
</neighbor>
</bgp>
</routing>
</vrf>
</config>
Configuring various address-family means that there are subtle differences between each address-family, that permit benefiting from each specificity.
For instance, IPv6, unicast
address-family provides 2 IPv6 next-hops :
the local one and the global one.
Also, IPv4, vpn
is the L3VPN combination for MPLS tunnels. While the
routing information exchanged deals with inner IPv4 information, the
MPLS VPN SAFI implies that the overlay will be based with MPLS. The
nexthop information will stand for underlay tunnel end point information.
Here, the nexthop may be either IPv4 or IPv6, independently of the inner
IPv4 prefix. The nexthop will also contain the MPLS label identifier.
Note
You can also disable BGP, either by suppressing the configuration:
vrf main
del routing bgp
..
Alternatively, if you don’t want to lose the configuration, and disabling BGP configuration, you can use following command:
vrf main
routing bgp
enabled false
This method can be used if the user wants to force peering with remote BGP speakers.
Consecutively changing the state of BGP will force the peering.
Here, below illustration indicates how the session for 10.125.0.3
is flushed.
flush bgp vrf main ipv4 unicast neighbor 10.125.0.3
Note that this command can also selectively flush different parts of the routing
tables, like ADJ-RIB-IN by issuing the soft in
prefix at the end of the command.
An other possibility is to disable the whole BGP instance.
vrf main
routing bgp enabled false
commit
routing bgp enabled true
commit
Basic BGP configuration¶
BGP configuration illustration with 3 BGP peerings¶
The above diagram depicts 3 devices, each one has a BGP instance that peers with each other. The 3 devices configuration is like below:
rt1
routing bgp
router-id 10.1.1.1
as 65500
neighbor 10.1.1.2 remote-as 65510
neighbor 10.1.1.3 remote-as 65520
address-family ipv4-unicast redistribute connected
..
..
interface
physical eth1_0
ipv4 address 192.168.1.0/24
..
..
physical eth0_0
ipv4 address 10.1.1.1/28
..
..
rt2
routing bgp
router-id 10.1.1.2
as 65510
neighbor 10.1.1.1 remote-as 65500
neighbor 10.1.1.3 remote-as 65520
address-family ipv4-unicast redistribute connected
..
..
interface
physical eth1_0
ipv4 address 192.168.2.0/24
..
..
physical eth0_0
ipv4 address 10.1.1.2/28
..
..
rt3
routing bgp
router-id 10.1.1.3
as 65520
neighbor 10.1.1.1 remote-as 65500
neighbor 10.1.1.2 remote-as 65510
address-family ipv4-unicast redistribute connected
..
..
interface
physical eth1_0
ipv4 address 192.168.3.0/24
..
..
physical eth0_0
ipv4 address 10.1.1.3/28
..
..
After having executed the three configurations, the status of the BGP connections can be obtained. The peerings between the devices can be visualised with the following command:
rt1> show bgp summary
IPv4 Unicast Summary:
BGP router identifier 10.1.1.1, local AS number 65500 vrf-id 0
BGP table version 5
RIB entries 9, using 1368 bytes of memory
Peers 2, using 41 KiB of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/P
fxRcd
10.1.1.2 4 65510 17 17 0 0 0 00:09:08 4
10.1.1.3 4 65520 17 17 0 0 0 00:09:11 4
Total number of neighbors 2
The output of the state column must be blank in case the BGP connection is established, otherwise it reflects the state of the BGP connection. The different BGP session states are studied later in the section. Following command gives detailed BGP information about a given neighbor:
rt1> show bgp neighbor 10.1.1.2
BGP neighbor is 10.1.1.2, remote AS 65510, local AS 65500, external link
Hostname: rt1
BGP version 4, remote router ID 10.1.1.2
BGP state = Established, up for 00:14:00
Last read 00:01:00, Last write 00:01:00
Hold time is 180, keepalive interval is 60 seconds
Neighbor capabilities:
4 Byte AS: advertised and received
AddPath:
IPv4 Unicast: RX advertised IPv4 Unicast and received
Route refresh: advertised and received(old & new)
Address Family IPv4 Unicast: advertised and received
Hostname Capability: advertised (name: rt1,domain name: n/a)
received (name: rt2,domain name: n/a)
Graceful Restart Capabilty: advertised and received
Remote Restart timer is 120 seconds
Address families by peer:
none
Graceful restart informations:
End-of-RIB send: IPv4 Unicast
End-of-RIB received: IPv4 Unicast
Message statistics:
Inq depth is 0
Outq depth is 0
Sent Rcvd
Opens 1 1
Notifications: 0 0
Updates: 6 6
Keepalives: 14 14
Route Refresh: 0 0
Capability: 0 0
Total: 21 21
It is also possible to dump the list of BGP entries that rt1 learnt from the other peers, by using following command on configuration mode:
rt1> show bgp ipv4 unicast neighbors
BGP table version is 5, local router ID is 10.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
* 10.0.2.0/24 10.1.1.2 0 0 65510 ?
* 10.1.1.3 0 0 65520 ?
*> 0.0.0.0 0 32768 ?
* 10.1.1.0/28 10.1.1.2 0 0 65510 ?
* 10.1.1.3 0 0 65520 ?
*> 0.0.0.0 0 32768 ?
*> 192.168.1.0 0.0.0.0 0 32768 ?
* 192.168.2.0 10.1.1.2 0 65520 65510 ?
*> 10.1.1.2 0 0 65510 ?
* 192.168.3.0 10.1.1.3 0 65510 65520 ?
*> 10.1.1.3 0 0 65520 ?
Displayed 5 routes and 11 total paths
Peer-groups¶
Scaling BGP deployments may be useful, when one deploys multiple instances of BGP. Instead of configuring each peer one by one, it is possible to configure peer groups.
A peer group is defined by a name, and is being applied the same configuration as the one applied to a single peer IP, except for the IP addressing of that peer.
You can use a peer group configuration so as to define some peers with outgoing
peering, with the inherited configuration coming from the peer-group.
You can also use a peer group to permit incoming peering connections, like a server
would do. Below configuration illustrates the usage of a peer group named group
:
routing bgp
listen
neighbor-range 10.135.0.0/24 neighbor-group group
..
as 65502
neighbor-group group
address-family
ipv6-unicast
..
..
remote-as 65501
update-source 10.135.0.2
..
neighbor 10.145.0.2
neighbor-group group
..
..
By default, the BGP instance will create up to 100 maximum peering connections. This is a configuration limit that can be modified to increase the maximum number of peering connections to support:
routing bgp
listen limit 5000
..
With that extra configuration, incoming BGP connections that match its settings will be able to be peered up to that new limit. It is also possible to limit the number of accepted incoming connections by establishing a range of potential IP addresses, like the above example illustrates.
Note
When configuring and handling a huge number of peering connections, for example with values above 5000, you may experience some timeout issues on the BGP connections. This is due to the increased time of processing the incoming BGP packets. This time can be reduced by increasing the number of BGP packets to be processed by I/O cycle. When a huge number of peering connections is used, it is recommended to use the following configuration command:
routing bgp
packet-rw-quantum read 64
packet-rw-quantum write 200
..
As you can see, the packet-rw-quantum write
value has been increased accordingly.
It is recommended to keep that value above packet-rw-quantum read
value, if you do
not want to experience memory exhaustion due to accumulation of buffers in BGP
process.
Note
With such big number of sessions, it may be wishable for the server to detect
the failure of remote endpoint as quick as possible. One possibility consists in
enabling tcp-keepalive
parameter. tcp-keepalive
packets need to be replied at
remote TCP endpoints with ack
packet. After a defined consecutive number of
probes
where ack
is not seen, TCP session is considered as down. Below
configuration detects a failure of 3 probes send each 2 seconds.
routing bgp
tcp-keepalive idle 2 keepalive 2 probes 3
..
Route-Reflector¶
Route reflector is used in iBGP networks, where the number of BGP peers becomes too
important. Instead of using a full mesh peering, a 1-N peering topology is used.
A single ( or two, in case backup is needed) BGP instance acts as route reflector
server, and receives/replies BGP updates from/to route reflector clients accordingly.
This permits scaling some setups. Creating a route reflector server consists in defining
an IBGP peering session, either via peer-group or by defining directly a peer. The
option route-reflector-client
must be set to true.
as 65501
neighbor-group group
address-family
ipv4-unicast
route-reflector-client true
..
..
remote-as 65501
neighbor 1.1.1.1
address-family
ipv4-unicast
route-reflector-client true
..
..
remote-as 65501
..
There is no need to add extra-configuration to the iBGP clients.
Multipath¶
Overview¶
The BGP path selection process involves a series of criteria to determine the best path for a given prefix. Here is a summarized and ordered list of these criteria:
Highest Weight: Prefers the path with the highest weight.
Highest Local Preference: Routes with higher local preference are favored.
Originated Locally: Prefers routes originated by the local router.
Shortest AS-path: Chooses the route with the fewest AS hops.
Lowest Origin Code: Prefers iBGP origins over eBGP, and eBGP over Incomplete.
Lowest MED: Selects the path with the smallest MED, but only compares MED for routes from the same AS.
eBGP over iBGP: Prefers external BGP paths over internal.
Shortest IGP Path to BGP Next Hop: Chooses the nearest next hop according to IGP metrics.
Oldest Path for eBGP: Prefers the oldest path from external BGP peers for stability.
Lowest Router ID: Selects the path through the BGP router with the lowest router ID.
Minimum Cluster List Length: In route reflection, prefers the path with the shortest cluster list.
Lowest Neighbor Address: Finally, the path from the neighbor with the lowest IP address is chosen if all other criteria are equal.
Traditionally in BGP, the selection process chooses only one route to be installed into the RIB. This approach was not optimal as it utilized only one path at a time. BGP multipath is a feature that allows multiple paths to be selected simultaneously, enabling traffic to be distributed across these paths. The paths included in the multipath setup are not distinguished by preference based on the criteria from 1 (Weight) through 8 (Shortest IGP Path to BGP Next Hop). They must either have the same AS-path for iBGP paths or be received from the same AS for eBGP paths. This technique enhances bandwidth usage and provides redundancy by utilizing all eligible routes equally (using ECMP) or unequally (using UCMP). BGP multipath promotes load balancing and ensures network reliability by rerouting traffic if a path fails.
By default, BGP can handle up to 8 multiple paths for a given prefix and
install them into the RIB. The limit on the number of paths can be customized
for each address family and both iBGP and eBGP sessions. Setting the
maximum-path
value to 1 effectively disables the multipath feature.
router-id 10.125.0.1
address-family
ipv4-unicast
maximum-path
ebgp 1
ibgp 1
..
..
..
as 65501
Specific parameters can be used to relax the criteria for including paths in the
BGP multipath selection. For example, the command bestpath as-path
multipath-relax as-set
relaxes the requirement for paths to have the same
AS-path list (for iBGP) or to originate from the same AS (for eBGP).
However, the condition of having the shortest AS-path is still enforced,
meaning the paths must have the same number of AS hops as the best path.
bestpath
as-path
multipath-relax as-set
..
..
Another command, bestpath as-path ignore true
, goes further by removing the
requirement for paths to share an identical count of AS in the AS-path
(criteria 4 in the previously mentioned list). With this setting, the AS-path
is entirely excluded from the path comparison process:
bestpath
as-path
ignore true
..
..
Refers to command reference for more details.
BGP Add-Path¶
The BGP Additional Paths feature, commonly referred to as BGP Add-Path,
enhances the capabilities of BGP by allowing the advertisement of multiple
paths for the same prefix in a BGP session. By default, BGP installs
multiple paths to a prefix into the RIB but only advertises the best path
according to the above-mentioned criteria. The Add-Path feature,
however, enables the advertisement of additional paths using the commands
addpath tx-all-paths
. For example, the command addpath tx-all-paths true
configures BGP to advertise all available paths.
neighbor 10.125.0.3
remote-as 65502
address-family
ipv4-unicast
addpath
tx-all-paths true
..
..
..
..
Refers to command reference for more details.
Unequal Cost Multipath¶
BGP Unequal Cost Multipath (UCMP), is an extension to the BGP multipath feature that allows for load balancing traffic across multiple paths that have different bandwidth capacities. In standard networking practices, ECMP is typically used when paths have the same cost metric, meaning that the traffic can be distributed evenly among them. However, this is not optimal when the paths have different bandwidths or capacities.
BGP UCMP improves upon this by allowing routers to distribute traffic in proportion to the bandwidth of each available path. This means that a higher-capacity link can carry more traffic, which can improve overall network efficiency and utilization.
The implementation of UCMP involves:
Path Advertisement: Routers advertise paths along with their respective bandwidth capabilities using a special bandwidth extended-community.
Path Selection: When multiple paths to the same destination are available, the router can use the bandwidth information to decide how much traffic to send over each path.
BGP UCMP can be particularly useful in environments where there are significant differences in link bandwidths and where bandwidth optimization and efficient load balancing are critical.
Warning
The current UCMP implementation is based on the IETF standard draft “draft-ietf-idr-link-bandwidth-07”. As the standard has not yet been finalized, interoperability of the UCMP feature is only guaranteed between Virtual Service Router routers. Routers from other vendors might not recognize the bandwidth extended-community correctly and could default to applying ECMP instead.
Bandwidth advertisement¶
Bandwidth information can be advertised using the bandwidth extended-community,
which can be configured via a route-map. Either num-multipaths
or a bandwidth
value can be set. Refers to
command reference
for details.
/ routing route-map BANDWIDTH seq 10 set extcommunity bandwidth <option>
The route-map is typically applied to the outbound configuration of a neighbor.
Path selection¶
By default, Virtual Service Router applies UCMP when all the paths for a given prefix include a bandwidth extended-community value. If this condition is not met, ECMP is performed instead. This behavior is customizable using the command.
/ vrf <vrf> routing bgp bestpath extcommunity-bandwidth behavior <option>
The options are detailed in command reference .