BGP configuration options¶
The BGP routing protocol is very rich and offers many options. In this paragraph we will study the most used and useful BGP options.
Aggregation¶
The main goal of aggregation is to summarize the number of network prefixes that are announced into the Internet. In fact, aggregation is a requirement when the mask length is too great. Your peers or the peers of your peers will filter some of them. They may want to reduce the number of prefixes.
However, the route aggregation can introduce some network loops or some black holes when it is not set properly.
Note
A BGP router can advertise an aggregated network only if one route of the aggregate network is in the BGP table. For example if we consider four networks 192.168.0.0/24 through 192.168.3.0/24, the BGP router can advertise the aggregate network 192.168.0.0/22 only if at least one network (192.168.1.0/24 through 192.168.3.0/24) is in the BGP table.
If all the sub-networks of an aggregated network go down, this aggregated network will not be advertised.
It is recommended to check that the aggregated network is not stopped by an Access List.
The aggregation of the IPv4 network prefixes within the BGP tables can be done with the following command:
vrouter running bgp# address-family ipv4-unicast aggregate-address
PREFIX/M [summary-only true|false] [as-set true|false]
The aggregate command originates a new prefix. However, how to summarize the different AS-PATH ? There are two solutions:
The AS-PATH is suppressed, although some network loops could be introduced.
The AS-PATH is summarized within an unordered set (AS-SET), although some black hole could be created.
No aggregation flags¶
When neither the summary-only
flag nor the as-set
flag are set, a route
with the aggregated PREFIX/M is originated from the BGP router. However the
sub-prefixes are still advertised.
Example
routing bgp
as 65500
address-family
ipv4-unicast
network 192.168.3.0/24
..
network 192.168.2.0/24
..
aggregate-address 192.168.0.0/22
..
..
neighbor 10.1.1.1
remote-as 65510
..
neighbor 10.1.1.6
remote-as 65530
..
After rt1 device peers with rt2, and rt2 peers with rt3, rt1 can receive following rib entries :
rt1> show bgp ipv4 unicast
BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 192.168.0.0/22 10.1.1.2 0 65520 i
*> 192.168.0.0 10.1.1.2 0 65520 65530 i
*> 192.168.1.0 10.1.1.2 0 65520 65530 i
*> 192.168.2.0 10.1.1.2 0 0 65520 i
*> 192.168.3.0 10.1.1.2 0 0 65520 i
Displayed 4 routes and 4 total paths
rt1> show bgp ipv4 unicast prefix 192.168.0.0/22
BGP routing table entry for 192.168.0.0/22
Paths: (1 available, best #1, table Default-IP-Routing-Table)
Advertised to non peer-group peers:
10.1.1.2
65520, (aggregated by 65520 10.1.1.2)
10.1.1.2 from 10.1.1.2 (10.1.1.2)
Origin IGP, localpref 100, valid, external, atomic-aggregate, best
AddPath ID: RX 0, TX 6
Last update: Fri Sep 28 16:11:02 2018
Note
The aggregated prefix has the attribute atomic-aggregate, which means that the AS information is lost for the aggregate prefix (192.168.0.0/22).
Not to advertise the aggregated prefix, the flag summary-only can be set. Or a prefix-list or a distribute-list can be defined.
Moreover this aggregated prefix is received by rt3 too.
rt3> show ipv4-route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR,
> - selected route, * - FIB route
B>* 192.168.0.0/22 [20/0] via 10.1.1.5, ntfp2, 00:03:34
B>* 192.168.2.0/24 [20/0] via 10.1.1.5, ntfp2, 00:03:34
B>* 192.168.3.0/24 [20/0] via 10.1.1.5, ntfp2, 00:03:34
Summary-only aggregation flag¶
When the summary-only flag is set and the as-set flag is not set, only the route with the aggregated PREFIX/M is originated from the BGP router. The sub-prefixes are not advertised. Moreover the ID of the router is set within the AS-PATH to help traffic engineering.
Example
rt2 running bgp# address-family ipv4-unicast aggregate-address 192.168.0.0/22 summary-only true
If the flag summary-only
is set, the router will only advertise the aggregate
prefix. We can notice that on the router which is advertising the aggregate
prefix, the sub-prefixes have been suppressed, the remote peers will only see
the aggregate prefix.
rt2> show bgp ipv4 unicast
BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 192.168.0.0/22 0.0.0.0 32768 i
s> 192.168.0.0 10.1.1.6 0 0 65530 i
s> 192.168.1.0 10.1.1.6 0 0 65530 i
s> 192.168.2.0 0.0.0.0 0 32768 i
s> 192.168.3.0 0.0.0.0 0 32768 i
Displayed 5 routes and 5 total paths
The sub-prefixes which have been suppressed are labeled s
.
On the remote peer, only the route to 192.168.0.0/22 is received by the BGP RIB.
rt1> show bgp ipv4 unicast
BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 192.168.0.0/22 10.1.1.2 0 65520 i
However, rt3 is still getting the aggregated route.
rt1> show bgp ipv4 unicast
BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 192.168.0.0/22 10.1.1.5 0 65520 i
*> 192.168.0.0 0.0.0.0 0 32768 i
*> 192.168.1.0 0.0.0.0 0 32768 i
Displayed 3 routes and 3 total paths
As-set aggregation flag¶
When the summary-only flag is not set and the as-set flag is set, a route with the aggregated PREFIX/M is originated from the BGP router. Moreover the information of the previous AS-PATHs is collected into an unordered list called an AS-SET. This AS-SET, that is included within the new AS-PATH originated by the router, can help to avoid some networks loops. However the sub-prefixes are still advertised.
vrouter running bgp# address-family ipv4-unicast aggregate-address 192.168.0.0/22 as-set true
The AS information appears between brackets { }
. It is an unordered list of
the ASes.
In our example, if configured with as-set, rt2 can advertise an aggregate prefix because it knows at least one of its sub-networks.
Now by checking the rt2 BGP RIB we will see the as-set displayed. between brackets.
rt2> show bgp ipv4 unicast
BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 192.168.0.0/22 0.0.0.0 32768 {65530} i
*> 192.168.0.0 10.1.1.6 0 0 65530 i
*> 192.168.1.0 10.1.1.6 0 0 65530 i
s> 192.168.2.0 0.0.0.0 0 32768 i
s> 192.168.3.0 0.0.0.0 0 32768 i
Displayed 5 routes and 5 total paths
Combined summary-only and as-set aggregation flags¶
When both the summary-only
and the as-set
flags are set, a route with the
aggregated PREFIX/M is originated from the BGP router. Moreover the
information of the previous AS-PATHs is collected into an unordered list called
an AS-SET. This AS-SET, that is included within the new AS-PATH originated by
the router, can help to avoid some networks loops. The sub-prefixes are no
longer advertised.
rt2 running bgp# address-family ipv4-unicast aggregate-address 192.168.0.0/22 summary-only true
as-set true
By taking following example, rt1 will receive aggregated prefix with the as-set set.
rt2> show bgp ipv4 unicast
BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
*> 192.168.0.0/22 10.1.1.2 0 65520 {65530} i
Confederation¶
A confederation is a set of many private ASes that are joined to be advertised as a single AS. A confederated AS is a confederation of many ASes that are joined by eBGP and that are themselves running an IGP.
The use cases are:
Join independent ASes into a single AS.
support multi-homed customers with a same ISP.
Avoid the scaling issues of the full-mesh eBGP routers.
Configure a BGP confederation:
running bgp# confederation identifier 65501
Join private ASes that belong to the same confederation:
running bgp# confederation peers 65502 peers 65501
Example
Let’s configure the following confederation:
Where the following configurations are set:
rt1
vrf main
interface physical eth0_0
ipv4 address 10.1.1.9/29
..
interface physical eth1_0
ipv4 address 172.16.255.254/30
..
routing bgp
as 65521
neighbor 10.1.1.11 remote-as 65522
neighbor 10.1.1.11 address-family ipv4-unicast route-map out route-map-name change_nexthop
neighbor 10.1.1.10 remote-as 65521
neighbor 10.1.1.10 address-family ipv4-unicast route-map out route-map-name change_nexthop
neighbor 172.16.255.253 remote-as 65500
confederation identifier 65520
confederation peers 65522
..
..
..
routing
ipv4-access-list 1
permit any
..
ipv4-prefix-list filter
seq 1 address 172.16.0.0/16 policy permit
..
route-map change_nexthop
seq 1 policy permit
seq 1 match ip address prefix-list filter
seq 1 set ip next-hop 10.1.1.9
seq 2 policy permit
seq 2 match ip address access-list 1
..
..
rt2
vrf main
interface physical eth0_0
ipv4 address 10.1.1.10/29
..
interface physical eth1_0
ipv4 address 192.168.2.1/24
..
routing bgp
as 65521
neighbor 10.1.1.9 remote-as 65521
confederation identifier 65520
address-family ipv4-unicast network 192.168.2.0/24
..
..
rt3
vrf main
interface physical eth0_0
ipv4 address 10.1.1.11/29
..
interface physical eth1_0
ipv4 address 10.1.1.1/29
..
interface loopback loop
ipv4 address 192.168.3.1/24
..
routing bgp
as 65522
neighbor 10.1.1.9 remote-as 65521
neighbor 10.1.1.2 remote-as 65520
confederation identifier 65520
confederation peers 65521
address-family ipv4-unicast network 192.168.3.0/24
..
..
rt4
vrf main
interface physical eth0_0
ipv4 address 192.168.4.1/24
..
interface physical eth1_0
ipv4 address 10.1.1.2/29
..
routing bgp
as 65522
neighbor 10.1.1.1 remote-as 65522
confederation identifier 65520
address-family ipv4-unicast network 192.168.4.0/24
..
..
rt5
However, when rt5 peers with rt1, it peers to the AS 65520 that is rt1’s BGP confederation identifier. It does not peer to the AS 65521 that is internal to the AS 65520:
vrf main
interface physical eth0_0
ipv4 address 172.16.0.1/16
..
interface physical eth1_0
ipv4 address 172.16.255.253/30
..
routing bgp
as 65000
neighbor 172.16.255.254 remote-as 65522
address-family ipv4-unicast network 172.16.0.0/16
..
..
Check this configuration on rt3 that displays the confederation path between parenthesis. The fib can also be dumped.
rt3> show bgp ipv4 unicast
BGP table version is 2, local router ID is 192.168.3.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete
Network Next Hop Metric LocPrf Weight Path
172.16.0.0 10.1.1.9 0 100 0 (65521) 65500 i
*> 192.168.2.0 10.1.1.10 0 100 0 (65521) i
*> 192.168.3.0 0.0.0.0 0 32768 i
*>i192.168.4.0 10.1.1.2 0 100 0 i
Displayed 3 routes and 3 total paths
rt3> show bgp ipv4 unicast prefix 172.16.0.0/16
BGP routing table entry for 172.16.0.0/16
Paths: (1 available, no best path)
Advertised to non peer-group peers:
10.1.1.9
(65521) 65500
10.1.1.9 from 10.1.1.9 (172.16.255.254)
Origin IGP, metric 0, localpref 100, invalid, confed-external, best
AddPath ID: RX 0, TX 22
Last update: Fri Oct 12 09:34:14 2018
The FIB can also be dumped:
rt3> show ipv4-routes
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR,
> - selected route, * - FIB route
C>* 10.1.1.0/29 is directly connected, eth0_0, 00:23:26
C>* 10.1.1.8/29 is directly connected, eth0_0, 00:23:26
B>* 172.16.0.0/16 [200/0] via 10.1.1.9, eth0_0, 00:18:11
B>* 192.168.2.0/24 [200/0] via 10.1.1.10, eth0_0, 00:17:17
C>* 192.168.3.0/24 is directly connected, loopback, 00:23:26
B>* 192.168.4.0/24 [200/0] via 10.1.1.2, eth1_0, 00:17:17
Note
if a route-map had not been added to rt1, 172.16.0.0/16 would not have been visible in rt3, because it has no route to 172.16.255.253. It is a feature of BGP that requires to work with an IGP to resolve the recursives routes that do not have a directly connected gateway. Moreover, it means that the eBGP sessions between the confederation sub-ASes do not change the next hop attribute.
For example, you could add RIP or OSPF v2 on rt1, rt2, rt3 and rt4 that will be the IGP of all the AS65520.
Overriding AS¶
When working with both public BGP peers and private BGP peers, it is wished to have one single BGP instance, and in the same time, having the ability to override the default AS value. This can be done by using local-as value, where it is possible to override default AS value by the one that is set as local-as value.
Following configuration illustrates what the configuration could be. real AS value (65000 here) is hiddent behind 64512. Remote peer only sees 64512 value.
vrf main
routing bgp
as 65000
neighbor 10.125.0.2 remote-as 64622
neighbor 10.125.0.2 local-as as-number 64512 no-prepend true replace-as true
..
..
..
Timers¶
The BGP timers are specific to the neighbors.
Set specific timers:
vrouter running bgp# neighbor 10.125.0.3 timers keepalive-interval 15 hold-time 30
Tip
A good practice is to configure the same value on both sides of the TCP connection. Generally, these values should not be changed; however when the processing time of the BGP table is too long for the CPU to fire the keepalive timer, the later could be increased.
Routing Reconfiguration¶
Some configuration items may need the BGP routing tables to be refreshed. This is the case for multipath configuration. Enabling multipath needs to analyse all the routing table to see if there are ECMP entries.
BGP provides 2 mechanisms to permit this refresh:
either by issuing BGP route refresh messages to remote peers. This message asks remote peer to send back all BGP updates for a defined (AFI, SAFI) address-family.
or by enhancing software reconfiguration inbound. An inbound RIB is created for each peer, for a defined (AFI, SAFI). This is the ADJ-RIB-IN. All incoming BGP updates are stored in ADJ-RIB-IN and are kept unmodified. This permits reinjecting original BGP updates of remote peer, when needed. Enhancing software reconfiguration inbound can be configured on each address-family node.
The routing reconfiguration will be automatically triggered upon some reconfiguration elements. If software reconfiguration is not configured, then default behaviour will issue a route refresh message with remote peer.
Anytime, ADJ-RIB-IN can be flushed by using a flush
command. This will force
to rebuild the ADJ-RIB-IN command by issuing update with remote peer:
flush bgp vrf main all soft in
Route refresh¶
Route refresh is an extension to BGP that is defined in RFC 2918. Using this feature, a BGP router can request a complete retransmission of the peer’s routing information without tearing down and reestablishing the BGP session, saving a route flap. It is used to facilitate routing policy changes, without storing an unmodified copy of the peer’s routes on the local router to save memory. The capability must be supported by both routers of a BGP session. When both routers in the peering session support this extension, each router will respond to requests issued from the peer without operator intervention.
Route Refresh is enabled by default.
When the command flush is used, Route Refresh messages are sent to the peers, the router receives one or more Update packets with all the routes of the Adj-RIB-Out.
Example
Let’s configure the following peering:
routing bgp
as 65000
neighbor 172.16.255.254 remote-as 65522
address-family ipv4-unicast network 172.16.0.0/16
.. .. ..
Then the peering happens. And the RIB is feeded with remote updates from remote. No need to configure the multipath feature, since it is enabled by default.
The local peer will mark as staled the local entries learnt from the remote peer, then will send a BGP refresh message to the remote peer. The remote peer will send back the BGP updates, and the local instance will refresh the RIB accoringly.
BGP graceful restart capability¶
Usually when BGP on a router restarts, all the BGP peers detect that the session went down, and then came up. This “down/up” transition results in a “routing flap” and causes BGP route re-computation, generation of BGP routing updates and flap the forwarding tables. It could spread across multiple routing domains. Such routing flaps may create transient forwarding blackholes and/or transient forwarding loops. They also consume resources on the control plane of the routers affected by the flap. As such they are detrimental to the overall network performance.
This feature proposes a mechanism for BGP that would help minimize the
negative effects on routing caused by BGP restart. The graceful restart
capabilities (code-64) will be exchanged between the BGP speakers through the
open messages. Routes advertised by the restarting speaker will become stale
in the peer speakers’ routing table. On expiry of restart time
the stale
routes will be deleted if the restarting speaker does not come up. Once the
restarting speaker re-establish the BGP session within the restart time
the
stale routes will be converted to normal routes. Traffic flow through the stale
routes will not be stopped while the BGP speaker is restarting.
Enable BGP graceful restart:
vrouter running bgp# graceful-restart restart-time 60
vrouter running bgp# graceful-restart stalepath-time 120