BGP configuration options

The BGP routing protocol is very rich and offers many options. In this paragraph we will study the most used and useful BGP options.

Aggregation

The main goal of aggregation is to summarize the number of network prefixes that are announced into the Internet. In fact, aggregation is a requirement when the mask length is too great. Your peers or the peers of your peers will filter some of them. They may want to reduce the number of prefixes.

However, the route aggregation can introduce some network loops or some black holes when it is not set properly.

Note

  • A BGP router can advertise an aggregated network only if one route of the aggregate network is in the BGP table. For example if we consider four networks 192.168.0.0/24 through 192.168.3.0/24, the BGP router can advertise the aggregate network 192.168.0.0/22 only if at least one network (192.168.1.0/24 through 192.168.3.0/24) is in the BGP table.

  • If all the sub-networks of an aggregated network go down, this aggregated network will not be advertised.

  • It is recommended to check that the aggregated network is not stopped by an Access List.

../../../../_images/aggregation.svg

BGP aggregation

The aggregation of the IPv4 network prefixes within the BGP tables can be done with the following command:

vrouter running bgp# address-family ipv4-unicast aggregate-address
                    PREFIX/M [summary-only true|false] [as-set true|false]

The aggregate command originates a new prefix. However, how to summarize the different AS-PATH ? There are two solutions:

  • The AS-PATH is suppressed, although some network loops could be introduced.

  • The AS-PATH is summarized within an unordered set (AS-SET), although some black hole could be created.

No aggregation flags

When neither the summary-only flag nor the as-set flag are set, a route with the aggregated PREFIX/M is originated from the BGP router. However the sub-prefixes are still advertised.

Example

routing bgp
   as 65500
   address-family
      ipv4-unicast
         network 192.168.3.0/24
           ..
         network 192.168.2.0/24
           ..
         aggregate-address 192.168.0.0/22
         ..
      ..
   neighbor 10.1.1.1
      remote-as 65510
      ..
   neighbor 10.1.1.6
      remote-as 65530
      ..

After rt1 device peers with rt2, and rt2 peers with rt3, rt1 can receive following rib entries :

rt1> show bgp ipv4 unicast
 BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
 Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                i internal, r RIB-failure, S Stale, R Removed
 Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
 Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
 *> 192.168.0.0/22   10.1.1.2                               0 65520 i
 *> 192.168.0.0      10.1.1.2                               0 65520 65530 i
 *> 192.168.1.0      10.1.1.2                               0 65520 65530 i
 *> 192.168.2.0      10.1.1.2                 0             0 65520 i
 *> 192.168.3.0      10.1.1.2                 0             0 65520 i

 Displayed  4 routes and 4 total paths
rt1> show bgp ipv4 unicast prefix 192.168.0.0/22
BGP routing table entry for 192.168.0.0/22
Paths: (1 available, best #1, table Default-IP-Routing-Table)
  Advertised to non peer-group peers:
  10.1.1.2
  65520, (aggregated by 65520 10.1.1.2)
    10.1.1.2 from 10.1.1.2 (10.1.1.2)
      Origin IGP, localpref 100, valid, external, atomic-aggregate, best
      AddPath ID: RX 0, TX 6
      Last update: Fri Sep 28 16:11:02 2018

Note

  • The aggregated prefix has the attribute atomic-aggregate, which means that the AS information is lost for the aggregate prefix (192.168.0.0/22).

  • Not to advertise the aggregated prefix, the flag summary-only can be set. Or a prefix-list or a distribute-list can be defined.

Moreover this aggregated prefix is received by rt3 too.

rt3> show ipv4-route
 Codes: K - kernel route, C - connected, S - static, R - RIP,
        O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
        T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
        F - PBR, f - OpenFabric,
        > - selected route, * - FIB route, q - queued route, r - rejected route

 B>* 192.168.0.0/22 [20/0] via 10.1.1.5, ntfp2, 00:03:34
 B>* 192.168.2.0/24 [20/0] via 10.1.1.5, ntfp2, 00:03:34
 B>* 192.168.3.0/24 [20/0] via 10.1.1.5, ntfp2, 00:03:34

Summary-only aggregation flag

When the summary-only flag is set and the as-set flag is not set, only the route with the aggregated PREFIX/M is originated from the BGP router. The sub-prefixes are not advertised. Moreover the ID of the router is set within the AS-PATH to help traffic engineering.

Example

rt2 running bgp# address-family ipv4-unicast aggregate-address 192.168.0.0/22 summary-only true

If the flag summary-only is set, the router will only advertise the aggregate prefix. We can notice that on the router which is advertising the aggregate prefix, the sub-prefixes have been suppressed, the remote peers will only see the aggregate prefix.

rt2> show bgp ipv4 unicast
 BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
 Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                i internal, r RIB-failure, S Stale, R Removed
 Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
 Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
 *> 192.168.0.0/22   0.0.0.0                            32768 i
 s> 192.168.0.0      10.1.1.6                 0             0 65530 i
 s> 192.168.1.0      10.1.1.6                 0             0 65530 i
 s> 192.168.2.0      0.0.0.0                  0         32768 i
 s> 192.168.3.0      0.0.0.0                  0         32768 i

 Displayed  5 routes and 5 total paths

The sub-prefixes which have been suppressed are labeled s.

On the remote peer, only the route to 192.168.0.0/22 is received by the BGP RIB.

rt1> show bgp ipv4 unicast
 BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
 Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                i internal, r RIB-failure, S Stale, R Removed
 Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
 Origin codes:  i - IGP, e - EGP, ? - incomplete

 Network          Next Hop            Metric LocPrf Weight Path
 *> 192.168.0.0/22   10.1.1.2                               0 65520 i

However, rt3 is still getting the aggregated route.

rt1> show bgp ipv4 unicast
 BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
 Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                i internal, r RIB-failure, S Stale, R Removed
 Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
 Origin codes:  i - IGP, e - EGP, ? - incomplete

 Network          Next Hop            Metric LocPrf Weight Path
 *> 192.168.0.0/22   10.1.1.5                               0 65520 i
 *> 192.168.0.0      0.0.0.0                  0         32768 i
 *> 192.168.1.0      0.0.0.0                  0         32768 i

 Displayed  3 routes and 3 total paths

As-set aggregation flag

When the summary-only flag is not set and the as-set flag is set, a route with the aggregated PREFIX/M is originated from the BGP router. Moreover the information of the previous AS-PATHs is collected into an unordered list called an AS-SET. This AS-SET, that is included within the new AS-PATH originated by the router, can help to avoid some networks loops. However the sub-prefixes are still advertised.

vrouter running bgp# address-family ipv4-unicast aggregate-address 192.168.0.0/22 as-set true

The AS information appears between brackets { }. It is an unordered list of the ASes.

In our example, if configured with as-set, rt2 can advertise an aggregate prefix because it knows at least one of its sub-networks.

Now by checking the rt2 BGP RIB we will see the as-set displayed. between brackets.

rt2> show bgp ipv4 unicast
 BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
 Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                i internal, r RIB-failure, S Stale, R Removed
 Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
 Origin codes:  i - IGP, e - EGP, ? - incomplete

 Network          Next Hop            Metric LocPrf Weight Path
 *> 192.168.0.0/22   0.0.0.0                            32768 {65530} i
 *> 192.168.0.0      10.1.1.6                 0             0 65530 i
 *> 192.168.1.0      10.1.1.6                 0             0 65530 i
 s> 192.168.2.0      0.0.0.0                  0         32768 i
 s> 192.168.3.0      0.0.0.0                  0         32768 i

Displayed  5 routes and 5 total paths

Combined summary-only and as-set aggregation flags

When both the summary-only and the as-set flags are set, a route with the aggregated PREFIX/M is originated from the BGP router. Moreover the information of the previous AS-PATHs is collected into an unordered list called an AS-SET. This AS-SET, that is included within the new AS-PATH originated by the router, can help to avoid some networks loops. The sub-prefixes are no longer advertised.

rt2 running bgp# address-family ipv4-unicast aggregate-address 192.168.0.0/22 summary-only true
                     as-set true

By taking following example, rt1 will receive aggregated prefix with the as-set set.

rt2> show bgp ipv4 unicast
 BGP table version is 4, local router ID is 10.1.1.1, vrf id 0
 Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                i internal, r RIB-failure, S Stale, R Removed
 Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
 Origin codes:  i - IGP, e - EGP, ? - incomplete

 Network          Next Hop            Metric LocPrf Weight Path
 *> 192.168.0.0/22   10.1.1.2                            0 65520 {65530} i

Confederation

A confederation is a set of many private ASes that are joined to be advertised as a single AS. A confederated AS is a confederation of many ASes that are joined by eBGP and that are themselves running an IGP.

The use cases are:

  1. Join independent ASes into a single AS.

  2. support multi-homed customers with a same ISP.

  3. Avoid the scaling issues of the full-mesh eBGP routers.

  • Configure a BGP confederation:

    running bgp# confederation identifier 65501
    
  • Join private ASes that belong to the same confederation:

    running bgp# confederation peers 65502 peers 65501
    

Example

Let’s configure the following confederation:

../../../../_images/confederation.svg

BGP confederation

Where the following configurations are set:

rt1

vrf main
   interface physical eth0_0
      ipv4 address 10.1.1.9/29
      ..
   interface physical eth1_0
      ipv4 address 172.16.255.254/30
      ..
   routing bgp
      as 65521
      neighbor 10.1.1.11 remote-as 65522
      neighbor 10.1.1.11 address-family ipv4-unicast route-map out route-map-name change_nexthop
      neighbor 10.1.1.10 remote-as 65521
      neighbor 10.1.1.10 address-family ipv4-unicast route-map out route-map-name change_nexthop
      neighbor 172.16.255.253 remote-as 65500
      confederation identifier 65520
      confederation peers 65522
      ..
   ..
   ..
routing
   ipv4-access-list 1
      permit any
      ..
   ipv4-prefix-list filter
      seq 1 address 172.16.0.0/16 policy permit
      ..
   route-map change_nexthop
      seq 1 policy permit
      seq 1 match ip address prefix-list filter
      seq 1 set ip next-hop 10.1.1.9
      seq 2 policy permit
      seq 2 match ip address access-list 1
      ..
    ..

rt2

vrf main
   interface physical eth0_0
      ipv4 address 10.1.1.10/29
      ..
   interface physical eth1_0
      ipv4 address 192.168.2.1/24
      ..
   routing bgp
      as 65521
      neighbor 10.1.1.9 remote-as 65521
      confederation identifier 65520
      address-family ipv4-unicast network 192.168.2.0/24
      ..
   ..

rt3

vrf main
   interface physical eth0_0
      ipv4 address 10.1.1.11/29
      ..
   interface physical eth1_0
      ipv4 address 10.1.1.1/29
      ..
   interface loopback loop
      ipv4 address 192.168.3.1/24
      ..
   routing bgp
      as 65522
      neighbor 10.1.1.9 remote-as 65521
      neighbor 10.1.1.2 remote-as 65520
      confederation identifier 65520
      confederation peers 65521
      address-family ipv4-unicast network 192.168.3.0/24
      ..
   ..

rt4

vrf main
   interface physical eth0_0
      ipv4 address 192.168.4.1/24
      ..
   interface physical eth1_0
      ipv4 address 10.1.1.2/29
      ..
   routing bgp
      as 65522
      neighbor 10.1.1.1 remote-as 65522
      confederation identifier 65520
      address-family ipv4-unicast network 192.168.4.0/24
      ..
   ..

rt5

However, when rt5 peers with rt1, it peers to the AS 65520 that is rt1’s BGP confederation identifier. It does not peer to the AS 65521 that is internal to the AS 65520:

vrf main
   interface physical eth0_0
      ipv4 address 172.16.0.1/16
      ..
   interface physical eth1_0
      ipv4 address 172.16.255.253/30
      ..
   routing bgp
      as 65000
      neighbor 172.16.255.254 remote-as 65522
      address-family ipv4-unicast network 172.16.0.0/16
      ..
   ..
  • Check this configuration on rt3 that displays the confederation path between parenthesis. The fib can also be dumped.

rt3> show bgp ipv4 unicast
 BGP table version is 2, local router ID is 192.168.3.1, vrf id 0
 Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                i internal, r RIB-failure, S Stale, R Removed
 Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
 Origin codes:  i - IGP, e - EGP, ? - incomplete

 Network          Next Hop            Metric LocPrf Weight Path
 172.16.0.0          10.1.1.9              0    100      0 (65521) 65500 i
 *> 192.168.2.0      10.1.1.10             0    100      0 (65521) i
 *> 192.168.3.0      0.0.0.0               0         32768 i
 *>i192.168.4.0      10.1.1.2              0    100      0 i

 Displayed  3 routes and 3 total paths

rt3> show bgp ipv4 unicast prefix 172.16.0.0/16
 BGP routing table entry for 172.16.0.0/16
 Paths: (1 available, no best path)
   Advertised to non peer-group peers:
   10.1.1.9
   (65521) 65500
     10.1.1.9 from 10.1.1.9 (172.16.255.254)
       Origin IGP, metric 0, localpref 100, invalid, confed-external, best
       AddPath ID: RX 0, TX 22
       Last update: Fri Oct 12 09:34:14 2018

The FIB can also be dumped:

rt3> show ipv4-routes
  Codes: K - kernel route, C - connected, S - static, R - RIP,
         O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
         T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
         F - PBR, f - OpenFabric,
         > - selected route, * - FIB route, q - queued route, r - rejected route


  C>* 10.1.1.0/29 is directly connected, eth0_0, 00:23:26
  C>* 10.1.1.8/29 is directly connected, eth0_0, 00:23:26
  B>* 172.16.0.0/16 [200/0] via 10.1.1.9, eth0_0, 00:18:11
  B>* 192.168.2.0/24 [200/0] via 10.1.1.10, eth0_0, 00:17:17
  C>* 192.168.3.0/24 is directly connected, loopback, 00:23:26
  B>* 192.168.4.0/24 [200/0] via 10.1.1.2, eth1_0, 00:17:17

Note

if a route-map had not been added to rt1, 172.16.0.0/16 would not have been visible in rt3, because it has no route to 172.16.255.253. It is a feature of BGP that requires to work with an IGP to resolve the recursives routes that do not have a directly connected gateway. Moreover, it means that the eBGP sessions between the confederation sub-ASes do not change the next hop attribute.

For example, you could add RIP or OSPF v2 on rt1, rt2, rt3 and rt4 that will be the IGP of all the AS65520.

Overriding AS

When working with both public BGP peers and private BGP peers, it is wished to have one single BGP instance, and in the same time, having the ability to override the default AS value. This can be done by using local-as value, where it is possible to override default AS value by the one that is set as local-as value.

Following configuration illustrates what the configuration could be. real AS value (65000 here) is hiddent behind 64512. Remote peer only sees 64512 value.

vrf main
   routing bgp
      as 65000
      neighbor 10.125.0.2 remote-as 64622
      neighbor 10.125.0.2 local-as as-number 64512 no-prepend true replace-as true
      ..
   ..
 ..

AS-Path prepending

On some situations, it is also wished to modify the as-path list. For instance, on transit routers, the as-path list may be enlarged in order to influence incoming traffic. Actually, by increasing the as-path list size, BGP best path selection algorithm may pick up the routers with the shortest as-path list.

The following route-map configuration can be applied to outgoing prefixes exchanged with BGP peers. as-path prepending action will prepend as-path values to the original as-path list. The priority number configured will determine which as-path value to insert first.

For instance, below route-map will prepend {65500, 65100} in the as-path list following the configured order 10, 20.

vrf main
   routing bgp as 65500
   ..
routing
 ipv4-prefix-list blocka
     seq 10 address 10.0.0.0/8 policy permit
     ..
 route-map bgp-export-block
     seq 10
         policy permit
         match
             ip
                 address
                     prefix-list blocka
                     ..
                 ..
             ..
         set
             as-path
                 prepend
                     asn 20
                         65100
                         ..
                     asn 10
                         65500
                         ..
                     ..
                 ..
             ip
                 next-hop 184.106.55.69

EBGP policy requirement

When interoperating with eBGP peers, route propagation may become riskier if no policies are set up on those peers. RFC 8212 enforces that policy by checking that incoming and outgoing filters are applied for eBGP sessions. With this policy, no route will be either accepted ( if no incoming filter) nor announced (if no outgoing filter). Below command can be used to enforce the behavior:

vrouter running bgp# ebgp-requires-policy true

Timers

The BGP timers are specific to the neighbors.

  • Set specific timers:

    vrouter running bgp# neighbor 10.125.0.3 timers keepalive-interval 15 hold-time 30
    

Tip

A good practice is to configure the same value on both sides of the TCP connection. Generally, these values should not be changed; however when the processing time of the BGP table is too long for the CPU to fire the keepalive timer, the later could be increased.

Routing Reconfiguration

Some configuration items may need the BGP routing tables to be refreshed. This is the case for multipath configuration. Enabling multipath needs to analyse all the routing table to see if there are ECMP entries.

BGP provides 2 mechanisms to permit this refresh:

  • either by issuing BGP route refresh messages to remote peers. This message asks remote peer to send back all BGP updates for a defined (AFI, SAFI) address-family.

  • or by enhancing software reconfiguration inbound. An inbound RIB is created for each peer, for a defined (AFI, SAFI). This is the ADJ-RIB-IN. All incoming BGP updates are stored in ADJ-RIB-IN and are kept unmodified. This permits reinjecting original BGP updates of remote peer, when needed. Enhancing software reconfiguration inbound can be configured on each address-family node.

The routing reconfiguration will be automatically triggered upon some reconfiguration elements. If software reconfiguration is not configured, then default behaviour will issue a route refresh message with remote peer.

Anytime, ADJ-RIB-IN can be flushed by using a flush command. This will force to rebuild the ADJ-RIB-IN command by issuing update with remote peer:

flush bgp vrf main all soft in

Route refresh

Route refresh is an extension to BGP that is defined in RFC 2918. Using this feature, a BGP router can request a complete retransmission of the peer’s routing information without tearing down and reestablishing the BGP session, saving a route flap. It is used to facilitate routing policy changes, without storing an unmodified copy of the peer’s routes on the local router to save memory. The capability must be supported by both routers of a BGP session. When both routers in the peering session support this extension, each router will respond to requests issued from the peer without operator intervention.

Route Refresh is enabled by default.

When the command flush is used, Route Refresh messages are sent to the peers, the router receives one or more Update packets with all the routes of the Adj-RIB-Out.

Example

Let’s configure the following peering:

routing bgp
   as 65000
   neighbor 172.16.255.254 remote-as 65522
   address-family ipv4-unicast network 172.16.0.0/16
     .. .. ..

Then the peering happens. And the RIB is feeded with remote updates from remote. No need to configure the multipath feature, since it is enabled by default.

The local peer will mark as staled the local entries learnt from the remote peer, then will send a BGP refresh message to the remote peer. The remote peer will send back the BGP updates, and the local instance will refresh the RIB accoringly.

BGP graceful restart capability

Usually when BGP on a router restarts, all the BGP peers detect that the session went down, and then came up. This “down/up” transition results in a “routing flap” and causes BGP route re-computation, generation of BGP routing updates and flap the forwarding tables. It could spread across multiple routing domains. Such routing flaps may create transient forwarding blackholes and/or transient forwarding loops. They also consume resources on the control plane of the routers affected by the flap. As such they are detrimental to the overall network performance.

This feature proposes a mechanism for BGP that would help minimize the negative effects on routing caused by BGP restart. The graceful restart capabilities (code-64) will be exchanged between the BGP speakers through the open messages. Routes advertised by the restarting speaker will become stale in the peer speakers’ routing table. On expiry of restart time the stale routes will be deleted if the restarting speaker does not come up. Once the restarting speaker re-establish the BGP session within the restart time the stale routes will be converted to normal routes. Traffic flow through the stale routes will not be stopped while the BGP speaker is restarting.

  • Enable BGP graceful restart:

    vrouter running bgp# graceful-restart restart-time 60
    
    vrouter running bgp# graceful-restart stalepath-time 120