BGP configuration

There are a list of necessary elements to know when forging a BGP configuration.

Basic elements for configuration

When forging a BGP configuration, the local AS number, and the remote AS number, as well as the remote IP address have to be known in order to establish peering.

An AS is an administrative set of routers, depending on an administrative authority. There are public or assigned ASes, and private ASes. An ASes is identified by a number called ASN.

The public ASNs are any registered ASN values that are not private. These ASNs are assigned by a RIR. The private ASNs are made up of 2 ranges that can be used for local administration. These numbers are 64512 through 65535, and 4200000000 through 4294967294.

BGP has been extended to exchange routing information not only for IPv4 routing tables, also other routing information like IPv6, or for other purpose: flowspec, or L3VPN or L2VPN. The address-family command allows us to identify the network protocol. It is made up of a pair of arguments AFI, SAFI. For instance, by default, IPv4, unicast is enabled and stands for the routing information of IPv4.

Here below is an example on how to configure a sample BGP configuration with both IPv4 and IPv6 address-family set:

vrf main
   routing
      bgp
         router-id 10.125.0.1
         as 65501
         neighbor 10.125.0.3
            remote-as 65502
            address-family ipv6
               ..
            ..
         commit

The same configuration can be made using this NETCONF XML configuration:

vsr running config# show config xml absolute vrf main routing bgp
<config xmlns="urn:6wind:vrouter">
  <vrf>
    <name>main</name>
    <routing xmlns="urn:6wind:vrouter/routing">
      <bgp xmlns="urn:6wind:vrouter/bgp">
        <router-id>10.125.0.1</router-id>
        <as>65501</as>
        <neighbor>
          <neighbor-address>10.125.0.3</neighbor-address>
          <address-family>
            <ipv6-unicast>
            </ipv6-unicast>
          </address-family>
          <remote-as>65502</remote-as>
        </neighbor>
      </bgp>
    </routing>
  </vrf>
</config>

Configuring various address-family means that there are subtle differences between each address-family, that permit benefiting from each specificity.

For instance, IPv6, unicast address-family provides 2 IPv6 next-hops : the local one and the global one.

Also, IPv4, vpn is the L3VPN combination for MPLS tunnels. While the routing information exchanged deals with inner IPv4 information, the MPLS VPN SAFI implies that the overlay will be based with MPLS. The nexthop information will stand for underlay tunnel end point information. Here, the nexthop may be either IPv4 or IPv6, independently of the inner IPv4 prefix. The nexthop will also contain the MPLS label identifier.

Note

You can also disable BGP, either by suppressing the configuration:

vrf main
    del routing bgp
    ..

Alternatively, if you don’t want to lose the configuration, and disabling BGP configuration, you can use following command:

vrf main
    routing bgp
        enabled false

This method can be used if the user wants to force peering with remote BGP speakers. Consecutively changing the state of BGP will force the peering. Here, below illustration indicates how the session for 10.125.0.3 is flushed.

flush bgp vrf main ipv4 unicast neighbor 10.125.0.3

Note that this command can also selectively flush different parts of the routing tables, like ADJ-RIB-IN by issuing the soft in prefix at the end of the command. An other possibility is to disable the whole BGP instance.

vrf main
    routing bgp enabled false
    commit
    routing bgp enabled true
    commit

Basic BGP configuration

../../../../_images/first-config.svg

BGP configuration illustration with 3 BGP peerings

The above diagram depicts 3 devices, each one has a BGP instance that peers with each other. The 3 devices configuration is like below:

rt1

routing bgp
    router-id 10.1.1.1
    as 65500
    neighbor 10.1.1.2 remote-as 65510
    neighbor 10.1.1.3 remote-as 65520
    address-family ipv4-unicast redistribute connected
    ..
    ..
interface
    physical eth1_0
       ipv4 address 192.168.1.0/24
       ..
    ..
    physical eth0_0
       ipv4 address 10.1.1.1/28
       ..
    ..

rt2

routing bgp
    router-id 10.1.1.2
    as 65510
    neighbor 10.1.1.1 remote-as 65500
    neighbor 10.1.1.3 remote-as 65520
    address-family ipv4-unicast redistribute connected
    ..
    ..
interface
    physical eth1_0
       ipv4 address 192.168.2.0/24
       ..
    ..
    physical eth0_0
       ipv4 address 10.1.1.2/28
       ..
    ..

rt3

routing bgp
    router-id 10.1.1.3
    as 65520
    neighbor 10.1.1.1 remote-as 65500
    neighbor 10.1.1.2 remote-as 65510
    address-family ipv4-unicast redistribute connected
    ..
    ..
interface
    physical eth1_0
       ipv4 address 192.168.3.0/24
       ..
    ..
    physical eth0_0
       ipv4 address 10.1.1.3/28
       ..
    ..

After having executed the three configurations, the status of the BGP connections can be obtained. The peerings between the devices can be visualised with the following command:

rt1> show bgp summary

IPv4 Unicast Summary:
BGP router identifier 10.1.1.1, local AS number 65500 vrf-id 0
BGP table version 5
RIB entries 9, using 1368 bytes of memory
Peers 2, using 41 KiB of memory

Neighbor        V         AS MsgRcvd MsgSent   TblVer  InQ OutQ  Up/Down State/P
fxRcd
10.1.1.2        4      65510      17      17        0    0    0 00:09:08      4
10.1.1.3        4      65520      17      17        0    0    0 00:09:11      4

Total number of neighbors 2

The output of the state column must be blank in case the BGP connection is established, otherwise it reflects the state of the BGP connection. The different BGP session states are studied later in the section. Following command gives detailed BGP information about a given neighbor:

rt1> show bgp neighbor 10.1.1.2
BGP neighbor is 10.1.1.2, remote AS 65510, local AS 65500, external link
Hostname: rt1
  BGP version 4, remote router ID 10.1.1.2
  BGP state = Established, up for 00:14:00
  Last read 00:01:00, Last write 00:01:00
  Hold time is 180, keepalive interval is 60 seconds
  Neighbor capabilities:
    4 Byte AS: advertised and received
    AddPath:
      IPv4 Unicast: RX advertised IPv4 Unicast and received
    Route refresh: advertised and received(old & new)
    Address Family IPv4 Unicast: advertised and received
    Hostname Capability: advertised (name: rt1,domain name: n/a)
          received (name: rt2,domain name: n/a)
    Graceful Restart Capabilty: advertised and received
      Remote Restart timer is 120 seconds
      Address families by peer:
        none
    Graceful restart informations:
      End-of-RIB send: IPv4 Unicast
      End-of-RIB received: IPv4 Unicast
    Message statistics:
      Inq depth is 0
      Outq depth is 0
                           Sent       Rcvd
      Opens                   1          1
      Notifications:          0          0
      Updates:                6          6
      Keepalives:            14         14
      Route Refresh:          0          0
      Capability:             0          0
      Total:                 21         21

It is also possible to dump the list of BGP entries that rt1 learnt from the other peers, by using following command on configuration mode:

rt1> show bgp ipv4 unicast neighbors
 BGP table version is 5, local router ID is 10.1.1.1, vrf id 0
 Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
                i internal, r RIB-failure, S Stale, R Removed
 Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
 Origin codes:  i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
 *  10.0.2.0/24      10.1.1.2                 0             0 65510 ?
 *                   10.1.1.3                 0             0 65520 ?
 *>                  0.0.0.0                  0         32768 ?
 *  10.1.1.0/28      10.1.1.2                 0             0 65510 ?
 *                   10.1.1.3                 0             0 65520 ?
 *>                  0.0.0.0                  0         32768 ?
 *> 192.168.1.0      0.0.0.0                  0         32768 ?
 *  192.168.2.0      10.1.1.2                               0 65520 65510 ?
 *>                  10.1.1.2                 0             0 65510 ?
 *  192.168.3.0      10.1.1.3                               0 65510 65520 ?
 *>                  10.1.1.3                 0             0 65520 ?

 Displayed  5 routes and 11 total paths

Peer-groups

Scaling BGP deployments may be useful, when one deploys multiple instances of BGP. Instead of configuring each peer one by one, it is possible to configure peer groups.

A peer group is defined by a name, and is being applied the same configuration as the one applied to a single peer IP, except for the IP addressing of that peer.

You can use a peer group configuration so as to define some peers with outgoing peering, with the inherited configuration coming from the peer-group. You can also use a peer group to permit incoming peering connections, like a server would do. Below configuration illustrates the usage of a peer group named group:

routing bgp
   listen
      neighbor-range 10.135.0.0/24 neighbor-group group
      ..
   as 65502
   neighbor-group group
      address-family
         ipv6-unicast
            ..
         ..
      remote-as 65501
      update-source 10.135.0.2
      ..
   neighbor 10.145.0.2
      neighbor-group group
         ..
       ..

By default, the BGP instance will create up to 100 maximum peering connections. This is a configuration limit that can be modified to increase the maximum number of peering connections to support:

routing bgp
   listen limit 5000
   ..

With that extra configuration, incoming BGP connections that match its settings will be able to be peered up to that new limit. It is also possible to limit the number of accepted incoming connections by establishing a range of potential IP addresses, like the above example illustrates.

Note

When configuring and handling a huge number of peering connections, for example with values above 5000, you may experience some timeout issues on the BGP connections. This is due to the increased time of processing the incoming BGP packets. This time can be reduced by increasing the number of BGP packets to be processed by I/O cycle. When a huge number of peering connections is used, it is recommended to use the following configuration command:

routing bgp
   packet-rw-quantum read 64
   packet-rw-quantum write 200
   ..

As you can see, the packet-rw-quantum write value has been increased accordingly. It is recommended to keep that value above packet-rw-quantum read value, if you do not want to experience memory exhaustion due to accumulation of buffers in BGP process.

Note

With such big number of sessions, it may be wishable for the server to detect the failure of remote endpoint as quick as possible. One possibility consists in enabling tcp-keepalive parameter. tcp-keepalive packets need to be replied at remote TCP endpoints with ack packet. After a defined consecutive number of probes where ack is not seen, TCP session is considered as down. Below configuration detects a failure of 3 probes send each 2 seconds.

routing bgp
   tcp-keepalive idle 2 keepalive 2 probes 3
   ..

Route-Reflector

Route reflector is used in iBGP networks, where the number of BGP peers becomes too important. Instead of using a full mesh peering, a 1-N peering topology is used. A single ( or two, in case backup is needed) BGP instance acts as route reflector server, and receives/replies BGP updates from/to route reflector clients accordingly. This permits scaling some setups. Creating a route reflector server consists in defining an IBGP peering session, either via peer-group or by defining directly a peer. The option route-reflector-client must be set to true.

as 65501
neighbor-group group
   address-family
      ipv4-unicast
         route-reflector-client true
         ..
      ..
   remote-as 65501
neighbor 1.1.1.1
   address-family
      ipv4-unicast
         route-reflector-client true
         ..
      ..
   remote-as 65501
   ..

There is no need to add extra-configuration to the iBGP clients.

Multipath

Overview

The BGP path selection process involves a series of criteria to determine the best path for a given prefix. Here is a summarized and ordered list of these criteria:

  1. Highest Weight: Prefers the path with the highest weight.

  2. Highest Local Preference: Routes with higher local preference are favored.

  3. Originated Locally: Prefers routes originated by the local router.

  4. Shortest AS-path: Chooses the route with the fewest AS hops.

  5. Lowest Origin Code: Prefers iBGP origins over eBGP, and eBGP over Incomplete.

  6. Lowest MED: Selects the path with the smallest MED, but only compares MED for routes from the same AS.

  7. eBGP over iBGP: Prefers external BGP paths over internal.

  8. Shortest IGP Path to BGP Next Hop: Chooses the nearest next hop according to IGP metrics.

  9. Oldest Path for eBGP: Prefers the oldest path from external BGP peers for stability.

  10. Lowest Router ID: Selects the path through the BGP router with the lowest router ID.

  11. Minimum Cluster List Length: In route reflection, prefers the path with the shortest cluster list.

  12. Lowest Neighbor Address: Finally, the path from the neighbor with the lowest IP address is chosen if all other criteria are equal.

Traditionally in BGP, the selection process chooses only one route to be installed into the RIB. This approach was not optimal as it utilized only one path at a time. BGP multipath is a feature that allows multiple paths to be selected simultaneously, enabling traffic to be distributed across these paths. The paths included in the multipath setup are not distinguished by preference based on the criteria from 1 (Weight) through 8 (Shortest IGP Path to BGP Next Hop). They must either have the same AS-path for iBGP paths or be received from the same AS for eBGP paths. This technique enhances bandwidth usage and provides redundancy by utilizing all eligible routes equally (using ECMP) or unequally (using UCMP). BGP multipath promotes load balancing and ensures network reliability by rerouting traffic if a path fails.

By default, BGP can handle up to 8 multiple paths for a given prefix and install them into the RIB. The limit on the number of paths can be customized for each address family and both iBGP and eBGP sessions. Setting the maximum-path value to 1 effectively disables the multipath feature.

router-id 10.125.0.1
address-family
   ipv4-unicast
      maximum-path
         ebgp 1
         ibgp 1
         ..
      ..
   ..
as 65501

Specific parameters can be used to relax the criteria for including paths in the BGP multipath selection. For example, the command bestpath as-path multipath-relax as-set relaxes the requirement for paths to have the same AS-path list (for iBGP) or to originate from the same AS (for eBGP). However, the condition of having the shortest AS-path is still enforced, meaning the paths must have the same number of AS hops as the best path.

bestpath
   as-path
      multipath-relax as-set
      ..
   ..

Another command, bestpath as-path ignore true, goes further by removing the requirement for paths to share an identical count of AS in the AS-path (criteria 4 in the previously mentioned list). With this setting, the AS-path is entirely excluded from the path comparison process:

bestpath
   as-path
      ignore true
      ..
   ..

Refers to command reference for more details.

BGP Add-Path

The BGP Additional Paths feature, commonly referred to as BGP Add-Path, enhances the capabilities of BGP by allowing the advertisement of multiple paths for the same prefix in a BGP session. By default, BGP installs multiple paths to a prefix into the RIB but only advertises the best path according to the above-mentioned criteria. The Add-Path feature, however, enables the advertisement of additional paths using the commands addpath tx-all-paths. For example, the command addpath tx-all-paths true configures BGP to advertise all available paths.

neighbor 10.125.0.3
   remote-as 65502
   address-family
      ipv4-unicast
         addpath
            tx-all-paths true
            ..
         ..
      ..
   ..

Refers to command reference for more details.

Unequal Cost Multipath

BGP Unequal Cost Multipath (UCMP), is an extension to the BGP multipath feature that allows for load balancing traffic across multiple paths that have different bandwidth capacities. In standard networking practices, ECMP is typically used when paths have the same cost metric, meaning that the traffic can be distributed evenly among them. However, this is not optimal when the paths have different bandwidths or capacities.

BGP UCMP improves upon this by allowing routers to distribute traffic in proportion to the bandwidth of each available path. This means that a higher-capacity link can carry more traffic, which can improve overall network efficiency and utilization.

The implementation of UCMP involves:

  • Path Advertisement: Routers advertise paths along with their respective bandwidth capabilities using a special bandwidth extended-community.

  • Path Selection: When multiple paths to the same destination are available, the router can use the bandwidth information to decide how much traffic to send over each path.

BGP UCMP can be particularly useful in environments where there are significant differences in link bandwidths and where bandwidth optimization and efficient load balancing are critical.

Warning

The current UCMP implementation is based on the IETF standard draft “draft-ietf-idr-link-bandwidth-07”. As the standard has not yet been finalized, interoperability of the UCMP feature is only guaranteed between Virtual Service Router routers. Routers from other vendors might not recognize the bandwidth extended-community correctly and could default to applying ECMP instead.

Bandwidth advertisement

Bandwidth information can be advertised using the bandwidth extended-community, which can be configured via a route-map. Either num-multipaths or a bandwidth value can be set. Refers to command reference for details.

/ routing route-map BANDWIDTH seq 10 set extcommunity bandwidth <option>

The route-map is typically applied to the outbound configuration of a neighbor.

Path selection

By default, Virtual Service Router applies UCMP when all the paths for a given prefix include a bandwidth extended-community value. If this condition is not met, ECMP is performed instead. This behavior is customizable using the command.

/ vrf <vrf> routing bgp bestpath extcommunity-bandwidth behavior <option>

The options are detailed in command reference .