3.6. External Tools¶
3.6.1. perf
¶
For interesting results, perf
requires the fast path executable to be built with
debug info (not stripped).
Use this command to get info on which functions the CPU spends the most time in:
# perf top
Samples: 794K of event 'cycles', Event count (approx.): 501635499700
Overhead Shared Object Symbol
39.71% fp-rte [.] fpn_main_loop
17.91% fp-rte [.] ixgbe_recv_pkts_lro_bulk_alloc
8.40% fp-rte [.] fp_ip_input
7.09% fp-rte [.] ixgbe_xmit_pkts
2.85% librte_crypto.so [.] rte_crypto_poll
2.66% fp-rte [.] fpn_crypto_generic_poll
2.31% fp-rte [.] fp_ip_if_send
2.25% fp-rte [.] fp_ether_input
2.20% fp-rte [.] fpn_intercore_drain
2.16% librte_crypto_multibuffer.so [.] 0x00000000000052f2
1.95% fp-rte [.] fp_if_output
1.79% fp-rte [.] fp_process_input_bulk
1.68% fp-rte [.] fpn_crypto_poll
1.60% ld-2.17.so [.] __tls_get_addr
0.90% fp-rte [.] fpn_recv_exception
0.89% fp-rte [.] fp_ether_output
0.66% librte_crypto_multibuffer.so [.] 0x00000000000052eb
0.44% librte_crypto_multibuffer.so [.] 0x0000000000005608
0.34% librte_crypto_multibuffer.so [.] 0x00000000000052cc
0.27% librte_crypto_multibuffer.so [.] __tls_get_addr@plt
0.21% librte_crypto_multibuffer.so [.] 0x0000000000005601
Note
perf
can only be used if the fast path is running as a userland process. This is
the case for Intel or Arm, but not Octeon, typically.
Refer to the perf
manpage for specific options.
3.6.2. strace
¶
strace
displays system calls done by a given program. Use this command to get
a first impression on what the program is spending time on. For instance, you
can see netlink messages handled by the cache manager:
# strace -p $(pidof cmgrd)
Process 5350 attached
setsockopt(11, SOL_SOCKET, SO_SNDBUF, [32768], 4) = 0
setsockopt(11, SOL_SOCKET, SO_RCVBUF, [32768], 4) = 0
bind(11, {sa_family=AF_NETLINK, pid=-2076175130, groups=00000000}, 12) = 0
getsockname(11, {sa_family=AF_NETLINK, pid=-2076175130, groups=00000000}, [12]) = 0
sendmsg(11, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\34\0\0\0\20\0\5\0\204\315jV\3
6\24@\204\3\1\0\0\10\0\2\0vrf\0", 28}], msg_controllen=0, msg_flags=0}, 0) = 28
recvmsg(11, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\320\0\0\0\20\0\0\0\204\315jV\
46\24@\204\1\2\0\0\10\0\2\0vrf\0\6\0\1\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 208
recvmsg(11, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"$\0\0\0\2\0\0\0\204\315jV\346\
4@\204\0\0\0\0\34\0\0\0\20\0\5\0\204\315jV"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 36
sendmsg(11, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"\24\0\0\0\33\0\5\3\205\315jV\3
6\24@\204\1\0\0\0", 20}], msg_controllen=0, msg_flags=0}, 0) = 20
recvmsg(11, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{",\0\0\0\33\0\2\0\205\315jV\346
24@\204\2\1\0\0\10\0\1\0\0\0\0\0\r\0\2\0"..., 16384}], msg_controllen=0, msg_flags=0}, 0) = 44
epoll_wait(4,
^C
Process 5350 detached
<detached ...>
Note
Refer to the strace
manpage for specific options.