As a Linux system administrator, having a solid grasp of performance monitoring and troubleshooting commands is essential. When issues arise, you need to quickly diagnose problems to restore optimal performance.
In this comprehensive guide, we will explore 12 powerful commands to analyze resource utilization, pinpoint bottlenecks, and optimize Linux systems. Whether managing a single server or an enterprise infrastructure, these tools provide invaluable insights.
1. top
The top
command provides a dynamic real-time view of overall system performance. It displays critical information like:
- CPU usage
- Memory and swap usage
- Processes by resource utilization
- Load averages
Top gives both a high-level summary and per-process details for drilling down on issues. Sorting by CPU, memory, etc. helps identify resource hogs.
top - 13:05:45 up 21 days, 14:22, 3 users, load average: 0.00, 0.01, 0.05
Tasks: 291 total, 1 running, 290 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.3 us, 0.3 sy, 0.0 ni, 99.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 2519048 total, 14456 free, 1665348 used, 852244 buff/cache
KiB Swap: 2097148 total, 2097148 free, 0 used. 1925740 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
4402 mysql 20 0 5535840 1.089g 2140 S 6.6 43.9 3631:58 mysqld
8249 root 20 0 105516 3244 2796 R 6.6 0.1 0:00.07 top
1 root 20 0 185380 12484 9280 S 0.0 0.5 0:04.82 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:05.35 ksoftirqd/0
...
With top, you can interactively kill processes, change priorities, and monitor in real-time. It‘s an administrator‘s best friend for taming a misbehaving system.
2. vmstat
The vmstat
tool reports vital virtual memory statistics. It shows critical insight into system performance like:
- Processes
- CPU utilization
- Memory
- Swap
- Disk I/O
- System interrupts
This concise summary helps identify any resource bottlenecks or anomalies.
vmstat
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
r b swpd free buff cache si so bi bo in cs us sy id wa st
34 0 0 200889792 73708 591828 0 0 0 5 6 10 2 1 97 0 0
With vmstat you can snapshot system resource utilization and performance. It‘s invaluable when troubleshooting.
3. iostat
The iostat
tool reports detailed storage I/O statistics. This includes:
- Disk throughput (reads/writes)
- Request queues
- CPU utilization
- Network filesystem throughput
iostat highlights storage bottlenecks and slow disks. It‘s perfect for diagnosing laggy I/O-bound systems.
Linux 5.4.0-1044-aws (ip-10-0-1-115) 02/10/22 _x86_64_ (2 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
0.26 0.00 0.12 0.04 0.00 99.57
Device tps kB_read/s kB_wrtn/s kB_read kB_wrtn
nvme0n1 0.51 4.52 12.93 2250424 6448184
iostat provides the insight needed to identify storage bottlenecks and slow disks that are impacting performance.
4. sar
The sar
tool collects and reports on system activity information. This includes:
- CPU
- Memory
- Network
- Disk I/O
- System load
It‘s like a detailed snapshot of your entire system‘s performance and health. The historical reporting makes it easy to identify performance trends.
Linux 5.4.0-1044-aws (ip-10-0-1-115) 02/10/22 _x86_64_ (2 CPU)
12:10:01 CPU %user %nice %system %iowait %steal %idle
12:20:01 all 1.60 0.00 0.30 0.10 0.00 98.00
Average: CPU %user %nice %system %iowait %steal %idle
all 1.55 0.00 0.27 0.06 0.00 98.12
sar provides a detailed system resource utilization report to help uncover performance problems.
5. lsof
The lsof
command lists open files and the processes using them. This allows you to:
- Identify opened files slowing things down
- Find processes locking files
- Check network connections
It‘s invaluable for troubleshooting stubborn performance issues or resource conflicts.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
mysqld 4402 mysql cwd DIR 8,6 4096 2 /
mysqld 4402 mysql rtd DIR 8,6 4096 2 /
mysqld 4402 mysql txt REG 8,6 158173184 6271619 /usr/sbin/mysqld
mysqld 4402 mysql DEL REG 0,5 525368 /dev/zero
mysqld 4402 mysql mem REG 8,6 1884160 6332994 /dev/urandom
lsof empowers you to see what‘s happening behind the scenes at a file level – critical knowledge when hunting down problems.
6. iotop
The iotop
tool displays live I/O usage information and is perfect for identifying disk I/O bottlenecks. It shows:
- I/O bandwidth per process
- Total DISK READ and WRITE
With iotop you can immediately see the processes and threads consuming the most disk I/O to troubleshoot slowdowns.
Total DISK READ: 0.00 B/s | Total DISK WRITE: 513.59 K/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
20410 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [jbd2/vda1-8]
20411 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [ext4-rsv-conver]
63 be/4 root 0.00 B/s 0.00 B/s 0.00 % 0.00 % [kworker/3:1]
iotop gives you the real-time disk I/O visibility to quickly troubleshoot storage performance problems.
7. pidstat
The pidstat
tool monitors and reports on process CPU, memory, disk I/O, and other activity. It‘s invaluable for drilling down on application performance.
Linux 5.4.0-1044-aws (ip-10-0-1-115) 02/14/22 _x86_64_ (2 CPU)
07:03:08 PM UID PID %usr %system %guest %CPU CPU Command
07:03:08 PM 0 9 0.00 0.94 0.00 0.94 1 rcuos/0
07:03:08 PM 0 43 0.00 1.89 0.00 1.89 1 rcu_sched
07:03:08 PM 0 6 0.00 1.89 0.00 1.89 1 ksoftirqd/1
pidstat provides detailed process usage statistics to help optimize application performance.
8. mpstat
The mpstat
utility reports processor statistics. This includes:
- Processor utilization
- CPU load
- CPU idle time
It highlights CPU bottlenecks and anomalies that impair system performance.
Linux 5.4.0-1044-aws (ip-10-0-1-115) 02/14/22 _x86_64_ (2 CPU)
02:18:48 PM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
02:18:48 PM all 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 100.00
mpstat provides the CPU-level statistics to help identify and troubleshoot processor bottlenecks.
9. free
The free
command displays system memory usage statistics. This includes:
- Total, Used and Free Memory
- Buffers, Cache
- Swap Space
It offers a quick overview of memory usage to help catch shortages before they cause system problems.
total used free shared buff/cache available
Mem: 1015072 159304 64936 10748 823832 713316
Swap: 1048572 0 1048572
Free quickly diagnoses memory bottlenecks and inadequacies that can severely impact performance.
10. netstat
The netstat
tool displays network connections, routing tables, interface statistics, and more. This vital insight can help troubleshoot network performance issues.
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 ip-10-0-0-111.ec2.:ssh ip-10-0-1-23.ec2.:65038 ESTABLISHED
tcp 0 0 ip-10-0-0-111.ec2.:ssh ip-10-0-1-23.ec2.:40718 ESTABLISHED
tcp 0 0 ip-10-0-0-111.ec2.:web ip-10-0-1-23.ec2.:57518 TIME_WAIT
netstat provides the network visibility to help track down connectivity and bandwidth issues impacting performance.
11. tcpdump
The tcpdump
tool captures network traffic for analysis. This enables deep inspection of network communications to troubleshoot a wide array of performance issues.
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
17:12:51.266986 IP ip-10-0-0-111.ec2.49362 > ip-10-0-1-23.ec2.ssh: Flags [P.], seq 1:60, ack 1, win 502, options [nop,nop,TS val 24606695 ecr 24606124], length 59
17:12:51.267164 IP ip-10-0-1-23.ec2.ssh > ip-10-0-0-111.ec2.49362: Flags [.], ack 60, win 501, options [nop,nop,TS val 24606695 ecr 24606124], length 0
17:12:51.718483 IP ip-10-0-1-23.ec2.57520 > ip-10-0-0-111.ec2.web: Flags [.], ack 1, win 229, options [nop,nop,TS val 24606809 ecr 24606695], length 0
tcpdump provides network packet capture and inspection to help analyze bandwidth and connectivity issues.
12. perf
The perf
tool provides in-depth profiling of Linux performance. This includes analyzing CPU performance characteristics like:
- Instruction executions
- Hardware/software events
- Memory hierarchy behaviors
- Precise CPU cycles
Perf helps dig deeper when system-level statistics hint at underlying performance issues.
Performance counter stats for CPU(s):
1,477,513,567 cpu-cycles
1,132,064,477 instructions
347,841,394,225 cache-references
19,853,950,617 cache-misses
0.132722635 seconds time elapsed
13.043467000 seconds user
0.001999000 seconds sys
Perf lets you delve into the low-level details to uncover stubborn performance problems.
Conclusion
Mastering these Linux performance commands will elevate your ability to monitor, analyze, and optimize Linux systems. They provide invaluable visibility into all aspects of system and application performance. While individual tools give targeted insight, utilizing them together provides complete holistic monitoring coverage so you can swiftly identify and resolve any performance issues that arise.