A Software Engineer's Tips and Tricks #3: CPU Utilization Is Not Always What It Seems
Hey there! We're back for our third edition of Tips and Tricks. As we said in our first posts on Drizzle ORM and Template Databases in PostgreSQL, our new Tips and Tricks mini blog series is going to share some helpful insights and cool tech that we've stumbled upon while working on technical stuff.
Today's topic is short and sweet. It'll be on CPU utilization and what that metric indicates. If you enjoy it and want to learn more, I encourage you to check out the "further reading" links.
If you find this post helpful, feel free to share it with your friends or colleagues. If you don’t like the topic, no problem! Just skip it and check out the next one.
Sound good? Let's jump in!
CPU utilization is not about CPU utilization
If you see 100% CPU usage in top
, htop
, or ps
, you might think a process is maxing out your server's resources. But is it really? CPU utilization metrics can be tricky and don't always tell the whole story.
Brendan Gregg, a renowned performance engineer at Intel (formerly at Netflix), explains that sometimes, what looks like high CPU activity is actually the CPU waiting on memory I/O. Your process might not be performing complex tasks but waiting for data from the hard drive or network. In other words, your CPU is not always at 100% because it’s busy at computing stuff.
For anyone intrigued by the challenge of diagnosing and optimizing production systems, Gregg's work is a must-read. Check out his insightful post on CPU utilization. And if you're starting to question the CPU metrics you see, it might be time to explore tools like iotop
or perf
to get a clearer picture of your system's behavior.
Further reading
- CPU utilization is wrong: https://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html
- Blog: https://www.brendangregg.com/blog/index.html
- Linux performance analysis in 60,000 Milliseconds: https://netflixtechblog.com/linux-performance-analysis-in-60-000-milliseconds-accc10403c55
Enjoy automatic continuous deployment, global load balancing, real-time metrics and monitoring, autoscaling, and more.
SIGQUIT
That’s it for today! We hope you enjoyed today's tips and tricks. If you have any feedback or suggestions for future posts, feel free to reach out! You can find us on Twitter (or X) at @gokoyeb, LinkedIn, or the Koyeb Community.