Intel power state measurements

While reading Etsy’s excellent performance tuning blog post I learned about i7z, a nifty little tool for showing you Intel CPU states. It gives details on clock speed, turbo mode, and just how deep an idle power saving mode it might be in. Fun stuff, but I like graphs in Munin! How to do it?

The i7z author was very kind to give me some suggestions and before too long I had a hacky Munin plugin to display the data. Image below. Unfortunately it requires running i7z in the background and writing to a logfile, not a great match for Munin. I think it’d be better to go back to the source for the data and craft a Munin plugin from the low level CPU info. That being said, i7z does a fair amount of nice work in interpreting things. It knows about different states in different CPU models, can interpret turbo frequencies, etc. It may be a lot of work to do a good general purpose Munin plugin.

Another program that gives similar info is Intel’s PowerTop, part of their awesomely successful effort to bring better power management to Linux. They’re responsible for a lot of savings, I’d guess 20-50% power (and heat!) for a server since the Linux-2.2 era. While looking through its code I learned there’s a lot of detailed info in /sys/devices/system/cpu/cpu*, including power states! Lots of files there, only a little documentation, here’s some of the fun stuff

cpufreq/time_in_state: histogram of clock speeds over the system’s life

cpuidle/: info on power saving. My CPUs have 4 states, the “name” file tells us they are POLL, C1-SNB, C3-SNB, and C6-SNB. cpuidle/usage and cpuidle/time seem to be counters for how often they’ve been in those states. Quicky monitoring

while :; do
  for f in state?; do
    cat $f/usage | tr '\n' ' ';
    echo;
  done;
  sleep 1; echo;
done

For posterity, here’s the hacky plugin I wrote to read the logfile that “i7z -w l” writes. Graph, then code. The graph here is spiky because I’m only sampling every 5 minutes, not taking the 5 minute average. Not sure it’s very interesting info anyway.


#!/usr/bin/env python

"""
i7z - Wildcard plugin to monitor CPU states

by Nelson Minar <nelson@monkey.org>, placed into the public domain

This plugin relies on i7z running in the background and writing
data to a file.

Valid names: c0 c1 c3 c6 c7
"""

# Sample input
# 1344980274.830075945 [1.000000,3.628642,1.000000,95.207611,0.000000]   [1.000000,-0.456208,1.000000,99.314697,0.000000]    [1.000000,-0.737858,1.000000,99.645119,0.000000]    [1.000000,0.956179,0.000000,98.813507,0.000000]
# timestamp [c0, c1, c3, c6, c7] x4 (for 4 CPUs)

import sys, subprocess, os

# Figure out the mode from the name the program was invoked with
mode = os.path.basename(sys.argv[0])[4:]
# figure out which array index of data we want
modeMap = { "c0": 0, "c1": 1, "c3": 2, "c6": 3, "c7": 4 }
dataSlot = modeMap[mode]

# config
if len(sys.argv) > 1:
    if sys.argv[1] == "config":
        print """
graph_title i7z status mode %(mode)s
graph_category system
graph_vlabel percent
graph_scale no
graph_args --upper-limit 100 -l 0

cpu1-%(mode)s.label CPU1
cpu1-%(mode)s.min 0
cpu1-%(mode)s.max 100

cpu2-%(mode)s.label CPU2
cpu2-%(mode)s.min 0
cpu2-%(mode)s.max 100

cpu3-%(mode)s.label CPU3
cpu3-%(mode)s.min 0
cpu3-%(mode)s.max 100

cpu4-%(mode)s.label CPU4
cpu4-%(mode)s.min 0
cpu4-%(mode)s.max 100
""" % { "mode": mode }
        sys.exit(0)
    else:
        sys.stderr.write("Unknown argument %s\n" % sys.argv[1])
        sys.exit(0)

# fetch data for munin

fp = file("/home/nelson/src/munin-i7z/cpu_cstate_log.txt")
data = fp.readline()
cpus = [1,2,3,4]
(ts, cpus[0], cpus[1], cpus[2], cpus[3]) = data.split()
for i, cpu in enumerate(cpus):
    cpus[i] = [float(n) for n in (cpu[1:-1].split(','))]
    print "cpu%d-%s.value %.2f" % (i+1, mode, cpus[i][dataSlot])

sys.exit(0)
About these ads

3 thoughts on “Intel power state measurements

  1. Hi Nelson! Thanks for the pingback on the blog post.
    I have one thought: Rather than grouping by CPU state, would it be better to group by CPU?
    This way you could see how much time each CPU spends in a given state.
    In the general case your CPUs will behave the same (for the most part), so your above graphs will look very similar.
    Grouping all of the states together with one graph per CPU might show you if a particular CPU is behaving oddly :-)

    • I think you’re right that grouping by CPU may make more sense. Hard to say for sure, it depends on CPU affinity in the kernel and what you’re trying to learn.

      The Munin graph I find most useful right now is CPU frequency scaling; you can quickly spot stuck processes and strange usage patterns. But I’m not running this on heavily loaded servers, so I don’t have a lot of meaningful data.

      • CPU affinity is exactly right. On busy servers you’ll often see IRQ storms on one CPU versus another, or applications being bound to one CPU..

        Great blog post :-) Thank you!

Comments are closed.