Sunday, February 1, 2009

Processor Power Management in Linux

At a high level, processor power management with modern processors involves managing three different types of processor states:
  • Processor power states (C-states)
  • Processor performance states (P-states)
  • Throttling states (T-states)

C0 is higher performance than C1, P0 is higher performance than P1, and T0 is higher performance than T1. Different processors offer different granularities for each of these capabilities. Some of the newest processors offer per-core C-state support, per-socket P-state support, and per-thread T-states. Remarkably, current kernels of Linux have built-in support for each of these capabilities.

C-states:
The Linux idle process automatically makes in-depth usage of the various C-states. For example, Intel’s “Nehalem” processors support C0, C1, C3, and C6 states, and the idle process uses these states as appropriate.
If you want to set the maximum C-state in Linux, just put processor.max_cstate= on the kernel command line (in grub, just hit the "e" or "a" keys)...... Bear in mind that is the ACPI C state, not the processor one, so ACPI C3 might be HW C6, etc. When in doubt, the powertop utility will show you which is which. Powertop is available from
http://www.lesswatts.org/projects/powertop/.

P-states:
P-states essentially refer to different frequencies supported by a given processor. As a general rule, higher frequency processors offer more P-states than lower frequency processors. In Linux, the cpufreq module allows control of the P-states:

  • cd /sys/devices/system/cpu
  • ls -L
  • cd cpux/cpufreq
  • cat scaling_available_frequencies
  • echo -n xxxxx > scaling_max_freq
  • echo -n yyyyy > scaling_min_freq

Where:

  • x is the appropriate CPU number from the prior command (though it may only be the first one that actually matters)
  • xxxxx and yyyyy are the desired frequencies from the list of scaling_available_frequencies defined above; set this these to be the same to peg the processor to a single frequency/P-state

Automatic P-state Control

Linux has different performance governors available to set P-state policies. Among the most interesting is the ondemand governor, which provides automatic adjustment of P-states. With the on-demand governor, there are additional tunable parameters that can adjust the performance of the governor--see http://software.intel.com/en-us/articles/enhanced-intel-speedstepr-technology-and-demand-based-switching-on-linux for details.

T-states
T-states (throttling states) essentially stop clocks to the processor between instructions to approximate the desired duty cycles. They were originally developed to adjust processor performance in response to thermal conditions, but this can also have an impact on power as well. For processors supporting T-states, there are usually 8 T-states (T0 through T7), corresponding to 12.5% reductions in duty cycle.

Only manual T-state control is available today in Linux:

  • cd /proc/acpi/processor/
  • ls -L CPU*
  • cd CPUx
  • echo -n y > throttling
  • cat throttling

Where:

  • x is the appropriate CPU number from the prior command (though it may only be the first one that actually matters)--note the upper case
  • y is a value from 0 to 7, correspond to T0 [not throttled] toT7 [87.5% throttled]

Here are a few scripts to set the T7 for all processors in the system, check the status of T-states, and then switch all the processors back to T0:

  • for ii in `ls /proc/acpi/processor/CPU*/throttling`; do echo -n 7 > $ii; done
  • for ii in `ls /proc/acpi/processor/CPU*/throttling`; do echo $ii; cat $ii; done
  • for ii in `ls /proc/acpi/processor/CPU*/throttling`; do echo -n 0 > $ii; done

For more information, see http://acpi.sourceforge.net/documentation/processor.html.

Statistics in Linux
The
PowerTop utility (http://www.lesswatts.org/projects/powertop/) provides information on P-state and C-state usage in a given system. Additional information available from Linux:

  • C-state transition info:
    cat /proc/acpi/processor/CPU*/power
  • P-state transition info
    cat /sys/devices/system/cpu/cpu*/cpufreq/stats/total_tran
    cat /sys/devices/system/cpu/cpu*/cpufreq/stats/time_in_state

Enjoy!
--kb

No comments:

Post a Comment