Runtime PM support for Intel Linux Graphics – Part 1: for the users

Since more than a year ago the Intel Linux Graphics team has been working on the Runtime Power Management (RPM) implementation for our Kernel driver, i915.ko. Since a feature is useless if nobody knows about it, I figured it’s time to document things a little bit in order to spread the knowledge.

Disclaimer

I work for Intel and I am a member of its Linux Kernel Graphics team. On this text I will talk about things that are related to my work at Intel, but this text reflects only my own opinion, not necessarily Intel’s opinion. Everything I talk about here is already publicly documented somewhere else.

What is it?

The basic goal of the i915.ko RPM implementation is to try to reduce the power consumption when all screens are disabled and the graphics device is idle. We basically shut down the whole graphics device, and put it in the PCI D3 state. This implementation uses the Kernel standard runtime PM framework, so the way you control i915.ko RPM should be the same way you control RPM for the other devices.

There are quite a few documents and presentations explaining Kernel RPM, you can learn a lot from them. Use your favorite search engine!

Which platforms support it?

For now, the platforms that support runtime suspend are just SNB and the newer ones, except for IVB. The power saving features make more sense on the HSW/VLV and newer platforms, but SNB support was added as a proof of concept that the RPM code could support multiple  platforms. Also notice that we don’t yet have support for IVB (which is between SNB and HSW), but adding this support would be very simple and probably share most (all?) of the SNB code. In order to be 100% sure of what is supported, grab the Kernel source code, then look at the definition of the HAS_RUNTIME_PM macro inside drivers/gpu/drm/i915/i915_drv.h.

When do we runtime suspend?

First of all, you need a very recent Kernel, because the older ones don’t support RPM for our graphics device. Then you have to enable RPM for the graphics device. Then, all screens have to be disabled and no applications can be using the graphics device.  After all these conditions are met, we start counting time. If no screen is re-enabled and no user-space application uses the GPU after some time, we runtime suspend. If something uses the GPU before the timeout, the timeout is reset after the application finishes using the GPU. In other words: we use the autosuspend delay from the Kernel RPM infrastructure. The current default autosuspend delay is 10 seconds, but it can be changed at runtime.

When I say “all screens have to be disabled”, I mean that, on Kernels older than 3.16, all CRTCs have to be completely unset, without attached framebuffers. But on Kernel 3.16 and newer, RPM can also happen if the screens are in DPMS state, which is a much more common case – just leave your machine idle and it will probably set the DPMS state at some point.

Why ten seconds?

Currently, the default autosuspend delay is set to 10 seconds, which means we will only runtime suspend if nothing uses the GPU for 10 seconds. This is a very conservative value: you won’t get to the runtime suspended state unless your applications are really quiet. Some applications keep waking up the GPU consistently every second, so on these environments you won’t really see RPM.

Even though I arbitrarily decided 10s to be the default value, I admit this value is too conservative, but there are a few reasons why I chose it.

  • With a big timeout, RPM is less likely to start happening on everybody’s machines at the same time, which minimizes the damages caused by possible bugs. When I proposed the 10s timeout, RPM was a new feature, and new features usually bring new bugs.
  • If the timeout value is too low, there’s a possibility that we may do way too many runtime suspend + resume cycles per second, and although I never measured this, it is pretty safe to assume that doing all these suspend + resume cycles will waste more power than just not doing them at all. If you’ve been idle for a whole ten seconds, it’s likely that you’ll probably remain idle for at least a few more seconds.
  • I was expecting someone – maybe even me – would do some very thorough experiments and conclude some ideal value: one that makes sure we don’t suspend + resume so frequently that we would waste power doing it, but that also makes sure we take all the opportunities we can to save power.
  • A conservative value would encourage people to analyze their user-space apps and tune them so they would wake-up the GPU – and maybe even the CPU – only when absolutely necessary, which would benefit not only the cases where we have RPM enabled, but also the cases where we are just using the applications.
  • The value can be changed at runtime! Do you want 1ms? You can have it!

Of course, due to all the complexities involved, I imagine that the ideal runtime suspend delay depends on the set of user-space applications that is running, so in theory the value should be different for every single use case.

Update: after I wrote this text, but before I published it publicly, Daniel Vetter sent a patch proposing a new timeout of 100ms.

How do I play with it?

To be able to change most of these parameters you will need to be root. Since our graphics device is always PCI device 00:02.0, the files will always have the same locations:

  • /sys/devices/pci0000:00/0000:00:02.0/power/control: this is the file that you use to enable and disable graphics RPM. The values are non-trivial: “auto” means “RPM enabled”, while “on” means “RPM disabled”. Write to this file using “echo” to change the parameters, read it to check the current state.
  • /sys/devices/pci0000:00/0000:00:02.0/power/autosuspend_delay_ms: this is the file that controls how often we will runtime suspend. Remember that the number is in milliseconds, so “10” means 10ms, not 10s. Write to this file to change the parameters, read to check the current state.
  • /sys/devices/pci0000:00/0000:00:02.0/power/runtime_status: this is the file you use to check the RPM status. It can print “suspended”, “suspending”, “resuming” and “active”.

Due to some funny interactions between the graphics and audio devices – needed for things like HDMI audio -, you will also need to enable audio runtime PM. I won’t go into details of how audio RPM works because I really don’t know much about it, but I know that to enable it:

echo 1 > /sys/module/snd_hda_intel/parameters/power_save
echo auto > /sys/bus/pci/devices/0000:00:03.0/power/control

Or you can just blacklist the snd_hda_intel module, but that means you will lose its features.

As an alternate to all the instructions above, you can also try to just run:

sudo powertop --auto-tune

This command will enable not only graphics and audio RPM, but try to tune all your system to use less power. While this is good, it can also have some bad effects, such as disabling your USB mouse. If something annoys you, you can then run sudo powertop, go to the tunables tab and undo what you want.

After all this, to force runtime PM you can do this:

  • If your Kernel supports runtime PM for DPMS, you can just run:
    xset dpms force off
    In this case, if you want to enable the screens again, you just need to move the mouse.
  • If your Kernel is older, you can run:
    xrandr
    Then, for each connected output, you run:
    xrandr --output $output --off
    In this case, if you want to enable the screens again, you will have to run, for each output:
    xrandr --output $output --auto

If you boot with drm.debug=0xe, you can look at the output of dmesg and look for “Device suspended” and “Device resumed” messages. This can be done with:

dmesg | grep intel_runtime_

What is the difference between runtime suspend and the usual suspend?

The “suspend” that everybody is used to is something that happens for the whole platform. All devices and applications stop, the machine enters the S3 state, and you just can’t use it while it remains on that state. The runtime suspend feature allows only the devices that are currently unused on the system to be suspended, so all the other devices still work, and user space also keeps working. Then, when something needs access to the device – such as user space -, it gets resumed.

What’s next?

On the next part of our documentation I will explain the development aspect of the RPM feature, including the feature design, its problems and how to debug them.

Credits

I have to mention that even though I did a lot of work on the RPM subsystem of our driver, a big number of other developers also did huge amounts of work here too, so they have to take the credit! Since pretty much everybody on our team helped in one way or another – writing features, enabling specific platforms, giving design feedback, reviewing patches, reporting bugs, adding regressions, etc. – I am not going to say specific names: everybody deserves the credit. Also, a lot of people that are not on the Intel Linux Kernel Graphics team contributed to this work too.