This is the third part of the documentation for the Runtime Power Management (RPM) subsystem of the Intel Linux Kernel Graphics Driver, also known as i915.ko. In this part I will explain the power management interactions between the graphics device and the rest of the system. Part 1 gave an overview of the feature and part 2 explained the implementation details.
I should also point that although I am a Graphics developer and I believe to have a good knowledge on the Graphics part of the problem, I am not an expert on general power management, so I may say something wrong. If you notice that, please correct me!
I work for Intel and I am a member of its Linux Kernel Graphics team. On this text I will talk about things that are related to my work at Intel, but this text reflects only my own opinion, not necessarily Intel’s opinion. Everything I talk about here is already public information.
The very first thing you need to do is to learn the basic concepts related to power management: Core C states – or just C States – and Package C states – or PC states. Since descriptions of these concepts were already written by people who know more about them than I do, I will just provide you some pointers:
- A very nice text, a little focused on Xeon Phi, but whose concepts apply to the other platforms too.
- The PowerTOP User’s guide, which you will want to completely read since I will talk about PowerTOP later.
- Chapter 4 of the Haswell processor datasheet, which is Intel’s official documentation. There’s also the Broadwell datasheet, but I’ll use the Haswell datasheet for the examples on this text.
The small detail that makes a lot of power management comparisons wrong
Now that you know what the PC states are, you understood that the current PC state directly affects the system’s total power consumption. You also understood that if just one little piece of the whole system is not allowing a certain PC state, then the whole system will not enter that PC state. And this is the big problem.
Let’s say you have a system with a disk that is only allowing up to PC3 state, and your graphics device is allowing up to the PC6 state. That means that you are stuck on PC3, you will not reach PC6. Then, you enable a certain graphics power management feature – let’s say, for example, Framebuffer Compression (FBC) -, and the graphics device starts allowing PC7 instead of PC6. Since the disk is still there, preventing anything deeper than PC3, your system will stay at PC3 or worse. Then you try to measure power consumption both before and after enabling FBC to see how much it changes. You will conclude that the difference is zero! Why? Because the disk is keeping your machine at PC3. Does that mean the FBC feature does not save any power? No. What this means is that you have to fix your disk device power management configuration first.
Then let’s say you fix the disk power management policies so it starts allowing PC7. This way, when you disable FBC your machine will be able to reach only up to PC6 state, but when you enable FBC you will reach up to PC7. And now, if you measure the power consumption, you will notice a difference. But even this case can allow you to reach wrong conclusions: if you’re trying to check how much you can save by properly configuring your disk power management policies, you will reach different conclusions depending on whether FBC is enabled or not! Power management comparisons are really not easy, and you usually cannot look at just one feature: you have to consider the whole platform, its states and your residencies. You always have to know what exactly is preventing you from getting into the deeper states before concluding anything.
Another common misconception
Another common misconception that I have to mention here is the relationship between power management and performance. A lot of people assume that enabling power management features sacrifices performance. While this may be true for some cases, it is also false for other cases! Power management features help your system not only draw less power when idle, but also stay at cooler temperatures. This means that when you really need performance, you will be less limited by the thermal management system, since it will take you more time to reach the high temperatures, so you will be able to run at higher clocks for more time.
I am not an expert in this area so I don’t really want to write too much because I don’t want to risk saying something wrong. But I suggest you to read the Thermal Management chapter of the datasheet I referenced earlier and pay a lot of attention while reading the Thermal Considerations and Turbo Boost sections!
After reading the last sections you may be wondering what you need to do to know which device is preventing the system from getting into deeper package C states. Unfortunately, as far as I know, there is really no way to discover that today. But we do have a tool that allows you to see how much time your system is spending on each state and that also gives you some hints on changes you could try to make to reach deeper PC states: PowerTOP.
Since you already read PowerTOP’s user manual – referenced earlier in this text – you already know how to use it. I suggest you to play with the tunables tab and try to see how each parameter affects the system. Do you see a change on the PC state residencies after toggling a certain parameter? Is the difference for one parameter only noticeable after you toggle another parameter? Do you have a certain group of parameters that need to be toggled together to allow deeper PC states? Also, since you’re basically changing the default parameters of the devices, you may see undesired side effects, such as a mouse that refuses to go back to life after you move it. Unfortunately, some power saving features are disabled exactly because they can cause bad side effects: that’s why you have to change the defaults. On the good side, it’s not everybody that is affected by these side effects, so your system may be just fine.
Another nice tool is Turbostat, which is located inside the Kernel source tree, under
tools/power/x86/turbostat. When I am trying to discover if a certain change makes a difference in the PC state residencies, I find it much easier to look at Turbostat’s output rather than at PowerTOP’s output, simply because Turbostat just prints all the values on new lines, allowing you to quickly scroll up to see the past. I usually have one terminal on PowerTOP’s tunables tab, and another terminal with Turbostat running, so I just keep toggling the tunables and see if the residencies change.
Back to the graphics device
Now that you understand PC states, I can tell you that there are many graphics features that affect the deepest possible PC state. I won’t really list every possible feature here, but the main rule is: the more pixels you process per second, the more power you will draw. On the datasheet provided at the beginning of this text, please read Section 4.2.6: Package C-States and Display Resolutions.
Did you read it? Ok, we can proceed. I know you just read that, but I need to emphasize: the screen resolution is not the only thing that can limit the deepest possible PC state. So, for example, if you have a lot of rendering, or if some feature such as FBC is disabled, your deepest possible PC state may be limited.
Screen on vs screen off
If you really read that datasheet, you may have noticed that, on those specific processors, the deepest PC state you can reach with the screen on is PC7, while the deepest PC state you can reach with the screen off is PC10. When there are no pixels to process, everything gets easier.
From the graphics driver point of view, it is much easier to make sure we allow the deepest possible PC state while the screen is off, and you can usually expect the state of i915.ko to be really good on this aspect. You can use this information as a way to find out if the graphics device is the limiting factor on your PC states. First, you close all the applications that could be doing off-screen rendering. Then you launch Turbostat and check up to which PC state you can reach. Then you disable the screen – see Part 1 for possible commands -, wait about 30 seconds for things to settle down and for Turbostat to print the output, and then you check if you were able to reach deeper PC states. If the answer is yes, then the graphics device is the limiting factor on your system. If not, then there’s probably something else limiting how deep you can go. Of course, i915.ko runtime PM needs to be properly configured and enabled while you’re doing this.
If you really want to guarantee that no bad applications are interfering with your test, I suggest you to close your display manager, all applications, and run
tests/pm_rpm --stay from intel-gpu-tools. Then you can run the experiment with Turbostat.
Common bottlenecks I observed
If you go back to Part 1 you will notice I already mentioned the power management bottlenecks that I usually observe on my own systems: the disk and audio power management policies. I usually find that if I only change these policies I can already get to the deepest possible PC states on my development machines. I won’t get great residences without tuning everything else, but I will at least be able to reach those deepest PC states.
But remember: every system is different, and I have no idea what is attached to your machine, so I can’t really guarantee that the recipe that worked for me will also work for you. I don’t know which wireless card you have, I don’t know which devices are plugged to your system, I don’t know how fast your memory is, so I can’t really provide you an universal tool for power management. The most I can do is suggest you to run PowerTOP’s auto tune feature.
The good news is that there are people trying to solve both the disk and audio problems. Want to help on the disk side? See this blog post from Matthew Garrett.
Back to graphics Runtime PM
So what does the graphics runtime PM implementation have to do with all these things I just explained? It’s very simple: when we runtime suspend i915.ko, we completely disable the graphics device. So, in theory, it should start allowing the deepest possible PC states. But, as I explained, even though this allows power savings, it does not really guarantee that you will save power, because the other devices might be preventing you from reaching the deepest PC states. But at least now we – the graphics team – probably won’t be the reason why you’re not saving more power, so you will have to blame other people.
My main goal with these posts was to teach you the little things I know about power management. I really hope that you use this acquired knowledge to make our world a greener place. I also have to ask you: please help us improve the state of power management on the Linux ecosystem! Please help testing all these features. Please find the main bottlenecks of your system. Please engage with the upstream developers of the many subsystems and help them change the default parameters and policies of the devices so they can save more power. Please initiate power management discussions with the distributions, so they can review their tools and default values too.
And before I forget, here’s a link to a comic that is going to give you even more motivation to help you make the world consume less power. You can be like Superman! http://www.smbc-comics.com/?id=2305