1 .. SPDX-License-Identifier: BSD-3-Clause
2 Copyright(c) 2010-2014 Intel Corporation.
4 VM Power Management Application
5 ===============================
10 Applications running in Virtual Environments have an abstract view of
11 the underlying hardware on the Host, in particular applications cannot see
12 the binding of virtual to physical hardware.
13 When looking at CPU resourcing, the pinning of Virtual CPUs(vCPUs) to
14 Host Physical CPUs(pCPUS) is not apparent to an application
15 and this pinning may change over time.
16 Furthermore, Operating Systems on virtual machines do not have the ability
17 to govern their own power policy; the Machine Specific Registers (MSRs)
18 for enabling P-State transitions are not exposed to Operating Systems
19 running on Virtual Machines(VMs).
21 The Virtual Machine Power Management solution shows an example of
22 how a DPDK application can indicate its processing requirements using VM local
23 only information(vCPU/lcore) to a Host based Monitor which is responsible
24 for accepting requests for frequency changes for a vCPU, translating the vCPU
25 to a pCPU via libvirt and affecting the change in frequency.
27 The solution is comprised of two high-level components:
29 #. Example Host Application
31 Using a Command Line Interface(CLI) for VM->Host communication channel management
32 allows adding channels to the Monitor, setting and querying the vCPU to pCPU pinning,
33 inspecting and manually changing the frequency for each CPU.
34 The CLI runs on a single lcore while the thread responsible for managing
35 VM requests runs on a second lcore.
37 VM requests arriving on a channel for frequency changes are passed
38 to the librte_power ACPI cpufreq sysfs based library.
39 The Host Application relies on both qemu-kvm and libvirt to function.
41 #. librte_power for Virtual Machines
43 Using an alternate implementation for the librte_power API, requests for
44 frequency changes are forwarded to the host monitor rather than
45 the APCI cpufreq sysfs interface used on the host.
47 The l3fwd-power application will use this implementation when deployed on a VM
48 (see :doc:`l3_forward_power_man`).
50 .. _figure_vm_power_mgr_highlevel:
52 .. figure:: img/vm_power_mgr_highlevel.*
60 VM Power Management employs qemu-kvm to provide communications channels
61 between the host and VMs in the form of Virtio-Serial which appears as
62 a paravirtualized serial device on a VM and can be configured to use
63 various backends on the host. For this example each Virtio-Serial endpoint
64 on the host is configured as AF_UNIX file socket, supporting poll/select
65 and epoll for event notification.
66 In this example each channel endpoint on the host is monitored via
67 epoll for EPOLLIN events.
68 Each channel is specified as qemu-kvm arguments or as libvirt XML for each VM,
69 where each VM can have a number of channels up to a maximum of 64 per VM,
70 in this example each DPDK lcore on a VM has exclusive access to a channel.
72 To enable frequency changes from within a VM, a request via the librte_power interface
73 is forwarded via Virtio-Serial to the host, each request contains the vCPU
74 and power command(scale up/down/min/max).
75 The API for host and guest librte_power is consistent across environments,
76 with the selection of VM or Host Implementation determined at automatically
77 at runtime based on the environment.
79 Upon receiving a request, the host translates the vCPU to a pCPU via
80 the libvirt API before forwarding to the host librte_power.
82 .. _figure_vm_power_mgr_vm_request_seq:
84 .. figure:: img/vm_power_mgr_vm_request_seq.*
86 VM request to scale frequency
89 Performance Considerations
90 ~~~~~~~~~~~~~~~~~~~~~~~~~~
92 While Haswell Microarchitecture allows for independent power control for each core,
93 earlier Microarchtectures do not offer such fine grained control.
94 When deployed on pre-Haswell platforms greater care must be taken in selecting
95 which cores are assigned to a VM, for instance a core will not scale down
96 until its sibling is similarly scaled.
104 Enhanced Intel SpeedStepĀ® Technology must be enabled in the platform BIOS
105 if the power management feature of DPDK is to be used.
106 Otherwise, the sys file folder /sys/devices/system/cpu/cpu0/cpufreq will not exist,
107 and the CPU frequency-based power management cannot be used.
108 Consult the relevant BIOS documentation to determine how these settings
111 Host Operating System
112 ~~~~~~~~~~~~~~~~~~~~~
114 The Host OS must also have the *apci_cpufreq* module installed, in some cases
115 the *intel_pstate* driver may be the default Power Management environment.
116 To enable *acpi_cpufreq* and disable *intel_pstate*, add the following
117 to the grub Linux command line:
119 .. code-block:: console
123 Upon rebooting, load the *acpi_cpufreq* module:
125 .. code-block:: console
127 modprobe acpi_cpufreq
129 Hypervisor Channel Configuration
130 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
132 Virtio-Serial channels are configured via libvirt XML:
137 <name>{vm_name}</name>
138 <controller type='virtio-serial' index='0'>
139 <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
141 <channel type='unix'>
142 <source mode='bind' path='/tmp/powermonitor/{vm_name}.{channel_num}'/>
143 <target type='virtio' name='virtio.serial.port.poweragent.{vm_channel_num}'/>
144 <address type='virtio-serial' controller='0' bus='0' port='{N}'/>
148 Where a single controller of type *virtio-serial* is created and up to 32 channels
149 can be associated with a single controller and multiple controllers can be specified.
150 The convention is to use the name of the VM in the host path *{vm_name}* and
151 to increment *{channel_num}* for each channel, likewise the port value *{N}*
152 must be incremented for each channel.
154 Each channel on the host will appear in *path*, the directory */tmp/powermonitor/*
155 must first be created and given qemu permissions
157 .. code-block:: console
159 mkdir /tmp/powermonitor/
160 chown qemu:qemu /tmp/powermonitor
162 Note that files and directories within /tmp are generally removed upon
163 rebooting the host and the above steps may need to be carried out after each reboot.
165 The serial device as it appears on a VM is configured with the *target* element attribute *name*
166 and must be in the form of *virtio.serial.port.poweragent.{vm_channel_num}*,
167 where *vm_channel_num* is typically the lcore channel to be used in DPDK VM applications.
169 Each channel on a VM will be present at */dev/virtio-ports/virtio.serial.port.poweragent.{vm_channel_num}*
171 Compiling and Running the Host Application
172 ------------------------------------------
177 Compiling the Application
178 -------------------------
180 To compile the sample application see :doc:`compiling`.
182 The application is located in the ``vm_power_manager`` sub-directory.
187 The application does not have any specific command line options other than *EAL*:
189 .. code-block:: console
191 ./build/vm_power_mgr [EAL options]
193 The application requires exactly two cores to run, one core is dedicated to the CLI,
194 while the other is dedicated to the channel endpoint monitor, for example to run
195 on cores 0 & 1 on a system with 4 memory channels:
197 .. code-block:: console
199 ./build/vm_power_mgr -l 0-1 -n 4
201 After successful initialization the user is presented with VM Power Manager CLI:
203 .. code-block:: console
207 Virtual Machines can now be added to the VM Power Manager:
209 .. code-block:: console
211 vm_power> add_vm {vm_name}
213 When a {vm_name} is specified with the *add_vm* command a lookup is performed
214 with libvirt to ensure that the VM exists, {vm_name} is used as an unique identifier
215 to associate channels with a particular VM and for executing operations on a VM within the CLI.
216 VMs do not have to be running in order to add them.
218 A number of commands can be issued via the CLI in relation to VMs:
220 Remove a Virtual Machine identified by {vm_name} from the VM Power Manager.
222 .. code-block:: console
226 Add communication channels for the specified VM, the virtio channels must be enabled
227 in the VM configuration(qemu/libvirt) and the associated VM must be active.
228 {list} is a comma-separated list of channel numbers to add, using the keyword 'all'
229 will attempt to add all channels for the VM:
231 .. code-block:: console
233 add_channels {vm_name} {list}|all
235 Enable or disable the communication channels in {list}(comma-separated)
236 for the specified VM, alternatively list can be replaced with keyword 'all'.
237 Disabled channels will still receive packets on the host, however the commands
238 they specify will be ignored. Set status to 'enabled' to begin processing requests again:
240 .. code-block:: console
242 set_channel_status {vm_name} {list}|all enabled|disabled
244 Print to the CLI the information on the specified VM, the information
245 lists the number of vCPUS, the pinning to pCPU(s) as a bit mask, along with
246 any communication channels associated with each VM, along with the status of each channel:
248 .. code-block:: console
252 Set the binding of Virtual CPU on VM with name {vm_name} to the Physical CPU mask:
254 .. code-block:: console
256 set_pcpu_mask {vm_name} {vcpu} {pcpu}
258 Set the binding of Virtual CPU on VM to the Physical CPU:
260 .. code-block:: console
262 set_pcpu {vm_name} {vcpu} {pcpu}
264 Manual control and inspection can also be carried in relation CPU frequency scaling:
266 Get the current frequency for each core specified in the mask:
268 .. code-block:: console
270 show_cpu_freq_mask {mask}
272 Set the current frequency for the cores specified in {core_mask} by scaling each up/down/min/max:
274 .. code-block:: console
276 set_cpu_freq {core_mask} up|down|min|max
278 Get the current frequency for the specified core:
280 .. code-block:: console
282 show_cpu_freq {core_num}
284 Set the current frequency for the specified core by scaling up/down/min/max:
286 .. code-block:: console
288 set_cpu_freq {core_num} up|down|min|max
290 Compiling and Running the Guest Applications
291 --------------------------------------------
293 For compiling and running l3fwd-power, see :doc:`l3_forward_power_man`.
295 A guest CLI is also provided for validating the setup.
297 For both l3fwd-power and guest CLI, the channels for the VM must be monitored by the
298 host application using the *add_channels* command on the host.
303 #. export RTE_SDK=/path/to/rte_sdk
304 #. cd ${RTE_SDK}/examples/vm_power_manager/guest_cli
310 The application does not have any specific command line options other than *EAL*:
312 .. code-block:: console
314 ./build/vm_power_mgr [EAL options]
316 The application for example purposes uses a channel for each lcore enabled,
317 for example to run on cores 0,1,2,3 on a system with 4 memory channels:
319 .. code-block:: console
321 ./build/guest_vm_power_mgr -l 0-3 -n 4
324 After successful initialization the user is presented with VM Power Manager Guest CLI:
326 .. code-block:: console
330 To change the frequency of a lcore, use the set_cpu_freq command.
331 Where {core_num} is the lcore and channel to change frequency by scaling up/down/min/max.
333 .. code-block:: console
335 set_cpu_freq {core_num} up|down|min|max