X-Git-Url: http://git.droids-corp.org/?a=blobdiff_plain;f=doc%2Fguides%2Fsample_app_ug%2Fvm_power_management.rst;h=1b6de8e93636848ece04f105a1e9a53b27e0e602;hb=3cb46d40d35960ca0478704d1e84e8d96b5676cd;hp=5be9f24d597d303f65c56db3209e1385262b9a84;hpb=3f04e13a87be9e7b4a6c7dbd50bc186d37a33953;p=dpdk.git diff --git a/doc/guides/sample_app_ug/vm_power_management.rst b/doc/guides/sample_app_ug/vm_power_management.rst index 5be9f24d59..1b6de8e936 100644 --- a/doc/guides/sample_app_ug/vm_power_management.rst +++ b/doc/guides/sample_app_ug/vm_power_management.rst @@ -1,71 +1,58 @@ .. SPDX-License-Identifier: BSD-3-Clause Copyright(c) 2010-2014 Intel Corporation. -VM Power Management Application -=============================== - -Introduction ------------- - -Applications running in Virtual Environments have an abstract view of -the underlying hardware on the Host, in particular applications cannot see -the binding of virtual to physical hardware. -When looking at CPU resourcing, the pinning of Virtual CPUs(vCPUs) to -Host Physical CPUs(pCPUS) is not apparent to an application -and this pinning may change over time. -Furthermore, Operating Systems on virtual machines do not have the ability -to govern their own power policy; the Machine Specific Registers (MSRs) -for enabling P-State transitions are not exposed to Operating Systems -running on Virtual Machines(VMs). - -The Virtual Machine Power Management solution shows an example of -how a DPDK application can indicate its processing requirements using VM local -only information(vCPU/lcore, etc.) to a Host based Monitor which is responsible -for accepting requests for frequency changes for a vCPU, translating the vCPU -to a pCPU via libvirt and affecting the change in frequency. - -The solution is comprised of two high-level components: - -#. Example Host Application - - Using a Command Line Interface(CLI) for VM->Host communication channel management - allows adding channels to the Monitor, setting and querying the vCPU to pCPU pinning, - inspecting and manually changing the frequency for each CPU. - The CLI runs on a single lcore while the thread responsible for managing - VM requests runs on a second lcore. - - VM requests arriving on a channel for frequency changes are passed - to the librte_power ACPI cpufreq sysfs based library. - The Host Application relies on both qemu-kvm and libvirt to function. - - This monitoring application is responsible for: - - - Accepting requests from client applications: Client applications can - request frequency changes for a vCPU, translating - the vCPU to a pCPU via libvirt and affecting the change in frequency. - - - Accepting policies from client applications: Client application can - send a policy to the host application. The - host application will then apply the rules of the policy independent - of the application. For example, the policy can contain time-of-day - information for busy/quiet periods, and the host application can scale - up/down the relevant cores when required. See the details of the guest - application below for more information on setting the policy values. - - - Out-of-band monitoring of workloads via cores hardware event counters: - The host application can manage power for an application in a virtualised - OR non-virtualised environment by looking at the event counters of the - cores and taking action based on the branch hit/miss ratio. See the host - application '--core-list' command line parameter below. - -#. librte_power for Virtual Machines - - Using an alternate implementation for the librte_power API, requests for - frequency changes are forwarded to the host monitor rather than - the APCI cpufreq sysfs interface used on the host. - - The l3fwd-power application will use this implementation when deployed on a VM - (see :doc:`l3_forward_power_man`). +Virtual Machine Power Management Application +============================================ + +Applications running in virtual environments have an abstract view of +the underlying hardware on the host. Specifically, applications cannot +see the binding of virtual components to physical hardware. When looking +at CPU resourcing, the pinning of Virtual CPUs (vCPUs) to Physical CPUs +(pCPUs) on the host is not apparent to an application and this pinning +may change over time. In addition, operating systems on Virtual Machines +(VMs) do not have the ability to govern their own power policy. The +Machine Specific Registers (MSRs) for enabling P-state transitions are +not exposed to the operating systems running on the VMs. + +The solution demonstrated in this sample application shows an example of +how a DPDK application can indicate its processing requirements using +VM-local only information (vCPU/lcore, and so on) to a host resident VM +Power Manager. The VM Power Manager is responsible for: + +- **Accepting requests for frequency changes for a vCPU** +- **Translating the vCPU to a pCPU using libvirt** +- **Performing the change in frequency** + +This application demonstrates the following features: + +- **The handling of VM application requests to change frequency.** + VM applications can request frequency changes for a vCPU. The VM + Power Management Application uses libvirt to translate that + virtual CPU (vCPU) request to a physical CPU (pCPU) request and + performs the frequency change. + +- **The acceptance of power management policies from VM applications.** + A VM application can send a policy to the host application. The + policy contains rules that define the power management behaviour + of the VM. The host application then applies the rules of the + policy independent of the VM application. For example, the + policy can contain time-of-day information for busy/quiet + periods, and the host application can scale up/down the relevant + cores when required. See :ref:`sending_policy` for information on + setting policy values. + +- **Out-of-band monitoring of workloads using core hardware event counters.** + The host application can manage power for an application by looking + at the event counters of the cores and taking action based on the + branch miss/hit ratio. See :ref:`enabling_out_of_band`. + + **Note**: This functionality also applies in non-virtualised environments. + +In addition to the ``librte_power`` library used on the host, the +application uses a special version of ``librte_power`` on each VM, which +directs frequency changes and policies to the host monitor rather than +the APCI ``cpufreq`` ``sysfs`` interface used on the host in non-virtualised +environments. .. _figure_vm_power_mgr_highlevel: @@ -73,47 +60,95 @@ The solution is comprised of two high-level components: Highlevel Solution +In the above diagram, the DPDK Applications are shown running in +virtual machines, and the VM Power Monitor application is shown running +in the host. + +**DPDK VM Application** + +- Reuse ``librte_power`` interface, but uses an implementation that + forwards frequency requests to the host using a ``virtio-serial`` channel +- Each lcore has exclusive access to a single channel +- Sample application reuses ``l3fwd_power`` +- A CLI for changing frequency from within a VM is also included + +**VM Power Monitor** + +- Accepts VM commands over ``virtio-serial`` endpoints, monitored + using ``epoll`` +- Commands include the virtual core to be modified, using ``libvirt`` to get + the physical core mapping +- Uses ``librte_power`` to affect frequency changes using Linux userspace + power governor (``acpi_cpufreq`` OR ``intel_pstate`` driver) +- CLI: For adding VM channels to monitor, inspecting and changing channel + state, manually altering CPU frequency. Also allows for the changings + of vCPU to pCPU pinning + +Sample Application Architecture Overview +---------------------------------------- + +The VM power management solution employs ``qemu-kvm`` to provide +communications channels between the host and VMs in the form of a +``virtio-serial`` connection that appears as a para-virtualised serial +device on a VM and can be configured to use various backends on the +host. For this example, the configuration of each ``virtio-serial`` endpoint +on the host as an ``AF_UNIX`` file socket, supporting poll/select and +``epoll`` for event notification. In this example, each channel endpoint on +the host is monitored for ``EPOLLIN`` events using ``epoll``. Each channel +is specified as ``qemu-kvm`` arguments or as ``libvirt`` XML for each VM, +where each VM can have several channels up to a maximum of 64 per VM. In this +example, each DPDK lcore on a VM has exclusive access to a channel. + +To enable frequency changes from within a VM, the VM forwards a +``librte_power`` request over the ``virtio-serial`` channel to the host. Each +request contains the vCPU and power command (scale up/down/min/max). The +API for the host ``librte_power`` and guest ``librte_power`` is consistent +across environments, with the selection of VM or host implementation +determined automatically at runtime based on the environment. On +receiving a request, the host translates the vCPU to a pCPU using the +libvirt API before forwarding it to the host ``librte_power``. -Overview --------- - -VM Power Management employs qemu-kvm to provide communications channels -between the host and VMs in the form of Virtio-Serial which appears as -a paravirtualized serial device on a VM and can be configured to use -various backends on the host. For this example each Virtio-Serial endpoint -on the host is configured as AF_UNIX file socket, supporting poll/select -and epoll for event notification. -In this example each channel endpoint on the host is monitored via -epoll for EPOLLIN events. -Each channel is specified as qemu-kvm arguments or as libvirt XML for each VM, -where each VM can have a number of channels up to a maximum of 64 per VM, -in this example each DPDK lcore on a VM has exclusive access to a channel. - -To enable frequency changes from within a VM, a request via the librte_power interface -is forwarded via Virtio-Serial to the host, each request contains the vCPU -and power command(scale up/down/min/max). -The API for host and guest librte_power is consistent across environments, -with the selection of VM or Host Implementation determined at automatically -at runtime based on the environment. - -Upon receiving a request, the host translates the vCPU to a pCPU via -the libvirt API before forwarding to the host librte_power. .. _figure_vm_power_mgr_vm_request_seq: .. figure:: img/vm_power_mgr_vm_request_seq.* - VM request to scale frequency - +In addition to the ability to send power management requests to the +host, a VM can send a power management policy to the host. In some +cases, using a power management policy is a preferred option because it +can eliminate possible latency issues that can occur when sending power +management requests. Once the VM sends the policy to the host, the VM no +longer needs to worry about power management, because the host now +manages the power for the VM based on the policy. The policy can specify +power behavior that is based on incoming traffic rates or time-of-day +power adjustment (busy/quiet hour power adjustment for example). See +:ref:`sending_policy` for more information. + +One method of power management is to sense how busy a core is when +processing packets and adjusting power accordingly. One technique for +doing this is to monitor the ratio of the branch miss to branch hits +counters and scale the core power accordingly. This technique is based +on the premise that when a core is not processing packets, the ratio of +branch misses to branch hits is very low, but when the core is +processing packets, it is measurably higher. The implementation of this +capability is as a policy of type ``BRANCH_RATIO``. +See :ref:`sending_policy` for more information on using the +BRANCH_RATIO policy option. + +A JSON interface enables the specification of power management requests +and policies in JSON format. The JSON interfaces provide a more +convenient and more easily interpreted interface for the specification +of requests and policies. See :ref:`power_man_requests` for more information. Performance Considerations ~~~~~~~~~~~~~~~~~~~~~~~~~~ -While Haswell Microarchitecture allows for independent power control for each core, -earlier Microarchtectures do not offer such fine grained control. -When deployed on pre-Haswell platforms greater care must be taken in selecting -which cores are assigned to a VM, for instance a core will not scale down -until its sibling is similarly scaled. +While the Haswell microarchitecture allows for independent power control +for each core, earlier microarchitectures do not offer such fine-grained +control. When deploying on pre-Haswell platforms, greater care must be +taken when selecting which cores are assigned to a VM, for example, a +core does not scale down in frequency until all of its siblings are +similarly scaled down. Configuration ------------- @@ -121,657 +156,552 @@ Configuration BIOS ~~~~ -Enhanced Intel SpeedStep® Technology must be enabled in the platform BIOS -if the power management feature of DPDK is to be used. -Otherwise, the sys file folder /sys/devices/system/cpu/cpu0/cpufreq will not exist, -and the CPU frequency-based power management cannot be used. -Consult the relevant BIOS documentation to determine how these settings -can be accessed. +To use the power management features of the DPDK, you must enable +Enhanced Intel SpeedStep® Technology in the platform BIOS. Otherwise, +the ``sys`` file folder ``/sys/devices/system/cpu/cpu0/cpufreq`` does not +exist, and you cannot use CPU frequency-based power management. Refer to the +relevant BIOS documentation to determine how to access these settings. Host Operating System ~~~~~~~~~~~~~~~~~~~~~ -The Host OS must also have the *apci_cpufreq* module installed, in some cases -the *intel_pstate* driver may be the default Power Management environment. -To enable *acpi_cpufreq* and disable *intel_pstate*, add the following -to the grub Linux command line: +The DPDK Power Management library can use either the ``acpi_cpufreq`` or +the ``intel_pstate`` kernel driver for the management of core frequencies. In +many cases, the ``intel_pstate`` driver is the default power management +environment. -.. code-block:: console +Should the ``acpi-cpufreq driver`` be required, the ``intel_pstate`` +module must be disabled, and the ``acpi-cpufreq`` module loaded in its place. - intel_pstate=disable +To disable the ``intel_pstate`` driver, add the following to the ``grub`` +Linux command line: -Upon rebooting, load the *acpi_cpufreq* module: + ``intel_pstate=disable`` -.. code-block:: console +On reboot, load the ``acpi_cpufreq`` module: - modprobe acpi_cpufreq + ``modprobe acpi_cpufreq`` Hypervisor Channel Configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Virtio-Serial channels are configured via libvirt XML: - +Configure ``virtio-serial`` channels using ``libvirt`` XML. +The XML structure is as follows:  -.. code-block:: xml +.. code-block:: XML - {vm_name} - -
- - - - -
- + {vm_name} + +
+ + + + +
+ +Where a single controller of type ``virtio-serial`` is created, up to 32 +channels can be associated with a single controller, and multiple +controllers can be specified. The convention is to use the name of the +VM in the host path ``{vm_name}`` and to increment ``{channel_num}`` for each +channel. Likewise, the port value ``{N}`` must be incremented for each +channel. -Where a single controller of type *virtio-serial* is created and up to 32 channels -can be associated with a single controller and multiple controllers can be specified. -The convention is to use the name of the VM in the host path *{vm_name}* and -to increment *{channel_num}* for each channel, likewise the port value *{N}* -must be incremented for each channel. - -Each channel on the host will appear in *path*, the directory */tmp/powermonitor/* -must first be created and given qemu permissions +On the host, for each channel to appear in the path, ensure the creation +of the ``/tmp/powermonitor/`` directory and the assignment of ``qemu`` +permissions: .. code-block:: console - mkdir /tmp/powermonitor/ - chown qemu:qemu /tmp/powermonitor + mkdir /tmp/powermonitor/ + chown qemu:qemu /tmp/powermonitor + +Note that files and directories in ``/tmp`` are generally removed when +rebooting the host and you may need to perform the previous steps after +each reboot. -Note that files and directories within /tmp are generally removed upon -rebooting the host and the above steps may need to be carried out after each reboot. +The serial device as it appears on a VM is configured with the target +element attribute name and must be in the form: +``virtio.serial.port.poweragent.{vm_channel_num}``, where +``vm_channel_num`` is typically the lcore channel to be used in +DPDK VM applications. -The serial device as it appears on a VM is configured with the *target* element attribute *name* -and must be in the form of *virtio.serial.port.poweragent.{vm_channel_num}*, -where *vm_channel_num* is typically the lcore channel to be used in DPDK VM applications. +Each channel on a VM is present at: -Each channel on a VM will be present at */dev/virtio-ports/virtio.serial.port.poweragent.{vm_channel_num}* +``/dev/virtio-ports/virtio.serial.port.poweragent.{vm_channel_num}`` Compiling and Running the Host Application ------------------------------------------ -Compiling -~~~~~~~~~ +Compiling the Host Application +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -For information on compiling DPDK and the sample applications +For information on compiling the DPDK and sample applications, see see :doc:`compiling`. -The application is located in the ``vm_power_manager`` sub-directory. +The application is located in the ``vm_power_manager`` subdirectory. To build just the ``vm_power_manager`` application using ``make``: .. code-block:: console - export RTE_SDK=/path/to/rte_sdk - export RTE_TARGET=build - cd ${RTE_SDK}/examples/vm_power_manager/ - make + export RTE_SDK=/path/to/rte_sdk + export RTE_TARGET=build + cd ${RTE_SDK}/examples/vm_power_manager/ + make -The resulting binary will be ${RTE_SDK}/build/examples/vm_power_manager +The resulting binary is ``${RTE_SDK}/build/examples/vm_power_manager``. -To build just the ``vm_power_manager`` application using ``meson/ninja``: +To build just the ``vm_power_manager`` application using ``meson``/``ninja``: .. code-block:: console - export RTE_SDK=/path/to/rte_sdk - cd ${RTE_SDK} - meson build - cd build - ninja - meson configure -Dexamples=vm_power_manager - ninja + export RTE_SDK=/path/to/rte_sdk + cd ${RTE_SDK} + meson build + cd build + ninja + meson configure -Dexamples=vm_power_manager + ninja -The resulting binary will be ${RTE_SDK}/build/examples/dpdk-vm_power_manager +The resulting binary is ``${RTE_SDK}/build/examples/dpdk-vm_power_manager``. -Running -~~~~~~~ +Running the Host Application +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The application does not have any specific command line options other than *EAL*: +The application does not have any specific command line options other +than the EAL options: .. code-block:: console - ./build/vm_power_mgr [EAL options] + ./build/vm_power_mgr [EAL options] -The application requires exactly two cores to run, one core is dedicated to the CLI, -while the other is dedicated to the channel endpoint monitor, for example to run -on cores 0 & 1 on a system with 4 memory channels: +The application requires exactly two cores to run. One core for the CLI +and the other for the channel endpoint monitor. For example, to run on +cores 0 and 1 on a system with four memory channels, issue the command: .. code-block:: console - ./build/vm_power_mgr -l 0-1 -n 4 + ./build/vm_power_mgr -l 0-1 -n 4 -After successful initialization the user is presented with VM Power Manager CLI: +After successful initialization, the VM Power Manager CLI prompt appears: .. code-block:: console - vm_power> + vm_power> -Virtual Machines can now be added to the VM Power Manager: +Now, it is possible to add virtual machines to the VM Power Manager: .. code-block:: console - vm_power> add_vm {vm_name} - -When a {vm_name} is specified with the *add_vm* command a lookup is performed -with libvirt to ensure that the VM exists, {vm_name} is used as an unique identifier -to associate channels with a particular VM and for executing operations on a VM within the CLI. -VMs do not have to be running in order to add them. + vm_power> add_vm {vm_name} -A number of commands can be issued via the CLI in relation to VMs: - - Remove a Virtual Machine identified by {vm_name} from the VM Power Manager. - - .. code-block:: console +When a ``{vm_name}`` is specified with the ``add_vm`` command, a lookup is +performed with ``libvirt`` to ensure that the VM exists. ``{vm_name}`` is a +unique identifier to associate channels with a particular VM and for +executing operations on a VM within the CLI. VMs do not have to be +running to add them. - rm_vm {vm_name} +It is possible to issue several commands from the CLI to manage VMs. - Add communication channels for the specified VM, the virtio channels must be enabled - in the VM configuration(qemu/libvirt) and the associated VM must be active. - {list} is a comma-separated list of channel numbers to add, using the keyword 'all' - will attempt to add all channels for the VM: +Remove the virtual machine identified by ``{vm_name}`` from the VM Power +Manager using the command: - .. code-block:: console - - add_channels {vm_name} {list}|all - - Enable or disable the communication channels in {list}(comma-separated) - for the specified VM, alternatively list can be replaced with keyword 'all'. - Disabled channels will still receive packets on the host, however the commands - they specify will be ignored. Set status to 'enabled' to begin processing requests again: - - .. code-block:: console - - set_channel_status {vm_name} {list}|all enabled|disabled - - Print to the CLI the information on the specified VM, the information - lists the number of vCPUS, the pinning to pCPU(s) as a bit mask, along with - any communication channels associated with each VM, along with the status of each channel: - - .. code-block:: console +.. code-block:: console - show_vm {vm_name} + rm_vm {vm_name} - Set the binding of Virtual CPU on VM with name {vm_name} to the Physical CPU mask: +Add communication channels for the specified VM using the following +command. The ``virtio`` channels must be enabled in the VM configuration +(``qemu/libvirt``) and the associated VM must be active. ``{list}`` is a +comma-separated list of channel numbers to add. Specifying the keyword +``all`` attempts to add all channels for the VM: - .. code-block:: console +.. code-block:: console - set_pcpu_mask {vm_name} {vcpu} {pcpu} + set_pcpu {vm_name} {vcpu} {pcpu} - Set the binding of Virtual CPU on VM to the Physical CPU: + Enable query of physical core information from a VM: - .. code-block:: console +.. code-block:: console - set_pcpu {vm_name} {vcpu} {pcpu} + set_query {vm_name} enable|disable Manual control and inspection can also be carried in relation CPU frequency scaling: Get the current frequency for each core specified in the mask: - .. code-block:: console +.. code-block:: console - show_cpu_freq_mask {mask} + show_cpu_freq_mask {mask} Set the current frequency for the cores specified in {core_mask} by scaling each up/down/min/max: - .. code-block:: console +.. code-block:: console - set_cpu_freq {core_mask} up|down|min|max + add_channels {vm_name} {list}|all - Get the current frequency for the specified core: +Enable or disable the communication channels in ``{list}`` (comma-separated) +for the specified VM. Alternatively, replace ``list`` with the keyword +``all``. Disabled channels receive packets on the host. However, the commands +they specify are ignored. Set the status to enabled to begin processing +requests again: - .. code-block:: console - - show_cpu_freq {core_num} +.. code-block:: console - Set the current frequency for the specified core by scaling up/down/min/max: + set_channel_status {vm_name} {list}|all enabled|disabled - .. code-block:: console +Print to the CLI information on the specified VM. The information lists +the number of vCPUs, the pinning to pCPU(s) as a bit mask, along with +any communication channels associated with each VM, and the status of +each channel: - set_cpu_freq {core_num} up|down|min|max +.. code-block:: console -There are also some command line parameters for enabling the out-of-band -monitoring of branch ratio on cores doing busy polling via PMDs. + show_vm {vm_name} - .. code-block:: console +Set the binding of a virtual CPU on a VM with name ``{vm_name}`` to the +physical CPU mask: - --core-list {list of cores} +.. code-block:: console - When this parameter is used, the list of cores specified will monitor the ratio - between branch hits and branch misses. A tightly polling PMD thread will have a - very low branch ratio, so the core frequency will be scaled down to the minimim - allowed value. When packets are received, the code path will alter, causing the - branch ratio to increase. When the ratio goes above the ratio threshold, the - core frequency will be scaled up to the maximum allowed value. + set_pcpu_mask {vm_name} {vcpu} {pcpu} +Set the binding of the virtual CPU on the VM to the physical CPU: +  .. code-block:: console - --branch-ratio {ratio} - - The branch ratio is a floating point number that specifies the threshold at which - to scale up or down for the given workload. The default branch ratio is 0.01, - and will need to be adjusted for different workloads. - - - -JSON API -~~~~~~~~ - -In addition to the command line interface for host command and a virtio-serial -interface for VM power policies, there is also a JSON interface through which -power commands and policies can be sent. This functionality adds a dependency -on the Jansson library, and the Jansson development package must be installed -on the system before the JSON parsing functionality is included in the app. -This is achieved by: - - .. code-block:: javascript - - apt-get install libjansson-dev - -The command and package name may be different depending on your operating -system. It's worth noting that the app will successfully build without this -package present, but a warning is shown during compilation, and the JSON -parsing functionality will not be present in the app. - -Sending a command or policy to the power manager application is achieved by -simply opening a fifo file, writing a JSON string to that fifo, and closing -the file. - -The fifo is at /tmp/powermonitor/fifo - -The jason string can be a policy or instruction, and takes the following -format: - - .. code-block:: javascript + set_pcpu {vm_name} {vcpu} {pcpu} - {"packet_type": { - "pair_1": value, - "pair_2": value - }} +It is also possible to perform manual control and inspection in relation +to CPU frequency scaling. -The 'packet_type' header can contain one of two values, depending on -whether a policy or power command is being sent. The two possible values are -"policy" and "instruction", and the expected name-value pairs is different -depending on which type is being sent. +Get the current frequency for each core specified in the mask: -The pairs are the format of standard JSON name-value pairs. The value type -varies between the different name/value pairs, and may be integers, strings, -arrays, etc. Examples of policies follow later in this document. The allowed -names and value types are as follows: +.. code-block:: console + show_cpu_freq_mask {mask} -:Pair Name: "name" -:Description: Name of the VM or Host. Allows the parser to associate the - policy with the relevant VM or Host OS. -:Type: string -:Values: any valid string -:Required: yes -:Example: +Set the current frequency for the cores specified in ``{core_mask}`` by +scaling each up/down/min/max: - .. code-block:: javascript +.. code-block:: console - "name", "ubuntu2" + set_cpu_freq {core_mask} up|down|min|max +Get the current frequency for the specified core: -:Pair Name: "command" -:Description: The type of packet we're sending to the power manager. We can be - creating or destroying a policy, or sending a direct command to adjust - the frequency of a core, similar to the command line interface. -:Type: string -:Values: +.. code-block:: console - :CREATE: used when creating a new policy, - :DESTROY: used when removing a policy, - :POWER: used when sending an immediate command, max, min, etc. -:Required: yes -:Example: + show_cpu_freq {core_num} - .. code-block:: javascript +Set the current frequency for the specified core by scaling up/down/min/max: - "command", "CREATE" +.. code-block:: console + set_cpu_freq {core_num} up|down|min|max -:Pair Name: "policy_type" -:Description: Type of policy to apply. Please see vm_power_manager documentation - for more information on the types of policies that may be used. -:Type: string -:Values: +.. _enabling_out_of_band: - :TIME: Time-of-day policy. Frequencies of the relevant cores are - scaled up/down depending on busy and quiet hours. - :TRAFFIC: This policy takes statistics from the NIC and scales up - and down accordingly. - :WORKLOAD: This policy looks at how heavily loaded the cores are, - and scales up and down accordingly. - :BRANCH_RATIO: This out-of-band policy can look at the ratio between - branch hits and misses on a core, and is useful for detecting - how much packet processing a core is doing. -:Required: only for CREATE/DESTROY command -:Example: +Command Line Options for Enabling Out-of-band Branch Ratio Monitoring +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - .. code-block:: javascript +There are a couple of command line parameters for enabling the out-of-band +monitoring of branch ratios on cores doing busy polling using PMDs as +described below: - "policy_type", "TIME" +``--core-branch-ratio {list of cores}:{branch ratio for listed cores}`` + Specify the list of cores to monitor the ratio of branch misses + to branch hits. A tightly-polling PMD thread has a very low + branch ratio, therefore the core frequency scales down to the + minimum allowed value. On receiving packets, the code path changes, + causing the branch ratio to increase. When the ratio goes above + the ratio threshold, the core frequency scales up to the maximum + allowed value. The specified branch-ratio is a floating point number + that identifies the threshold at which to scale up or down for the + elements of the core-list. If not included the default branch ratio of + 0.01 but will need adjustment for different workloads -:Pair Name: "busy_hours" -:Description: The hours of the day in which we scale up the cores for busy - times. -:Type: array of integers -:Values: array with list of hour numbers, (0-23) -:Required: only for TIME policy -:Example: + This parameter can be used multiple times for different sets of cores. + The branch ratio mechanism can also be useful for non-PMD cores and + hyper-threaded environments where C-States are disabled. - .. code-block:: javascript - "busy_hours":[ 17, 18, 19, 20, 21, 22, 23 ] +Compiling and Running the Guest Applications +-------------------------------------------- -:Pair Name: "quiet_hours" -:Description: The hours of the day in which we scale down the cores for quiet - times. -:Type: array of integers -:Values: array with list of hour numbers, (0-23) -:Required: only for TIME policy -:Example: +It is possible to use the ``l3fwd-power`` application (for example) with the +``vm_power_manager``. - .. code-block:: javascript +The distribution also provides a guest CLI for validating the setup. - "quiet_hours":[ 2, 3, 4, 5, 6 ] +For both ``l3fwd-power`` and the guest CLI, the host application must use +the ``add_channels`` command to monitor the channels for the VM. To do this, +issue the following commands in the host application: -:Pair Name: "avg_packet_thresh" -:Description: Threshold below which the frequency will be set to min for - the TRAFFIC policy. If the traffic rate is above this and below max, the - frequency will be set to medium. -:Type: integer -:Values: The number of packets below which the TRAFFIC policy applies the - minimum frequency, or medium frequency if between avg and max thresholds. -:Required: only for TRAFFIC policy -:Example: - - .. code-block:: javascript +.. code-block:: console - "avg_packet_thresh": 100000 + vm_power> add_vm vmname + vm_power> add_channels vmname all + vm_power> set_channel_status vmname all enabled + vm_power> show_vm vmname -:Pair Name: "max_packet_thresh" -:Description: Threshold above which the frequency will be set to max for - the TRAFFIC policy -:Type: integer -:Values: The number of packets per interval above which the TRAFFIC policy - applies the maximum frequency -:Required: only for TRAFFIC policy -:Example: +Compiling the Guest Application +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - .. code-block:: javascript +For information on compiling DPDK and the sample applications in general, +see :doc:`compiling`. - "max_packet_thresh": 500000 +For compiling and running the ``l3fwd-power`` sample application, see +:doc:`l3_forward_power_man`. -:Pair Name: "core_list" -:Description: The cores to which to apply the policy. -:Type: array of integers -:Values: array with list of virtual CPUs. -:Required: only policy CREATE/DESTROY -:Example: +The application is in the ``guest_cli`` subdirectory under ``vm_power_manager``. - .. code-block:: javascript +To build just the ``guest_vm_power_manager`` application using ``make``, issue +the following commands: - "core_list":[ 10, 11 ] +.. code-block:: console -:Pair Name: "workload" -:Description: When our policy is of type WORKLOAD, we need to specify how - heavy our workload is. -:Type: string -:Values: + export RTE_SDK=/path/to/rte_sdk + export RTE_TARGET=build + cd ${RTE_SDK}/examples/vm_power_manager/guest_cli/ + make - :HIGH: For cores running workloads that require high frequencies - :MEDIUM: For cores running workloads that require medium frequencies - :LOW: For cores running workloads that require low frequencies -:Required: only for WORKLOAD policy types -:Example: +The resulting binary is ``${RTE_SDK}/build/examples/guest_cli``. - .. code-block:: javascript +**Note**: This sample application conditionally links in the Jansson JSON +library. Consequently, if you are using a multilib or cross-compile +environment, you may need to set the ``PKG_CONFIG_LIBDIR`` environmental +variable to point to the relevant ``pkgconfig`` folder so that the correct +library is linked in. - "workload", "MEDIUM" +For example, if you are building for a 32-bit target, you could find the +correct directory using the following find command: -:Pair Name: "mac_list" -:Description: When our policy is of type TRAFFIC, we need to specify the - MAC addresses that the host needs to monitor -:Type: string -:Values: array with a list of mac address strings. -:Required: only for TRAFFIC policy types -:Example: +.. code-block:: console - .. code-block:: javascript + # find /usr -type d -name pkgconfig + /usr/lib/i386-linux-gnu/pkgconfig + /usr/lib/x86_64-linux-gnu/pkgconfig - "mac_list":[ "de:ad:be:ef:01:01", "de:ad:be:ef:01:02" ] +Then use: -:Pair Name: "unit" -:Description: the type of power operation to apply in the command -:Type: string -:Values: +.. code-block:: console - :SCALE_MAX: Scale frequency of this core to maximum - :SCALE_MIN: Scale frequency of this core to minimum - :SCALE_UP: Scale up frequency of this core - :SCALE_DOWN: Scale down frequency of this core - :ENABLE_TURBO: Enable Turbo Boost for this core - :DISABLE_TURBO: Disable Turbo Boost for this core -:Required: only for POWER instruction -:Example: + export PKG_CONFIG_LIBDIR=/usr/lib/i386-linux-gnu/pkgconfig - .. code-block:: javascript +You then use the ``make`` command as normal, which should find the 32-bit +version of the library, if it installed. If not, the application builds +without the JSON interface functionality. - "unit", "SCALE_MAX" +To build just the ``vm_power_manager`` application using ``meson``/``ninja``: -:Pair Name: "resource_id" -:Description: The core to which to apply the power command. -:Type: integer -:Values: valid core id for VM or host OS. -:Required: only POWER instruction -:Example: +.. code-block:: console - .. code-block:: javascript + export RTE_SDK=/path/to/rte_sdk + cd ${RTE_SDK} + meson build + cd build + ninja + meson configure -Dexamples=vm_power_manager/guest_cli + ninja - "resource_id": 10 +The resulting binary is ``${RTE_SDK}/build/examples/guest_cli``. -JSON API Examples -~~~~~~~~~~~~~~~~~ +Running the Guest Application +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Profile create example: +The standard EAL command line parameters are necessary: - .. code-block:: javascript +.. code-block:: console - {"policy": { - "name": "ubuntu", - "command": "create", - "policy_type": "TIME", - "busy_hours":[ 17, 18, 19, 20, 21, 22, 23 ], - "quiet_hours":[ 2, 3, 4, 5, 6 ], - "core_list":[ 11 ] - }} + ./build/vm_power_mgr [EAL options] -- [guest options] -Profile destroy example: +The guest example uses a channel for each lcore enabled. For example, to +run on cores 0, 1, 2 and 3: - .. code-block:: javascript +.. code-block:: console - {"profile": { - "name": "ubuntu", - "command": "destroy", - }} + ./build/guest_vm_power_mgr -l 0-3 -Power command example: +.. _sending_policy: - .. code-block:: javascript +Command Line Options Available When Sending a Policy to the Host +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - {"command": { - "name": "ubuntu", - "unit": "SCALE_MAX", - "resource_id": 10 - }} +Optionally, there are several command line options for a user who needs +to send a power policy to the host application: -To send a JSON string to the Power Manager application, simply paste the -example JSON string into a text file and cat it into the fifo: +``--vm-name {name of guest vm}`` + Allows the user to change the virtual machine name + passed down to the host application using the power policy. + The default is ubuntu2. - .. code-block:: console +``--vcpu-list {list vm cores}`` + A comma-separated list of cores in the VM that the user + wants the host application to monitor. + The list of cores in any VM starts at zero, + and the host application maps these to the physical cores + once the policy passes down to the host. + Valid syntax includes individual cores 2,3,4, + a range of cores 2-4, or a combination of both 1,3,5-7. - cat file.json >/tmp/powermonitor/fifo +``--busy-hours {list of busy hours}`` + A comma-separated list of hours in which to set the core + frequency to the maximum. + Valid syntax includes individual hours 2,3,4, + a range of hours 2-4, or a combination of both 1,3,5-7. + Valid hour values are 0 to 23. -The console of the Power Manager application should indicate the command that -was just received via the fifo. +``--quiet-hours {list of quiet hours}`` + A comma-separated list of hours in which to set the core frequency + to minimum. Valid syntax includes individual hours 2,3,4, + a range of hours 2-4, or a combination of both 1,3,5-7. + Valid hour values are 0 to 23. -Compiling and Running the Guest Applications --------------------------------------------- +``--policy {policy type}`` + The type of policy. This can be one of the following values: -l3fwd-power is one sample application that can be used with vm_power_manager. + - TRAFFIC - Based on incoming traffic rates on the NIC. + - TIME - Uses a busy/quiet hours policy. + - BRANCH_RATIO - Uses branch ratio counters to determine core busyness. + - WORKLOAD - Sets the frequency to low, medium or high + based on the received policy setting. -A guest CLI is also provided for validating the setup. + **Note**: Not all policy types need all parameters. + For example, BRANCH_RATIO only needs the vcpu-list parameter. -For both l3fwd-power and guest CLI, the channels for the VM must be monitored by the -host application using the *add_channels* command on the host. This typically uses -the following commands in the host application: +After successful initialization, the VM Power Manager Guest CLI prompt +appears: .. code-block:: console - vm_power> add_vm vmname - vm_power> add_channels vmname all - vm_power> set_channel_status vmname all enabled - vm_power> show_vm vmname - - -Compiling -~~~~~~~~~ + vm_power(guest)> -For information on compiling DPDK and the sample applications -see :doc:`compiling`. - -For compiling and running l3fwd-power, see :doc:`l3_forward_power_man`. - -The application is located in the ``guest_cli`` sub-directory under ``vm_power_manager``. - -To build just the ``guest_vm_power_manager`` application using ``make``: +To change the frequency of an lcore, use a ``set_cpu_freq`` command similar +to the following: .. code-block:: console - export RTE_SDK=/path/to/rte_sdk - export RTE_TARGET=build - cd ${RTE_SDK}/examples/vm_power_manager/guest_cli/ - make - -The resulting binary will be ${RTE_SDK}/build/examples/guest_cli + set_cpu_freq {core_num} up|down|min|max -.. Note:: - This sample application conditionally links in the Jansson JSON - library, so if you are using a multilib or cross compile environment you - may need to set the ``PKG_CONFIG_LIBDIR`` environmental variable to point to - the relevant pkgconfig folder so that the correct library is linked in. +where, ``{core_num}`` is the lcore and channel to change frequency by +scaling up/down/min/max. - For example, if you are building for a 32-bit target, you could find the - correct directory using the following ``find`` command: - - .. code-block:: console - - # find /usr -type d -name pkgconfig - /usr/lib/i386-linux-gnu/pkgconfig - /usr/lib/x86_64-linux-gnu/pkgconfig - - Then use: - - .. code-block:: console - - export PKG_CONFIG_LIBDIR=/usr/lib/i386-linux-gnu/pkgconfig - - You then use the make command as normal, which should find the 32-bit - version of the library, if it installed. If not, the application will - be built without the JSON interface functionality. - -To build just the ``vm_power_manager`` application using ``meson/ninja``: +To start an application, configure the power policy, and send it to the +host, use a command like the following: .. code-block:: console - export RTE_SDK=/path/to/rte_sdk - cd ${RTE_SDK} - meson build - cd build - ninja - meson configure -Dexamples=vm_power_manager/guest_cli - ninja + ./build/guest_vm_power_mgr -l 0-3 -n 4 -- --vm-name=ubuntu --policy=BRANCH_RATIO --vcpu-list=2-4 -The resulting binary will be ${RTE_SDK}/build/examples/guest_cli +Once the VM Power Manager Guest CLI appears, issuing the 'send_policy now' command +will send the policy to the host: -Running -~~~~~~~ +.. code-block:: console -The standard *EAL* command line parameters are required: + send_policy now -.. code-block:: console +Once the policy is sent to the host, the host application takes over the power monitoring +of the specified cores in the policy. - ./build/guest_vm_power_mgr [EAL options] -- [guest options] +.. _power_man_requests: -The guest example uses a channel for each lcore enabled. For example, -to run on cores 0,1,2,3: +JSON Interface for Power Management Requests and Policies +--------------------------------------------------------- -.. code-block:: console +In addition to the command line interface for the host command, and a +``virtio-serial`` interface for VM power policies, there is also a JSON +interface through which power commands and policies can be sent. - ./build/guest_vm_power_mgr -l 0-3 +**Note**: This functionality adds a dependency on the Jansson library. +Install the Jansson development package on the system to avail of the +JSON parsing functionality in the app. Issue the ``apt-get install +libjansson-dev`` command to install the development package. The command +and package name may be different depending on your operating system. It +is worth noting that the app builds successfully if this package is not +present, but a warning displays during compilation, and the JSON parsing +functionality is not present in the app. -Optionally, there is a list of command line parameter should the user wish to send a power -policy down to the host application. These parameters are as follows: +Send a request or policy to the VM Power Manager by simply opening a +fifo file at ``/tmp/powermonitor/fifo``, writing a JSON string to that file, +and closing the file. - .. code-block:: console +The JSON string can be a power management request or a policy, and takes +the following format: - --vm-name {name of guest vm} +.. code-block:: javascript - This parameter allows the user to change the Virtual Machine name passed down to the - host application via the power policy. The default is "ubuntu2" + {"packet_type": { + "pair_1": value, + "pair_2": value + }} - .. code-block:: console +The ``packet_type`` header can contain one of two values, depending on +whether a power management request or policy is being sent. The two +possible values are ``instruction`` and ``policy`` and the expected name-value +pairs are different depending on which type is sent. - --vcpu-list {list vm cores} +The pairs are in the format of standard JSON name-value pairs. The value +type varies between the different name-value pairs, and may be integers, +strings, arrays, and so on. See :ref:`json_interface_ex` +for examples of policies and instructions and +:ref:`json_name_value_pair` for the supported names and value types. - A comma-separated list of cores in the VM that the user wants the host application to - monitor. The list of cores in any vm starts at zero, and these are mapped to the - physical cores by the host application once the policy is passed down. - Valid syntax includes individial cores '2,3,4', or a range of cores '2-4', or a - combination of both '1,3,5-7' +.. _json_interface_ex: - .. code-block:: console +JSON Interface Examples +~~~~~~~~~~~~~~~~~~~~~~~ - --busy-hours {list of busy hours} +The following is an example JSON string that creates a time-profile +policy. - A comma-separated list of hours within which to set the core frequency to maximum. - Valid syntax includes individial hours '2,3,4', or a range of hours '2-4', or a - combination of both '1,3,5-7'. Valid hours are 0 to 23. +.. code-block:: JSON - .. code-block:: console + {"policy": { + "name": "ubuntu", + "command": "create", + "policy_type": "TIME", + "busy_hours":[ 17, 18, 19, 20, 21, 22, 23 ], + "quiet_hours":[ 2, 3, 4, 5, 6 ], + "core_list":[ 11 ] + }} - --quiet-hours {list of quiet hours} +The following is an example JSON string that removes the named policy. - A comma-separated list of hours within which to set the core frequency to minimum. - Valid syntax includes individial hours '2,3,4', or a range of hours '2-4', or a - combination of both '1,3,5-7'. Valid hours are 0 to 23. +.. code-block:: JSON - .. code-block:: console + {"policy": { + "name": "ubuntu", + "command": "destroy", + }} - --policy {policy type} +The following is an example JSON string for a power management request. - The type of policy. This can be one of the following values: - TRAFFIC - based on incoming traffic rates on the NIC. - TIME - busy/quiet hours policy. - BRANCH_RATIO - uses branch ratio counters to determine core busyness. - Not all parameters are needed for all policy types. For example, BRANCH_RATIO - only needs the vcpu-list parameter, not any of the hours. +.. code-block:: JSON + {"instruction": { + "name": "ubuntu", + "command": "power", + "unit": "SCALE_MAX", + "resource_id": 10 + }} -After successful initialization the user is presented with VM Power Manager Guest CLI: +To query the available frequences of an lcore, use the query_cpu_freq command. +Where {core_num} is the lcore to query. +Before using this command, please enable responses via the set_query command on the host. .. code-block:: console - vm_power(guest)> + query_cpu_freq {core_num}|all -To change the frequency of a lcore, use the set_cpu_freq command. -Where {core_num} is the lcore and channel to change frequency by scaling up/down/min/max. +To query the capabilities of an lcore, use the query_cpu_caps command. +Where {core_num} is the lcore to query. +Before using this command, please enable responses via the set_query command on the host. .. code-block:: console - set_cpu_freq {core_num} up|down|min|max + query_cpu_caps {core_num}|all To start the application and configure the power policy, and send it to the host: @@ -789,3 +719,226 @@ will send the policy to the host: Once the policy is sent to the host, the host application takes over the power monitoring of the specified cores in the policy. +.. _json_name_value_pair: + +JSON Name-value Pairs +~~~~~~~~~~~~~~~~~~~~~ + +The following are the name-value pairs supported by the JSON interface: + +- `avg_packet_thresh`_ +- `busy_hours`_ +- `command`_ +- `core_list`_ +- `mac_list`_ +- `max_packet_thresh`_ +- `name`_ +- `policy_type`_ +- `quiet_hours`_ +- `resource_id`_ +- `unit`_ +- `workload`_ + +avg_packet_thresh +^^^^^^^^^^^^^^^^^ + +Description + The threshold below which the frequency is set to the minimum value + for the TRAFFIC policy. + If the traffic rate is above this value and below the maximum value, + the frequency is set to medium. +Type + integer +Values + The number of packets below which the TRAFFIC policy applies + the minimum frequency, or the medium frequency + if between the average and maximum thresholds. +Required + Yes +Example + ``"avg_packet_thresh": 100000`` + +busy_hours +^^^^^^^^^^ + +Description + The hours of the day in which we scale up the cores for busy times. +Type + array of integers +Values + An array with a list of hour values (0-23). +Required + For the TIME policy only. +Example + ``"busy_hours":[ 17, 18, 19, 20, 21, 22, 23 ]`` + +command +^^^^^^^ + +Description + The type of packet to send to the VM Power Manager. + It is possible to create or destroy a policy or send a direct command + to adjust the frequency of a core, + as is possible on the command line interface. +Type + string +Values + Possible values are: + - CREATE: Create a new policy. + - DESTROY: Remove an existing policy. + - POWER: Send an immediate command, max, min, and so on. +Required + Yes +Example + ``"command": "CREATE"`` + +core_list +^^^^^^^^^ + +Description + The cores to which to apply a policy. +Type + array of integers +Values + An array with a list of virtual CPUs. +Required + For CREATE/DESTROY policy requests only. +Example + ``"core_list":[ 10, 11 ]`` + +mac_list +^^^^^^^^ + +Description + When the policy is of type TRAFFIC, + it is necessary to specify the MAC addresses that the host must monitor. +Type + array of strings +Values + An array with a list of MAC address strings. +Required + For TRAFFIC policy types only. +Example + ``"mac_list":[ "de:ad:be:ef:01:01","de:ad:be:ef:01:02" ]`` + +max_packet_thresh +^^^^^^^^^^^^^^^^^ + +Description + In a policy of type TRAFFIC, + the threshold value above which the frequency is set to a maximum. +Type + integer +Values + The number of packets per interval above which + the TRAFFIC policy applies the maximum frequency. +Required + For the TRAFFIC policy only. +Example + ``"max_packet_thresh": 500000`` + +name +^^^^ + +Description + The name of the VM or host. + Allows the parser to associate the policy with the relevant VM or host OS. +Type + string +Values + Any valid string. +Required + Yes +Example + ``"name": "ubuntu2"`` + +policy_type +^^^^^^^^^^^ + +Description + The type of policy to apply. + See the ``--policy`` option description for more information. +Type + string +Values + Possible values are: + + - TIME: Time-of-day policy. + Scale the frequencies of the relevant cores up/down + depending on busy and quiet hours. + - TRAFFIC: Use statistics from the NIC and scale up and down accordingly. + - WORKLOAD: Determine how heavily loaded the cores are + and scale up and down accordingly. + - BRANCH_RATIO: An out-of-band policy that looks at the ratio + between branch hits and misses on a core + and uses that information to determine how much packet processing + a core is doing. + +Required + For ``CREATE`` and ``DESTROY`` policy requests only. +Example + ``"policy_type": "TIME"`` + +quiet_hours +^^^^^^^^^^^ + +Description + The hours of the day to scale down the cores for quiet times. +Type + array of integers +Values + An array with a list of hour numbers with values in the range 0 to 23. +Required + For the TIME policy only. +Example + ``"quiet_hours":[ 2, 3, 4, 5, 6 ]`` + +resource_id +^^^^^^^^^^^ + +Description + The core to which to apply a power command. +Type + integer +Values + A valid core ID for the VM or host OS. +Required + For the ``POWER`` instruction only. +Example + ``"resource_id": 10`` + +unit +^^^^ + +Description + The type of power operation to apply in the command. +Type + string +Values + - SCALE_MAX: Scale the frequency of this core to the maximum. + - SCALE_MIN: Scale the frequency of this core to the minimum. + - SCALE_UP: Scale up the frequency of this core. + - SCALE_DOWN: Scale down the frequency of this core. + - ENABLE_TURBO: Enable Intel® Turbo Boost Technology for this core. + - DISABLE_TURBO: Disable Intel® Turbo Boost Technology for this core. +Required + For the ``POWER`` instruction only. +Example + ``"unit": "SCALE_MAX"`` + +workload +^^^^^^^^ + +Description + In a policy of type WORKLOAD, + it is necessary to specify how heavy the workload is. +Type + string +Values + - HIGH: Scale the frequency of this core to maximum. + - MEDIUM: Scale the frequency of this core to minimum. + - LOW: Scale up the frequency of this core. +Required + For the ``WORKLOAD`` policy only. +Example + ``"workload": "MEDIUM"``