X-Git-Url: http://git.droids-corp.org/?a=blobdiff_plain;f=doc%2Fguides%2Fprog_guide%2Fpower_man.rst;h=c70ae128acf0db171a7a79ee97812061b37e86d3;hb=d1355fcc4607de529359c671c908bfbd2a5ffd0c;hp=eba1cc6bf3af7026c0fc7bbc3075749a7d4ba8e0;hpb=5630257fcc30397e7217139ec55da4f301f59fb7;p=dpdk.git diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst index eba1cc6bf3..c70ae128ac 100644 --- a/doc/guides/prog_guide/power_man.rst +++ b/doc/guides/prog_guide/power_man.rst @@ -106,9 +106,139 @@ User Cases The power management mechanism is used to save power when performing L3 forwarding. + +Empty Poll API +-------------- + +Abstract +~~~~~~~~ + +For packet processing workloads such as DPDK polling is continuous. +This means CPU cores always show 100% busy independent of how much work +those cores are doing. It is critical to accurately determine how busy +a core is hugely important for the following reasons: + + * No indication of overload conditions + * User does not know how much real load is on a system, resulting + in wasted energy as no power management is utilized + +Compared to the original l3fwd-power design, instead of going to sleep +after detecting an empty poll, the new mechanism just lowers the core frequency. +As a result, the application does not stop polling the device, which leads +to improved handling of bursts of traffic. + +When the system become busy, the empty poll mechanism can also increase the core +frequency (including turbo) to do best effort for intensive traffic. This gives +us more flexible and balanced traffic awareness over the standard l3fwd-power +application. + + +Proposed Solution +~~~~~~~~~~~~~~~~~ +The proposed solution focuses on how many times empty polls are executed. +The less the number of empty polls, means current core is busy with processing +workload, therefore, the higher frequency is needed. The high empty poll number +indicates the current core not doing any real work therefore, we can lower the +frequency to safe power. + +In the current implementation, each core has 1 empty-poll counter which assume +1 core is dedicated to 1 queue. This will need to be expanded in the future to +support multiple queues per core. + +Power state definition: +^^^^^^^^^^^^^^^^^^^^^^^ + +* LOW: Not currently used, reserved for future use. + +* MED: the frequency is used to process modest traffic workload. + +* HIGH: the frequency is used to process busy traffic workload. + +There are two phases to establish the power management system: +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +* Training phase. This phase is used to measure the optimal frequency + change thresholds for a given system. The thresholds will differ from + system to system due to differences in processor micro-architecture, + cache and device configurations. + In this phase, the user must ensure that no traffic can enter the + system so that counts can be measured for empty polls at low, medium + and high frequencies. Each frequency is measured for two seconds. + Once the training phase is complete, the threshold numbers are + displayed, and normal mode resumes, and traffic can be allowed into + the system. These threshold number can be used on the command line + when starting the application in normal mode to avoid re-training + every time. + +* Normal phase. Every 10ms the run-time counters are compared + to the supplied threshold values, and the decision will be made + whether to move to a different power state (by adjusting the + frequency). + +API Overview for Empty Poll Power Management +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* **State Init**: initialize the power management system. + +* **State Free**: free the resource hold by power management system. + +* **Update Empty Poll Counter**: update the empty poll counter. + +* **Update Valid Poll Counter**: update the valid poll counter. + +* **Set the Frequency Index**: update the power state/frequency mapping. + +* **Detect empty poll state change**: empty poll state change detection algorithm then take action. + +User Cases +---------- +The mechanism can applied to any device which is based on polling. e.g. NIC, FPGA. + +Ethernet PMD Power Management API +--------------------------------- + +Abstract +~~~~~~~~ + +Existing power management mechanisms require developers +to change application design or change code to make use of it. +The PMD power management API provides a convenient alternative +by utilizing Ethernet PMD RX callbacks, +and triggering power saving whenever empty poll count reaches a certain number. + +Monitor + This power saving scheme will put the CPU into optimized power state + and use the ``rte_power_monitor()`` function + to monitor the Ethernet PMD RX descriptor address, + and wake the CPU up whenever there's new traffic. + +Pause + This power saving scheme will avoid busy polling + by either entering power-optimized sleep state + with ``rte_power_pause()`` function, + or, if it's not available, use ``rte_pause()``. + +Frequency scaling + This power saving scheme will use ``librte_power`` library + functionality to scale the core frequency up/down + depending on traffic volume. + +.. note:: + + Currently, this power management API is limited to mandatory mapping + of 1 queue to 1 core (multiple queues are supported, + but they must be polled from different cores). + +API Overview for Ethernet PMD Power Management +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +* **Queue Enable**: Enable specific power scheme for certain queue/port/core. + +* **Queue Disable**: Disable power scheme for certain queue/port/core. + References ---------- -* l3fwd-power: The sample application in DPDK that performs L3 forwarding with power management. +* The :doc:`../sample_app_ug/l3_forward_power_man` + chapter in the :doc:`../sample_app_ug/index` section. -* The "L3 Forwarding with Power Management Sample Application" chapter in the *DPDK Sample Application's User Guide*. +* The :doc:`../sample_app_ug/vm_power_management` + chapter in the :doc:`../sample_app_ug/index` section.