X-Git-Url: http://git.droids-corp.org/?a=blobdiff_plain;f=doc%2Fguides%2Fprog_guide%2Fpoll_mode_drv.rst;h=68312898448cb857e45c7a18584022d8c40e677c;hb=34fd4373ce76efd0236e59397c495762c2ec9e64;hp=a1a758b0fa0de36e9c7b1e9d019ec09858634c4d;hpb=ea85e7d711b664558a53a8131e22fdff952e5241;p=dpdk.git diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst index a1a758b0fa..6831289844 100644 --- a/doc/guides/prog_guide/poll_mode_drv.rst +++ b/doc/guides/prog_guide/poll_mode_drv.rst @@ -1,32 +1,5 @@ -.. BSD LICENSE - Copyright(c) 2010-2015 Intel Corporation. All rights reserved. - All rights reserved. - - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - * Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - * Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - * Neither the name of Intel Corporation nor the names of its - contributors may be used to endorse or promote products derived - from this software without specific prior written permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2010-2015 Intel Corporation. .. _Poll_Mode_Driver: @@ -84,7 +57,7 @@ Whenever needed and appropriate, asynchronous communication should be introduced Avoiding lock contention is a key issue in a multi-core environment. To address this issue, PMDs are designed to work with per-core private resources as much as possible. -For example, a PMD maintains a separate transmit queue per-core, per-port. +For example, a PMD maintains a separate transmit queue per-core, per-port, if the PMD is not ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` capable. In the same way, every receive queue of a port is assigned to and polled by a single logical core (lcore). To comply with Non-Uniform Memory Access (NUMA), memory management is designed to assign to each logical core @@ -146,8 +119,18 @@ This is also true for the pipe-line model provided all logical cores used are lo Multiple logical cores should never share receive or transmit queues for interfaces since this would require global locks and hinder performance. -Device Identification and Configuration ---------------------------------------- +If the PMD is ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` capable, multiple threads can invoke ``rte_eth_tx_burst()`` +concurrently on the same tx queue without SW lock. This PMD feature found in some NICs and useful in the following use cases: + +* Remove explicit spinlock in some applications where lcores are not mapped to Tx queues with 1:1 relation. + +* In the eventdev use case, avoid dedicating a separate TX core for transmitting and thus + enables more scaling as all workers can send the packets. + +See `Hardware Offload`_ for ``RTE_ETH_TX_OFFLOAD_MT_LOCKFREE`` capability probing details. + +Device Identification, Ownership and Configuration +-------------------------------------------------- Device Identification ~~~~~~~~~~~~~~~~~~~~~ @@ -161,6 +144,16 @@ Based on their PCI identifier, NIC ports are assigned two other identifiers: * A port name used to designate the port in console messages, for administration or debugging purposes. For ease of use, the port name includes the port index. +Port Ownership +~~~~~~~~~~~~~~ +The Ethernet devices ports can be owned by a single DPDK entity (application, library, PMD, process, etc). +The ownership mechanism is controlled by ethdev APIs and allows to set/remove/get a port owner by DPDK entities. +Allowing this should prevent any multiple management of Ethernet port by different entities. + +.. note:: + + It is the DPDK entity responsibility to set the port owner before using it and to manage the port usage synchronization between different threads or processes. + Device Configuration ~~~~~~~~~~~~~~~~~~~~ @@ -290,7 +283,8 @@ Hardware Offload Depending on driver capabilities advertised by ``rte_eth_dev_info_get()``, the PMD may support hardware offloading -feature like checksumming, TCP segmentation or VLAN insertion. +feature like checksumming, TCP segmentation, VLAN insertion or +lockfree multithreaded TX burst on the same TX queue. The support of these offload features implies the addition of dedicated status bit(s) and value field(s) into the rte_mbuf data structure, along @@ -299,6 +293,41 @@ exported by each PMD. The list of flags and their precise meaning is described in the mbuf API documentation and in the in :ref:`Mbuf Library `, section "Meta Information". +Per-Port and Per-Queue Offloads +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In the DPDK offload API, offloads are divided into per-port and per-queue offloads as follows: + +* A per-queue offloading can be enabled on a queue and disabled on another queue at the same time. +* A pure per-port offload is the one supported by device but not per-queue type. +* A pure per-port offloading can't be enabled on a queue and disabled on another queue at the same time. +* A pure per-port offloading must be enabled or disabled on all queues at the same time. +* Any offloading is per-queue or pure per-port type, but can't be both types at same devices. +* Port capabilities = per-queue capabilities + pure per-port capabilities. +* Any supported offloading can be enabled on all queues. + +The different offloads capabilities can be queried using ``rte_eth_dev_info_get()``. +The ``dev_info->[rt]x_queue_offload_capa`` returned from ``rte_eth_dev_info_get()`` includes all per-queue offloading capabilities. +The ``dev_info->[rt]x_offload_capa`` returned from ``rte_eth_dev_info_get()`` includes all pure per-port and per-queue offloading capabilities. +Supported offloads can be either per-port or per-queue. + +Offloads are enabled using the existing ``RTE_ETH_TX_OFFLOAD_*`` or ``RTE_ETH_RX_OFFLOAD_*`` flags. +Any requested offloading by an application must be within the device capabilities. +Any offloading is disabled by default if it is not set in the parameter +``dev_conf->[rt]xmode.offloads`` to ``rte_eth_dev_configure()`` and +``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()``. + +If any offloading is enabled in ``rte_eth_dev_configure()`` by an application, +it is enabled on all queues no matter whether it is per-queue or +per-port type and no matter whether it is set or cleared in +``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()``. + +If a per-queue offloading hasn't been enabled in ``rte_eth_dev_configure()``, +it can be enabled or disabled in ``rte_eth_[rt]x_queue_setup()`` for individual queue. +A newly added offloads in ``[rt]x_conf->offloads`` to ``rte_eth_[rt]x_queue_setup()`` input by application +is the one which hasn't been enabled in ``rte_eth_dev_configure()`` and is requested to be enabled +in ``rte_eth_[rt]x_queue_setup()``. It must be per-queue type, otherwise trigger an error log. + Poll Mode Driver API -------------------- @@ -331,6 +360,35 @@ Ethernet Device API The Ethernet device API exported by the Ethernet PMDs is described in the *DPDK API Reference*. +.. _ethernet_device_standard_device_arguments: + +Ethernet Device Standard Device Arguments +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Standard Ethernet device arguments allow for a set of commonly used arguments/ +parameters which are applicable to all Ethernet devices to be available to for +specification of specific device and for passing common configuration +parameters to those ports. + +* ``representor`` for a device which supports the creation of representor ports + this argument allows user to specify which switch ports to enable port + representors for. Multiple representors in one device argument is invalid:: + + -a DBDF,representor=vf0 + -a DBDF,representor=vf[0,4,6,9] + -a DBDF,representor=vf[0-31] + -a DBDF,representor=vf[0,2-4,7,9-11] + -a DBDF,representor=sf0 + -a DBDF,representor=sf[1,3,5] + -a DBDF,representor=sf[0-1023] + -a DBDF,representor=sf[0,2-4,7,9-11] + -a DBDF,representor=pf1vf0 + -a DBDF,representor=pf[0-1]sf[0-127] + -a DBDF,representor=pf1 + +Note: PMDs are not required to support the standard device arguments and users +should consult the relevant PMD documentation to see support devargs. + Extended Statistics API ~~~~~~~~~~~~~~~~~~~~~~~ @@ -360,8 +418,8 @@ strings split by a single underscore ``_``. The scheme is as follows: * detail n * unit -Examples of common statistics xstats strings, formatted to comply to the -above scheme: +Examples of common statistics xstats strings, formatted to comply to the scheme +proposed above: * ``rx_bytes`` * ``rx_crc_errors`` @@ -375,7 +433,7 @@ associated with the receive side of the NIC. The second component ``packets`` indicates that the unit of measure is packets. A more complicated example: ``tx_size_128_to_255_packets``. In this example, -``tx`` indicates transmission, ``size`` is the first detail, ``128`` etc. are +``tx`` indicates transmission, ``size`` is the first detail, ``128`` etc are more details, and ``packets`` indicates that this is a packet counter. Some additions in the metadata scheme are as follows: @@ -414,13 +472,13 @@ The API is built out of a small number of functions, which can be used to retrieve the number of statistics and the names, IDs and values of those statistics. -* ``rte_eth_xstats_get_names()``: returns the names of the statistics. When given a +* ``rte_eth_xstats_get_names_by_id()``: returns the names of the statistics. When given a ``NULL`` parameter the function returns the number of statistics that are available. * ``rte_eth_xstats_get_id_by_name()``: Searches for the statistic ID that matches ``xstat_name``. If found, the ``id`` integer is set. -* ``rte_eth_xstats_get()``: Fills in an array of ``uint64_t`` values +* ``rte_eth_xstats_get_by_id()``: Fills in an array of ``uint64_t`` values with matching the provided ``ids`` array. If the ``ids`` array is NULL, it returns all statistics that are available. @@ -444,7 +502,7 @@ First step is to get all statistics names and list them: int len, i; /* Get number of stats */ - len = rte_eth_xstats_get_names(port_id, NULL, NULL, 0); + len = rte_eth_xstats_get_names_by_id(port_id, NULL, NULL, 0); if (len < 0) { printf("Cannot get xstats count\n"); goto err; @@ -457,7 +515,7 @@ First step is to get all statistics names and list them: } /* Retrieve xstats names, passing NULL for IDs to return all statistics */ - if (len != rte_eth_xstats_get_names(port_id, xstats_names, NULL, len)) { + if (len != rte_eth_xstats_get_names_by_id(port_id, xstats_names, NULL, len)) { printf("Cannot get xstat names\n"); goto err; } @@ -469,7 +527,7 @@ First step is to get all statistics names and list them: } /* Getting xstats values */ - if (len != rte_eth_xstats_get(port_id, NULL, values, len)) { + if (len != rte_eth_xstats_get_by_id(port_id, NULL, values, len)) { printf("Cannot get xstat values\n"); goto err; } @@ -490,7 +548,7 @@ ids of those statistics by looking up the name as follows: const char *xstat_name = "rx_errors"; if(!rte_eth_xstats_get_id_by_name(port_id, xstat_name, &id)) { - rte_eth_xstats_get(port_id, &id, &value, 1); + rte_eth_xstats_get_by_id(port_id, &id, &value, 1); printf("%s: %"PRIu64"\n", xstat_name, value); } else { @@ -511,7 +569,7 @@ statistics simpler for the application. uint64_t value_array[APP_NUM_STATS]; /* Getting multiple xstats values from array of IDs */ - rte_eth_xstats_get(port_id, ids_array, value_array, APP_NUM_STATS); + rte_eth_xstats_get_by_id(port_id, ids_array, value_array, APP_NUM_STATS); uint32_t i; for(i = 0; i < APP_NUM_STATS; i++) { @@ -524,4 +582,44 @@ This array lookup API for xstats allows the application create multiple call. As an end result, the application is able to achieve its goal of monitoring a single statistic ("rx_errors" in this case), and if that shows packets being dropped, it can easily retrieve a "set" of statistics using the -IDs array parameter to ``rte_eth_xstats_get`` function. +IDs array parameter to ``rte_eth_xstats_get_by_id`` function. + +NIC Reset API +~~~~~~~~~~~~~ + +.. code-block:: c + + int rte_eth_dev_reset(uint16_t port_id); + +Sometimes a port has to be reset passively. For example when a PF is +reset, all its VFs should also be reset by the application to make them +consistent with the PF. A DPDK application also can call this function +to trigger a port reset. Normally, a DPDK application would invokes this +function when an RTE_ETH_EVENT_INTR_RESET event is detected. + +It is the duty of the PMD to trigger RTE_ETH_EVENT_INTR_RESET events and +the application should register a callback function to handle these +events. When a PMD needs to trigger a reset, it can trigger an +RTE_ETH_EVENT_INTR_RESET event. On receiving an RTE_ETH_EVENT_INTR_RESET +event, applications can handle it as follows: Stop working queues, stop +calling Rx and Tx functions, and then call rte_eth_dev_reset(). For +thread safety all these operations should be called from the same thread. + +For example when PF is reset, the PF sends a message to notify VFs of +this event and also trigger an interrupt to VFs. Then in the interrupt +service routine the VFs detects this notification message and calls +rte_eth_dev_callback_process(dev, RTE_ETH_EVENT_INTR_RESET, NULL). +This means that a PF reset triggers an RTE_ETH_EVENT_INTR_RESET +event within VFs. The function rte_eth_dev_callback_process() will +call the registered callback function. The callback function can trigger +the application to handle all operations the VF reset requires including +stopping Rx/Tx queues and calling rte_eth_dev_reset(). + +The rte_eth_dev_reset() itself is a generic function which only does +some hardware reset operations through calling dev_unint() and +dev_init(), and itself does not handle synchronization, which is handled +by application. + +The PMD itself should not call rte_eth_dev_reset(). The PMD can trigger +the application to handle reset event. It is duty of application to +handle all synchronization before it calls rte_eth_dev_reset().