.. SPDX-License-Identifier: BSD-3-Clause
- Copyright(c) 2019 Intel Corporation.
+ Copyright(c) 2019-2020 Intel Corporation.
AF_XDP Poll Mode Driver
==========================
Current implementation only supports single queue, multi-queues feature will
be added later.
-Note that MTU of AF_XDP PMD is limited due to XDP lacks support for
-fragmentation.
-
AF_XDP PMD enables need_wakeup flag by default if it is supported. This
need_wakeup feature is used to support executing application and driver on the
same core efficiently. This feature not only has a large positive performance
* ``iface`` - name of the Kernel interface to attach to (required);
* ``start_queue`` - starting netdev queue id (optional, default 0);
* ``queue_count`` - total netdev queue number (optional, default 1);
+* ``shared_umem`` - PMD will attempt to share UMEM with others (optional,
+ default 0);
+* ``xdp_prog`` - path to custom xdp program (optional, default none);
+* ``busy_budget`` - busy polling budget (optional, default 64);
Prerequisites
-------------
* A Kernel bound interface to attach to;
* For need_wakeup feature, it requires kernel version later than v5.3-rc1;
* For PMD zero copy, it requires kernel version later than v5.4-rc1;
+* For shared_umem, it requires kernel version v5.10 or later and libbpf version
+ v0.2.0 or later.
+* For 32-bit OS, a kernel with version 5.4 or later is required.
+* For busy polling, kernel version v5.11 or later is required.
Set up an af_xdp interface
-----------------------------
.. code-block:: console
--vdev net_af_xdp,iface=ens786f1
+
+Limitations
+-----------
+
+- **MTU**
+
+ The MTU of the AF_XDP PMD is limited due to the XDP requirement of one packet
+ per page. In the PMD we report the maximum MTU for zero copy to be equal
+ to the page size less the frame overhead introduced by AF_XDP (XDP HR = 256)
+ and DPDK (frame headroom = 320). With a 4K page size this works out at 3520.
+ However in practice this value may be even smaller, due to differences between
+ the supported RX buffer sizes of the underlying kernel netdev driver.
+
+ For example, the largest RX buffer size supported by the underlying kernel driver
+ which is less than the page size (4096B) may be 3072B. In this case, the maximum
+ MTU value will be at most 3072, but likely even smaller than this, once relevant
+ headers are accounted for eg. Ethernet and VLAN.
+
+ To determine the actual maximum MTU value of the interface you are using with the
+ AF_XDP PMD, consult the documentation for the kernel driver.
+
+ Note: The AF_XDP PMD will fail to initialise if an MTU which violates the driver's
+ conditions as above is set prior to launching the application.
+
+- **Shared UMEM**
+
+ The sharing of UMEM is only supported for AF_XDP sockets with unique contexts.
+ The context refers to the netdev,qid tuple.
+
+ The following combination will fail:
+
+ .. code-block:: console
+
+ --vdev net_af_xdp0,iface=ens786f1,shared_umem=1 \
+ --vdev net_af_xdp1,iface=ens786f1,shared_umem=1 \
+
+ Either of the following however is permitted since either the netdev or qid differs
+ between the two vdevs:
+
+ .. code-block:: console
+
+ --vdev net_af_xdp0,iface=ens786f1,shared_umem=1 \
+ --vdev net_af_xdp1,iface=ens786f1,start_queue=1,shared_umem=1 \
+
+ .. code-block:: console
+
+ --vdev net_af_xdp0,iface=ens786f1,shared_umem=1 \
+ --vdev net_af_xdp1,iface=ens786f2,shared_umem=1 \
+
+- **Preferred Busy Polling**
+
+ The SO_PREFER_BUSY_POLL socket option was introduced in kernel v5.11. It can
+ deliver a performance improvement for sockets with heavy traffic loads and
+ can significantly improve single-core performance in this context.
+
+ The feature is enabled by default in the AF_XDP PMD. To disable it, set the
+ 'busy_budget' vdevarg to zero:
+
+ .. code-block:: console
+
+ --vdev net_af_xdp0,iface=ens786f1,busy_budget=0
+
+ The default 'busy_budget' is 64 and it represents the number of packets the
+ kernel will attempt to process in the netdev's NAPI context. You can change
+ the value for example to 256 like so:
+
+ .. code-block:: console
+
+ --vdev net_af_xdp0,iface=ens786f1,busy_budget=256
+
+ It is also strongly recommended to set the following for optimal performance:
+
+ .. code-block:: console
+
+ echo 2 | sudo tee /sys/class/net/ens786f1/napi_defer_hard_irqs
+ echo 200000 | sudo tee /sys/class/net/ens786f1/gro_flush_timeout
+
+ The above defers interrupts for interface ens786f1 and instead schedules its
+ NAPI context from a watchdog timer instead of from softirqs. More information
+ on this feature can be found at [1].
+
+ [1] https://lwn.net/Articles/837010/
\ No newline at end of file