MLX5 poll mode driver
=====================
-The MLX5 poll mode driver library (**librte_pmd_mlx5**) provides support for
-**Mellanox ConnectX-4 EN** and **Mellanox ConnectX-4 Lx EN** families of
-10/25/40/50/100 Gb/s adapters as well as their virtual functions (VF) in
-SR-IOV context.
+The MLX5 poll mode driver library (**librte_pmd_mlx5**) provides support
+for **Mellanox ConnectX-4**, **Mellanox ConnectX-4 Lx** and **Mellanox
+ConnectX-5** families of 10/25/40/50/100 Gb/s adapters as well as their
+virtual functions (VF) in SR-IOV context.
Information and documentation about these adapters can be found on the
`Mellanox website <http://www.mellanox.com>`__. Help is also provided by the
be enabled manually by setting ``CONFIG_RTE_LIBRTE_MLX5_PMD=y`` and
recompiling DPDK.
-.. warning::
-
- ``CONFIG_RTE_BUILD_COMBINE_LIBS`` with ``CONFIG_RTE_BUILD_SHARED_LIB``
- is not supported and thus the compilation will fail with this configuration.
-
Implementation details
----------------------
Enabling librte_pmd_mlx5 causes DPDK applications to be linked against
libibverbs.
+Features
+--------
+
+- Multiple TX and RX queues.
+- Support for scattered TX and RX frames.
+- IPv4, IPv6, TCPv4, TCPv6, UDPv4 and UDPv6 RSS on any number of queues.
+- Several RSS hash keys, one for each flow type.
+- Configurable RETA table.
+- Support for multiple MAC addresses.
+- VLAN filtering.
+- RX VLAN stripping.
+- TX VLAN insertion.
+- RX CRC stripping configuration.
+- Promiscuous mode.
+- Multicast promiscuous mode.
+- Hardware checksum offloads.
+- Flow director (RTE_FDIR_MODE_PERFECT, RTE_FDIR_MODE_PERFECT_MAC_VLAN and
+ RTE_ETH_FDIR_REJECT).
+- Flow API.
+- Secondary process TX is supported.
+- KVM and VMware ESX SR-IOV modes are supported.
+- RSS hash result is supported.
+- Hardware TSO.
+
+Limitations
+-----------
+
+- Inner RSS for VXLAN frames is not supported yet.
+- Port statistics through software counters only.
+- Hardware checksum offloads for VXLAN inner header are not supported yet.
+- Secondary process RX is not supported.
+
Configuration
-------------
adds additional run-time checks and debugging messages at the cost of
lower performance.
-- ``CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N`` (default **4**)
-
- Number of scatter/gather elements (SGEs) per work request (WR). Lowering
- this number improves performance but also limits the ability to receive
- scattered packets (packets that do not fit a single mbuf). The default
- value is a safe tradeoff.
-
-- ``CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE`` (default **0**)
-
- Amount of data to be inlined during TX operations. Improves latency but
- lowers throughput.
-
- ``CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE`` (default **8**)
Maximum number of cached memory pools (MPs) per TX queue. Each MP from
This value is always 1 for RX queues since they use a single MP.
+Environment variables
+~~~~~~~~~~~~~~~~~~~~~
+
+- ``MLX5_PMD_ENABLE_PADDING``
+
+ Enables HW packet padding in PCI bus transactions.
+
+ When packet size is cache aligned and CRC stripping is enabled, 4 fewer
+ bytes are written to the PCI bus. Enabling padding makes such packets
+ aligned again.
+
+ In cases where PCI bandwidth is the bottleneck, padding can improve
+ performance by 10%.
+
+ This is disabled by default since this can also decrease performance for
+ unaligned packet sizes.
+
Run-time configuration
~~~~~~~~~~~~~~~~~~~~~~
- **ethtool** operations on related kernel interfaces also affect the PMD.
+- ``rxq_cqe_comp_en`` parameter [int]
+
+ A nonzero value enables the compression of CQE on RX side. This feature
+ allows to save PCI bandwidth and improve performance at the cost of a
+ slightly higher CPU usage. Enabled by default.
+
+ Supported on:
+
+ - x86_64 with ConnectX4 and ConnectX4 LX
+ - Power8 with ConnectX4 LX
+
+- ``txq_inline`` parameter [int]
+
+ Amount of data to be inlined during TX operations. Improves latency.
+ Can improve PPS performance when PCI back pressure is detected and may be
+ useful for scenarios involving heavy traffic on many queues.
+
+ It is not enabled by default (set to 0) since the additional software
+ logic necessary to handle this mode can lower performance when back
+ pressure is not expected.
+
+- ``txqs_min_inline`` parameter [int]
+
+ Enable inline send only when the number of TX queues is greater or equal
+ to this value.
+
+ This option should be used in combination with ``txq_inline`` above.
+
+- ``txq_mpw_en`` parameter [int]
+
+ A nonzero value enables multi-packet send. This feature allows the TX
+ burst function to pack up to five packets in two descriptors in order to
+ save PCI bandwidth and improve performance at the cost of a slightly
+ higher CPU usage.
+
+ This option cannot be used in conjunction with ``tso`` below. When ``tso``
+ is set, ``txq_mpw_en`` is disabled.
+
+ It is currently only supported on the ConnectX-4 Lx and ConnectX-5
+ families of adapters. Enabled by default.
+
+- ``tso`` parameter [int]
+
+ A nonzero value enables hardware TSO.
+ When hardware TSO is enabled, packets marked with TCP segmentation
+ offload will be divided into segments by the hardware.
+
+ Disabled by default.
+
Prerequisites
-------------
- **libmlx5**
- Low-level user space driver library for Mellanox ConnectX-4 devices,
- it is automatically loaded by libibverbs.
+ Low-level user space driver library for Mellanox ConnectX-4/ConnectX-5
+ devices, it is automatically loaded by libibverbs.
This library basically implements send/receive calls to the hardware
queues.
Unlike most other PMDs, these modules must remain loaded and bound to
their devices:
- - mlx5_core: hardware driver managing Mellanox ConnectX-4 devices and
- related Ethernet kernel network devices.
+ - mlx5_core: hardware driver managing Mellanox ConnectX-4/ConnectX-5
+ devices and related Ethernet kernel network devices.
- mlx5_ib: InifiniBand device driver.
- ib_uverbs: user space driver for Verbs (entry point for libibverbs).
- **Firmware update**
- Mellanox OFED releases include firmware updates for ConnectX-4 adapters.
+ Mellanox OFED releases include firmware updates for ConnectX-4/ConnectX-5
+ adapters.
Because each release provides new features, these updates must be applied to
match the kernel modules and libraries they come with.
Both libraries are BSD and GPL licensed. Linux kernel modules are GPL
licensed.
+Currently supported by DPDK:
+
+- Mellanox OFED version: **4.0-1.0.1.0**
+- firmware version:
+
+ - ConnectX-4: **12.18.1000**
+ - ConnectX-4 Lx: **14.18.1000**
+ - ConnectX-5: **16.18.1000**
+ - ConnectX-5 Ex: **16.18.1000**
+
Getting Mellanox OFED
~~~~~~~~~~~~~~~~~~~~~
this DPDK release was developed and tested against is strongly
recommended. Please check the `prerequisites`_.
+Supported NICs
+--------------
+
+* Mellanox(R) ConnectX(R)-4 10G MCX4111A-XCAT (1x10G)
+* Mellanox(R) ConnectX(R)-4 10G MCX4121A-XCAT (2x10G)
+* Mellanox(R) ConnectX(R)-4 25G MCX4111A-ACAT (1x25G)
+* Mellanox(R) ConnectX(R)-4 25G MCX4121A-ACAT (2x25G)
+* Mellanox(R) ConnectX(R)-4 40G MCX4131A-BCAT (1x40G)
+* Mellanox(R) ConnectX(R)-4 40G MCX413A-BCAT (1x40G)
+* Mellanox(R) ConnectX(R)-4 40G MCX415A-BCAT (1x40G)
+* Mellanox(R) ConnectX(R)-4 50G MCX4131A-GCAT (1x50G)
+* Mellanox(R) ConnectX(R)-4 50G MCX413A-GCAT (1x50G)
+* Mellanox(R) ConnectX(R)-4 50G MCX414A-BCAT (2x50G)
+* Mellanox(R) ConnectX(R)-4 50G MCX415A-GCAT (2x50G)
+* Mellanox(R) ConnectX(R)-4 50G MCX416A-BCAT (2x50G)
+* Mellanox(R) ConnectX(R)-4 50G MCX416A-GCAT (2x50G)
+* Mellanox(R) ConnectX(R)-4 50G MCX415A-CCAT (1x100G)
+* Mellanox(R) ConnectX(R)-4 100G MCX416A-CCAT (2x100G)
+* Mellanox(R) ConnectX(R)-4 Lx 10G MCX4121A-XCAT (2x10G)
+* Mellanox(R) ConnectX(R)-4 Lx 25G MCX4121A-ACAT (2x25G)
+* Mellanox(R) ConnectX(R)-5 100G MCX556A-ECAT (2x100G)
+* Mellanox(R) ConnectX(R)-5 Ex EN 100G MCX516A-CDAT (2x100G)
+
+Notes for testpmd
+-----------------
+
+Compared to librte_pmd_mlx4 that implements a single RSS configuration per
+port, librte_pmd_mlx5 supports per-protocol RSS configuration.
+
+Since ``testpmd`` defaults to IP RSS mode and there is currently no
+command-line parameter to enable additional protocols (UDP and TCP as well
+as IP), the following commands must be entered from its CLI to get the same
+behavior as librte_pmd_mlx4:
+
+.. code-block:: console
+
+ > port stop all
+ > port config all rss all
+ > port start all
+
Usage example
-------------
-This section demonstrates how to launch **testpmd** with Mellanox ConnectX-4
-devices managed by librte_pmd_mlx5.
+This section demonstrates how to launch **testpmd** with Mellanox
+ConnectX-4/ConnectX-5 devices managed by librte_pmd_mlx5.
#. Load the kernel modules:
modprobe -a ib_uverbs mlx5_core mlx5_ib
+ Alternatively if MLNX_OFED is fully installed, the following script can
+ be run:
+
+ .. code-block:: console
+
+ /etc/init.d/openibd restart
+
.. note::
User space I/O kernel modules (uio and igb_uio) are not used and do
.. code-block:: console
- testpmd -c 0xff00 -n 4 -w 05:00.0 -w 05:00.1 -w 06:00.0 -w 06:00.1 -- --rxq=2 --txq=2 -i
+ testpmd -l 8-15 -n 4 -w 05:00.0 -w 05:00.1 -w 06:00.0 -w 06:00.1 -- --rxq=2 --txq=2 -i
Example output: