- Flow director (RTE_FDIR_MODE_PERFECT, RTE_FDIR_MODE_PERFECT_MAC_VLAN and
RTE_ETH_FDIR_REJECT).
- Flow API.
-- Secondary process TX is supported.
+- Multiple process.
- KVM and VMware ESX SR-IOV modes are supported.
- RSS hash result is supported.
- Hardware TSO.
- Hardware checksum TX offload for VXLAN and GRE.
- RX interrupts.
- Statistics query including Basic, Extended and per queue.
+- Rx HW timestamp.
Limitations
-----------
- Inner RSS for VXLAN frames is not supported yet.
-- Port statistics through software counters only.
- Hardware checksum RX offloads for VXLAN inner header are not supported yet.
-- Secondary process RX is not supported.
+- For secondary process:
+
+ - Forked secondary process not supported.
+ - All mempools must be initialized before rte_eth_dev_start().
+
- Flow pattern without any specific vlan will match for vlan packets as well:
When VLAN spec is not specified in the pattern, the matching rule will be created with VLAN as a wild card.
Will match any ipv4 packet (VLAN included).
+- A multi segment packet must have less than 6 segments in case the Tx burst function
+ is set to multi-packet send or Enhanced multi-packet send. Otherwise it must have
+ less than 50 segments.
+- Count action for RTE flow is **only supported in Mellanox OFED**.
+- Flows with a VXLAN Network Identifier equal (or ends to be equal)
+ to 0 are not supported.
+- VXLAN TSO and checksum offloads are not supported on VM.
+
+Statistics
+----------
+
+MLX5 supports various of methods to report statistics:
+
+Port statistics can be queried using ``rte_eth_stats_get()``. The port statistics are through SW only and counts the number of packets received or sent successfully by the PMD.
+
+Extended statistics can be queried using ``rte_eth_xstats_get()``. The extended statistics expose a wider set of counters counted by the device. The extended port statistics counts the number of packets received or sent successfully by the port. As Mellanox NICs are using the :ref:`Bifurcated Linux Driver <linux_gsg_linux_drivers>` those counters counts also packet received or sent by the Linux kernel. The counters with ``_phy`` suffix counts the total events on the physical port, therefore not valid for VF.
+
+Finally per-flow statistics can by queried using ``rte_flow_query`` when attaching a count action for specific flow. The flow counter counts the number of packets received successfully by the port and match the specific flow.
+
Configuration
-------------
Toggle compilation of librte_pmd_mlx5 itself.
+- ``CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS`` (default **n**)
+
+ Build PMD with additional code to make it loadable without hard
+ dependencies on **libibverbs** nor **libmlx5**, which may not be installed
+ on the target system.
+
+ In this mode, their presence is still required for it to run properly,
+ however their absence won't prevent a DPDK application from starting (with
+ ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as
+ missing with ``ldd(1)``.
+
+ It works by moving these dependencies to a purpose-built rdma-core "glue"
+ plug-in, which must either be installed in ``CONFIG_RTE_EAL_PMD_PATH`` if
+ set, or in a standard location for the dynamic linker (e.g. ``/lib``) if
+ left to the default empty string (``""``).
+
+ This option has no performance impact.
+
- ``CONFIG_RTE_LIBRTE_MLX5_DEBUG`` (default **n**)
Toggle debugging code and stricter compilation flags. Enabling this option
Environment variables
~~~~~~~~~~~~~~~~~~~~~
+- ``MLX5_GLUE_PATH``
+
+ A list of directories in which to search for the rdma-core "glue" plug-in,
+ separated by colons or semi-colons.
+
+ Only matters when compiled with ``CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS``
+ enabled and most useful when ``CONFIG_RTE_EAL_PMD_PATH`` is also set,
+ since ``LD_LIBRARY_PATH`` has no effect in this case.
+
- ``MLX5_PMD_ENABLE_PADDING``
Enables HW packet padding in PCI bus transactions.
This is disabled by default since this can also decrease performance for
unaligned packet sizes.
+- ``MLX5_SHUT_UP_BF``
+
+ Configures HW Tx doorbell register as IO-mapped.
+
+ By default, the HW Tx doorbell is configured as a write-combining register.
+ The register would be flushed to HW usually when the write-combining buffer
+ becomes full, but it depends on CPU design.
+
+ Except for vectorized Tx burst routines, a write memory barrier is enforced
+ after updating the register so that the update can be immediately visible to
+ HW.
+
+ When vectorized Tx burst is called, the barrier is set only if the burst size
+ is not aligned to MLX5_VPMD_TX_MAX_BURST. However, setting this environmental
+ variable will bring better latency even though the maximum throughput can
+ slightly decline.
+
Run-time configuration
~~~~~~~~~~~~~~~~~~~~~~
Enhanced MPS supports hybrid mode - mixing inlined packets and pointers
in the same descriptor.
- This option cannot be used in conjunction with ``tso`` below. When ``tso``
- is set, ``txq_mpw_en`` is disabled.
+ This option cannot be used with certain offloads such as ``DEV_TX_OFFLOAD_TCP_TSO,
+ DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO, DEV_TX_OFFLOAD_VLAN_INSERT``.
+ When those offloads are requested the MPS send function will not be used.
It is currently only supported on the ConnectX-4 Lx and ConnectX-5
families of adapters. Enabled by default.
Effective only when Enhanced MPS is supported. The default value is 256.
-- ``tso`` parameter [int]
+- ``tx_vec_en`` parameter [int]
+
+ A nonzero value enables Tx vector on ConnectX-5 only NIC if the number of
+ global Tx queues on the port is lesser than MLX5_VPMD_MIN_TXQS.
+
+ This option cannot be used with certain offloads such as ``DEV_TX_OFFLOAD_TCP_TSO,
+ DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO, DEV_TX_OFFLOAD_VLAN_INSERT``.
+ When those offloads are requested the MPS send function will not be used.
+
+ Enabled by default on ConnectX-5.
+
+- ``rx_vec_en`` parameter [int]
+
+ A nonzero value enables Rx vector if the port is not configured in
+ multi-segment otherwise this parameter is ignored.
- A nonzero value enables hardware TSO.
- When hardware TSO is enabled, packets marked with TCP segmentation
- offload will be divided into segments by the hardware. Disabled by default.
+ Enabled by default.
Prerequisites
-------------
This library basically implements send/receive calls to the hardware
queues.
-- **Kernel modules** (mlnx-ofed-kernel)
+- **Kernel modules**
They provide the kernel-side Verbs API and low level device drivers that
manage actual hardware initialization and resources sharing with user
Both libraries are BSD and GPL licensed. Linux kernel modules are GPL
licensed.
-Currently supported by DPDK:
+Installation
+~~~~~~~~~~~~
-- Mellanox OFED version: **4.1**.
-- firmware version:
+Either RDMA Core library with a recent enough Linux kernel release
+(recommended) or Mellanox OFED, which provides compatibility with older
+releases.
- - ConnectX-4: **12.20.1010** and above.
- - ConnectX-4 Lx: **14.20.1010** and above.
- - ConnectX-5: **16.20.1010** and above.
- - ConnectX-5 Ex: **16.20.1010** and above.
+RMDA Core with Linux Kernel
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Getting Mellanox OFED
-~~~~~~~~~~~~~~~~~~~~~
+- Minimal kernel version : v4.14 or the most recent 4.14-rc (see `Linux installation documentation`_)
+- Minimal rdma-core version: v15+ commit 0c5f5765213a ("Merge pull request #227 from yishaih/tm")
+ (see `RDMA Core installation documentation`_)
+
+.. _`Linux installation documentation`: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/plain/Documentation/admin-guide/README.rst
+.. _`RDMA Core installation documentation`: https://raw.githubusercontent.com/linux-rdma/rdma-core/master/README.md
+
+Mellanox OFED
+^^^^^^^^^^^^^
+
+- Mellanox OFED version: **4.2, 4.3**.
+- firmware version:
+
+ - ConnectX-4: **12.21.1000** and above.
+ - ConnectX-4 Lx: **14.21.1000** and above.
+ - ConnectX-5: **16.21.1000** and above.
+ - ConnectX-5 Ex: **16.21.1000** and above.
While these libraries and kernel modules are available on OpenFabrics
Alliance's `website <https://www.openfabrics.org/>`__ and provided by package
* Mellanox(R) ConnectX(R)-5 100G MCX556A-ECAT (2x100G)
* Mellanox(R) ConnectX(R)-5 Ex EN 100G MCX516A-CDAT (2x100G)
-Quick Start Guide
------------------
+Quick Start Guide on OFED
+-------------------------
1. Download latest Mellanox OFED. For more info check the `prerequisites`_.
.. code-block:: console
- ./mlnxofedinstall
+ ./mlnxofedinstall --upstream-libs --dpdk
3. Verify the firmware is the correct one: