X-Git-Url: http://git.droids-corp.org/?a=blobdiff_plain;f=doc%2Fguides%2Fnics%2Fmlx5.rst;h=9b0ba291c430e01fadd23430146658689e3d51e5;hb=3f13f8c23a7c;hp=eb8c04207666a3b5d5366a56e0d9d0ef6b876af7;hpb=28014f0754860f06e7504510496d7b7c9b235d88;p=dpdk.git diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index eb8c042076..9b0ba291c4 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -30,10 +30,10 @@ MLX5 poll mode driver ===================== -The MLX5 poll mode driver library (**librte_pmd_mlx5**) provides support for -**Mellanox ConnectX-4 EN** and **Mellanox ConnectX-4 Lx EN** families of -10/25/40/50/100 Gb/s adapters as well as their virtual functions (VF) in -SR-IOV context. +The MLX5 poll mode driver library (**librte_pmd_mlx5**) provides support +for **Mellanox ConnectX-4**, **Mellanox ConnectX-4 Lx** and **Mellanox +ConnectX-5** families of 10/25/40/50/100 Gb/s adapters as well as their +virtual functions (VF) in SR-IOV context. Information and documentation about these adapters can be found on the `Mellanox website `__. Help is also provided by the @@ -48,11 +48,6 @@ There is also a `section dedicated to this poll mode driver be enabled manually by setting ``CONFIG_RTE_LIBRTE_MLX5_PMD=y`` and recompiling DPDK. -.. warning:: - - ``CONFIG_RTE_BUILD_COMBINE_LIBS`` with ``CONFIG_RTE_BUILD_SHARED_LIB`` - is not supported and thus the compilation will fail with this configuration. - Implementation details ---------------------- @@ -78,19 +73,32 @@ Features - Multiple TX and RX queues. - Support for scattered TX and RX frames. -- IPv4, TCPv4 and UDPv4 RSS on any number of queues. +- IPv4, IPv6, TCPv4, TCPv6, UDPv4 and UDPv6 RSS on any number of queues. - Several RSS hash keys, one for each flow type. +- Configurable RETA table. - Support for multiple MAC addresses. - VLAN filtering. +- RX VLAN stripping. +- TX VLAN insertion. +- RX CRC stripping configuration. - Promiscuous mode. +- Multicast promiscuous mode. +- Hardware checksum offloads. +- Flow director (RTE_FDIR_MODE_PERFECT, RTE_FDIR_MODE_PERFECT_MAC_VLAN and + RTE_ETH_FDIR_REJECT). +- Flow API. +- Secondary process TX is supported. +- KVM and VMware ESX SR-IOV modes are supported. +- RSS hash result is supported. +- Hardware TSO. Limitations ----------- -- IPv6 and inner VXLAN RSS are not supported yet. +- Inner RSS for VXLAN frames is not supported yet. - Port statistics through software counters only. -- No allmulticast mode. -- Hardware checksum offloads are not supported yet. +- Hardware checksum offloads for VXLAN inner header are not supported yet. +- Secondary process RX is not supported. Configuration ------------- @@ -110,18 +118,6 @@ These options can be modified in the ``.config`` file. adds additional run-time checks and debugging messages at the cost of lower performance. -- ``CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N`` (default **4**) - - Number of scatter/gather elements (SGEs) per work request (WR). Lowering - this number improves performance but also limits the ability to receive - scattered packets (packets that do not fit a single mbuf). The default - value is a safe tradeoff. - -- ``CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE`` (default **0**) - - Amount of data to be inlined during TX operations. Improves latency but - lowers throughput. - - ``CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE`` (default **8**) Maximum number of cached memory pools (MPs) per TX queue. Each MP from @@ -133,15 +129,19 @@ These options can be modified in the ``.config`` file. Environment variables ~~~~~~~~~~~~~~~~~~~~~ -- ``MLX5_ENABLE_CQE_COMPRESSION`` +- ``MLX5_PMD_ENABLE_PADDING`` + + Enables HW packet padding in PCI bus transactions. + + When packet size is cache aligned and CRC stripping is enabled, 4 fewer + bytes are written to the PCI bus. Enabling padding makes such packets + aligned again. - A nonzero value lets ConnectX-4 return smaller completion entries to - improve performance when PCI backpressure is detected. It is most useful - for scenarios involving heavy traffic on many queues. + In cases where PCI bandwidth is the bottleneck, padding can improve + performance by 10%. - Since the additional software logic necessary to handle this mode can - lower performance when there is no backpressure, it is not enabled by - default. + This is disabled by default since this can also decrease performance for + unaligned packet sizes. Run-time configuration ~~~~~~~~~~~~~~~~~~~~~~ @@ -152,6 +152,55 @@ Run-time configuration - **ethtool** operations on related kernel interfaces also affect the PMD. +- ``rxq_cqe_comp_en`` parameter [int] + + A nonzero value enables the compression of CQE on RX side. This feature + allows to save PCI bandwidth and improve performance at the cost of a + slightly higher CPU usage. Enabled by default. + + Supported on: + + - x86_64 with ConnectX4 and ConnectX4 LX + - Power8 with ConnectX4 LX + +- ``txq_inline`` parameter [int] + + Amount of data to be inlined during TX operations. Improves latency. + Can improve PPS performance when PCI back pressure is detected and may be + useful for scenarios involving heavy traffic on many queues. + + It is not enabled by default (set to 0) since the additional software + logic necessary to handle this mode can lower performance when back + pressure is not expected. + +- ``txqs_min_inline`` parameter [int] + + Enable inline send only when the number of TX queues is greater or equal + to this value. + + This option should be used in combination with ``txq_inline`` above. + +- ``txq_mpw_en`` parameter [int] + + A nonzero value enables multi-packet send. This feature allows the TX + burst function to pack up to five packets in two descriptors in order to + save PCI bandwidth and improve performance at the cost of a slightly + higher CPU usage. + + This option cannot be used in conjunction with ``tso`` below. When ``tso`` + is set, ``txq_mpw_en`` is disabled. + + It is currently only supported on the ConnectX-4 Lx and ConnectX-5 + families of adapters. Enabled by default. + +- ``tso`` parameter [int] + + A nonzero value enables hardware TSO. + When hardware TSO is enabled, packets marked with TCP segmentation + offload will be divided into segments by the hardware. + + Disabled by default. + Prerequisites ------------- @@ -171,8 +220,8 @@ DPDK and must be installed separately: - **libmlx5** - Low-level user space driver library for Mellanox ConnectX-4 devices, - it is automatically loaded by libibverbs. + Low-level user space driver library for Mellanox ConnectX-4/ConnectX-5 + devices, it is automatically loaded by libibverbs. This library basically implements send/receive calls to the hardware queues. @@ -186,14 +235,15 @@ DPDK and must be installed separately: Unlike most other PMDs, these modules must remain loaded and bound to their devices: - - mlx5_core: hardware driver managing Mellanox ConnectX-4 devices and - related Ethernet kernel network devices. + - mlx5_core: hardware driver managing Mellanox ConnectX-4/ConnectX-5 + devices and related Ethernet kernel network devices. - mlx5_ib: InifiniBand device driver. - ib_uverbs: user space driver for Verbs (entry point for libibverbs). - **Firmware update** - Mellanox OFED releases include firmware updates for ConnectX-4 adapters. + Mellanox OFED releases include firmware updates for ConnectX-4/ConnectX-5 + adapters. Because each release provides new features, these updates must be applied to match the kernel modules and libraries they come with. @@ -205,10 +255,13 @@ DPDK and must be installed separately: Currently supported by DPDK: -- Mellanox OFED **3.1**. -- Minimum firmware version: - - ConnectX-4: **12.12.0780**. - - ConnectX-4 Lx: **14.12.0780**. +- Mellanox OFED version: **4.0-1.0.1.0** +- firmware version: + + - ConnectX-4: **12.18.1000** + - ConnectX-4 Lx: **14.18.1000** + - ConnectX-5: **16.18.1000** + - ConnectX-5 Ex: **16.18.1000** Getting Mellanox OFED ~~~~~~~~~~~~~~~~~~~~~ @@ -230,11 +283,51 @@ required from that distribution. this DPDK release was developed and tested against is strongly recommended. Please check the `prerequisites`_. +Supported NICs +-------------- + +* Mellanox(R) ConnectX(R)-4 10G MCX4111A-XCAT (1x10G) +* Mellanox(R) ConnectX(R)-4 10G MCX4121A-XCAT (2x10G) +* Mellanox(R) ConnectX(R)-4 25G MCX4111A-ACAT (1x25G) +* Mellanox(R) ConnectX(R)-4 25G MCX4121A-ACAT (2x25G) +* Mellanox(R) ConnectX(R)-4 40G MCX4131A-BCAT (1x40G) +* Mellanox(R) ConnectX(R)-4 40G MCX413A-BCAT (1x40G) +* Mellanox(R) ConnectX(R)-4 40G MCX415A-BCAT (1x40G) +* Mellanox(R) ConnectX(R)-4 50G MCX4131A-GCAT (1x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX413A-GCAT (1x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX414A-BCAT (2x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX415A-GCAT (2x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX416A-BCAT (2x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX416A-GCAT (2x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX415A-CCAT (1x100G) +* Mellanox(R) ConnectX(R)-4 100G MCX416A-CCAT (2x100G) +* Mellanox(R) ConnectX(R)-4 Lx 10G MCX4121A-XCAT (2x10G) +* Mellanox(R) ConnectX(R)-4 Lx 25G MCX4121A-ACAT (2x25G) +* Mellanox(R) ConnectX(R)-5 100G MCX556A-ECAT (2x100G) +* Mellanox(R) ConnectX(R)-5 Ex EN 100G MCX516A-CDAT (2x100G) + +Notes for testpmd +----------------- + +Compared to librte_pmd_mlx4 that implements a single RSS configuration per +port, librte_pmd_mlx5 supports per-protocol RSS configuration. + +Since ``testpmd`` defaults to IP RSS mode and there is currently no +command-line parameter to enable additional protocols (UDP and TCP as well +as IP), the following commands must be entered from its CLI to get the same +behavior as librte_pmd_mlx4: + +.. code-block:: console + + > port stop all + > port config all rss all + > port start all + Usage example ------------- -This section demonstrates how to launch **testpmd** with Mellanox ConnectX-4 -devices managed by librte_pmd_mlx5. +This section demonstrates how to launch **testpmd** with Mellanox +ConnectX-4/ConnectX-5 devices managed by librte_pmd_mlx5. #. Load the kernel modules: @@ -242,6 +335,13 @@ devices managed by librte_pmd_mlx5. modprobe -a ib_uverbs mlx5_core mlx5_ib + Alternatively if MLNX_OFED is fully installed, the following script can + be run: + + .. code-block:: console + + /etc/init.d/openibd restart + .. note:: User space I/O kernel modules (uio and igb_uio) are not used and do @@ -294,7 +394,7 @@ devices managed by librte_pmd_mlx5. .. code-block:: console - testpmd -c 0xff00 -n 4 -w 05:00.0 -w 05:00.1 -w 06:00.0 -w 06:00.1 -- --rxq=2 --txq=2 -i + testpmd -l 8-15 -n 4 -w 05:00.0 -w 05:00.1 -w 06:00.0 -w 06:00.1 -- --rxq=2 --txq=2 -i Example output: