X-Git-Url: http://git.droids-corp.org/?a=blobdiff_plain;f=doc%2Fguides%2Fnics%2Fmlx4.rst;h=d0e8a8b2ffd4ee385af1e975caced932ba70d3ac;hb=c62b6e667310a58e444dbaae1d08371a6d221333;hp=7c431778bca7cfff95f8f4b871b8e356ccda2285;hpb=b280c98f70719285d3f1b6516813f63fe4173875;p=dpdk.git
diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst
index 7c431778bc..d0e8a8b2ff 100644
--- a/doc/guides/nics/mlx4.rst
+++ b/doc/guides/nics/mlx4.rst
@@ -1,32 +1,6 @@
-.. BSD LICENSE
+.. SPDX-License-Identifier: BSD-3-Clause
Copyright 2012 6WIND S.A.
- Copyright 2015 Mellanox
-
- Redistribution and use in source and binary forms, with or without
- modification, are permitted provided that the following conditions
- are met:
-
- * Redistributions of source code must retain the above copyright
- notice, this list of conditions and the following disclaimer.
- * Redistributions in binary form must reproduce the above copyright
- notice, this list of conditions and the following disclaimer in
- the documentation and/or other materials provided with the
- distribution.
- * Neither the name of 6WIND S.A. nor the names of its
- contributors may be used to endorse or promote products derived
- from this software without specific prior written permission.
-
- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ Copyright 2015 Mellanox Technologies, Ltd
MLX4 poll mode driver library
=============================
@@ -72,14 +46,9 @@ This capability allows the PMD to coexist with kernel network interfaces
which remain functional, although they stop receiving unicast packets as
long as they share the same MAC address.
-Compiling librte_pmd_mlx4 causes DPDK to be linked against libibverbs.
-
-Features
---------
+The :ref:`flow_isolated_mode` is supported.
-- Multi arch support: x86_64 and POWER8.
-- Link state information is provided.
-- RX interrupts.
+Compiling librte_pmd_mlx4 causes DPDK to be linked against libibverbs.
Configuration
-------------
@@ -93,27 +62,47 @@ These options can be modified in the ``.config`` file.
Toggle compilation of librte_pmd_mlx4 itself.
+- ``CONFIG_RTE_IBVERBS_LINK_DLOPEN`` (default **n**)
+
+ Build PMD with additional code to make it loadable without hard
+ dependencies on **libibverbs** nor **libmlx4**, which may not be installed
+ on the target system.
+
+ In this mode, their presence is still required for it to run properly,
+ however their absence won't prevent a DPDK application from starting (with
+ ``CONFIG_RTE_BUILD_SHARED_LIB`` disabled) and they won't show up as
+ missing with ``ldd(1)``.
+
+ It works by moving these dependencies to a purpose-built rdma-core "glue"
+ plug-in which must either be installed in a directory whose name is based
+ on ``CONFIG_RTE_EAL_PMD_PATH`` suffixed with ``-glue`` if set, or in a
+ standard location for the dynamic linker (e.g. ``/lib``) if left to the
+ default empty string (``""``).
+
+ This option has no performance impact.
+
+- ``CONFIG_RTE_IBVERBS_LINK_STATIC`` (default **n**)
+
+ Embed static flavor of the dependencies **libibverbs** and **libmlx4**
+ in the PMD shared library or the executable static binary.
+
- ``CONFIG_RTE_LIBRTE_MLX4_DEBUG`` (default **n**)
Toggle debugging code and stricter compilation flags. Enabling this option
adds additional run-time checks and debugging messages at the cost of
lower performance.
-- ``CONFIG_RTE_LIBRTE_MLX4_DEBUG_BROKEN_VERBS`` (default **n**)
-
- Mellanox OFED versions earlier than 4.2 may return false errors from
- Verbs object destruction APIs after the device is plugged out.
- Enabling this option replaces assertion checks that cause the program
- to abort with harmless debugging messages as a workaround.
- Relevant only when CONFIG_RTE_LIBRTE_MLX4_DEBUG is enabled.
+Environment variables
+~~~~~~~~~~~~~~~~~~~~~
-- ``CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE`` (default **8**)
+- ``MLX4_GLUE_PATH``
- Maximum number of cached memory pools (MPs) per TX queue. Each MP from
- which buffers are to be transmitted must be associated to memory regions
- (MRs). This is a slow operation that must be cached.
+ A list of directories in which to search for the rdma-core "glue" plug-in,
+ separated by colons or semi-colons.
- This value is always 1 for RX queues since they use a single MP.
+ Only matters when compiled with ``CONFIG_RTE_IBVERBS_LINK_DLOPEN``
+ enabled and most useful when ``CONFIG_RTE_EAL_PMD_PATH`` is also set,
+ since ``LD_LIBRARY_PATH`` has no effect in this case.
Run-time configuration
~~~~~~~~~~~~~~~~~~~~~~
@@ -130,6 +119,17 @@ Run-time configuration
times for additional ports. All ports are probed by default if left
unspecified.
+- ``mr_ext_memseg_en`` parameter [int]
+
+ A nonzero value enables extending memseg when registering DMA memory. If
+ enabled, the number of entries in MR (Memory Region) lookup table on datapath
+ is minimized and it benefits performance. On the other hand, it worsens memory
+ utilization because registered memory is pinned by kernel driver. Even if a
+ page in the extended chunk is freed, that doesn't become reusable until the
+ entire memory is freed.
+
+ Enabled by default.
+
Kernel module parameters
~~~~~~~~~~~~~~~~~~~~~~~~
@@ -153,6 +153,25 @@ below.
following limitation: VLAN filtering is not supported with this mode.
This is the recommended mode in case VLAN filter is not needed.
+Limitations
+-----------
+
+- For secondary process:
+
+ - Forked secondary process not supported.
+ - External memory unregistered in EAL memseg list cannot be used for DMA
+ unless such memory has been registered by ``mlx4_mr_update_ext_mp()`` in
+ primary process and remapped to the same virtual address in secondary
+ process. If the external memory is registered by primary process but has
+ different virtual address in secondary process, unexpected error may happen.
+
+- CRC stripping is supported by default and always reported as "true".
+ The ability to enable/disable CRC stripping requires OFED version
+ 4.3-1.5.0.0 and above or rdma-core version v18 and above.
+
+- TSO (Transmit Segmentation Offload) is supported in OFED version
+ 4.4 and above.
+
Prerequisites
-------------
@@ -160,7 +179,7 @@ This driver relies on external libraries and kernel drivers for resources
allocations and initialization. The following dependencies are not part of
DPDK and must be installed separately:
-- **libibverbs**
+- **libibverbs** (provided by rdma-core package)
User space verbs framework used by librte_pmd_mlx4. This library provides
a generic interface between the kernel and low-level user space drivers
@@ -170,7 +189,7 @@ DPDK and must be installed separately:
resources allocations) to be managed by the kernel and fast operations to
never leave user space.
-- **libmlx4**
+- **libmlx4** (provided by rdma-core package)
Low-level user space driver library for Mellanox ConnectX-3 devices,
it is automatically loaded by libibverbs.
@@ -178,7 +197,7 @@ DPDK and must be installed separately:
This library basically implements send/receive calls to the hardware
queues.
-- **Kernel modules** (mlnx-ofed-kernel)
+- **Kernel modules**
They provide the kernel-side verbs API and low level device drivers that
manage actual hardware initialization and resources sharing with user
@@ -204,24 +223,40 @@ DPDK and must be installed separately:
Both libraries are BSD and GPL licensed. Linux kernel modules are GPL
licensed.
-Currently supported by DPDK:
+Depending on system constraints and user preferences either RDMA core library
+with a recent enough Linux kernel release (recommended) or Mellanox OFED,
+which provides compatibility with older releases.
-- Mellanox OFED **4.1**.
-- Firmware version **2.36.5000** and above.
+Current RDMA core package and Linux kernel (recommended)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Getting Mellanox OFED
-~~~~~~~~~~~~~~~~~~~~~
+- Minimal Linux kernel version: 4.14.
+- Minimal RDMA core version: v15 (see `RDMA core installation documentation`_).
+
+- Starting with rdma-core v21, static libraries can be built::
+
+ cd build
+ CFLAGS=-fPIC cmake -DIN_PLACE=1 -DENABLE_STATIC=1 -GNinja ..
+ ninja
+
+.. _`RDMA core installation documentation`: https://raw.githubusercontent.com/linux-rdma/rdma-core/master/README.md
+
+If rdma-core libraries are built but not installed, DPDK makefile can link them,
+thanks to these environment variables:
-While these libraries and kernel modules are available on OpenFabrics
-Alliance's `website `_ and provided by package
-managers on most distributions, this PMD requires Ethernet extensions that
-may not be supported at the moment (this is a work in progress).
+ - ``EXTRA_CFLAGS=-I/path/to/rdma-core/build/include``
+ - ``EXTRA_LDFLAGS=-L/path/to/rdma-core/build/lib``
+ - ``PKG_CONFIG_PATH=/path/to/rdma-core/build/lib/pkgconfig``
-`Mellanox OFED
-`_
-includes the necessary support and should be used in the meantime. For DPDK,
-only libibverbs, libmlx4, mlnx-ofed-kernel packages and firmware updates are
-required from that distribution.
+.. _Mellanox_OFED_as_a_fallback:
+
+Mellanox OFED as a fallback
+~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+- `Mellanox OFED`_ version: **4.4, 4.5, 4.6**.
+- firmware version: **2.42.5000** and above.
+
+.. _`Mellanox OFED`: http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers
.. note::
@@ -229,61 +264,62 @@ required from that distribution.
this DPDK release was developed and tested against is strongly
recommended. Please check the `prerequisites`_.
-Supported NICs
---------------
-
-* Mellanox(R) ConnectX(R)-3 Pro 40G MCX354A-FCC_Ax (2*40G)
-
-Quick Start Guide
------------------
+Installing Mellanox OFED
+^^^^^^^^^^^^^^^^^^^^^^^^
-1. Download latest Mellanox OFED. For more info check the `prerequisites`_.
+1. Download latest Mellanox OFED.
2. Install the required libraries and kernel modules either by installing
only the required set, or by installing the entire Mellanox OFED:
- For bare metal use:
+ For bare metal use::
- .. code-block:: console
+ ./mlnxofedinstall --dpdk --upstream-libs
- ./mlnxofedinstall
+ For SR-IOV hypervisors use::
- For SR-IOV hypervisors use:
+ ./mlnxofedinstall --dpdk --upstream-libs --enable-sriov --hypervisor
- .. code-block:: console
+ For SR-IOV virtual machine use::
- ./mlnxofedinstall --enable-sriov -hypervisor
+ ./mlnxofedinstall --dpdk --upstream-libs --guest
- For SR-IOV virtual machine use:
+3. Verify the firmware is the correct one::
- .. code-block:: console
+ ibv_devinfo
- ./mlnxofedinstall --guest
+4. Set all ports links to Ethernet, follow instructions on the screen::
-3. Verify the firmware is the correct one:
+ connectx_port_config
- .. code-block:: console
+5. Continue with :ref:`section 2 of the Quick Start Guide `.
- ibv_devinfo
+Supported NICs
+--------------
-4. Set all ports links to Ethernet, follow instructions on the screen:
+* Mellanox(R) ConnectX(R)-3 Pro 40G MCX354A-FCC_Ax (2*40G)
- .. code-block:: console
+.. _qsg:
- connectx_port_config
-
- Or in the manual way:
+Quick Start Guide
+-----------------
- .. code-block:: console
+1. Set all ports links to Ethernet::
PCI=
echo eth > "/sys/bus/pci/devices/$PCI/mlx4_port0"
echo eth > "/sys/bus/pci/devices/$PCI/mlx4_port1"
-5. In case of bare metal or hypervisor, configure optimized steering mode
- by adding the following line to ``/etc/modprobe.d/mlx4_core.conf``:
+ .. note::
+
+ If using Mellanox OFED one can permanently set the port link
+ to Ethernet using connectx_port_config tool provided by it.
+ :ref:`Mellanox_OFED_as_a_fallback`:
+
+.. _QSG_2:
- .. code-block:: console
+2. In case of bare metal or hypervisor, configure optimized steering mode
+ by adding the following line to ``/etc/modprobe.d/mlx4_core.conf``::
options mlx4_core log_num_mgm_entry_size=-7
@@ -292,35 +328,27 @@ Quick Start Guide
If VLAN filtering is used, set log_num_mgm_entry_size=-1.
Performance degradation can occur on this case.
-6. Restart the driver:
-
- .. code-block:: console
+3. Restart the driver::
/etc/init.d/openibd restart
- or:
-
- .. code-block:: console
+ or::
service openibd restart
-7. Compile DPDK and you are ready to go. See instructions on
+4. Compile DPDK and you are ready to go. See instructions on
:ref:`Development Kit Build System `
Performance tuning
------------------
-1. Verify the optimized steering mode is configured:
-
- .. code-block:: console
+1. Verify the optimized steering mode is configured::
cat /sys/module/mlx4_core/parameters/log_num_mgm_entry_size
2. Use the CPU near local NUMA node to which the PCIe adapter is connected,
for better performance. For VMs, verify that the right CPU
- and NUMA node are pinned according to the above. Run:
-
- .. code-block:: console
+ and NUMA node are pinned according to the above. Run::
lstopo-no-graphics
@@ -332,9 +360,7 @@ Performance tuning
This in order to forward packets from one to the other without
NUMA performance penalty.
-4. Disable pause frames:
-
- .. code-block:: console
+4. Disable pause frames::
ethtool -A rx off tx off
@@ -348,37 +374,35 @@ Performance tuning
to set the PCI max read request parameter to 1K. This can be
done in the following way:
- To query the read request size use:
-
- .. code-block:: console
+ To query the read request size use::
setpci -s 68.w
- If the output is different than 3XXX, set it by:
-
- .. code-block:: console
+ If the output is different than 3XXX, set it by::
setpci -s 68.w=3XXX
The XXX can be different on different systems. Make sure to configure
according to the setpci output.
+6. To minimize overhead of searching Memory Regions:
+
+ - '--socket-mem' is recommended to pin memory by predictable amount.
+ - Configure per-lcore cache when creating Mempools for packet buffer.
+ - Refrain from dynamically allocating/freeing memory in run-time.
+
Usage example
-------------
This section demonstrates how to launch **testpmd** with Mellanox ConnectX-3
devices managed by librte_pmd_mlx4.
-#. Load the kernel modules:
-
- .. code-block:: console
+#. Load the kernel modules::
modprobe -a ib_uverbs mlx4_en mlx4_core mlx4_ib
Alternatively if MLNX_OFED is fully installed, the following script can
- be run:
-
- .. code-block:: console
+ be run::
/etc/init.d/openibd restart
@@ -388,24 +412,18 @@ devices managed by librte_pmd_mlx4.
not have to be loaded.
#. Make sure Ethernet interfaces are in working order and linked to kernel
- verbs. Related sysfs entries should be present:
-
- .. code-block:: console
+ verbs. Related sysfs entries should be present::
ls -d /sys/class/net/*/device/infiniband_verbs/uverbs* | cut -d / -f 5
- Example output:
-
- .. code-block:: console
+ Example output::
eth2
eth3
eth4
eth5
-#. Optionally, retrieve their PCI bus addresses for whitelisting:
-
- .. code-block:: console
+#. Optionally, retrieve their PCI bus addresses for whitelisting::
{
for intf in eth2 eth3 eth4 eth5;
@@ -415,9 +433,7 @@ devices managed by librte_pmd_mlx4.
} |
sed -n 's,.*/\(.*\),-w \1,p'
- Example output:
-
- .. code-block:: console
+ Example output::
-w 0000:83:00.0
-w 0000:83:00.0
@@ -429,21 +445,15 @@ devices managed by librte_pmd_mlx4.
There are only two distinct PCI bus addresses because the Mellanox
ConnectX-3 adapters installed on this system are dual port.
-#. Request huge pages:
-
- .. code-block:: console
+#. Request huge pages::
echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages/nr_hugepages
-#. Start testpmd with basic parameters:
-
- .. code-block:: console
+#. Start testpmd with basic parameters::
testpmd -l 8-15 -n 4 -w 0000:83:00.0 -w 0000:84:00.0 -- --rxq=2 --txq=2 -i
- Example output:
-
- .. code-block:: console
+ Example output::
[...]
EAL: PCI device 0000:83:00.0 on NUMA socket 1