+* Mellanox(R) ConnectX(R)-6 200G MCX654106A-HCAT (4x200G)
+* Mellanox(R) ConnectX(R)-6DX EN 100G MCX623106AN-CDAT (2*100g)
+* Mellanox(R) ConnectX(R)-6DX EN 200G MCX623105AN-VDAT (1*200g)
+
+Quick Start Guide on OFED/EN
+----------------------------
+
+1. Download latest Mellanox OFED/EN. For more info check the `prerequisites`_.
+
+
+2. Install the required libraries and kernel modules either by installing
+ only the required set, or by installing the entire Mellanox OFED/EN::
+
+ ./mlnxofedinstall --upstream-libs --dpdk
+
+3. Verify the firmware is the correct one::
+
+ ibv_devinfo
+
+4. Verify all ports links are set to Ethernet::
+
+ mlxconfig -d <mst device> query | grep LINK_TYPE
+ LINK_TYPE_P1 ETH(2)
+ LINK_TYPE_P2 ETH(2)
+
+ Link types may have to be configured to Ethernet::
+
+ mlxconfig -d <mst device> set LINK_TYPE_P1/2=1/2/3
+
+ * LINK_TYPE_P1=<1|2|3> , 1=Infiniband 2=Ethernet 3=VPI(auto-sense)
+
+ For hypervisors, verify SR-IOV is enabled on the NIC::
+
+ mlxconfig -d <mst device> query | grep SRIOV_EN
+ SRIOV_EN True(1)
+
+ If needed, configure SR-IOV::
+
+ mlxconfig -d <mst device> set SRIOV_EN=1 NUM_OF_VFS=16
+ mlxfwreset -d <mst device> reset
+
+5. Restart the driver::
+
+ /etc/init.d/openibd restart
+
+ or::
+
+ service openibd restart
+
+ If link type was changed, firmware must be reset as well::
+
+ mlxfwreset -d <mst device> reset
+
+ For hypervisors, after reset write the sysfs number of virtual functions
+ needed for the PF.
+
+ To dynamically instantiate a given number of virtual functions (VFs)::
+
+ echo [num_vfs] > /sys/class/infiniband/mlx5_0/device/sriov_numvfs
+
+6. Compile DPDK and you are ready to go. See instructions on
+ :ref:`Development Kit Build System <Development_Kit_Build_System>`
+
+Enable switchdev mode
+---------------------
+
+Switchdev mode is a mode in E-Switch, that binds between representor and VF.
+Representor is a port in DPDK that is connected to a VF in such a way
+that assuming there are no offload flows, each packet that is sent from the VF
+will be received by the corresponding representor. While each packet that is
+sent to a representor will be received by the VF.
+This is very useful in case of SRIOV mode, where the first packet that is sent
+by the VF will be received by the DPDK application which will decide if this
+flow should be offloaded to the E-Switch. After offloading the flow packet
+that the VF that are matching the flow will not be received any more by
+the DPDK application.
+
+1. Enable SRIOV mode::
+
+ mlxconfig -d <mst device> set SRIOV_EN=true
+
+2. Configure the max number of VFs::
+
+ mlxconfig -d <mst device> set NUM_OF_VFS=<num of vfs>
+
+3. Reset the FW::
+
+ mlxfwreset -d <mst device> reset
+
+3. Configure the actual number of VFs::
+
+ echo <num of vfs > /sys/class/net/<net device>/device/sriov_numvfs
+
+4. Unbind the device (can be rebind after the switchdev mode)::
+
+ echo -n "<device pci address" > /sys/bus/pci/drivers/mlx5_core/unbind
+
+5. Enbale switchdev mode::
+
+ echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
+
+Performance tuning
+------------------
+
+1. Configure aggressive CQE Zipping for maximum performance::
+
+ mlxconfig -d <mst device> s CQE_COMPRESSION=1
+
+ To set it back to the default CQE Zipping mode use::
+
+ mlxconfig -d <mst device> s CQE_COMPRESSION=0
+
+2. In case of virtualization:
+
+ - Make sure that hypervisor kernel is 3.16 or newer.
+ - Configure boot with ``iommu=pt``.
+ - Use 1G huge pages.
+ - Make sure to allocate a VM on huge pages.
+ - Make sure to set CPU pinning.
+
+3. Use the CPU near local NUMA node to which the PCIe adapter is connected,
+ for better performance. For VMs, verify that the right CPU
+ and NUMA node are pinned according to the above. Run::
+
+ lstopo-no-graphics
+
+ to identify the NUMA node to which the PCIe adapter is connected.
+
+4. If more than one adapter is used, and root complex capabilities allow
+ to put both adapters on the same NUMA node without PCI bandwidth degradation,
+ it is recommended to locate both adapters on the same NUMA node.
+ This in order to forward packets from one to the other without
+ NUMA performance penalty.
+
+5. Disable pause frames::
+
+ ethtool -A <netdev> rx off tx off
+
+6. Verify IO non-posted prefetch is disabled by default. This can be checked
+ via the BIOS configuration. Please contact you server provider for more
+ information about the settings.
+
+.. note::
+
+ On some machines, depends on the machine integrator, it is beneficial
+ to set the PCI max read request parameter to 1K. This can be
+ done in the following way:
+
+ To query the read request size use::
+
+ setpci -s <NIC PCI address> 68.w
+
+ If the output is different than 3XXX, set it by::
+
+ setpci -s <NIC PCI address> 68.w=3XXX
+
+ The XXX can be different on different systems. Make sure to configure
+ according to the setpci output.
+
+7. To minimize overhead of searching Memory Regions:
+
+ - '--socket-mem' is recommended to pin memory by predictable amount.
+ - Configure per-lcore cache when creating Mempools for packet buffer.
+ - Refrain from dynamically allocating/freeing memory in run-time.
+
+.. _mlx5_offloads_support:
+
+Supported hardware offloads
+---------------------------
+
+.. table:: Minimal SW/HW versions for queue offloads
+
+ ============== ===== ===== ========= ===== ========== ==========
+ Offload DPDK Linux rdma-core OFED firmware hardware
+ ============== ===== ===== ========= ===== ========== ==========
+ common base 17.11 4.14 16 4.2-1 12.21.1000 ConnectX-4
+ checksums 17.11 4.14 16 4.2-1 12.21.1000 ConnectX-4
+ Rx timestamp 17.11 4.14 16 4.2-1 12.21.1000 ConnectX-4
+ TSO 17.11 4.14 16 4.2-1 12.21.1000 ConnectX-4
+ LRO 19.08 N/A N/A 4.6-4 16.25.6406 ConnectX-5
+ ============== ===== ===== ========= ===== ========== ==========
+
+.. table:: Minimal SW/HW versions for rte_flow offloads
+
+ +-----------------------+-----------------+-----------------+
+ | Offload | with E-Switch | with NIC |
+ +=======================+=================+=================+
+ | Count | | DPDK 19.05 | | DPDK 19.02 |
+ | | | OFED 4.6 | | OFED 4.6 |
+ | | | rdma-core 24 | | rdma-core 23 |
+ | | | ConnectX-5 | | ConnectX-5 |
+ +-----------------------+-----------------+-----------------+
+ | Drop | | DPDK 19.05 | | DPDK 18.11 |
+ | | | OFED 4.6 | | OFED 4.5 |
+ | | | rdma-core 24 | | rdma-core 23 |
+ | | | ConnectX-5 | | ConnectX-4 |
+ +-----------------------+-----------------+-----------------+
+ | Queue / RSS | | | | DPDK 18.11 |
+ | | | N/A | | OFED 4.5 |
+ | | | | | rdma-core 23 |
+ | | | | | ConnectX-4 |
+ +-----------------------+-----------------+-----------------+
+ | Encapsulation | | DPDK 19.05 | | DPDK 19.02 |
+ | (VXLAN / NVGRE / RAW) | | OFED 4.7-1 | | OFED 4.6 |
+ | | | rdma-core 24 | | rdma-core 23 |
+ | | | ConnectX-5 | | ConnectX-5 |
+ +-----------------------+-----------------+-----------------+
+ | Encapsulation | | DPDK 19.11 | | DPDK 19.11 |
+ | GENEVE | | OFED 4.7-3 | | OFED 4.7-3 |
+ | | | rdma-core 27 | | rdma-core 27 |
+ | | | ConnectX-5 | | ConnectX-5 |
+ +-----------------------+-----------------+-----------------+
+ | | Header rewrite | | DPDK 19.05 | | DPDK 19.02 |
+ | | (set_ipv4_src / | | OFED 4.7-1 | | OFED 4.7-1 |
+ | | set_ipv4_dst / | | rdma-core 24 | | rdma-core 24 |
+ | | set_ipv6_src / | | ConnectX-5 | | ConnectX-5 |
+ | | set_ipv6_dst / | | | | |
+ | | set_tp_src / | | | | |
+ | | set_tp_dst / | | | | |
+ | | dec_ttl / | | | | |
+ | | set_ttl / | | | | |
+ | | set_mac_src / | | | | |
+ | | set_mac_dst) | | | | |
+ | | | | | | |
+ | | (of_set_vlan_vid) | | DPDK 19.11 | | DPDK 19.11 |
+ | | | OFED 4.7-1 | | OFED 4.7-1 |
+ | | | ConnectX-5 | | ConnectX-5 |
+ +-----------------------+-----------------+-----------------+
+ | Jump | | DPDK 19.05 | | DPDK 19.02 |
+ | | | OFED 4.7-1 | | OFED 4.7-1 |
+ | | | rdma-core 24 | | N/A |
+ | | | ConnectX-5 | | ConnectX-5 |
+ +-----------------------+-----------------+-----------------+
+ | Mark / Flag | | DPDK 19.05 | | DPDK 18.11 |
+ | | | OFED 4.6 | | OFED 4.5 |
+ | | | rdma-core 24 | | rdma-core 23 |
+ | | | ConnectX-5 | | ConnectX-4 |
+ +-----------------------+-----------------+-----------------+
+ | Port ID | | DPDK 19.05 | | N/A |
+ | | | OFED 4.7-1 | | N/A |
+ | | | rdma-core 24 | | N/A |
+ | | | ConnectX-5 | | N/A |
+ +-----------------------+-----------------+-----------------+
+ | | VLAN | | DPDK 19.11 | | DPDK 19.11 |
+ | | (of_pop_vlan / | | OFED 4.7-1 | | OFED 4.7-1 |
+ | | of_push_vlan / | | ConnectX-5 | | ConnectX-5 |
+ | | of_set_vlan_pcp / | | |
+ | | of_set_vlan_vid) | | |
+ +-----------------------+-----------------+-----------------+
+ | Hairpin | | | | DPDK 19.11 |
+ | | | N/A | | OFED 4.7-3 |
+ | | | | | rdma-core 26 |
+ | | | | | ConnectX-5 |
+ +-----------------------+-----------------+-----------------+
+ | Meta data | | DPDK 19.11 | | DPDK 19.11 |
+ | | | OFED 4.7-3 | | OFED 4.7-3 |
+ | | | rdma-core 26 | | rdma-core 26 |
+ | | | ConnectX-5 | | ConnectX-5 |
+ +-----------------------+-----------------+-----------------+
+ | Metering | | DPDK 19.11 | | DPDK 19.11 |
+ | | | OFED 4.7-3 | | OFED 4.7-3 |
+ | | | rdma-core 26 | | rdma-core 26 |
+ | | | ConnectX-5 | | ConnectX-5 |
+ +-----------------------+-----------------+-----------------+