+ This option should be used in combination with ``txq_inline_max`` and
+ ``txq_inline_mpw`` below and does not affect ``txq_inline_min`` settings above.
+
+ If this option is not specified the default value 16 is used for BlueField
+ and 8 for other platforms
+
+ The data inlining consumes the CPU cycles, so this option is intended to
+ auto enable inline data if we have enough Tx queues, which means we have
+ enough CPU cores and PCI bandwidth is getting more critical and CPU
+ is not supposed to be bottleneck anymore.
+
+ The copying data into WQE improves latency and can improve PPS performance
+ when PCI back pressure is detected and may be useful for scenarios involving
+ heavy traffic on many queues.
+
+ Because additional software logic is necessary to handle this mode, this
+ option should be used with care, as it may lower performance when back
+ pressure is not expected.
+
+ If inline data are enabled it may affect the maximal size of Tx queue in
+ descriptors because the inline data increase the descriptor size and
+ queue size limits supported by hardware may be exceeded.
+
+- ``txq_inline_min`` parameter [int]
+
+ Minimal amount of data to be inlined into WQE during Tx operations. NICs
+ may require this minimal data amount to operate correctly. The exact value
+ may depend on NIC operation mode, requested offloads, etc. It is strongly
+ recommended to omit this parameter and use the default values. Anyway,
+ applications using this parameter should take into consideration that
+ specifying an inconsistent value may prevent the NIC from sending packets.
+
+ If ``txq_inline_min`` key is present the specified value (may be aligned
+ by the driver in order not to exceed the limits and provide better descriptor
+ space utilization) will be used by the driver and it is guaranteed that
+ requested amount of data bytes are inlined into the WQE beside other inline
+ settings. This key also may update ``txq_inline_max`` value (default
+ or specified explicitly in devargs) to reserve the space for inline data.
+
+ If ``txq_inline_min`` key is not present, the value may be queried by the
+ driver from the NIC via DevX if this feature is available. If there is no DevX
+ enabled/supported the value 18 (supposing L2 header including VLAN) is set
+ for ConnectX-4 and ConnectX-4LX, and 0 is set by default for ConnectX-5
+ and newer NICs. If packet is shorter the ``txq_inline_min`` value, the entire
+ packet is inlined.
+
+ For ConnectX-4 NIC, driver does not allow specifying value below 18
+ (minimal L2 header, including VLAN), error will be raised.
+
+ For ConnectX-4LX NIC, it is allowed to specify values below 18, but
+ it is not recommended and may prevent NIC from sending packets over
+ some configurations.
+
+ Please, note, this minimal data inlining disengages eMPW feature (Enhanced
+ Multi-Packet Write), because last one does not support partial packet inlining.
+ This is not very critical due to minimal data inlining is mostly required
+ by ConnectX-4 and ConnectX-4 Lx, these NICs do not support eMPW feature.
+
+- ``txq_inline_max`` parameter [int]
+
+ Specifies the maximal packet length to be completely inlined into WQE
+ Ethernet Segment for ordinary SEND method. If packet is larger than specified
+ value, the packet data won't be copied by the driver at all, data buffer
+ is addressed with a pointer. If packet length is less or equal all packet
+ data will be copied into WQE. This may improve PCI bandwidth utilization for
+ short packets significantly but requires the extra CPU cycles.
+
+ The data inline feature is controlled by number of Tx queues, if number of Tx
+ queues is larger than ``txqs_min_inline`` key parameter, the inline feature
+ is engaged, if there are not enough Tx queues (which means not enough CPU cores
+ and CPU resources are scarce), data inline is not performed by the driver.
+ Assigning ``txqs_min_inline`` with zero always enables the data inline.
+
+ The default ``txq_inline_max`` value is 290. The specified value may be adjusted
+ by the driver in order not to exceed the limit (930 bytes) and to provide better
+ WQE space filling without gaps, the adjustment is reflected in the debug log.
+ Also, the default value (290) may be decreased in run-time if the large transmit
+ queue size is requested and hardware does not support enough descriptor
+ amount, in this case warning is emitted. If ``txq_inline_max`` key is
+ specified and requested inline settings can not be satisfied then error
+ will be raised.
+
+- ``txq_inline_mpw`` parameter [int]
+
+ Specifies the maximal packet length to be completely inlined into WQE for
+ Enhanced MPW method. If packet is large the specified value, the packet data
+ won't be copied, and data buffer is addressed with pointer. If packet length
+ is less or equal, all packet data will be copied into WQE. This may improve PCI
+ bandwidth utilization for short packets significantly but requires the extra
+ CPU cycles.
+
+ The data inline feature is controlled by number of TX queues, if number of Tx
+ queues is larger than ``txqs_min_inline`` key parameter, the inline feature
+ is engaged, if there are not enough Tx queues (which means not enough CPU cores
+ and CPU resources are scarce), data inline is not performed by the driver.
+ Assigning ``txqs_min_inline`` with zero always enables the data inline.
+
+ The default ``txq_inline_mpw`` value is 268. The specified value may be adjusted
+ by the driver in order not to exceed the limit (930 bytes) and to provide better
+ WQE space filling without gaps, the adjustment is reflected in the debug log.
+ Due to multiple packets may be included to the same WQE with Enhanced Multi
+ Packet Write Method and overall WQE size is limited it is not recommended to
+ specify large values for the ``txq_inline_mpw``. Also, the default value (268)
+ may be decreased in run-time if the large transmit queue size is requested
+ and hardware does not support enough descriptor amount, in this case warning
+ is emitted. If ``txq_inline_mpw`` key is specified and requested inline
+ settings can not be satisfied then error will be raised.
+
+- ``txqs_max_vec`` parameter [int]
+
+ Enable vectorized Tx only when the number of TX queues is less than or
+ equal to this value. This parameter is deprecated and ignored, kept
+ for compatibility issue to not prevent driver from probing.
+
+- ``txq_mpw_hdr_dseg_en`` parameter [int]
+
+ A nonzero value enables including two pointers in the first block of TX
+ descriptor. The parameter is deprecated and ignored, kept for compatibility
+ issue.
+
+- ``txq_max_inline_len`` parameter [int]
+
+ Maximum size of packet to be inlined. This limits the size of packet to
+ be inlined. If the size of a packet is larger than configured value, the
+ packet isn't inlined even though there's enough space remained in the
+ descriptor. Instead, the packet is included with pointer. This parameter
+ is deprecated and converted directly to ``txq_inline_mpw`` providing full
+ compatibility. Valid only if eMPW feature is engaged.
+
+- ``txq_mpw_en`` parameter [int]
+
+ A nonzero value enables Enhanced Multi-Packet Write (eMPW) for ConnectX-5,
+ ConnectX-6, ConnectX-6 DX and BlueField. eMPW allows the TX burst function to pack
+ up multiple packets in a single descriptor session in order to save PCI bandwidth
+ and improve performance at the cost of a slightly higher CPU usage. When
+ ``txq_inline_mpw`` is set along with ``txq_mpw_en``, TX burst function copies
+ entire packet data on to TX descriptor instead of including pointer of packet.
+
+ The Enhanced Multi-Packet Write feature is enabled by default if NIC supports
+ it, can be disabled by explicit specifying 0 value for ``txq_mpw_en`` option.
+ Also, if minimal data inlining is requested by non-zero ``txq_inline_min``
+ option or reported by the NIC, the eMPW feature is disengaged.
+
+- ``tx_db_nc`` parameter [int]
+
+ The rdma core library can map doorbell register in two ways, depending on the
+ environment variable "MLX5_SHUT_UP_BF":
+
+ - As regular cached memory, if the variable is either missing or set to zero.
+ - As non-cached memory, if the variable is present and set to not "0" value.
+
+ The type of mapping may slightly affect the Tx performance, the optimal choice
+ is strongly relied on the host architecture and should be deduced practically.
+
+ If ``tx_db_nc`` is set to zero, the doorbell is forced to be mapped to regular
+ memory, the PMD will perform the extra write memory barrier after writing to
+ doorbell, it might increase the needed CPU clocks per packet to send, but
+ latency might be improved.
+
+ If ``tx_db_nc`` is set to one, the doorbell is forced to be mapped to non
+ cached memory, the PMD will not perform the extra write memory barrier
+ after writing to doorbell, on some architectures it might improve the
+ performance.
+
+ If ``tx_db_nc`` is set to two, the doorbell is forced to be mapped to regular
+ memory, the PMD will use heuristics to decide whether write memory barrier
+ should be performed. For bursts with size multiple of recommended one (64 pkts)
+ it is supposed the next burst is coming and no need to issue the extra memory
+ barrier (it is supposed to be issued in the next coming burst, at least after
+ descriptor writing). It might increase latency (on some hosts till next
+ packets transmit) and should be used with care.
+
+ If ``tx_db_nc`` is omitted or set to zero, the preset (if any) environment
+ variable "MLX5_SHUT_UP_BF" value is used. If there is no "MLX5_SHUT_UP_BF",
+ the default ``tx_db_nc`` value is zero for ARM64 hosts and one for others.
+
+- ``tx_vec_en`` parameter [int]
+
+ A nonzero value enables Tx vector on ConnectX-5, ConnectX-6, ConnectX-6 DX
+ and BlueField NICs if the number of global Tx queues on the port is less than
+ ``txqs_max_vec``. The parameter is deprecated and ignored.
+
+- ``rx_vec_en`` parameter [int]
+
+ A nonzero value enables Rx vector if the port is not configured in
+ multi-segment otherwise this parameter is ignored.
+
+ Enabled by default.
+
+- ``vf_nl_en`` parameter [int]
+
+ A nonzero value enables Netlink requests from the VF to add/remove MAC
+ addresses or/and enable/disable promiscuous/all multicast on the Netdevice.
+ Otherwise the relevant configuration must be run with Linux iproute2 tools.
+ This is a prerequisite to receive this kind of traffic.
+
+ Enabled by default, valid only on VF devices ignored otherwise.
+
+- ``l3_vxlan_en`` parameter [int]
+
+ A nonzero value allows L3 VXLAN and VXLAN-GPE flow creation. To enable
+ L3 VXLAN or VXLAN-GPE, users has to configure firmware and enable this
+ parameter. This is a prerequisite to receive this kind of traffic.
+
+ Disabled by default.
+
+- ``dv_xmeta_en`` parameter [int]
+
+ A nonzero value enables extensive flow metadata support if device is
+ capable and driver supports it. This can enable extensive support of
+ ``MARK`` and ``META`` item of ``rte_flow``. The newly introduced
+ ``SET_TAG`` and ``SET_META`` actions do not depend on ``dv_xmeta_en``.
+
+ There are some possible configurations, depending on parameter value:
+
+ - 0, this is default value, defines the legacy mode, the ``MARK`` and
+ ``META`` related actions and items operate only within NIC Tx and
+ NIC Rx steering domains, no ``MARK`` and ``META`` information crosses
+ the domain boundaries. The ``MARK`` item is 24 bits wide, the ``META``
+ item is 32 bits wide and match supported on egress only.
+
+ - 1, this engages extensive metadata mode, the ``MARK`` and ``META``
+ related actions and items operate within all supported steering domains,
+ including FDB, ``MARK`` and ``META`` information may cross the domain
+ boundaries. The ``MARK`` item is 24 bits wide, the ``META`` item width
+ depends on kernel and firmware configurations and might be 0, 16 or
+ 32 bits. Within NIC Tx domain ``META`` data width is 32 bits for
+ compatibility, the actual width of data transferred to the FDB domain
+ depends on kernel configuration and may be vary. The actual supported
+ width can be retrieved in runtime by series of rte_flow_validate()
+ trials.
+
+ - 2, this engages extensive metadata mode, the ``MARK`` and ``META``
+ related actions and items operate within all supported steering domains,
+ including FDB, ``MARK`` and ``META`` information may cross the domain
+ boundaries. The ``META`` item is 32 bits wide, the ``MARK`` item width
+ depends on kernel and firmware configurations and might be 0, 16 or
+ 24 bits. The actual supported width can be retrieved in runtime by
+ series of rte_flow_validate() trials.
+
+ +------+-----------+-----------+-------------+-------------+
+ | Mode | ``MARK`` | ``META`` | ``META`` Tx | FDB/Through |
+ +======+===========+===========+=============+=============+
+ | 0 | 24 bits | 32 bits | 32 bits | no |
+ +------+-----------+-----------+-------------+-------------+
+ | 1 | 24 bits | vary 0-32 | 32 bits | yes |
+ +------+-----------+-----------+-------------+-------------+
+ | 2 | vary 0-32 | 32 bits | 32 bits | yes |
+ +------+-----------+-----------+-------------+-------------+
+
+ If there is no E-Switch configuration the ``dv_xmeta_en`` parameter is
+ ignored and the device is configured to operate in legacy mode (0).
+
+ Disabled by default (set to 0).
+
+ The Direct Verbs/Rules (engaged with ``dv_flow_en`` = 1) supports all
+ of the extensive metadata features. The legacy Verbs supports FLAG and
+ MARK metadata actions over NIC Rx steering domain only.
+
+- ``dv_flow_en`` parameter [int]
+
+ A nonzero value enables the DV flow steering assuming it is supported
+ by the driver (RDMA Core library version is rdma-core-24.0 or higher).
+
+ Enabled by default if supported.
+
+- ``dv_esw_en`` parameter [int]
+
+ A nonzero value enables E-Switch using Direct Rules.
+
+ Enabled by default if supported.
+
+- ``mr_ext_memseg_en`` parameter [int]
+
+ A nonzero value enables extending memseg when registering DMA memory. If
+ enabled, the number of entries in MR (Memory Region) lookup table on datapath
+ is minimized and it benefits performance. On the other hand, it worsens memory
+ utilization because registered memory is pinned by kernel driver. Even if a
+ page in the extended chunk is freed, that doesn't become reusable until the
+ entire memory is freed.
+
+ Enabled by default.
+
+- ``representor`` parameter [list]
+
+ This parameter can be used to instantiate DPDK Ethernet devices from
+ existing port (or VF) representors configured on the device.
+
+ It is a standard parameter whose format is described in
+ :ref:`ethernet_device_standard_device_arguments`.
+
+ For instance, to probe port representors 0 through 2::
+
+ representor=[0-2]
+
+- ``max_dump_files_num`` parameter [int]
+
+ The maximum number of files per PMD entity that may be created for debug information.
+ The files will be created in /var/log directory or in current directory.
+
+ set to 128 by default.
+
+- ``lro_timeout_usec`` parameter [int]
+
+ The maximum allowed duration of an LRO session, in micro-seconds.
+ PMD will set the nearest value supported by HW, which is not bigger than
+ the input ``lro_timeout_usec`` value.
+ If this parameter is not specified, by default PMD will set
+ the smallest value supported by HW.
+
+.. _mlx5_firmware_config:
+
+Firmware configuration
+~~~~~~~~~~~~~~~~~~~~~~
+
+Firmware features can be configured as key/value pairs.
+
+The command to set a value is::
+
+ mlxconfig -d <device> set <key>=<value>
+
+The command to query a value is::
+
+ mlxconfig -d <device> query | grep <key>
+
+The device name for the command ``mlxconfig`` can be either the PCI address,
+or the mst device name found with::
+
+ mst status
+
+Below are some firmware configurations listed.
+
+- link type::
+
+ LINK_TYPE_P1
+ LINK_TYPE_P2
+ value: 1=Infiniband 2=Ethernet 3=VPI(auto-sense)
+
+- enable SR-IOV::
+
+ SRIOV_EN=1
+
+- maximum number of SR-IOV virtual functions::
+
+ NUM_OF_VFS=<max>
+
+- enable DevX (required by Direct Rules and other features)::
+
+ UCTX_EN=1
+
+- aggressive CQE zipping::
+
+ CQE_COMPRESSION=1
+
+- L3 VXLAN and VXLAN-GPE destination UDP port::
+
+ IP_OVER_VXLAN_EN=1
+ IP_OVER_VXLAN_PORT=<udp dport>
+
+- enable IP-in-IP tunnel flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=0
+
+- enable MPLS flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=1
+
+- enable ICMP/ICMP6 code/type fields matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=2
+
+- enable Geneve flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=0
+
+- enable GTP flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=3