+Run as non-root
+^^^^^^^^^^^^^^^
+
+In order to run as a non-root user,
+some capabilities must be granted to the application::
+
+ setcap cap_sys_admin,cap_net_admin,cap_net_raw,cap_ipc_lock+ep <dpdk-app>
+
+Below are the reasons of the need for each capability:
+
+``cap_sys_admin``
+ When using physical addresses (PA mode), with Linux >= 4.0,
+ for access to ``/proc/self/pagemap``.
+
+``cap_net_admin``
+ For device configuration.
+
+``cap_net_raw``
+ For raw ethernet queue allocation through kernel driver.
+
+``cap_ipc_lock``
+ For DMA memory pinning.
+
+Driver options
+^^^^^^^^^^^^^^
+
+- ``rxq_cqe_comp_en`` parameter [int]
+
+ A nonzero value enables the compression of CQE on RX side. This feature
+ allows to save PCI bandwidth and improve performance. Enabled by default.
+
+ Supported on:
+
+ - x86_64 with ConnectX-4, ConnectX-4 Lx, ConnectX-5, ConnectX-6, ConnectX-6 Dx
+ and BlueField.
+ - POWER9 and ARMv8 with ConnectX-4 Lx, ConnectX-5, ConnectX-6, ConnectX-6 Dx
+ and BlueField.
+
+- ``rxq_cqe_pad_en`` parameter [int]
+
+ A nonzero value enables 128B padding of CQE on RX side. The size of CQE
+ is aligned with the size of a cacheline of the core. If cacheline size is
+ 128B, the CQE size is configured to be 128B even though the device writes
+ only 64B data on the cacheline. This is to avoid unnecessary cache
+ invalidation by device's two consecutive writes on to one cacheline.
+ However in some architecture, it is more beneficial to update entire
+ cacheline with padding the rest 64B rather than striding because
+ read-modify-write could drop performance a lot. On the other hand,
+ writing extra data will consume more PCIe bandwidth and could also drop
+ the maximum throughput. It is recommended to empirically set this
+ parameter. Disabled by default.
+
+ Supported on:
+
+ - CPU having 128B cacheline with ConnectX-5 and BlueField.
+
+- ``rxq_pkt_pad_en`` parameter [int]
+
+ A nonzero value enables padding Rx packet to the size of cacheline on PCI
+ transaction. This feature would waste PCI bandwidth but could improve
+ performance by avoiding partial cacheline write which may cause costly
+ read-modify-copy in memory transaction on some architectures. Disabled by
+ default.
+
+ Supported on:
+
+ - x86_64 with ConnectX-4, ConnectX-4 Lx, ConnectX-5, ConnectX-6, ConnectX-6 Dx
+ and BlueField.
+ - POWER8 and ARMv8 with ConnectX-4 Lx, ConnectX-5, ConnectX-6, ConnectX-6 Dx
+ and BlueField.
+
+- ``mprq_en`` parameter [int]
+
+ A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is
+ configured as Multi-Packet RQ if the total number of Rx queues is
+ ``rxqs_min_mprq`` or more. Disabled by default.
+
+ Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth
+ by posting a single large buffer for multiple packets. Instead of posting a
+ buffers per a packet, one large buffer is posted in order to receive multiple
+ packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides
+ and each stride receives one packet. MPRQ can improve throughput for
+ small-packet traffic.
+
+ When MPRQ is enabled, max_rx_pkt_len can be larger than the size of
+ user-provided mbuf even if DEV_RX_OFFLOAD_SCATTER isn't enabled. PMD will
+ configure large stride size enough to accommodate max_rx_pkt_len as long as
+ device allows. Note that this can waste system memory compared to enabling Rx
+ scatter and multi-segment packet.
+
+- ``mprq_log_stride_num`` parameter [int]
+
+ Log 2 of the number of strides for Multi-Packet Rx queue. Configuring more
+ strides can reduce PCIe traffic further. If configured value is not in the
+ range of device capability, the default value will be set with a warning
+ message. The default value is 4 which is 16 strides per a buffer, valid only
+ if ``mprq_en`` is set.
+
+ The size of Rx queue should be bigger than the number of strides.
+
+- ``mprq_log_stride_size`` parameter [int]
+
+ Log 2 of the size of a stride for Multi-Packet Rx queue. Configuring a smaller
+ stride size can save some memory and reduce probability of a depletion of all
+ available strides due to unreleased packets by an application. If configured
+ value is not in the range of device capability, the default value will be set
+ with a warning message. The default value is 11 which is 2048 bytes per a
+ stride, valid only if ``mprq_en`` is set. With ``mprq_log_stride_size`` set
+ it is possible for a packet to span across multiple strides. This mode allows
+ support of jumbo frames (9K) with MPRQ. The memcopy of some packets (or part
+ of a packet if Rx scatter is configured) may be required in case there is no
+ space left for a head room at the end of a stride which incurs some
+ performance penalty.
+
+- ``mprq_max_memcpy_len`` parameter [int]
+
+ The maximum length of packet to memcpy in case of Multi-Packet Rx queue. Rx
+ packet is mem-copied to a user-provided mbuf if the size of Rx packet is less
+ than or equal to this parameter. Otherwise, PMD will attach the Rx packet to
+ the mbuf by external buffer attachment - ``rte_pktmbuf_attach_extbuf()``.
+ A mempool for external buffers will be allocated and managed by PMD. If Rx
+ packet is externally attached, ol_flags field of the mbuf will have
+ EXT_ATTACHED_MBUF and this flag must be preserved. ``RTE_MBUF_HAS_EXTBUF()``
+ checks the flag. The default value is 128, valid only if ``mprq_en`` is set.
+
+- ``rxqs_min_mprq`` parameter [int]
+
+ Configure Rx queues as Multi-Packet RQ if the total number of Rx queues is
+ greater or equal to this value. The default value is 12, valid only if
+ ``mprq_en`` is set.
+
+- ``txq_inline`` parameter [int]
+
+ Amount of data to be inlined during TX operations. This parameter is
+ deprecated and converted to the new parameter ``txq_inline_max`` providing
+ partial compatibility.
+
+- ``txqs_min_inline`` parameter [int]
+
+ Enable inline data send only when the number of TX queues is greater or equal
+ to this value.
+
+ This option should be used in combination with ``txq_inline_max`` and
+ ``txq_inline_mpw`` below and does not affect ``txq_inline_min`` settings above.
+
+ If this option is not specified the default value 16 is used for BlueField
+ and 8 for other platforms
+
+ The data inlining consumes the CPU cycles, so this option is intended to
+ auto enable inline data if we have enough Tx queues, which means we have
+ enough CPU cores and PCI bandwidth is getting more critical and CPU
+ is not supposed to be bottleneck anymore.
+
+ The copying data into WQE improves latency and can improve PPS performance
+ when PCI back pressure is detected and may be useful for scenarios involving
+ heavy traffic on many queues.
+
+ Because additional software logic is necessary to handle this mode, this
+ option should be used with care, as it may lower performance when back
+ pressure is not expected.
+
+ If inline data are enabled it may affect the maximal size of Tx queue in
+ descriptors because the inline data increase the descriptor size and
+ queue size limits supported by hardware may be exceeded.
+
+- ``txq_inline_min`` parameter [int]
+
+ Minimal amount of data to be inlined into WQE during Tx operations. NICs
+ may require this minimal data amount to operate correctly. The exact value
+ may depend on NIC operation mode, requested offloads, etc. It is strongly
+ recommended to omit this parameter and use the default values. Anyway,
+ applications using this parameter should take into consideration that
+ specifying an inconsistent value may prevent the NIC from sending packets.
+
+ If ``txq_inline_min`` key is present the specified value (may be aligned
+ by the driver in order not to exceed the limits and provide better descriptor
+ space utilization) will be used by the driver and it is guaranteed that
+ requested amount of data bytes are inlined into the WQE beside other inline
+ settings. This key also may update ``txq_inline_max`` value (default
+ or specified explicitly in devargs) to reserve the space for inline data.
+
+ If ``txq_inline_min`` key is not present, the value may be queried by the
+ driver from the NIC via DevX if this feature is available. If there is no DevX
+ enabled/supported the value 18 (supposing L2 header including VLAN) is set
+ for ConnectX-4 and ConnectX-4 Lx, and 0 is set by default for ConnectX-5
+ and newer NICs. If packet is shorter the ``txq_inline_min`` value, the entire
+ packet is inlined.
+
+ For ConnectX-4 NIC, driver does not allow specifying value below 18
+ (minimal L2 header, including VLAN), error will be raised.
+
+ For ConnectX-4 Lx NIC, it is allowed to specify values below 18, but
+ it is not recommended and may prevent NIC from sending packets over
+ some configurations.
+
+ Please, note, this minimal data inlining disengages eMPW feature (Enhanced
+ Multi-Packet Write), because last one does not support partial packet inlining.
+ This is not very critical due to minimal data inlining is mostly required
+ by ConnectX-4 and ConnectX-4 Lx, these NICs do not support eMPW feature.
+
+- ``txq_inline_max`` parameter [int]
+
+ Specifies the maximal packet length to be completely inlined into WQE
+ Ethernet Segment for ordinary SEND method. If packet is larger than specified
+ value, the packet data won't be copied by the driver at all, data buffer
+ is addressed with a pointer. If packet length is less or equal all packet
+ data will be copied into WQE. This may improve PCI bandwidth utilization for
+ short packets significantly but requires the extra CPU cycles.
+
+ The data inline feature is controlled by number of Tx queues, if number of Tx
+ queues is larger than ``txqs_min_inline`` key parameter, the inline feature
+ is engaged, if there are not enough Tx queues (which means not enough CPU cores
+ and CPU resources are scarce), data inline is not performed by the driver.
+ Assigning ``txqs_min_inline`` with zero always enables the data inline.
+
+ The default ``txq_inline_max`` value is 290. The specified value may be adjusted
+ by the driver in order not to exceed the limit (930 bytes) and to provide better
+ WQE space filling without gaps, the adjustment is reflected in the debug log.
+ Also, the default value (290) may be decreased in run-time if the large transmit
+ queue size is requested and hardware does not support enough descriptor
+ amount, in this case warning is emitted. If ``txq_inline_max`` key is
+ specified and requested inline settings can not be satisfied then error
+ will be raised.
+
+- ``txq_inline_mpw`` parameter [int]
+
+ Specifies the maximal packet length to be completely inlined into WQE for
+ Enhanced MPW method. If packet is large the specified value, the packet data
+ won't be copied, and data buffer is addressed with pointer. If packet length
+ is less or equal, all packet data will be copied into WQE. This may improve PCI
+ bandwidth utilization for short packets significantly but requires the extra
+ CPU cycles.
+
+ The data inline feature is controlled by number of TX queues, if number of Tx
+ queues is larger than ``txqs_min_inline`` key parameter, the inline feature
+ is engaged, if there are not enough Tx queues (which means not enough CPU cores
+ and CPU resources are scarce), data inline is not performed by the driver.
+ Assigning ``txqs_min_inline`` with zero always enables the data inline.
+
+ The default ``txq_inline_mpw`` value is 268. The specified value may be adjusted
+ by the driver in order not to exceed the limit (930 bytes) and to provide better
+ WQE space filling without gaps, the adjustment is reflected in the debug log.
+ Due to multiple packets may be included to the same WQE with Enhanced Multi
+ Packet Write Method and overall WQE size is limited it is not recommended to
+ specify large values for the ``txq_inline_mpw``. Also, the default value (268)
+ may be decreased in run-time if the large transmit queue size is requested
+ and hardware does not support enough descriptor amount, in this case warning
+ is emitted. If ``txq_inline_mpw`` key is specified and requested inline
+ settings can not be satisfied then error will be raised.
+
+- ``txqs_max_vec`` parameter [int]
+
+ Enable vectorized Tx only when the number of TX queues is less than or
+ equal to this value. This parameter is deprecated and ignored, kept
+ for compatibility issue to not prevent driver from probing.
+
+- ``txq_mpw_hdr_dseg_en`` parameter [int]
+
+ A nonzero value enables including two pointers in the first block of TX
+ descriptor. The parameter is deprecated and ignored, kept for compatibility
+ issue.
+
+- ``txq_max_inline_len`` parameter [int]
+
+ Maximum size of packet to be inlined. This limits the size of packet to
+ be inlined. If the size of a packet is larger than configured value, the
+ packet isn't inlined even though there's enough space remained in the
+ descriptor. Instead, the packet is included with pointer. This parameter
+ is deprecated and converted directly to ``txq_inline_mpw`` providing full
+ compatibility. Valid only if eMPW feature is engaged.
+
+- ``txq_mpw_en`` parameter [int]
+
+ A nonzero value enables Enhanced Multi-Packet Write (eMPW) for ConnectX-5,
+ ConnectX-6, ConnectX-6 Dx and BlueField. eMPW allows the TX burst function to pack
+ up multiple packets in a single descriptor session in order to save PCI bandwidth
+ and improve performance at the cost of a slightly higher CPU usage. When
+ ``txq_inline_mpw`` is set along with ``txq_mpw_en``, TX burst function copies
+ entire packet data on to TX descriptor instead of including pointer of packet.
+
+ The Enhanced Multi-Packet Write feature is enabled by default if NIC supports
+ it, can be disabled by explicit specifying 0 value for ``txq_mpw_en`` option.
+ Also, if minimal data inlining is requested by non-zero ``txq_inline_min``
+ option or reported by the NIC, the eMPW feature is disengaged.
+
+- ``tx_db_nc`` parameter [int]
+
+ The rdma core library can map doorbell register in two ways, depending on the
+ environment variable "MLX5_SHUT_UP_BF":
+
+ - As regular cached memory (usually with write combining attribute), if the
+ variable is either missing or set to zero.
+ - As non-cached memory, if the variable is present and set to not "0" value.
+
+ The type of mapping may slightly affect the Tx performance, the optimal choice
+ is strongly relied on the host architecture and should be deduced practically.
+
+ If ``tx_db_nc`` is set to zero, the doorbell is forced to be mapped to regular
+ memory (with write combining), the PMD will perform the extra write memory barrier
+ after writing to doorbell, it might increase the needed CPU clocks per packet
+ to send, but latency might be improved.
+
+ If ``tx_db_nc`` is set to one, the doorbell is forced to be mapped to non
+ cached memory, the PMD will not perform the extra write memory barrier
+ after writing to doorbell, on some architectures it might improve the
+ performance.
+
+ If ``tx_db_nc`` is set to two, the doorbell is forced to be mapped to regular
+ memory, the PMD will use heuristics to decide whether write memory barrier
+ should be performed. For bursts with size multiple of recommended one (64 pkts)
+ it is supposed the next burst is coming and no need to issue the extra memory
+ barrier (it is supposed to be issued in the next coming burst, at least after
+ descriptor writing). It might increase latency (on some hosts till next
+ packets transmit) and should be used with care.
+
+ If ``tx_db_nc`` is omitted or set to zero, the preset (if any) environment
+ variable "MLX5_SHUT_UP_BF" value is used. If there is no "MLX5_SHUT_UP_BF",
+ the default ``tx_db_nc`` value is zero for ARM64 hosts and one for others.
+
+- ``tx_pp`` parameter [int]
+
+ If a nonzero value is specified the driver creates all necessary internal
+ objects to provide accurate packet send scheduling on mbuf timestamps.
+ The positive value specifies the scheduling granularity in nanoseconds,
+ the packet send will be accurate up to specified digits. The allowed range is
+ from 500 to 1 million of nanoseconds. The negative value specifies the module
+ of granularity and engages the special test mode the check the schedule rate.
+ By default (if the ``tx_pp`` is not specified) send scheduling on timestamps
+ feature is disabled.
+
+- ``tx_skew`` parameter [int]
+
+ The parameter adjusts the send packet scheduling on timestamps and represents
+ the average delay between beginning of the transmitting descriptor processing
+ by the hardware and appearance of actual packet data on the wire. The value
+ should be provided in nanoseconds and is valid only if ``tx_pp`` parameter is
+ specified. The default value is zero.
+
+- ``tx_vec_en`` parameter [int]
+
+ A nonzero value enables Tx vector on ConnectX-5, ConnectX-6, ConnectX-6 Dx
+ and BlueField NICs if the number of global Tx queues on the port is less than
+ ``txqs_max_vec``. The parameter is deprecated and ignored.
+
+- ``rx_vec_en`` parameter [int]
+
+ A nonzero value enables Rx vector if the port is not configured in
+ multi-segment otherwise this parameter is ignored.
+
+ Enabled by default.
+
+- ``vf_nl_en`` parameter [int]
+
+ A nonzero value enables Netlink requests from the VF to add/remove MAC
+ addresses or/and enable/disable promiscuous/all multicast on the Netdevice.
+ Otherwise the relevant configuration must be run with Linux iproute2 tools.
+ This is a prerequisite to receive this kind of traffic.
+
+ Enabled by default, valid only on VF devices ignored otherwise.
+
+- ``l3_vxlan_en`` parameter [int]
+
+ A nonzero value allows L3 VXLAN and VXLAN-GPE flow creation. To enable
+ L3 VXLAN or VXLAN-GPE, users has to configure firmware and enable this
+ parameter. This is a prerequisite to receive this kind of traffic.
+
+ Disabled by default.
+
+- ``dv_xmeta_en`` parameter [int]
+
+ A nonzero value enables extensive flow metadata support if device is
+ capable and driver supports it. This can enable extensive support of
+ ``MARK`` and ``META`` item of ``rte_flow``. The newly introduced
+ ``SET_TAG`` and ``SET_META`` actions do not depend on ``dv_xmeta_en``.
+
+ There are some possible configurations, depending on parameter value:
+
+ - 0, this is default value, defines the legacy mode, the ``MARK`` and
+ ``META`` related actions and items operate only within NIC Tx and
+ NIC Rx steering domains, no ``MARK`` and ``META`` information crosses
+ the domain boundaries. The ``MARK`` item is 24 bits wide, the ``META``
+ item is 32 bits wide and match supported on egress only.
+
+ - 1, this engages extensive metadata mode, the ``MARK`` and ``META``
+ related actions and items operate within all supported steering domains,
+ including FDB, ``MARK`` and ``META`` information may cross the domain
+ boundaries. The ``MARK`` item is 24 bits wide, the ``META`` item width
+ depends on kernel and firmware configurations and might be 0, 16 or
+ 32 bits. Within NIC Tx domain ``META`` data width is 32 bits for
+ compatibility, the actual width of data transferred to the FDB domain
+ depends on kernel configuration and may be vary. The actual supported
+ width can be retrieved in runtime by series of rte_flow_validate()
+ trials.
+
+ - 2, this engages extensive metadata mode, the ``MARK`` and ``META``
+ related actions and items operate within all supported steering domains,
+ including FDB, ``MARK`` and ``META`` information may cross the domain
+ boundaries. The ``META`` item is 32 bits wide, the ``MARK`` item width
+ depends on kernel and firmware configurations and might be 0, 16 or
+ 24 bits. The actual supported width can be retrieved in runtime by
+ series of rte_flow_validate() trials.
+
+ +------+-----------+-----------+-------------+-------------+
+ | Mode | ``MARK`` | ``META`` | ``META`` Tx | FDB/Through |
+ +======+===========+===========+=============+=============+
+ | 0 | 24 bits | 32 bits | 32 bits | no |
+ +------+-----------+-----------+-------------+-------------+
+ | 1 | 24 bits | vary 0-32 | 32 bits | yes |
+ +------+-----------+-----------+-------------+-------------+
+ | 2 | vary 0-32 | 32 bits | 32 bits | yes |
+ +------+-----------+-----------+-------------+-------------+
+
+ If there is no E-Switch configuration the ``dv_xmeta_en`` parameter is
+ ignored and the device is configured to operate in legacy mode (0).
+
+ Disabled by default (set to 0).
+
+ The Direct Verbs/Rules (engaged with ``dv_flow_en`` = 1) supports all
+ of the extensive metadata features. The legacy Verbs supports FLAG and
+ MARK metadata actions over NIC Rx steering domain only.
+
+- ``dv_flow_en`` parameter [int]
+
+ A nonzero value enables the DV flow steering assuming it is supported
+ by the driver (RDMA Core library version is rdma-core-24.0 or higher).
+
+ Enabled by default if supported.
+
+- ``dv_esw_en`` parameter [int]
+
+ A nonzero value enables E-Switch using Direct Rules.
+
+ Enabled by default if supported.
+
+- ``lacp_by_user`` parameter [int]
+
+ A nonzero value enables the control of LACP traffic by the user application.
+ When a bond exists in the driver, by default it should be managed by the
+ kernel and therefore LACP traffic should be steered to the kernel.
+ If this devarg is set to 1 it will allow the user to manage the bond by
+ itself and not steer LACP traffic to the kernel.
+
+ Disabled by default (set to 0).
+
+- ``mr_ext_memseg_en`` parameter [int]
+
+ A nonzero value enables extending memseg when registering DMA memory. If
+ enabled, the number of entries in MR (Memory Region) lookup table on datapath
+ is minimized and it benefits performance. On the other hand, it worsens memory
+ utilization because registered memory is pinned by kernel driver. Even if a
+ page in the extended chunk is freed, that doesn't become reusable until the
+ entire memory is freed.
+
+ Enabled by default.
+
+- ``representor`` parameter [list]
+
+ This parameter can be used to instantiate DPDK Ethernet devices from
+ existing port (or VF) representors configured on the device.
+
+ It is a standard parameter whose format is described in
+ :ref:`ethernet_device_standard_device_arguments`.
+
+ For instance, to probe port representors 0 through 2::
+
+ representor=[0-2]
+
+- ``max_dump_files_num`` parameter [int]
+
+ The maximum number of files per PMD entity that may be created for debug information.
+ The files will be created in /var/log directory or in current directory.
+
+ set to 128 by default.
+
+- ``lro_timeout_usec`` parameter [int]
+
+ The maximum allowed duration of an LRO session, in micro-seconds.
+ PMD will set the nearest value supported by HW, which is not bigger than
+ the input ``lro_timeout_usec`` value.
+ If this parameter is not specified, by default PMD will set
+ the smallest value supported by HW.
+
+- ``hp_buf_log_sz`` parameter [int]
+
+ The total data buffer size of a hairpin queue (logarithmic form), in bytes.
+ PMD will set the data buffer size to 2 ** ``hp_buf_log_sz``, both for RX & TX.
+ The capacity of the value is specified by the firmware and the initialization
+ will get a failure if it is out of scope.
+ The range of the value is from 11 to 19 right now, and the supported frame
+ size of a single packet for hairpin is from 512B to 128KB. It might change if
+ different firmware release is being used. By using a small value, it could
+ reduce memory consumption but not work with a large frame. If the value is
+ too large, the memory consumption will be high and some potential performance
+ degradation will be introduced.
+ By default, the PMD will set this value to 16, which means that 9KB jumbo
+ frames will be supported.
+
+- ``reclaim_mem_mode`` parameter [int]
+
+ Cache some resources in flow destroy will help flow recreation more efficient.
+ While some systems may require the all the resources can be reclaimed after
+ flow destroyed.
+ The parameter ``reclaim_mem_mode`` provides the option for user to configure
+ if the resource cache is needed or not.
+
+ There are three options to choose:
+
+ - 0. It means the flow resources will be cached as usual. The resources will
+ be cached, helpful with flow insertion rate.
+
+ - 1. It will only enable the DPDK PMD level resources reclaim.
+
+ - 2. Both DPDK PMD level and rdma-core low level will be configured as
+ reclaimed mode.
+
+ By default, the PMD will set this value to 0.
+
+- ``sys_mem_en`` parameter [int]
+
+ A non-zero value enables the PMD memory management allocating memory
+ from system by default, without explicit rte memory flag.
+
+ By default, the PMD will set this value to 0.
+
+- ``decap_en`` parameter [int]
+
+ Some devices do not support FCS (frame checksum) scattering for
+ tunnel-decapsulated packets.
+ If set to 0, this option forces the FCS feature and rejects tunnel
+ decapsulation in the flow engine for such devices.
+
+ By default, the PMD will set this value to 1.
+
+.. _mlx5_firmware_config:
+
+Firmware configuration
+~~~~~~~~~~~~~~~~~~~~~~
+
+Firmware features can be configured as key/value pairs.
+
+The command to set a value is::
+
+ mlxconfig -d <device> set <key>=<value>
+
+The command to query a value is::
+
+ mlxconfig -d <device> query | grep <key>
+
+The device name for the command ``mlxconfig`` can be either the PCI address,
+or the mst device name found with::
+
+ mst status
+
+Below are some firmware configurations listed.
+
+- link type::
+
+ LINK_TYPE_P1
+ LINK_TYPE_P2
+ value: 1=Infiniband 2=Ethernet 3=VPI(auto-sense)
+
+- enable SR-IOV::
+
+ SRIOV_EN=1
+
+- maximum number of SR-IOV virtual functions::
+
+ NUM_OF_VFS=<max>
+
+- enable DevX (required by Direct Rules and other features)::
+
+ UCTX_EN=1
+
+- aggressive CQE zipping::
+
+ CQE_COMPRESSION=1
+
+- L3 VXLAN and VXLAN-GPE destination UDP port::
+
+ IP_OVER_VXLAN_EN=1
+ IP_OVER_VXLAN_PORT=<udp dport>
+
+- enable VXLAN-GPE tunnel flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=0
+ or
+ FLEX_PARSER_PROFILE_ENABLE=2
+
+- enable IP-in-IP tunnel flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=0
+
+- enable MPLS flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=1
+
+- enable ICMP/ICMP6 code/type fields matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=2
+
+- enable Geneve flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=0
+ or
+ FLEX_PARSER_PROFILE_ENABLE=1
+
+- enable GTP flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=3
+
+- enable eCPRI flow matching::
+
+ FLEX_PARSER_PROFILE_ENABLE=4
+ PROG_PARSE_GRAPH=1
+