- E-Switch mirroring and modify.
- 21844 flow priorities for ingress or egress flow groups greater than 0 and for any transfer
flow group.
+- Flow metering, including meter policy API.
+- Flow integrity offload API.
+- Connection tracking.
+- Sub-Function representors.
Limitations
-----------
- Flow rules having a VLAN pop offload command as one of their actions and
are lacking a match on VLAN as one of their items are not supported.
- - The command is not supported on egress traffic.
+ - The command is not supported on egress traffic in NIC mode.
-- VLAN push offload is not supported on ingress traffic.
+- VLAN push offload is not supported on ingress traffic in NIC mode.
- VLAN set PCP offload is not supported on existing headers.
- Sample flow:
- - Supports ``RTE_FLOW_ACTION_TYPE_SAMPLE`` action only within NIC Rx and E-Switch steering domain.
- - The E-Switch Sample flow must have the eswitch_manager VPORT destination (PF or ECPF) and no additional actions.
- - For ConnectX-5, the ``RTE_FLOW_ACTION_TYPE_SAMPLE`` is typically used as first action in the E-Switch egress flow if with header modify or encapsulation actions.
+ - Supports ``RTE_FLOW_ACTION_TYPE_SAMPLE`` action only within NIC Rx and
+ E-Switch steering domain.
+ - For E-Switch Sampling flow with sample ratio > 1, additional actions are not
+ supported in the sample actions list.
+ - For ConnectX-5, the ``RTE_FLOW_ACTION_TYPE_SAMPLE`` is typically used as
+ first action in the E-Switch egress flow if with header modify or
+ encapsulation actions.
+ - For NIC Rx flow, supports ``MARK``, ``COUNT``, ``QUEUE``, ``RSS`` in the
+ sample actions list.
+ - For E-Switch mirroring flow, supports ``RAW ENCAP``, ``Port ID``,
+ ``VXLAN ENCAP``, ``NVGRE ENCAP`` in the sample actions list.
- Modify Field flow:
- Supports the 'set' operation only for ``RTE_FLOW_ACTION_TYPE_MODIFY_FIELD`` action.
- Modification of an arbitrary place in a packet via the special ``RTE_FLOW_FIELD_START`` Field ID is not supported.
+ - Modification of the 802.1Q Tag, VXLAN Network or GENEVE Network ID's is not supported.
- Encapsulation levels are not supported, can modify outermost header fields only.
- Offsets must be 32-bits aligned, cannot skip past the boundary of a field.
- Hairpin between two ports could only manual binding and explicit Tx flow mode. For single port hairpin, all the combinations of auto/manual binding and explicit/implicit Tx flow mode could be supported.
- Hairpin in switchdev SR-IOV mode is not supported till now.
+- Meter:
+
+ - All the meter colors with drop action will be counted only by the global drop statistics.
+ - Green color is not supported with drop action.
+ - Yellow detection is not supported.
+ - Red color must be with drop action.
+ - Meter statistics are supported only for drop case.
+ - Meter yellow color detection is not supported.
+ - A meter action created with pre-defined policy must be the last action in the flow except single case where the policy actions are:
+ - green: NULL or END.
+ - yellow: NULL or END.
+ - RED: DROP / END.
+ - The only supported meter policy actions:
+ - green: QUEUE, RSS, PORT_ID, JUMP, MARK and SET_TAG.
+ - yellow: must be empty.
+ - RED: must be DROP.
+ - meter profile packet mode is supported.
+
+- Integrity:
+
+ - Integrity offload is enabled for **ConnectX-6** family.
+ - Verification bits provided by the hardware are ``l3_ok``, ``ipv4_csum_ok``, ``l4_ok``, ``l4_csum_ok``.
+ - ``level`` value 0 references outer headers.
+ - Multiple integrity items not supported in a single flow rule.
+ - Flow rule items supplied by application must explicitly specify network headers referred by integrity item.
+ For example, if integrity item mask sets ``l4_ok`` or ``l4_csum_ok`` bits, reference to L4 network header,
+ TCP or UDP, must be in the rule pattern as well::
+
+ flow create 0 ingress pattern integrity level is 0 value mask l3_ok value spec l3_ok / eth / ipv6 / end …
+ or
+ flow create 0 ingress pattern integrity level is 0 value mask l4_ok value spec 0 / eth / ipv4 proto is udp / end …
+
+- Connection tracking:
+
+ - Cannot co-exist with ASO meter, ASO age action in a single flow rule.
+ - Flow rules insertion rate and memory consumption need more optimization.
+ - 256 ports maximum.
+ - 4M connections maximum.
+
Statistics
----------
- ``representor`` parameter [list]
This parameter can be used to instantiate DPDK Ethernet devices from
- existing port (or VF) representors configured on the device.
+ existing port (PF, VF or SF) representors configured on the device.
It is a standard parameter whose format is described in
:ref:`ethernet_device_standard_device_arguments`.
- For instance, to probe port representors 0 through 2::
+ For instance, to probe VF port representors 0 through 2::
+
+ <PCI_BDF>,representor=vf[0-2]
+
+ To probe SF port representors 0 through 2::
+
+ <PCI_BDF>,representor=sf[0-2]
- representor=[0-2]
+ To probe VF port representors 0 through 2 on both PFs of bonding device::
+
+ <Primary_PCI_BDF>,representor=pf[0,1]vf[0-2]
- ``max_dump_files_num`` parameter [int]
Enable switchdev mode
---------------------
-Switchdev mode is a mode in E-Switch, that binds between representor and VF.
-Representor is a port in DPDK that is connected to a VF in such a way
-that assuming there are no offload flows, each packet that is sent from the VF
-will be received by the corresponding representor. While each packet that is
-sent to a representor will be received by the VF.
+Switchdev mode is a mode in E-Switch, that binds between representor and VF or SF.
+Representor is a port in DPDK that is connected to a VF or SF in such a way
+that assuming there are no offload flows, each packet that is sent from the VF or SF
+will be received by the corresponding representor. While each packet that is or SF
+sent to a representor will be received by the VF or SF.
This is very useful in case of SRIOV mode, where the first packet that is sent
-by the VF will be received by the DPDK application which will decide if this
+by the VF or SF will be received by the DPDK application which will decide if this
flow should be offloaded to the E-Switch. After offloading the flow packet
-that the VF that are matching the flow will not be received any more by
+that the VF or SF that are matching the flow will not be received any more by
the DPDK application.
1. Enable SRIOV mode::
echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
+Sub-Function representor
+------------------------
+
+Sub-Function is a portion of the PCI device, a SF netdev has its own
+dedicated queues(txq, rxq). A SF netdev supports E-Switch representation
+offload similar to existing PF and VF representors. A SF shares PCI
+level resources with other SFs and/or with its parent PCI function.
+
+1. Configure SF feature::
+
+ mlxconfig -d <mst device> set PF_BAR2_SIZE=<0/1/2/3> PF_BAR2_ENABLE=1
+
+ Value of PF_BAR2_SIZE:
+
+ 0: 8 SFs
+ 1: 16 SFs
+ 2: 32 SFs
+ 3: 64 SFs
+
+2. Reset the FW::
+
+ mlxfwreset -d <mst device> reset
+
+3. Enable switchdev mode::
+
+ echo switchdev > /sys/class/net/<net device>/compat/devlink/mode
+
+4. Create SF::
+
+ mlnx-sf -d <PCI_BDF> -a create
+
+5. Probe SF representor::
+
+ testpmd> port attach <PCI_BDF>,representor=sf0,dv_flow_en=1
+
Performance tuning
------------------
for better performance. For VMs, verify that the right CPU
and NUMA node are pinned according to the above. Run::
- lstopo-no-graphics
+ lstopo-no-graphics --merge
to identify the NUMA node to which the PCIe adapter is connected.
| | of_set_vlan_pcp / | | | | |
| | of_set_vlan_vid) | | | | |
+-----------------------+-----------------+-----------------+
+ | | VLAN | | DPDK 21.05 | | |
+ | | ingress and / | | OFED 5.3 | | N/A |
+ | | of_push_vlan / | | ConnectX-6 Dx | | |
+ +-----------------------+-----------------+-----------------+
+ | | VLAN | | DPDK 21.05 | | |
+ | | egress and / | | OFED 5.3 | | N/A |
+ | | of_pop_vlan / | | ConnectX-6 Dx | | |
+ +-----------------------+-----------------+-----------------+
| Encapsulation | | DPDK 19.05 | | DPDK 19.02 |
| (VXLAN / NVGRE / RAW) | | OFED 4.7-1 | | OFED 4.6 |
| | | rdma-core 24 | | rdma-core 23 |
| | | rdma-core 35 | | rdma-core 35 |
| | | ConnectX-5 | | ConnectX-5 |
+-----------------------+-----------------+-----------------+
+ | Connection tracking | | | | DPDK 21.05 |
+ | | | N/A | | OFED 5.3 |
+ | | | | | rdma-core 35 |
+ | | | | | ConnectX-6 Dx |
+ +-----------------------+-----------------+-----------------+
.. table:: Minimal SW/HW versions for shared action offload
:name: sact
| | | | | rdma-core 33 |
| | | | | ConnectX-5 |
+-----------------------+-----------------+-----------------+
- | Age | | DPDK 20.11 | | DPDK 20.11 |
- | | | OFED 5.2 | | OFED 5.2 |
- | | | rdma-core 32 | | rdma-core 32 |
- | | | ConnectX-6 Dx| | ConnectX-6 Dx |
+ | Age | | DPDK 20.11 | | DPDK 20.11 |
+ | | | OFED 5.2 | | OFED 5.2 |
+ | | | rdma-core 32 | | rdma-core 32 |
+ | | | ConnectX-6 Dx | | ConnectX-6 Dx |
+ +-----------------------+-----------------+-----------------+
+ | Count | | DPDK 21.05 | | DPDK 21.05 |
+ | | | OFED 4.6 | | OFED 4.6 |
+ | | | rdma-core 24 | | rdma-core 23 |
+ | | | ConnectX-5 | | ConnectX-5 |
+-----------------------+-----------------+-----------------+
Notes for metadata
#. Request huge pages::
- echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages/nr_hugepages
+ dpdk-hugepages.py --setup 2G
#. Start testpmd with basic parameters::
- testpmd -l 8-15 -n 4 -a 05:00.0 -a 05:00.1 -a 06:00.0 -a 06:00.1 -- --rxq=2 --txq=2 -i
+ dpdk-testpmd -l 8-15 -n 4 -a 05:00.0 -a 05:00.1 -a 06:00.0 -a 06:00.1 -- --rxq=2 --txq=2 -i
Example output::
.. code-block:: console
- testpmd> flow dump <port> <output_file>
+ To dump all flows:
+ testpmd> flow dump <port> all <output_file>
+ and dump one flow:
+ testpmd> flow dump <port> rule <rule_id> <output_file>
- call rte_flow_dev_dump api:
.. code-block:: console
- rte_flow_dev_dump(port, file, NULL);
+ rte_flow_dev_dump(port, flow, file, NULL);
#. Dump human-readable flows from raw file:
.. code-block:: console
- mlx_steering_dump.py -f <output_file>
+ mlx_steering_dump.py -f <output_file> -flowptr <flow_ptr>