Juraj Linkeš [Wed, 7 Jul 2021 13:25:40 +0000 (15:25 +0200)]
eal/arm: update CPU flags
There are two execution states on armv8 architecture, aarch64 and
aarch32. Add PLATFORM_STR for the latter and update RTE_ARCH_* flags
according to e9b97392640.
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Juraj Linkeš [Wed, 7 Jul 2021 13:25:39 +0000 (15:25 +0200)]
net/virtio: fix aarch32 build
NEON vector path of the PMD needs aarch64 support. But it was
enabled for aarch32 build as well because aarch32 build had
cpu_family set to aarch64. So build for aarch32 will fail due
to unsupported intrinsics.
Fix aarch32 build by updating meson file to exclude NEON vector
implementation for aarch32.
Fixes: 749799482a72 ("net/virtio: add to meson build") Cc: stable@dpdk.org Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Ruifeng Wang [Wed, 7 Jul 2021 13:25:38 +0000 (15:25 +0200)]
net/bnxt: fix aarch32 build
NEON vector path of the PMD needs aarch64 support. But it was
enabled for aarch32 build as well because aarch32 build had
cpu_family set to aarch64. So build for aarch32 will fail due
to unsupported intrinsics.
Fix aarch32 build by updating meson file to exclude NEON vector
implementation for aarch32.
Fixes: 398358341419 ("net/bnxt: support NEON") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Ruifeng Wang [Wed, 7 Jul 2021 13:25:37 +0000 (15:25 +0200)]
net/sfc: fix aarch32 build
The sfc PMD was enabled for aarch32 which is 32-bit mode but has
cpu_family set to aarch64.
As sfc support only 64-bit system, it should be disabled for aarch32.
Updated meson file to disable sfc for aarch32 build.
Fixes: 141d2870675a ("net/sfc: support aarch64 architecture") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Ferruh Yigit [Thu, 24 Jun 2021 13:32:16 +0000 (14:32 +0100)]
kni: update link only on change
'rte_kni_update_link()' updates virtual KNI interface link using kernel
sysfs interface.
If the requested link status is same as interface link status, do not
update the link status but return with success.
Nick Connolly [Mon, 26 Apr 2021 10:07:32 +0000 (11:07 +0100)]
build: support drivers symlink on Windows
The symlink-drivers-solibs.sh script was disabled as part of 'install'
for Windows because there is no support for shell scripts. However,
this means that driver related DLLs are not present in the installed
'libdir' directory. Add a python script to perform the install and use
it for Windows if the version of meson supports using an external
program with add_install_script (>= 0.55.0).
On Windows, symbolic links are somewhat problematic since the
SeCreateSymbolicLinkPrivilege is required to be able to create them.
In addition, different cross-compilation environments handle symbolic
links differently, e.g. WSL, Msys2, Cygwin. Rather than trying to
distinguish these scenarios, the python script will perform a file copy
for any Windows specific names.
On Windows, the shared library outputs have different names depending
upon which toolset has been used to build them. The script currently
handles Clang and GCC.
On Linux the functionality is unchanged, but could be replaced with the
python script once the required minimum version of meson is >= 0.55.0.
Cc: stable@dpdk.org Signed-off-by: Nick Connolly <nick.connolly@mayadata.io> Tested-by: Narcisa Vasile <navasile@linux.microsoft.com> Acked-by: Narcisa Vasile <navasile@linux.microsoft.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
On arm platform, the value in "/sys/.../cpuinfo_cur_freq" may not
be exactly the same as what was set when using CPPC cpufreq driver.
For other cpufreq driver, no need to round it currently, or else
this check will fail with turbo enabled. For example, with acpi_cpufreq,
cpuinfo_cur_freq can be 2401000 which is equal to freqs[0].It should
not be rounded to 2400000.
Fixes: 606a234c6d360 ("test/power: round CPU frequency to check") Cc: stable@dpdk.org Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Acked-by: David Hunt <david.hunt@intel.com>
Currently in DPDK only acpi_cpufreq and pstate_cpufreq drivers are
supported, which are both not available on arm64 platforms. Add
support for cppc_cpufreq driver which works on most arm64 platforms.
Signed-off-by: Richael Zhuang <richael.zhuang@arm.com> Acked-by: David Hunt <david.hunt@intel.com>
Dmitry Kozlyuk [Wed, 30 Jun 2021 16:22:35 +0000 (19:22 +0300)]
doc: fix build on Windows with Meson 0.58
The `doc` target used `echo` as its command.
On Windows, `echo` is always a shell built-in, there is no binary.
Starting from meson 0.58, `run_target()` always searches for command
executable and no longer accepts `echo` as such on Windows.
Replace plain `echo` with a Python one-liner.
Fixes: d02a2dab2dfb ("doc: support building HTML guides with meson") Cc: stable@dpdk.org Reported-by: Rob Scheepens <rob.scheepens@nutanix.com> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Acked-by: Luca Boccassi <bluca@debian.org> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
Juraj Linkeš [Tue, 6 Jul 2021 09:44:28 +0000 (11:44 +0200)]
build: use platform for generic and native builds
The current meson option 'machine' should only specify the ISA, which is
not sufficient for Arm, where setting ISA implies other settings as well
(and is used in Arm configuration as such).
Use the existing 'platform' meson option to differentiate the type of
the build (native/generic) and set ISA accordingly, unless the user
chooses to override it with a new option, 'cpu_instruction_set'.
The 'machine' option set the ISA in x86 builds and set native/default
'build type' in aarch64 builds. These two new variables, 'platform' and
'cpu_instruction_set', now properly set both ISA and build type for all
architectures in a uniform manner.
The 'machine' option also doesn't describe very well what it sets. The
new option, 'cpu_instruction_set', is much more descriptive. Keep
'machine' for backwards compatibility.
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Anoob Joseph [Fri, 9 Jul 2021 06:28:43 +0000 (11:58 +0530)]
crypto/cnxk: add PCI ID for cn9k
Add PCI ID for crypo_cn9k PMD.
To avoid conflicting PCI ID in crypto_octeontx2 and crypto_cn9k PMDs,
disable crypto_cn9k PMD when built with octeontx2 config.
The lack of PCI ID is causing debug build to fail on Ubuntu 18.04
for crypto_cn9k PMD.
Reported-by: Ali Alnubani <alialnu@nvidia.com> Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Joyce Kong [Tue, 6 Jul 2021 06:54:03 +0000 (01:54 -0500)]
net/i40e: fix descriptor scan on Arm
For Arm platforms, reading descs can get re-ordered, then the
status of DD bits will be discontinuous, so add the logic to
only process continuous descs by checking DD bits.
Fixes: 4861cde46116 ("i40e: new poll mode driver") Cc: stable@dpdk.org Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Dapeng Yu [Wed, 16 Jun 2021 01:20:53 +0000 (09:20 +0800)]
net/ice: fix VXLAN flow director creation
In original implementation, error returned when creating VXLAN flow
director with SCTP or TCP as layer 4 protocol of inner segment.
There are several root causes for the error:
1. ice_fdir_input_set_hdrs() set ICE_FLOW_SEG_HDR_UDP into protocol
header flag of inner segment of VXLAN FDIR rule, even if it shall be
ICE_FLOW_SEG_HDR_TCP or ICE_FLOW_SEG_HDR_SCTP
2. ice_fdir_input_set_hdrs() set ICE_FLOW_SEG_HDR_VXLAN into protocol
header flag of segments of VXLAN FDIR rule, it not necessary, and can
be set automatically by ice_flow_set_fld() later
3. flow type: ICE_FLTR_PTYPE_NONF_IPV4_UDP_VXLAN hides the flow type of
inner segment of VXLAN FDIR rule, then further causes function:
ice_fdir_get_gen_prgm_pkt() cannot write correct protocol id into inner
segment of training packet.
This patch fixes those defects described above.
Fixes: 855d23a07b36 ("net/ice: support VXLAN VNI field in flow director") Cc: stable@dpdk.org Signed-off-by: Dapeng Yu <dapengx.yu@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Dapeng Yu [Wed, 16 Jun 2021 01:20:52 +0000 (09:20 +0800)]
net/ice/base: fix VXLAN flow director creation
In original implementation, error returned when creating VXLAN flow
director with SCTP or TCP as layer 4 protocol of inner segment.
There are several root causes for the error:
1. ice_fdir_udp4_vxlan_pkt[] is not adapted to the TCP and SCTP protocol.
Its length cannot hold TCP header, only UDP protocol was supported in
original implementation
2. VXLAN VNI offset: 45 is inconsistent with IETF RFC 7348
This patch fixes those defects described above.
Fixes: 608cd0a5e283 ("net/ice/base: support VXLAN VNI field in flow director") Cc: stable@dpdk.org Signed-off-by: Dapeng Yu <dapengx.yu@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
net/ice: support QoS bandwidth config after VF reset in DCF
When VF reset happens, the QoS bandwidth configuration will be lost. If
the reset is not caused by DCB change, it is supposed to replay the
bandwidth configuration to VF by DCF. In this patch, when a vsi update
PF event is received from PF after VF reset, and it is confirmed that
DCB is not changed, bandwidth configuration will be replayed.
net/iavf: simplify flow director rules for IP fragment
This patch simplify the pattern of flow rules of FDIR for IP fragment.
Flow rule can be created by the following command:
1. flow create 0 ingress pattern eth /
ipv4 fragment_offset spec 0x2000 fragment_offset mask 0x2000 /
end <actions>
2. flow create 0 ingress pattern eth / ipv6 /
ipv6_frag_ext fragment_offset spec 0x0001 fragment_offset mask 0x0001 /
end <actions>
net/ice: simplify flow director rules for IP fragment
This patch simplify the pattern of flow rules of FDIR for IP fragment.
Flow rule can be created by the following command:
1. flow create 0 ingress pattern eth /
ipv4 fragment_offset spec 0x2000 fragment_offset mask 0x2000 /
end <actions>
2. flow create 0 ingress pattern eth / ipv6 /
ipv6_frag_ext fragment_offset spec 0x0001 fragment_offset mask 0x0001 /
end <actions>
testpmd> port attach 0000:5e:00.0
Attaching a new port...
EAL: Using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: net_ice (8086:159b) device: 0000:5e:00.0 (socket 0)
ice_load_pkg(): failed to open file: /lib/firmware/intel/ice/ddp/ice.pkg
ice_dev_init(): Failed to load the DDP package,Use safe-mode-support=1 to
enter Safe Mode
EAL: Releasing PCI mapped resource for 0000:5e:00.0
EAL: Calling pci_unmap_resource for 0000:5e:00.0 at 0x2200000000
EAL: Calling pci_unmap_resource for 0000:5e:00.0 at 0x2202000000
EAL: Driver cannot attach the device (0000:5e:00.0)
EAL: Failed to attach device on primary process
testpmd: Failed to attach port 0000:5e:00.0
common/mlx5: fix compatibility with OFED port query API
The compilation flag HAVE_MLX5DV_DR_DEVX_PORT depends on presence
of mlx5dv_query_devx_port routine in rdma-core library.
The mlx5dv_query_devx_port routine exists only in OFED versions
of rdma-core library and is being planned to be removed and replaced
with Upstream compatible mlx5dv_query_port.
As mlx5dv_query_devx_port is being removed all the dependencies on
the HAVE_MLX5DV_DR_DEVX_PORT compilation flag are reconsidered.
The new compilation flag HAVE_MLX5DV_DR_CREATE_DEST_IB_PORT is for
backward compatibility with older OFED versions.
Fixes: 6cfe84fbe7b1 ("net/mlx5: fix port action for LAG") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
In order to get E-Switch vport identifiers the mlx5 PMD relies
on two approaches:
[a] use port query API if it is provided by rdma-core library
[b] otherwise, deduce vport ids from the related VF index
The latter is not reliable and may not work with newer kernel
drivers and in some configurations (LAG), causing E-Switch
malfunction. Hence, engaging the port query API is highly
desirable.
Depending on rdma-core version the port query API is:
- very old OFED versions have no query API (approach [b])
- rdma-core OFED < 5.5 provides mlx5dv_query_devx_port,
HAVE_MLX5DV_DR_DEVX_PORT flag is defined (approach [a])
- rdma-core OFED >= 5.5 has mlx5dv_query_port, flag
HAVE_MLX5DV_DR_DEVX_PORT_V35 is defined (approach [a])
- future OFED versions might remove mlx5dv_query_devx_port
and HAVE_MLX5DV_DR_DEVX_PORT will not be defined
- Upstream rdma-core < v35 has no port query API (approach [b])
- Upstream rdma-core >= v35 has mlx5dv_query_port, flag
HAVE_MLX5DV_DR_DEVX_PORT_V35 is defined (approach [a])
In order to support the new mlx5dv_query_port routine, the
conditional compilation flag HAVE_MLX5DV_DR_DEVX_PORT_V35
is introduced by this patch. The flag HAVE_MLX5DV_DR_DEVX_PORT
is kept for compatibility with previous rdma-core versions.
Despite this patch is not a bugfix (it follows the introduced API
variation in underlying library), it resolves the compatibility
issue and is highly desired to be ported to DPDK LTS.
Jiawei Wang [Tue, 6 Jul 2021 08:12:27 +0000 (11:12 +0300)]
net/mlx5: control flow rules with identical pattern
In order to allow\disallow configuring rules with identical
patterns, the new device argument 'allow_duplicate_pattern'
is introduced.
If allow, these rules be inserted successfully and only the
first rule take affect.
If disallow, the first rule will be inserted and other rules
be rejected.
The default is to allow.
Set it to 0 if disallow, for example:
-a <PCI_BDF>,allow_duplicate_pattern=0
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
When creating hierarchy meter, its color rules will increase next
meter's reference count, so when destroy the hierarchy meter, also
need to dereference the next meter's count.
During flushing all meters of a port, need to destroy all hierarchy
meters and their policies first, to dereference the last meter in
hierarchy. Then all meters have no reference and can be destroyed.
When using meter hierarchy with multiple meters, every meter may have
drop counter, so a packet being set red color by one meter should be
counted to that specific meter only.
To support this, add tag action in the color rule so packet going to
next new meter can have its meter id, so as to be counted to the
correct drop counter in drop table.
This makes the meter policy support meter action. So multiple meters
can be chained as a meter hierarchy.
Only termination meter is allowed as the last meter in a hierarchy,
and there're two cases:
1. The last meter has non-RSS policy, can directly create sub-policy
and color rules during each meter's policy creation.
2. The last meter has RSS policy, don't create sub-policy/rules when
creating meter policy. Only when a RTE flow is using the meter hierarchy,
will iterate all meters of the hierarchy and create needed sub-
policies and color rules for them.
Xiaoyu Min [Fri, 2 Jul 2021 08:34:48 +0000 (16:34 +0800)]
net/mlx5: limit inner RSS expansion for MPLS
If user wants to do MPLS inner RSS and only provides pattern
till MPLS without inner items [1], RSS expansion will expand flows
into 13 sub-flows[2] which is too many and it impacts flow insert
rate, stack usage becomes large as well.
This expansion into 13 sub-flows seems not worthy of and it can
be significantly reduced (i.e, 7 sub-flows [3]) by user providing
at least one inner L2/L3 item [4].
[1]:
pattern eth / ipv4 / udp / mpls / end actions rss type tcp udp ip
end level 2 / end
[2]:
eth / ipv4 / udp / mpls
eth / ipv4 / udp / mpls / ipv4
eth / ipv4 / udp / mpls / ipv4 / udp
eth / ipv4 / udp / mpls / ipv4 / tcp
eth / ipv4 / udp / mpls / ipv6
eth / ipv4 / udp / mpls / ipv6 / udp
eth / ipv4 / udp / mpls / ipv6 / tcp
eth / ipv4 / udp / mpls / eth / ipv4
eth / ipv4 / udp / mpls / eth / ipv4 / udp
eth / ipv4 / udp / mpls / eth / ipv4 / tcp
eth / ipv4 / udp / mpls / eth / ipv6
eth / ipv4 / udp / mpls / eth / ipv6 / udp
eth / ipv4 / udp / mpls / eth / ipv6 / tcp
[3]:
eth / ipv4 / udp / mpls / eth
eth / ipv4 / udp / mpls / eth / ipv4 / udp
eth / ipv4 / udp / mpls / eth / ipv4 / tcp
eth / ipv4 / udp / mpls / eth / ipv6
eth / ipv4 / udp / mpls / eth / ipv6 / udp
eth / ipv4 / udp / mpls / eth / ipv6 / tcp
[4]:
pattern eth / ipv4 / udp / mpls / eth / end actions rss type tcp udp ip
level 2 / end
Signed-off-by: Xiaoyu Min <jackmin@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
net/mlx5: fix offset calculation for modify field action
Offsets are not taken into account during MAC addresses
manipulation for the MODIFY_FIELD action. That leads to
a wrong split between 0-15 and 16-47 bits and corrupted
data being copied to/from MAC addresses. Use both source
and destination offsets to calcucate the proper modify
header action specification.
Fixes: fdd0c046f4 ("net/mlx5: fix modify field action order for MAC") Cc: stable@dpdk.org Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
MLX5 PMD supports L3 and L4 integrity bits.
L4 checksum-ok bit was not translated correctly.
The patch updates the l4_csum_ok integrity bit translation.
If there are many VFs the Netlink message length sent by kernel
in reply to RTM_GETLINK request can be large. We should query
the size of message being received in advance and allocate
the large enough buffer to handle these large messages.
Fixes: ccdcba53a3f4 ("net/mlx5: use Netlink to add/remove MAC addresses") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xiaoyu Min [Thu, 1 Jul 2021 05:54:56 +0000 (13:54 +0800)]
net/mlx5: fix match MPLS over GRE with key
Currently PMD needs previous layer information in order to set
corresponding match field for MPLSoGRE or MPLSoUDP.
GRE_KEY item is missing as supported previous layer when translate
item MPLS, which causes flow[1] cannot match MPLS over GRE traffic.
According to RFC4023, MPLS over GRE tunnel with optional key
field needs to be supported too.
By adding missing GRE_KEY as supported previous layer fix problem.
[1]:
flow create 0 ingress pattern eth / ipv6 / gre k_bit is 1 / gre_key /
mpls label is 966138 / end actions queue index 1 / mark id 0xa / end
Fixes: a7a0365565a4 ("net/mlx5: match GRE key and present bits") Cc: stable@dpdk.org Signed-off-by: Xiaoyu Min <jackmin@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Gregory Etelson [Wed, 30 Jun 2021 07:19:52 +0000 (10:19 +0300)]
net/mlx5: fix pattern expansion in RSS flow rules
Flow rule pattern may be implicitly expanded by the PMD if the rule
has RSS flow action. The expansion adds network headers to the
original pattern. The new pattern lists all network levels that
participate in the rule RSS action.
The patch validates that buffer for expanded pattern has enough bytes
for new flow items.
Haifei Luo [Mon, 31 May 2021 02:22:08 +0000 (05:22 +0300)]
net/mlx5: add more details to flow dump
Currently the flow dump provides few information about actions
- just the pointers. Add implementations to display details for
counter, modify_hdr and encap_decap actions.
For counter, the regular flow operation query is engaged and
the counter content information is provided, including hits
and bytes values.For modify_hdr, encap_and decap actions,
the information stored in the ipool objects is dumped.
There are the formats of information presented in the dump:
Counter: rec_type,id,hits,bytes
Modify_hdr: rec_type,id,actions_number,actions
Encap_decap: rec_type,id,buf
Signed-off-by: Haifei Luo <haifeil@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Currently when creating meter policy, a src port_id match item will
always be added in switch domain. So if one meter is used by another
port, it will not work correctly.
This issue is solved:
1. If policy fate action is port_id, add the src port_id match item,
and the meter cannot be shared by another port.
2. If policy fate action isn't port_id, don't add the src port_id
match, meter can be shared by another port.
This fix enables one meter being shared by different ports. User can
create a meter flow using a port_id match item to make this meter
shared by other port.
Fixes: afb4aa4f122 ("net/mlx5: support meter policy operations") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
The meter policy handlers are managed by user IDs and the driver used l3
table in order to map the user ID to the internal driver handler of the
policy.
The l3 table was wrongly saved in the shared device structure which
manages all the switch domain ports what made the user IDs shared
between different ethdev ports.
Move the policy l3 table to be per port by saving it in the port private
structure.
Fixes: afb4aa4f122 ("net/mlx5: support meter policy operations") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Dmitry Kozlyuk [Wed, 30 Jun 2021 07:01:06 +0000 (10:01 +0300)]
doc: add limitation for ConnectX-4 with L2 in mlx5 guide
ConnectX-4 and ConnectX-4 Lx NICs require all L2 headers of transmitted
packets to be inlined. By default only first 18 bytes are inlined,
which is insufficient if additional encapsulation is used, like Q-in-Q.
Thus, default settings caused such traffic to be dropepd on Tx.
Document a recommendation to increase inlined data size in such cases.
The inline data length for TSO ethernet segment should be
calculated from the TSO header instead of the inline size
configured by txq_inline_min devarg or reported by the NIC.
It is imposed by the nature of TSO offload - inline header
is being duplicated to every output TCP packet.
net/mlx5: fix multi-segment inline for the first segments
Before 19.08 release the Tx burst routines of mlx5 PMD
provided data inline for the first short segments of the
multi-segment packets. In the release 19.08 mlx5 Tx datapath
was refactored and this behavior was broken, affecting the
performance.
For example, the T-Rex traffic generator might use small
leading segments to handle packet headers and performance
degradation was noticed.
If the first segments of the multi-segment packet are short
and the overall length is below the inline threshold it
should be inline into the WQE to fix the performance.
Li Zhang [Wed, 23 Jun 2021 07:24:40 +0000 (10:24 +0300)]
net/mlx5: fix meter policy with RSS action
When creating the meter sub-policy RSS rule,
the RSS descriptor was used before its update.
It also need update tunnel bit in RSS descriptor
after flow translate.
Use it only when it is updated.
Fixes: ec962bad14e ("net/mlx5: fix metering cleanup on stop") Cc: stable@dpdk.org Signed-off-by: Li Zhang <lizh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Ajit Khaparde [Tue, 25 May 2021 17:35:22 +0000 (10:35 -0700)]
net/bnxt: fix Rx interrupt setting
Don't set rxq interrupt config
Applications can set the rxq interrupt config to 1 or 0 as needed.
If an application is not interested in handling Rx interrupts and
prefers to poll Rx rings, there is no need for the PMD to set this
config option to 1.
Lance Richardson [Wed, 16 Jun 2021 17:55:21 +0000 (13:55 -0400)]
net/bnxt: fix Tx descriptor status implementation
With Tx completion batching, a single transmit completion
can correspond to one or more transmit descriptors, adjust
implementation to account for this.
RTE_ETH_TX_DESC_DONE should be returned for descriptors that
are available for use instead of RTE_ETH_TX_DESC_UNAVAIL.
Lance Richardson [Wed, 16 Jun 2021 17:55:20 +0000 (13:55 -0400)]
net/bnxt: fix ring and context memory allocation
Use requested socket ID when allocating memory for transmit rings,
receive rings, and completion queues. Use device NUMA ID when
allocating context memory, notification queue rings, async
completion queue rings, and VNIC attributes.
Fixes: 6eb3cc2294fd ("net/bnxt: add initial Tx code") Fixes: 9738793f28ec ("net/bnxt: add VNIC functions and structs") Fixes: f8168ca0e690 ("net/bnxt: support thor controller") Fixes: bd0a14c99f65 ("net/bnxt: use dedicated CPR for async events") Fixes: 683e5cf79249 ("net/bnxt: use common NQ ring") Cc: stable@dpdk.org Signed-off-by: Lance Richardson <lance.richardson@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 9 Jun 2021 03:13:32 +0000 (08:43 +0530)]
net/bnxt: invoke device removal event on recovery failure
When the driver receives RESET_NOTIFY async event from FW or detects
a FW fatal error condition, it tries to recover from the error.
When the driver fails to recover from the error condition, fixed to
send device removal event to the application.
Kalesh AP [Wed, 9 Jun 2021 03:13:31 +0000 (08:43 +0530)]
net/bnxt: fix auto-negociation on Whitney+
Driver should enable autoneg on a port if FW supports it.
Because of a wrong check, driver is not enabling autoneg
on a port after setting forced speed on Whitney+.
Fixes: 7bc8e9a227cc ("net/bnxt: support async link notification") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 9 Jun 2021 03:13:30 +0000 (08:43 +0530)]
net/bnxt: fix typo in log message
In bnxt_rss_hash_update_op, check for valid RSS hashkey length is
made against size HW_HASH_KEY_SIZE(40). But the failure log says
"Invalid hashkey length, should be 16 bytes".
Kalesh AP [Wed, 9 Jun 2021 03:13:29 +0000 (08:43 +0530)]
net/bnxt: cleanup code
This is a cleanup commit and no functional change.
1. use macros instead of hard coded values
2. remove unnecessary comments
Fixes: 5cd0e2889c43 ("net/bnxt: support NIC Partitioning") Fixes: 2ba07b7dbd9d ("net/bnxt: set the hash key size") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 9 Jun 2021 02:45:15 +0000 (08:15 +0530)]
net/bnxt: dump SFP module info
Add support to fetch the SFP EEPROM settings from the firmware.
For SFP+ modules we will display 0xA0 page for status and 0xA2 page
for other information. For QSFP modules we will show the 0xA0 page.
Also identify the module types for QSFP28, QSFP, QSFP+ apart
from the SFP modules and return an error for 10GBase-T PHY.
Shahaji Bhosle [Sun, 30 May 2021 08:59:27 +0000 (14:29 +0530)]
net/bnxt: cleanup ULP parser and mapper
1. Disable accum_stats for Thor
2. Delete the generic port table for default flow
3. The packet mask to calculate the number of packets must be 28 bits.
4. Increase the WC TCAM entries to 512 per application and add 2
shared L2 context TCAM entries to match identifiers for flow
scaling
5. Ignore multiple critical resources in ULP flow database
6. Renamed conditional code update to function opcode.
7. Updated TRUFLOW debug logs to support the above changes.
8. As part of the HA cleanup, the shared session name now allows the user
to designate that the session uses the wc_tcam regions within the
shared session.
9. The CFA action pointer does not exist if there is no support for
VF representor, so no need to display the message for use case where
there is no support for VF representors.
10. Cleanup flow counter software accumulation.
11. When an application exits ungracefully, the HA code now
clears the appropriate shared WC region and sets the HA state.
12. Removal of unnecessary INFO message. The message is an indicator that
the ports are being removed from DPDK, but all cleanup has not
completed. Once the cleanup is completed, the timer will be stopped.
Add context in ULP for timers.
The alarm callback needs to have a valid context pointer when it is
invoked. The context could become invalid if the port goes down and
the callback is invoked before it is cancelled.
Mike Baucom [Sun, 30 May 2021 08:59:22 +0000 (14:29 +0530)]
net/bnxt: process resource lists before session open
Shared sessions require both named and unnamed resources to be requested
during a tf_open_session. ULP uses named resources for global resources
that are pre-allocated and remain through the life of the application.
Unnamed resources are generally per flow resources and allocated on
demand. The sum of both named and unnamed resources must be requested
when initializing the session. The ulp_init now processes both lists
prior to calling tf_open_session for both shared and regular sessions.
Signed-off-by: Mike Baucom <michael.baucom@broadcom.com> Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Mike Baucom [Sun, 30 May 2021 08:59:19 +0000 (14:29 +0530)]
net/bnxt: add shared session support to ULP
Shared session permits cooperative sharing of prescribed resources
between applications.
- devargs added for app-id in order to enable sharing session
resources across applications
- shared session management added
- TRUFLOW resource reservations are now app ID and device dependent
1. Remove Ether type, VLAN type and IP proto type from pattern matching,
since the header bits can be used for matching. This reduces the class
template signatures by a factor of 8.
2. Remove the wild card bit in the pattern matching since same template
can be used for both exact and wild card entries.
3. The action record pointers have to use higher range to not collide
with the firmware action record pointers. Hence reduced the number of
action record pointers for Whitney platform.
4. The conditional update opcode provide functionality to reject flows
for instance reject flows that do not adhere to flow signature match.
5. Added check to not populate protocol specifications if the
protocol mask is null or zero.
1. Add templates to support Thor platform.
2. Flow counter manager is not enabled if no flow counters are
configured.
3. Mark database is not enabled if mark action is not supported.
4. Removed application to port default flow.
5. Add allocate and write for the global registry file.
6. Multiple default flow templates are combined to one.
7. Remove default loopback action record, this is required in order to
support multiple platforms.
8. Enable port table support in the generic table.
9. remove global template table in order to support multiple platforms.
10. Add support to get parent VNIC from port table database.
11. VF representor action mark is made optional since not all
configurations need representor support.
12. Add layer 4 ports to computational fields.
13. Update templates to support the above changes.
14. Add support for wildcard.
1. The internal and external exact match table resource types
is combined since the resource handle contains the encoded
type whether it is internal or external exact match entry.
2. When a flow doesn't hit the offloaded rules, the default action is
to send it to the kernel (L2 driver interface). In order to do that,
TRUFLOW must know the kernel interface's (PF's) default vnic id.
This patch fetches the PF's default vnic id from the dpdk core and
stores it in port database. It also stores the mac addr for the
future usage. Renamed compute field for layer 4 port enums.
Added support for port database opcode that can get port details
like mac address which can then be populated in the l2 context entry.
3. Both active and default bit set need to considered to check if a
specific flow type is enabled or not.
4. ulp mapper fetches the dpdk port id from the compute field index
BNXT_ULP_CF_IDX_DEV_PORT_ID which is used to get the interface’s
mac address eventually. However, the compute field array is not
populated with dpdk port id at the index BNXT_ULP_CF_IDX_DEV_PORT_ID.
The problem fixed by populating the compute field array correctly.
5. Some dpdk applications may accumulate the flow counters while some
may not. In cases where the application is accumulating the counters
the PMD need not do the accumulation itself and viceversa to report
the correct flow counters.
6. Pointer to bp is added to open session parms to support
shared session.
The templates are updated to enable the extended exact match
table support. As part of this change, the action record size of
the action has to be calculated dynamically so it can be included
in the match table.
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com> Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Farah Smith <farah.smith@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Venkat Duvvuru [Sun, 30 May 2021 08:59:13 +0000 (14:29 +0530)]
net/bnxt: support GRE flows
This patch does the following to support GRE flows:
1. RTE_FLOW_ITEM_TYPE_ANY & RTE_FLOW_ITEM_TYPE_GRE processing
2. Calculate the absolute function ID from the logical VF ID
passed as part of RTE_FLOW_ACTION_TYPE_VF action.
3. Move bnxt_get_bp API to bnxt_ethdev.c
The computational field is enabled for wild card pattern support.
The template checks the computational field to add a flow as wild
card entry or exact match entry.
1. The flow database opcode is updated to split the alloc push resource
item so it can be controlled using the control table.
2. The class and action match signatures are populated with pattern ids
that are matched against template pattern id to reject any unsupported
class and action combinations.
3. The flow DB opcode should be no op when accessing the
global registry identifiers.
4. The resource function for branch is changed to control so that it
is extended to perform flow database operations and not just branch
operations.
5. The conditional goto processing now supports negative numbers to
support looping of the mapper tables to support flow ranges and
also enable conditional fail goto to support failure path mapper
tables.
6. The field mapper opcode is updated to add all ones to fields
that support exact match.
7. Added key info and identifier list to whitney action templates
The whitney plus templates are updated to use the mapper infrastructure
changes.
8. The partition interface table configuration of the default
egress rule for the representor interface needs to use the
reserved parif interface that is specific to each
platform. The pipeline for the representor interface is broken
since incorrect parif configuration cause the miss path packets to
be dropped.
9. In the mapper table processing, if a failure condition is hit
due to invalid memory type then use the conditional goto failure
configuration instead of jumping to next table. This causes ipv6
exact match entry to be skipped. This patch fixes that issue.
net/bnxt: add conditional opcode and L4 port fields
The conditional field opcode provides capability to perform
changes to the field values specified by template to address
platform specific modifications. For instance, mirror id value
is modified before it is configured in the hardware.
The addition of L4 port compute fields enables support of
generic exact match rule that can support both TCP and UDP
flows with the same template.
The shared handle is set in the mapper params when generic resource
are created, this shall be used by application as a handle to the
shared resource like mirror handle.
The condition execute of the mapper tables have goto field that
defines the offset of the next table to be processed instead of
sequential processing of the tables, this improving the performance.
Also, modify key and mask field opcode processing
Conflict resolution feature allows rejection of flows based on
the previously added flows that conflict. For instance, a five
tuple flow is added and then you add a new flow with only 4 tuple
instead having same layer2 details then it will be rejected.
1. Added interface table specific opcode to process interface table
entry creation and reuse. This allows reuse of the interface table
entry for multiple flows. Changed the regfile apis to store
the data in big endian format.
2. The result blob creation being done in tcam, interface, index
tables are consolidate to a common method.
3. Added result blob processing for generic table write
4. Modified the index table opcode processing to support new opcodes.
5. The driver was setting key size that did not take into account
the word alignment.
6. The hard coded values for critical resource is replaced with
template defined values.
Venkat Duvvuru [Sun, 30 May 2021 08:59:02 +0000 (14:29 +0530)]
net/bnxt: modify VXLAN decap for multichannel mode
The driver is using physical port id as the index into
the tunnel inner flow table. However, this will not work in case
of multichannel mode where multiple physical functions are going
to share the same physical port id.
When tunnel inner flow offload request comes before tunnel
outer flow offload request, the driver caches the tunnel inner flow
details and programs it in the hardware after installing the tunnel
outer flow in the hardware. If more than one tunnel inner flow arrives
before tunnel outer flow is offloaded, the driver rejects any such
tunnel inner flow offload requests.
This patch fixes the above two problems by
1. Using dpdk port id as the index to store tunnel inner info.
2. Caching any number of tunnel inner flow offload requests that come
before offloading tunnel outer flow offload request
Added TCAM table specific opcode to process TCAM entry creation
and reuse. This change removes the TCAM cache mechanism and uses
the generic table mechanism for reuse of TCAM entries.
Mike Baucom [Sun, 30 May 2021 08:59:00 +0000 (14:29 +0530)]
net/bnxt: add conditional processing of templates
Conditional execution and rejection processing added for templates and
tables. This allows the mapper to skip tables and reject templates
based on the content without having to hard code rules.
Added support for mapper flow database opcode to enable
shared resources like mirror action. This allows mapper
to conditionally populate flow database based on template content.
Venkat Duvvuru [Sun, 30 May 2021 08:58:57 +0000 (14:28 +0530)]
net/bnxt: check FW capability to support TRUFLOW
Currently, a devarg (host-based-truflow) is passed while launching
the app to enable TRUFLOW feature. However, this mechanism adds
an extra step in enabling TRUFLOW. This doesn't give a seamless
experience when flow offloads has to work with FW that doesn't/does
support TRUFLOW feature. Also, it's likely that customers may not
want to use devarg to enable flow offloads.
This patch fixes it by checking for TRUFLOW feature support in
device's capabilities and configurations field of the hwrm_ver_get.