dpdk.git
2 years agocommon/mlx5: update log for DevX general command failure
Gregory Etelson [Wed, 8 Jun 2022 11:58:25 +0000 (14:58 +0300)]
common/mlx5: update log for DevX general command failure

Application can fetch syndrome value after FW operation failure
starting from Mellanox OFED-5.6.
The patch updates log data issued after devx_general_cmd error.

Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2 years agonet/mlx5: support field modification in meter rules
Sean Zhang [Tue, 7 Jun 2022 11:19:00 +0000 (14:19 +0300)]
net/mlx5: support field modification in meter rules

This patch introduces MODIFY_FIELD action support in meter. User can
create meter policy with MODIFY_FIELD action in green/yellow action.

For example:

testpmd> add port meter policy 0 21 g_actions modify_field op set
dst_type ipv4_ecn src_type value src_value 3 width 2 / ...

Signed-off-by: Sean Zhang <xiazhang@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2 years agonet/mlx5: support modifying ECN field
Sean Zhang [Tue, 7 Jun 2022 11:18:59 +0000 (14:18 +0300)]
net/mlx5: support modifying ECN field

This patch is to support modify ECN field in IPv4/IPv6 header.

Signed-off-by: Sean Zhang <xiazhang@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2 years agocommon/mlx5: check ECN modification capability
Sean Zhang [Tue, 7 Jun 2022 11:18:58 +0000 (14:18 +0300)]
common/mlx5: check ECN modification capability

Flag outer_ip_ecn in header modify capabilities properties layout is
added in order to check if the firmware supports modification of ecn
field.

Signed-off-by: Sean Zhang <xiazhang@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2 years agonet/mlx5: support represented port item in flow rules
Sean Zhang [Tue, 7 Jun 2022 11:17:32 +0000 (14:17 +0300)]
net/mlx5: support represented port item in flow rules

Add support for represented_port item in pattern. And if the spec and mask
both are NULL, translate function will not add source vport to matcher.

For example, testpmd starts with PF, VF-rep0 and VF-rep1, below command
will redirect packets from VF0 and VF1 to wire:
testpmd> flow create 0 ingress transfer group 0 pattern eth /
represented_port / end actions represented_port ethdev_id is 0 / end

Signed-off-by: Sean Zhang <xiazhang@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2 years agoeal/linux: allocate worker lcore stacks in hugepages
Don Wallwork [Thu, 23 Jun 2022 11:21:27 +0000 (07:21 -0400)]
eal/linux: allocate worker lcore stacks in hugepages

Add support for using hugepages for worker lcore stack memory. The
intent is to improve performance by reducing stack memory related TLB
misses and also by using memory local to the NUMA node of each lcore.

EAL option '--huge-worker-stack[=stack-size-in-kbytes]' is added to allow
the feature to be enabled at runtime. If the size is not specified,
the system pthread stack size will be used.

Signed-off-by: Don Wallwork <donw@xsightlabs.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
2 years agoip_frag: fix build with GCC 12
Huichao Cai [Sat, 18 Jun 2022 14:09:40 +0000 (22:09 +0800)]
ip_frag: fix build with GCC 12

GCC 12 raises warnings on usage of rte_memcpy with IPv4 options handling
in fragments for both the ip_frag library and unit tests.

For example in the library:
In function ‘_mm256_storeu_si256’,
    inlined from ‘rte_mov32’ at
        ../lib/eal/x86/include/rte_memcpy.h:347:2,
    inlined from ‘rte_mov128’ at
        ../lib/eal/x86/include/rte_memcpy.h:369:2,
    inlined from ‘rte_memcpy_generic’
        at ../lib/eal/x86/include/rte_memcpy.h:445:4,
    inlined from ‘rte_memcpy’
        at ../lib/eal/x86/include/rte_memcpy.h:851:10,
    inlined from ‘__create_ipopt_frag_hdr’
        at ../lib/ip_frag/rte_ipv4_fragmentation.c:68:4,
    inlined from ‘rte_ipv4_fragment_packet’
        at ../lib/ip_frag/rte_ipv4_fragmentation.c:242:16:
/usr/lib/gcc/x86_64-redhat-linux/12/include/avxintrin.h:935:8: error:
    array subscript ‘__m256i_u[1]’ is partly outside array bounds of
    ‘uint8_t[60]’ {aka ‘unsigned char[60]’} [-Werror=array-bounds]
  935 |   *__P = __A;
      |   ~~~~~^~~~~
../lib/ip_frag/rte_ipv4_fragmentation.c: In function
    ‘rte_ipv4_fragment_packet’:
../lib/ip_frag/rte_ipv4_fragmentation.c:122:17: note: at offset [52, 60]
    into object ‘ipopt_frag_hdr’ of size 60
  122 |         uint8_t ipopt_frag_hdr[IPV4_HDR_MAX_LEN];
      |                 ^~~~~~~~~~~~~~

To resolve the compilation warning, replace the rte_memcpy with memcpy.

Fixes: b50a14a853aa ("ip_frag: add IPv4 options fragment")

Signed-off-by: Huichao Cai <chcchc88@163.com>
2 years agonet/qede: fix build with GCC 12
Stephen Hemminger [Tue, 7 Jun 2022 17:17:40 +0000 (10:17 -0700)]
net/qede: fix build with GCC 12

The x86 version of rte_memcpy can cause warnings. The driver does
not need to use rte_memcpy for everything. Standard memcpy is
just as fast and safer; the compiler and static analysis tools
treat memcpy specially.

Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agonet/ice/base: fix build with GCC 12
Wenxuan Wu [Thu, 23 Jun 2022 09:01:05 +0000 (17:01 +0800)]
net/ice/base: fix build with GCC 12

GCC 12 with -O2 flag would raise the following warning:
../drivers/net/ice/base/ice_switch.c:7220:61: error: writing 1 byte into a
region of size 0 [-Werror=stringop-overflow=]
 7220 |           buf[recps].content.lkup_indx[i + 1] = entry->fv_idx[i];
      |           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~

This patch changed the type of fv_idx in struct ice_recp_grp_entry to
align with its callers which are also u8 type.

Fixes: 04b8ec1ea807 ("net/ice/base: add protocol structures and defines")
Cc: stable@dpdk.org
Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: add basic NEON Rx
Kathleen Capella [Fri, 17 Jun 2022 18:21:34 +0000 (18:21 +0000)]
net/iavf: add basic NEON Rx

This patch adds the basic NEON Rx path to the iavf driver. It does not
include scatter or flex varieties.

Tested on N1SDP platform with Intel XL710 NIC and 40G connection.
Tested with a single core and testpmd rxonly mode. Saw no significant
performance difference between scalar and Arm vPMD paths using this test
in iavf and saw the same results when comparing scalar and Arm vPMD
path in i40e.

Signed-off-by: Kathleen Capella <kathleen.capella@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/i40e: add outer VLAN processing
Robin Zhang [Fri, 10 Jun 2022 16:29:44 +0000 (16:29 +0000)]
net/i40e: add outer VLAN processing

Outer VLAN processing is supported after firmware v8.4, kernel driver
also change the default behavior to support this feature. To align with
kernel driver, add support for outer VLAN processing in DPDK.

But it is forbidden for firmware to change the Inner/Outer VLAN
configuration while there are MAC/VLAN filters in the switch table.
Therefore, we need to clear the MAC table before setting config,
and then restore the MAC table after setting.

This will not impact on an old firmware.

Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Signed-off-by: Kevin Liu <kevinx.liu@intel.com>
Acked-by: Yuying Zhang <yuying.zhang@intel.com>
2 years agonet/ice: add DDP runtime configuration dump
Steve Yang [Fri, 10 Jun 2022 01:14:26 +0000 (01:14 +0000)]
net/ice: add DDP runtime configuration dump

Dump DDP runtime configure into a binary (package) file from ice PF port.

Add command line:
    ddp dump <port_id> <config_path>

Parameters:
    <port_id>       the PF Port ID
    <config_path>   dumped runtime configure file, if not a absolute path,
                    it will be dumped to testpmd running directory.

For example:
testpmd> ddp dump 0 current.pkg

If you want to dump ice VF DDP runtime configure, you need bind other
unused PF port of the NIC first, and then dump the PF's runtime configure
as target output.

Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: fix race condition in Rx timestamp
Simei Su [Wed, 8 Jun 2022 02:46:01 +0000 (10:46 +0800)]
net/ice: fix race condition in Rx timestamp

In multi-cores cases for Rx timestamp offload, to avoid phc time being
frequently overwritten, move related variables from ice_adapter to
ice_rx_queue structure, and each queue will handle timestamp calculation
by itself.

Fixes: 953e74e6b73a ("net/ice: enable Rx timestamp on flex descriptor")
Fixes: 5543827fc6df ("net/ice: improve performance of Rx timestamp offload")
Cc: stable@dpdk.org
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/qede: fix build with GCC 13
Ferruh Yigit [Thu, 16 Jun 2022 17:02:09 +0000 (18:02 +0100)]
net/qede: fix build with GCC 13

Reproduced with "gcc (GCC) 13.0.0 20220616 (experimental)"

Build error:
In file included from ../drivers/net/qede/qede_debug.c:9:
../drivers/net/qede/qede_debug.c: In function ‘qed_grc_dump_addr_range’:
../drivers/net/qede/base/ecore.h:95:17:
warning: overflow in conversion from ‘int’ to ‘u8’
{aka ‘unsigned char’} changes value from ‘(int)vf_id << 8 | 128’
to ‘128’ [-Woverflow]
   95 |                 ((_value & _name##_MASK) << _name##_SHIFT)
      |                 ^
../drivers/net/qede/qede_debug.c:1907:31:
note: in expansion of macro ‘FIELD_VALUE’
 1907 |         fid = FIELD_VALUE(PXP_PRETEND_CONCRETE_FID_VFVALID, 1)
      |               ^~~~~~~~~~~

To prevent overflow converting 'fib' to uint16_t,
while updating it also updated 'vf_id' to 16 bit too.

Fixes: ec55c118792b ("net/qede: add infrastructure for debug data collection")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
Acked-by: Devendra Singh Rawat <dsinghrawat@marvell.com>
2 years agonet/cnxk: add SDP VF device IDs
Radha Mohan Chintakuntla [Thu, 16 Jun 2022 09:24:19 +0000 (14:54 +0530)]
net/cnxk: add SDP VF device IDs

Add SDP VF device ID in the table for probe matching.

Signed-off-by: Radha Mohan Chintakuntla <radhac@marvell.com>
2 years agonet/cnxk: resize CQ for Rx security for errata
Nithin Dabilpuram [Thu, 16 Jun 2022 09:24:18 +0000 (14:54 +0530)]
net/cnxk: resize CQ for Rx security for errata

Resize CQ for Rx security offload in case of HW errata.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agonet/cnxk: fix PFC class disabling
Harman Kalra [Thu, 16 Jun 2022 09:24:17 +0000 (14:54 +0530)]
net/cnxk: fix PFC class disabling

Disabling a specific PFC class on a SQ is resulting in disabling PFC
on the entire port.

Fixes: 9544713564f5 ("net/cnxk: support priority flow control")
Cc: stable@dpdk.org
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agonet/cnxk: remove restriction on VF for PFC config
Sunil Kumar Kori [Thu, 16 Jun 2022 09:24:16 +0000 (14:54 +0530)]
net/cnxk: remove restriction on VF for PFC config

Currently PFC configuration is not allowed on VFs.
Patch enables PFC configuration on VFs

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
2 years agonet/cnxk: add SDP link status
Satananda Burla [Thu, 16 Jun 2022 09:24:15 +0000 (14:54 +0530)]
net/cnxk: add SDP link status

Add SDP link status reporting

Signed-off-by: Satananda Burla <sburla@marvell.com>
2 years agocommon/cnxk: fix mbox structs to avoid unaligned access
Nithin Dabilpuram [Thu, 16 Jun 2022 09:24:14 +0000 (14:54 +0530)]
common/cnxk: fix mbox structs to avoid unaligned access

Fix mbox structs to avoid unaligned access as mbox
memory is from BAR space.

Fixes: 503b82de2cbf ("common/cnxk: add mbox request and response definitions")
Fixes: e746aec161cc ("common/cnxk: fix SQ flush sequence")
Cc: stable@dpdk.org
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: enhance CPT parsing header dump
Nithin Dabilpuram [Thu, 16 Jun 2022 09:24:13 +0000 (14:54 +0530)]
common/cnxk: enhance CPT parsing header dump

Enhance CPT parse header dump to dump fragment info
and swap pointers before printing.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: support same TC value across multiple queues
Harman Kalra [Thu, 16 Jun 2022 09:24:12 +0000 (14:54 +0530)]
common/cnxk: support same TC value across multiple queues

User may want to configure same TC value across multiple queues, but
for that all queues should have a common TL3 where this TC value will
get configured.

Changed the pfc_tc_cq_map/pfc_tc_sq_map array indexing to qid and store
TC values in the array. As multiple queues may have same TC value.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
2 years agocommon/cnxk: add PFC support for VF
Sunil Kumar Kori [Thu, 16 Jun 2022 09:24:11 +0000 (14:54 +0530)]
common/cnxk: add PFC support for VF

Current PFC implementation does not support VFs.
This patch enables PFC on VFs too.

Also fix the config of aura.bp to be based on number
of buffers(aura.limit) and corresponding shift
value(aura.shift).

Fixes: cb4bfd6e7bdf ("event/cnxk: support Rx adapter")
Cc: stable@dpdk.org
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
2 years agocommon/cnxk: avoid CPT backpressure due to errata
Nithin Dabilpuram [Thu, 16 Jun 2022 09:24:10 +0000 (14:54 +0530)]
common/cnxk: avoid CPT backpressure due to errata

Avoid enabling CPT backpressure due to errata where
backpressure would block requests from even other
CPT LF's. Also allow CQ size >=16K.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: use computed value for WQE skip
Nithin Dabilpuram [Thu, 16 Jun 2022 09:24:09 +0000 (14:54 +0530)]
common/cnxk: use computed value for WQE skip

Use computed value for WQE skip instead of a hard-coded value.
WQE skip needs to be number of 128B lines to accommodate rte_mbuf.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: add include for macro definition
Bruce Richardson [Wed, 15 Jun 2022 17:10:13 +0000 (18:10 +0100)]
common/cnxk: add include for macro definition

The header file "roc_io.h" uses the "__plt_always_inline" macro but
don't include "roc_platform.h" to get the definition of it. This
inclusion is not necessary for compilation, but the lack of it can
confuse some indexers - such as those in eclipse, which reports the
lines:

"static __plt_always_inline uint64_t"

as possible definitions of a variable called "uint64_t". This confusion
leads to uint64_t being flagged as an unknown type in all other parts of
the project being indexed, e.g. across all of DPDK code.

Adding in the include of roc_platform.h makes it clear to the indexer
that those lines are  part of a function definition, and that allows
eclipse to correctly recognise uint64_t as a type from stdint.h

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: add ROC API to free MCAM entry
Satheesh Paul [Wed, 15 Jun 2022 13:57:05 +0000 (19:27 +0530)]
common/cnxk: add ROC API to free MCAM entry

Add ROC API to free the given MCAM entry. If the MCAM
entry has flow counter associated, this API will clear
and free the flow counter.

Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: support CNF10KB SoC
Harman Kalra [Mon, 13 Jun 2022 11:45:17 +0000 (17:15 +0530)]
common/cnxk: support CNF10KB SoC

Support for CNF10KB SoC by adding its PCI device ID.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
2 years agocommon/cnxk: fix CN103XX subsystem device ID
Rahul Bhansali [Mon, 13 Jun 2022 11:29:39 +0000 (16:59 +0530)]
common/cnxk: fix CN103XX subsystem device ID

Fix the subsystem device ID for CN103XX.

Fixes: dd462f68f04a ("common/cnxk: support CN103XX platform")
Cc: stable@dpdk.org
Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
2 years agocommon/cnxk: update extra stats for inline device
Rakesh Kudurumalla [Mon, 13 Jun 2022 09:50:04 +0000 (15:20 +0530)]
common/cnxk: update extra stats for inline device

Inline device's NIX RX and RQ stats are updated
on ethdev extra stats

Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
2 years agomempool/cnxk: support optional wait when counting
Ashwin Sekhar T K [Fri, 10 Jun 2022 16:07:14 +0000 (21:37 +0530)]
mempool/cnxk: support optional wait when counting

When counting the batch allocated pointers in cnxk mempool driver,
currently it always waits for in-flight batch operations to finish.
Add a provision to make this waiting optional.

Signed-off-by: Ashwin Sekhar T K <asekhar@marvell.com>
2 years agocommon/cnxk: handle ROC model init failure
Hanumanth Pothula [Fri, 10 Jun 2022 08:14:14 +0000 (13:44 +0530)]
common/cnxk: handle ROC model init failure

Return with error on fail to initialize ROC model.

Fixes: 014a9e222bac ("common/cnxk: add model init and IO handling API")
Cc: stable@dpdk.org
Signed-off-by: Hanumanth Pothula <hpothula@marvell.com>
2 years agocommon/cnxk: print NIX inline outbound CPT LF registers
Rahul Bhansali [Fri, 20 May 2022 05:22:30 +0000 (10:52 +0530)]
common/cnxk: print NIX inline outbound CPT LF registers

This add the support to dump NIX inline outbound CPT LF
registers.

Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2 years agocommon/cnxk: fix decrypt packet count register update
Rahul Bhansali [Fri, 20 May 2022 05:22:29 +0000 (10:52 +0530)]
common/cnxk: fix decrypt packet count register update

Corrects the CPT decrypt packet counter register.

Fixes: b1a22e5d4f ("common/cnxk: add CPT diagnostics")
Cc: stable@dpdk.org
Signed-off-by: Rahul Bhansali <rbhansali@marvell.com>
2 years agodoc: add platform option in cnxk native build
Jerin Jacob [Wed, 18 May 2022 15:03:22 +0000 (20:33 +0530)]
doc: add platform option in cnxk native build

Update cnxk platform documentation to use
-Dplatform meson option for native builds.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
2 years agocommon/cnxk: support dumping flow MCAM entry data
Satheesh Paul [Tue, 17 May 2022 04:04:08 +0000 (09:34 +0530)]
common/cnxk: support dumping flow MCAM entry data

When dumping flow data, read hardware MCAM entry corresponding
to the flow and print that data also.

Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
2 years agonet/cnxk: fix crash in IPsec telemetry
David Marchand [Thu, 19 May 2022 12:21:51 +0000 (14:21 +0200)]
net/cnxk: fix crash in IPsec telemetry

Calling this telemetry callback with no argument caused a crash.

Fixes: 41cc645c214f ("net/cnxk: add inline IPsec telemetry for CN9K")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
2 years agocommon/cnxk: add macros to platform layer
Srikanth Yalavarthi [Mon, 16 May 2022 17:26:56 +0000 (10:26 -0700)]
common/cnxk: add macros to platform layer

Added new platform layer macros for pointer operations,
bitwise operations, spinlock and 32 bit read and write.

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: fix channel number setting in MCAM entries
Satheesh Paul [Mon, 2 May 2022 08:47:30 +0000 (14:17 +0530)]
common/cnxk: fix channel number setting in MCAM entries

Adding changes to accommodate the following requirements
while masking the channel number.
1. For CN10K device, channel number should not be masked
   for first pass rules with RTE_FLOW_ACTION_TYPE_SECURITY
   action. And channel number should be masked for all
   other flow rules.
2. For CN9K device channel number should not be masked.

Fixes: 4968b362b639 ("common/cnxk: support CPT second pass flow rules")
Cc: stable@dpdk.org
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
2 years agonet/thunderx: populate max and min MTU values
Hanumanth Pothula [Tue, 24 May 2022 08:42:35 +0000 (14:12 +0530)]
net/thunderx: populate max and min MTU values

Populate maximum and minimum MTU values while retrieving
device information.

Signed-off-by: Hanumanth Pothula <hpothula@marvell.com>
2 years agonet/thunderx: support attaching from secondary
Harman Kalra [Tue, 24 May 2022 08:42:34 +0000 (14:12 +0530)]
net/thunderx: support attaching from secondary

Adding support for device hotplugging - attach and detach from
secondary

Signed-off-by: Harman Kalra <hkalra@marvell.com>
2 years agonet/octeontx: support allmulticast
Harman Kalra [Tue, 24 May 2022 08:42:33 +0000 (14:12 +0530)]
net/octeontx: support allmulticast

Implement allmulticast operations for octeontx driver:
rte_eth_allmulticast_enable()/rte_eth_allmulticast_disable().

Signed-off-by: Harman Kalra <hkalra@marvell.com>
2 years agonet/octeontx: support xstats
Harman Kalra [Tue, 24 May 2022 08:42:32 +0000 (14:12 +0530)]
net/octeontx: support xstats

Adding support for xstats eth operations.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
2 years agonet/thunderx: support setting link attributes
Harman Kalra [Tue, 24 May 2022 08:42:31 +0000 (14:12 +0530)]
net/thunderx: support setting link attributes

Adding support to configure link attributes like speed,
duplex, negotiation.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
2 years agonet/thunderx: reset Rx DMAC control register
Hanumanth Pothula [Tue, 24 May 2022 08:42:30 +0000 (14:12 +0530)]
net/thunderx: reset Rx DMAC control register

During initialization, reset RX DMAC control register by
sending mbox message NIC_MBOX_MSG_RESET_XCAST to PF.

Signed-off-by: Hanumanth Pothula <hpothula@marvell.com>
2 years agonet/thunderx: support polling of link state change
Hanumanth Pothula [Tue, 24 May 2022 08:42:29 +0000 (14:12 +0530)]
net/thunderx: support polling of link state change

Moving the logic of link polling to VF from PF. Now VF
is supposed to poll for the link status, rather PF alerting
VF about any link change.

Signed-off-by: Hanumanth Pothula <hpothula@marvell.com>
2 years agonet/octeontx: handle port reconfiguration
Harman Kalra [Tue, 24 May 2022 08:42:28 +0000 (14:12 +0530)]
net/octeontx: handle port reconfiguration

Adding support for port reconfiguration as user may require to
do so on a running system.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
2 years agonet/octeontx: support setting link attributes
Harman Kalra [Tue, 24 May 2022 08:42:27 +0000 (14:12 +0530)]
net/octeontx: support setting link attributes

Adding support to configure link attributes like speed,
duplex, negotiation.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
2 years agonet/octeontx: fix port close
Harman Kalra [Tue, 24 May 2022 08:42:26 +0000 (14:12 +0530)]
net/octeontx: fix port close

Segmentation fault has been observed while closing the ethernet
port. Reason for the segfault is, eth port close also shuts down
event device while other ethernet port is still using the event
device.

Fixes: da6c687471a3 ("net/octeontx: add start and stop support")
Cc: stable@dpdk.org
Signed-off-by: Harman Kalra <hkalra@marvell.com>
2 years agoci: enable C++ check for Arm and PPC
Stanislaw Kardach [Tue, 21 Jun 2022 12:28:24 +0000 (14:28 +0200)]
ci: enable C++ check for Arm and PPC

The crossbuild-essential-<arch> packages contain all necessary
dependencies to cross-compile binaries for a given architecture
including C and C++ compilers. Therefore use those instead of listing
packages directly. This way C++ compiler is also installed and C++
include checks will be checked in CI for ARM and PowerPC.

Cc: stable@dpdk.org
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2 years agoconfig: fix C++ cross compiler for Arm and PPC
Stanislaw Kardach [Tue, 21 Jun 2022 12:28:23 +0000 (14:28 +0200)]
config: fix C++ cross compiler for Arm and PPC

Through some mixup all cross-files for ARM and PowerPC platforms were
using C Preprocessor (cpp) instead of GCC (g++).
This caused meson to fail detecting the C++ compiler presence and
therefore disabling some targets (i.e. C++ include file checks).

Fixes: e53a5299d219 ("build: support vendor specific ARM cross builds")
Cc: stable@dpdk.org
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
2 years agomalloc: fix allocation of almost hugepage size
Fidaullah Noonari [Wed, 25 May 2022 05:18:37 +0000 (10:18 +0500)]
malloc: fix allocation of almost hugepage size

If called to allocate memory of size is between multiple of hugepage
size minus malloc_header_len and hugepage size, rte_malloc fails.

This fix replaces malloc_elem_trailer_len with malloc_elem_overhead in
try_expand_heap() to include malloc_elem_header_len when calculating
n_seg.

Bugzilla ID: 800
Fixes: 07dcbfe0101f ("malloc: support multiprocess memory hotplug")
Cc: stable@dpdk.org
Signed-off-by: Fidaullah Noonari <fidaullah.noonari@emumba.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2 years agoeal/unix: make stack dump signal safe
Stephen Hemminger [Thu, 14 Apr 2022 20:19:40 +0000 (13:19 -0700)]
eal/unix: make stack dump signal safe

rte_dump_stack() needs to be usable in situations when a bug is
encountered and from signal handlers (such as SEGV).

Glibc backtrace_symbols() calls malloc which makes it
dangerous in a signal handler that is handling errors that maybe
due to memory corruption. Additionally, rte_log() is unsafe because
syslog() is not signal safe; printf() is also documented as
not being safe.

This version formats message and uses writev for each line in a manner
similar to what glibc version of backtrace_symbols_fd() does. The
FreeBSD version of backtrace_symbols_fd() is not signal safe.

Sample output:

0: ./build/app/dpdk-testpmd (rte_dump_stack+0x2b) [560a6e9c002b]
1: ./build/app/dpdk-testpmd (main+0xad) [560a6decd5ad]
2: /lib/x86_64-linux-gnu/libc.so.6 (__libc_start_main+0xcd) [7fd43d3e27fd]
3: ./build/app/dpdk-testpmd (_start+0x2a) [560a6e83628a]

Bugzilla ID: 929

Acked-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2 years agovhost/crypto: fix descriptor processing
David Marchand [Wed, 22 Jun 2022 15:30:20 +0000 (17:30 +0200)]
vhost/crypto: fix descriptor processing

copy_data was returning a pointer to an increased (off by one) descriptor.
Subsequent calls to copy_data in the library were then failing.
Fix this by incrementing the descriptor only if there is some left data
to copy.

Fixes: 4414bb67010d ("vhost/crypto: fix build with GCC 12")
Cc: stable@dpdk.org
Reported-by: Jakub Poczatek <jakub.poczatek@intel.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Jakub Poczatek <jakub.poczatek@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
2 years agonet/virtio: unmap PCI device in secondary process
Yuan Wang [Mon, 6 Jun 2022 15:55:43 +0000 (23:55 +0800)]
net/virtio: unmap PCI device in secondary process

In multi-process, the secondary process will remap PCI during
initialization, but the mapping is not removed in the uninit path,
the device is not closed, and the device busy error will be reported
when the device is hotplugged.

This patch unmaps PCI device at secondary process uninitialization
based on virtio_rempa_pci.

Fixes: 36a7a2e7a53f ("net/virtio: move PCI device init in dedicated file")
Cc: stable@dpdk.org
Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Tested-by: Wei Ling <weix.ling@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2 years agovhost/crypto: fix build with GCC 12
David Marchand [Thu, 16 Jun 2022 14:46:50 +0000 (16:46 +0200)]
vhost/crypto: fix build with GCC 12

GCC 12 raises the following warning:

In file included from ../lib/mempool/rte_mempool.h:46,
                 from ../lib/mbuf/rte_mbuf.h:38,
                 from ../lib/vhost/vhost_crypto.c:7:
../lib/vhost/vhost_crypto.c: In function ‘rte_vhost_crypto_fetch_requests’:
../lib/eal/x86/include/rte_memcpy.h:371:9: warning: array subscript 1 is
     outside array bounds of ‘struct virtio_crypto_op_data_req[1]’
     [-Warray-bounds]
  371 | rte_mov32((uint8_t *)dst + 3 * 32, (const uint8_t *)src + 3 * 32);
      | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../lib/vhost/vhost_crypto.c:1178:42: note: while referencing ‘req’
 1178 |         struct virtio_crypto_op_data_req req;
      |                                          ^~~

Split this function and separate the per descriptor copy.
This makes the code clearer, and the compiler happier.

Note: logs for errors have been moved to callers to avoid duplicates.

Fixes: 3c79609fda7c ("vhost/crypto: handle virtually non-contiguous buffers")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: prepare virtqueue resource creation
Li Zhang [Sat, 18 Jun 2022 09:02:58 +0000 (12:02 +0300)]
vdpa/mlx5: prepare virtqueue resource creation

Split the virtqs virt-queue resource between
the configuration threads.
Also need pre-created virt-queue resource
after virtq destruction.
This accelerates the LM process and reduces its time by 30%.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: add virtq sub-resources creation
Li Zhang [Sat, 18 Jun 2022 09:02:57 +0000 (12:02 +0300)]
vdpa/mlx5: add virtq sub-resources creation

pre-created virt-queue sub-resource in device probe stage
and then modify virtqueue in device config stage.
Steer table also need to support dummy virt-queue.
This accelerates the LM process and reduces its time by 40%.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: add device close task
Li Zhang [Sat, 18 Jun 2022 09:02:56 +0000 (12:02 +0300)]
vdpa/mlx5: add device close task

Split the virtqs device close tasks after
stopping virt-queue between the configuration threads.
This accelerates the LM process and
reduces its time by 50%.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: add virtq live migration log task
Li Zhang [Sat, 18 Jun 2022 09:02:55 +0000 (12:02 +0300)]
vdpa/mlx5: add virtq live migration log task

Split the virtqs LM log between the configuration threads.
This accelerates the LM process and reduces its time by 20%.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: add virtq creation task
Li Zhang [Sat, 18 Jun 2022 09:02:54 +0000 (12:02 +0300)]
vdpa/mlx5: add virtq creation task

The virtq object and all its sub-resources use a lot of
FW commands and can be accelerated by the MT management.
Split the virtqs creation between the configuration threads.
This accelerates the LM process and reduces its time by 20%.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: add VM memory registration task
Li Zhang [Sat, 18 Jun 2022 09:02:53 +0000 (12:02 +0300)]
vdpa/mlx5: add VM memory registration task

The driver creates a direct MR object of
the HW for each VM memory region,
which maps the VM physical address to
the actual physical address.

Later, after all the MRs are ready,
the driver creates an indirect MR to group all the direct MRs
into one virtual space from the HW perspective.

Create direct MRs in parallel using the MT mechanism.
After completion, the primary thread creates the indirect MR
needed for the following virtqs configurations.

This optimization accelerrate the LM process and
reduce its time by 5%.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: add task ring for multi-thread management
Li Zhang [Sat, 18 Jun 2022 09:02:52 +0000 (12:02 +0300)]
vdpa/mlx5: add task ring for multi-thread management

The configuration threads tasks need a container to
support multiple tasks assigned to a thread in parallel.
Use rte_ring container per thread to manage
the thread tasks without locks.
The caller thread from the user context opens a task to
a thread and enqueue it to the thread ring.
The thread polls its ring and dequeue tasks.
That’s why the ring should be in multi-producer
and single consumer mode.
Anatomic counter manages the tasks completion notification.
The threads report errors to the caller by
a dedicated error counter per task.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: add multi-thread management for configuration
Li Zhang [Sat, 18 Jun 2022 09:02:51 +0000 (12:02 +0300)]
vdpa/mlx5: add multi-thread management for configuration

The LM process includes a lot of objects creations and
destructions in the source and the destination servers.
As much as LM time increases, the packet drop of the VM increases.
To improve LM time need to parallel the configurations for mlx5 FW.
Add internal multi-thread management in the driver for it.

A new devarg defines the number of threads and their CPU.
The management is shared between all the devices of the driver.
Since the event_core also affects the datapath events thread,
reduce the priority of the datapath event thread to
allow fast configuration of the devices doing the LM.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: optimize datapath-control synchronization
Li Zhang [Sat, 18 Jun 2022 09:02:50 +0000 (12:02 +0300)]
vdpa/mlx5: optimize datapath-control synchronization

The driver used a single global lock for any synchronization
needed for the datapath and control path.
It is better to group the critical sections with
the other ones that should be synchronized.

Replace the global lock with the following locks:

1.virtq locks(per virtq) synchronize datapath polling and
  parallel configurations on the same virtq.
2.A doorbell lock synchronizes doorbell update,
  which is shared for all the virtqs in the device.
3.A steering lock for the shared steering objects updates.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: pre-create virtq at probing time
Li Zhang [Sat, 18 Jun 2022 09:02:49 +0000 (12:02 +0300)]
vdpa/mlx5: pre-create virtq at probing time

dev_config operation is called in LM progress.
LM time is very critical because all
the VM packets are dropped directly at that time.

Move the virtq creation to probe time and
only modify the configuration later in
the dev_config stage using the new ability
to modify virtq.

This optimization accelerates the LM process and
reduces its time by 70%.

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agocommon/mlx5: extend virtq modifiable fields
Li Zhang [Sat, 18 Jun 2022 09:02:48 +0000 (12:02 +0300)]
common/mlx5: extend virtq modifiable fields

A virtq configuration can be modified after the virtq creation.
Added the following modifiable fields:
1.address fields: desc_addr/used_addr/available_addr
2.hw_available_index
3.hw_used_index
4.virtio_q_type
5.version type
6.queue mkey
7.feature bit mask: tso_ipv4/tso_ipv6/tx_csum/rx_csum
8.event mode: event_mode/event_qpn_or_msix

Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: reuse event queues
Yajun Wu [Sat, 18 Jun 2022 09:02:47 +0000 (12:02 +0300)]
vdpa/mlx5: reuse event queues

To speed up queue creation time, event QP and CQ will create only once.
Each virtq creation will reuse same event QP and CQ.

Because FW will set event QP to error state during virtq destroy,
need modify event QP to RESET state, then modify QP to RTS state as
usual. This can save about 1.5ms for each virtq creation.

After SW QP reset, QP pi/ci all become 0 while CQ pi/ci keep as
previous. Add new variable qp_ci to save SW QP ci. Move QP pi
independently with CQ ci.

Add new function mlx5_vdpa_drain_cq to drain CQ CQE after virtq
release.

Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agocommon/mlx5: add DevX API to move queues to reset state
Yajun Wu [Sat, 18 Jun 2022 09:02:46 +0000 (12:02 +0300)]
common/mlx5: add DevX API to move queues to reset state

Support set QP to RESET state.

Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: support pre-creation of virtq resource
Yajun Wu [Sat, 18 Jun 2022 09:02:45 +0000 (12:02 +0300)]
vdpa/mlx5: support pre-creation of virtq resource

The motivation of this change is to reduce vDPA device queue creation
time by creating some queue resource in vDPA device probe stage.

In VM live migration scenario, this can reduce 0.8ms for each queue
creation, thus reduce LM network downtime.

To create queue resource(umem/counter) in advance, we need to know
virtio queue depth and max number of queue VM will use.

Introduce two new devargs: queues(max queue pair number) and queue_size
(queue depth). Two args must be both provided, if only one argument
provided, the argument will be ignored and no pre-creation.

The queues and queue_size must also be identical to vhost configuration
driver later receive. Otherwise either the pre-create resource is wasted
or missing or the resource need destroy and recreate(in case queue_size
mismatch).

Pre-create umem/counter will keep alive until vDPA device removal.

Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/mlx5: fix maximum number of virtqs
Li Zhang [Sat, 18 Jun 2022 09:02:44 +0000 (12:02 +0300)]
vdpa/mlx5: fix maximum number of virtqs

The driver wrongly takes the capability value for
the number of virtq pairs instead of just the number of virtqs.

Adjust all the usages of it to be the number of virtqs.

Fixes: c2eb33aaf967 ("vdpa/mlx5: manage virtqs by array")
Cc: stable@dpdk.org
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovhost: fix log message for async dequeue
David Marchand [Fri, 17 Jun 2022 05:40:03 +0000 (07:40 +0200)]
vhost: fix log message for async dequeue

Since the commit 02798b073520 ("vhost: improve virtio-net layer logs"),
vhost logs contain the socket path as a prefix.
Async dequeue path was copied from the sync dequeue path but a log
was incorrect.

Fixes: 84d5204310d7 ("vhost: support async dequeue for split ring")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovhost: fix statistics update in async dequeue
Xuan Ding [Thu, 16 Jun 2022 09:44:32 +0000 (09:44 +0000)]
vhost: fix statistics update in async dequeue

This patch adds missing per-virtqueue statistics in async dequeue path.

Fixes: 84d5204310d7 ("vhost: support async dequeue for split ring")

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Tested-by: Wei Ling <weix.ling@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2 years agovhost: rename number of available entries
Maxime Coquelin [Thu, 16 Jun 2022 08:20:31 +0000 (10:20 +0200)]
vhost: rename number of available entries

This patchs renames the local variables free_entries to
avail_entries in the dequeue path.

Indeed, this variable represents the number of new packets
available in the Virtio transmit queue, so these entries
are actually used, not free.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2 years agovdpa/mlx5: workaround VAR offset within page
Yajun Wu [Wed, 15 Jun 2022 10:02:27 +0000 (13:02 +0300)]
vdpa/mlx5: workaround VAR offset within page

vDPA driver first uses kernel driver to allocate doorbell (VAR) area for
each device. Then uses var->mmap_off and var->length to mmap uverbs device
file as doorbell userspace virtual address.

Current kernel driver provides var->mmap_off equal to page start of VAR.
It's fine with x86 4K page server, because VAR physical address is only 4K
aligned thus locate in 4K page start.

But with aarch64 64K page server, the actual VAR physical address has
offset within page (not located in 64K page start).
So the vDPA driver needs to add this within page offset
(caps.doorbell_bar_offset) to get the right VAR virtual address.

Fixes: 62c813706e4 ("vdpa/mlx5: map doorbell")
Cc: stable@dpdk.org
Signed-off-by: Yajun Wu <yajunw@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovhost: support async packed ring dequeue
Cheng Jiang [Mon, 13 Jun 2022 08:21:59 +0000 (08:21 +0000)]
vhost: support async packed ring dequeue

This patch implements packed ring dequeue data path
for asynchronous vhost.

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovdpa/ifc/base: fix null pointer dereference
Andy Pei [Wed, 15 Jun 2022 06:23:34 +0000 (14:23 +0800)]
vdpa/ifc/base: fix null pointer dereference

Fix null pointer dereference reported in coverity scan.

Coverity issue: 378882
Fixes: 5d75517beffe ("vdpa/ifc/base: access block device registers")

Signed-off-by: Andy Pei <andy.pei@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agoexamples/vhost: support clear in-flight for async dequeue
Yuan Wang [Thu, 9 Jun 2022 17:34:04 +0000 (01:34 +0800)]
examples/vhost: support clear in-flight for async dequeue

This patch allows vring_state_changed() to clear in-flight
dequeue packets. It also clears the in-flight packets in
a thread-safe way in destroy_device().

Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
2 years agovhost: support clear in-flight packets for async dequeue
Yuan Wang [Thu, 9 Jun 2022 17:34:03 +0000 (01:34 +0800)]
vhost: support clear in-flight packets for async dequeue

rte_vhost_clear_queue_thread_unsafe() supports to clear
in-flight packets for async enqueue only. But after
supporting async dequeue, this API should support async dequeue too.

This patch also adds the thread-safe version of this API,
the difference between the two API is that thread safety uses lock.

These APIs maybe used to clean up packets in the async channel
to prevent packet loss when the device state changes or
when the device is destroyed.

Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
2 years agonet/vhost: perform SW checksum in Tx path
Maxime Coquelin [Wed, 8 Jun 2022 12:49:46 +0000 (14:49 +0200)]
net/vhost: perform SW checksum in Tx path

Virtio specification supports guest checksum offloading
for L4, which is enabled with VIRTIO_NET_F_GUEST_CSUM
feature negotiation. However, the Vhost PMD does not
advertise Tx checksum offload capabilities.

Advertising these offload capabilities at the ethdev level
is not enough, because we could still end-up with the
application enabling these offloads while the guest not
negotiating it.

This patch advertises the Tx checksum offload capabilities,
and introduces a compatibility layer to cover the case
VIRTIO_NET_F_GUEST_CSUM has not been negotiated but the
application does configure the Tx checksum offloads. This
function performs the L4 Tx checksum in SW for UDP and TCP.
Compared to Rx SW checksum, the Tx SW checksum function
needs to compute the pseudo-header checksum, as we cannot
know whether it was done before.

This patch does not advertise SCTP checksum offloading
capability for now, but it could be handled later if the
need arises.

Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Cheng Jiang <cheng1.jiang@intel.com>
2 years agonet/vhost: perform SW checksum in Rx path
Maxime Coquelin [Wed, 8 Jun 2022 12:49:45 +0000 (14:49 +0200)]
net/vhost: perform SW checksum in Rx path

Virtio specification supports host checksum offloading
for L4, which is enabled with VIRTIO_NET_F_CSUM feature
negotiation. However, the Vhost PMD does not advertise
Rx checksum offload capabilities, so we can end-up with
the VIRTIO_NET_F_CSUM feature being negotiated, implying
the Vhost library returns packets with checksum being
offloaded while the application did not request for it.

Advertising these offload capabilities at the ethdev level
is not enough, because we could still end-up with the
application not enabling these offloads while the guest
still negotiate them.

This patch advertises the Rx checksum offload capabilities,
and introduces a compatibility layer to cover the case
VIRTIO_NET_F_CSUM has been negotiated but the application
does not configure the Rx checksum offloads. This function
performis the L4 Rx checksum in SW for UDP and TCP. Note
that it is not needed to calculate the pseudo-header
checksum, because the Virtio specification requires that
the driver do it.

This patch does not advertise SCTP checksum offloading
capability for now, but it could be handled later if the
need arises.

Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Cheng Jiang <cheng1.jiang@intel.com>
2 years agonet/vhost: make VLAN stripping flag a boolean
Maxime Coquelin [Wed, 8 Jun 2022 12:49:44 +0000 (14:49 +0200)]
net/vhost: make VLAN stripping flag a boolean

This trivial patch makes the vlan_strip field of the
pmd_internal struct a boolean, since it is handled as
such.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2 years agonet/vhost: enable compliant offloading mode
Maxime Coquelin [Wed, 8 Jun 2022 12:49:43 +0000 (14:49 +0200)]
net/vhost: enable compliant offloading mode

This patch enables the compliant offloading flags mode by
default, which prevents the Rx path to set Tx offload flags,
which is illegal. A new legacy-ol-flags devarg is introduced
to enable the legacy behaviour.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2 years agovhost: fix missing enqueue pseudo-header calculation
Maxime Coquelin [Wed, 8 Jun 2022 12:49:42 +0000 (14:49 +0200)]
vhost: fix missing enqueue pseudo-header calculation

The Virtio specification requires that in case of checksum
offloading, the pseudo-header checksum must be set in the
L4 header.

When received from another Vhost-user port, the packet
checksum might already contain the pseudo-header checksum
but we have no way to know it. So we have no other choice
than doing the pseudo-header checksum systematically.

This patch handles this using the rte_net_intel_cksum_prepare()
helper.

Fixes: 859b480d5afd ("vhost: add guest offload setting")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2 years agoapp/testpmd: revert MAC update in checksum forwarding
Maxime Coquelin [Wed, 8 Jun 2022 12:49:41 +0000 (14:49 +0200)]
app/testpmd: revert MAC update in checksum forwarding

This patch reverts
commit 10f4620f02e1 ("app/testpmd: modify mac in csum forwarding"),
as the checksum forwarding is expected to only perform
checksum and not also overwrites the source and destination MAC addresses.

Doing so, we can test checksum offloading with real traffic
without breaking broadcast packets.

Fixes: 10f4620f02e1 ("app/testpmd: modify mac in csum forwarding")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Aman Singh <aman.deep.singh@intel.com>
2 years agonet/ngbe: support YT PHY SGMII to RGMII mode
Jiawen Wu [Wed, 22 Jun 2022 06:56:13 +0000 (14:56 +0800)]
net/ngbe: support YT PHY SGMII to RGMII mode

Add SGMII to RGMII mode for yt8521s and yt8531s PHY.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2 years agonet/ngbe: support autoneg on/off for external PHY SFI mode
Jiawen Wu [Wed, 22 Jun 2022 06:56:12 +0000 (14:56 +0800)]
net/ngbe: support autoneg on/off for external PHY SFI mode

Add support for external PHY to switch autoneg on/off on their SFI mode.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2 years agonet/ngbe: fix YT PHY UTP mode to link up
Jiawen Wu [Wed, 22 Jun 2022 06:56:11 +0000 (14:56 +0800)]
net/ngbe: fix YT PHY UTP mode to link up

Fix to read and write the correct register fields for yt8521s and
yt8531s PHY, since mode check was added.

Fixes: 1c44384fce76 ("net/ngbe: support custom PHY interfaces")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2 years agonet/ngbe: add more packet statistics
Jiawen Wu [Wed, 22 Jun 2022 06:56:10 +0000 (14:56 +0800)]
net/ngbe: add more packet statistics

Add more hardware extended statistics.

Fixes: 8b433d04adc9 ("net/ngbe: support device xstats")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2 years agonet/txgbe: fix register polling
Jiawen Wu [Wed, 22 Jun 2022 06:56:09 +0000 (14:56 +0800)]
net/txgbe: fix register polling

Fix to poll some specific registers, which expect bit value 0.

'w32w' is used in registers where the write command bit is set and
waits for the bit clear to complete the write.

Fixes: 24a4c76aff4d ("net/txgbe: add error types and registers")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2 years agonet/ngbe: support OEM subsystem vendor ID
Jiawen Wu [Wed, 22 Jun 2022 06:56:08 +0000 (14:56 +0800)]
net/ngbe: support OEM subsystem vendor ID

Add support for OEM subsystem vendor ID.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2 years agonet/txgbe: support OEM subsystem vendor ID
Jiawen Wu [Wed, 22 Jun 2022 06:56:07 +0000 (14:56 +0800)]
net/txgbe: support OEM subsystem vendor ID

Add support for OEM subsystem vendor ID.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
2 years agonet/i40e: move testpmd commands
David Marchand [Fri, 17 Jun 2022 05:07:26 +0000 (07:07 +0200)]
net/i40e: move testpmd commands

Move related specific testpmd commands into this driver directory.
While at it, fix checkpatch warnings.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
2 years agonet/bonding: move testpmd commands
David Marchand [Fri, 17 Jun 2022 05:06:52 +0000 (07:06 +0200)]
net/bonding: move testpmd commands

Move related specific testpmd commands into this driver directory.
While at it, fix checkpatch warnings.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
2 years agonet/nfp: fix initialization
Peng Zhang [Wed, 15 Jun 2022 10:14:17 +0000 (12:14 +0200)]
net/nfp: fix initialization

When the testpmd start-up, it will check MTU range,
if MTU > flubfsz, it will lead testpmd start fail.
Because the hw->flbufsz doesn't have the initialized
value, so it will lead the bug.

Fixes: 417be15e5f11 ("net/nfp: make sure MTU is never larger than mbuf size")
Cc: stable@dpdk.org
Signed-off-by: Peng Zhang <peng.zhang@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
2 years agonet/nfp: modify RSS logic
Jin Liu [Fri, 17 Jun 2022 09:34:44 +0000 (11:34 +0200)]
net/nfp: modify RSS logic

Now NFP NIC support two type of RSS logic, NFP_NET_CFG_CTRL_RSS and
NFP_NET_CFG_CTRL_RSS2, use NFP_NET_CFG_CTRL_RSS2 if NIC capability
support, otherwise use NFP_NET_CFG_CTRL_RSS.

Signed-off-by: Jin Liu <jin.liu@corigine.com>
Signed-off-by: Diana Wang <na.wang@corigine.com>
Signed-off-by: Peng Zhang <peng.zhang@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
2 years agonet/nfp: move round macros to header file
Jin Liu [Fri, 17 Jun 2022 09:34:42 +0000 (11:34 +0200)]
net/nfp: move round macros to header file

Move macro __round_mask, round_up and round_down from C file to
corresponding head file, will be used by TX function of nfp net
firmware with NFDk.

Signed-off-by: Jin Liu <jin.liu@corigine.com>
Signed-off-by: Diana Wang <na.wang@corigine.com>
Signed-off-by: Peng Zhang <peng.zhang@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
2 years agonet/nfp: add queue stop and close helper functions
Jin Liu [Fri, 17 Jun 2022 09:34:41 +0000 (11:34 +0200)]
net/nfp: add queue stop and close helper functions

This commit does not introduce new features, just integrate some common
logic into helper functions to reduce the same logic and increase code
reuse, include queue stop and queue close logic, will be used when NFP
net stop and close.

queue stop: reset queue
queue close: reset and release queue

Modify NFP net stop and close function, use helper function to stop
and close queue instead of before logic.

Signed-off-by: Jin Liu <jin.liu@corigine.com>
Signed-off-by: Diana Wang <na.wang@corigine.com>
Signed-off-by: Peng Zhang <peng.zhang@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
2 years agonet/nfp: add NFDk option and queue function
Jin Liu [Fri, 17 Jun 2022 09:34:40 +0000 (11:34 +0200)]
net/nfp: add NFDk option and queue function

Add ethdev option for firmware with NFDk, implement tx_queue setup
function for firmware with NFDk.

Signed-off-by: Jin Liu <jin.liu@corigine.com>
Signed-off-by: Diana Wang <na.wang@corigine.com>
Signed-off-by: Peng Zhang <peng.zhang@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
2 years agonet/nfp: adjust structures
Jin Liu [Fri, 17 Jun 2022 09:34:39 +0000 (11:34 +0200)]
net/nfp: adjust structures

Add and modify the nfp PMD struct and macro that will be used by firmware
with NFDk.

Signed-off-by: Jin Liu <jin.liu@corigine.com>
Signed-off-by: Diana Wang <na.wang@corigine.com>
Signed-off-by: Peng Zhang <peng.zhang@corigine.com>
Signed-off-by: Chaoyong He <chaoyong.he@corigine.com>
Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>