dpdk.git
3 years agoraw/cnxk_bphy: support reading CGX queue configuration
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:27 +0000 (17:04 +0200)]
raw/cnxk_bphy: support reading CGX queue configuration

Add support for reading queue configuration. Single queue represents
a logical MAC available on RPM/CGX.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agoraw/cnxk_bphy: add BPHY CGX/RPM skeleton driver
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:26 +0000 (17:04 +0200)]
raw/cnxk_bphy: add BPHY CGX/RPM skeleton driver

Add baseband PHY CGX/RPM skeleton driver which merely probes a matching
device. CGX/RPM are Ethernet MACs hardwired to baseband subsystem.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: start and stop BPHY LMAC
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:25 +0000 (17:04 +0200)]
common/cnxk: start and stop BPHY LMAC

Add support for starting or stopping specific lmac.
Start enables Rx/Tx traffic while stop does the opposite.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: set BPHY link state
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:24 +0000 (17:04 +0200)]
common/cnxk: set BPHY link state

Add support for setting link up or down.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: set BPHY link mode
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:23 +0000 (17:04 +0200)]
common/cnxk: set BPHY link mode

Add support for setting link mode.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: enable and disable BPHY PTP mode
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:22 +0000 (17:04 +0200)]
common/cnxk: enable and disable BPHY PTP mode

Add support for enabling or disablig PTP mode.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: enable and disable BPHY internal loopback
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:21 +0000 (17:04 +0200)]
common/cnxk: enable and disable BPHY internal loopback

Add support for enabling or disabling internal loopback.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: get BPHY link information
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:20 +0000 (17:04 +0200)]
common/cnxk: get BPHY link information

Add support for retrieving link information.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: add BPHY communication with atf
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:19 +0000 (17:04 +0200)]
common/cnxk: add BPHY communication with atf

Messages can be exchanged between userspace software and firmware
via set of two dedicated registers, namely scratch1 and scratch0.

scratch1 acts as a command register i.e message is sent to firmware,
while scratch0 holds response to previously sent message.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agocommon/cnxk: add BPHY CGX/RPM initialization and cleanup
Tomasz Duszynski [Mon, 21 Jun 2021 15:04:18 +0000 (17:04 +0200)]
common/cnxk: add BPHY CGX/RPM initialization and cleanup

Add support for low level initialization and cleanup of baseband
PHY CGX/RPM blocks.

Initialization and cleanup are related hence are in the same patch.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Signed-off-by: Jakub Palider <jpalider@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
3 years agobus/auxiliary: introduce auxiliary bus
Xueming Li [Mon, 5 Jul 2021 06:45:12 +0000 (14:45 +0800)]
bus/auxiliary: introduce auxiliary bus

Auxiliary bus [1] provides a way to split function into child-devices
representing sub-domains of functionality. Each auxiliary device
represents a part of its parent functionality.

Auxiliary device is identified by unique device name, sysfs path:
  /sys/bus/auxiliary/devices/<name>

Devargs legacy syntax of auxiliary device:
  -a auxiliary:<name>[,args...]
Devargs generic syntax of auxiliary device:
  -a bus=auxiliary,name=<name>/class=<class>/driver=<driver>[,args...]

[1] kernel auxiliary bus document:
https://www.kernel.org/doc/html/latest/driver-api/auxiliary_bus.html

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
3 years agodevargs: add common key definition
Xueming Li [Mon, 5 Jul 2021 06:45:11 +0000 (14:45 +0800)]
devargs: add common key definition

Add common devargs key definition for "bus", "class" and "driver".

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
3 years agoeal: save error in string copy
Thomas Monjalon [Sun, 13 Jun 2021 08:24:21 +0000 (16:24 +0800)]
eal: save error in string copy

The string copy api rte_strscpy() did not set rte_errno during failures,
instead it just returned negative error number.

Set rte_errrno if the destination buffer is too small.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agoexamples/l2fwd: remove mac-updating option
Chenglian Sun [Tue, 22 Jun 2021 02:49:43 +0000 (10:49 +0800)]
examples/l2fwd: remove mac-updating option

The "mac-updating" option can be removed since the associated mac_updating
variable is set to 1 by default.

Signed-off-by: Chenglian Sun <sunchenglian@loongson.cn>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agoexamples/l2fwd: fix [no-]mac-updating options
Chenglian Sun [Tue, 22 Jun 2021 02:47:05 +0000 (10:47 +0800)]
examples/l2fwd: fix [no-]mac-updating options

For l2fwd, --no-mac-updating and --mac-updating are treated as invalid
arguments. Rework long options parsing to let --no-mac-updating and
--mac-updating options work well.

Fixes: fa19eb20d212 ("examples/l2fwd: add forwarding port mapping option")
Cc: stable@dpdk.org
Signed-off-by: Chenglian Sun <sunchenglian@loongson.cn>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agoexamples/l3fwd: remove useless reloads in LPM main loop
Ruifeng Wang [Thu, 10 Jun 2021 06:57:40 +0000 (06:57 +0000)]
examples/l3fwd: remove useless reloads in LPM main loop

Number of rx queue and number of rx port in lcore config are constants
during the period of l3 forward application running. But compiler has
no this information.

Copied values from lcore config to local variables and used the local
variables for iteration. Compiler can see that the local variables are
not changed, so qconf reloads at each iteration can be eliminated.

The change showed 1.8% performance uplift in single core, single port,
single queue test on N1SDP platform with MLX5 NIC.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agoexamples/l3fwd: remove useless calculations in NEON LPM
Ruifeng Wang [Thu, 10 Jun 2021 06:57:39 +0000 (06:57 +0000)]
examples/l3fwd: remove useless calculations in NEON LPM

Both L2 and L3 headers will be used in forward processing. And these
two headers are in the same cache line. It has the same effect for
prefetching with L2 header address and prefetching with L3 header
address.

Changed to use L2 header address for prefetching. The change showed
no measurable performance improvement, but it definitely removed
unnecessary instructions for address calculation.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
3 years agoapp/test: fix IPv6 header initialization
Lance Richardson [Fri, 26 Mar 2021 16:37:32 +0000 (12:37 -0400)]
app/test: fix IPv6 header initialization

Fix two issues found when writing PMD unit tests for HW ptype and
L4 checksum offload:

   - The version field in the IPv6 header was being set to zero,
     which prevented hardware from recognizing it as IPv6. The
     IP version field is now set to six.
   - The payload_len field was being initialized using host byte
     order, which (among other things) resulted in incorrect L4
     checksum computation. The payload_len field is now set using
     network (big-endian) byte order.

Fixes: 92073ef961ee ("bond: unit tests")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agobus/pci: support IOVA as VA in PowerVM LPARs
David Christensen [Wed, 23 Jun 2021 20:43:55 +0000 (13:43 -0700)]
bus/pci: support IOVA as VA in PowerVM LPARs

Add IOMMU detection logic for PowerVM LPARs.

PowerNV $ cat /proc/cpuinfo
...
timebase     : 512000000
platform     : PowerNV
model        : 8335-GTW

PowerVM LPAR $ cat /proc/cpuinfo
...
timebase     : 512000000
platform     : pSeries
model        : IBM,9009-22A
machine      : CHRP IBM,9009-22A
MMU          : Hash

PowerNV KVM Guest $ cat /proc/cpuinfo
...
timebase     : 512000000
platform     : pSeries
model        : IBM pSeries (emulated by qemu)
machine      : CHRP IBM pSeries (emulated by qemu)
MMU          : Radix

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Thinh Tran <thinhtr@linux.vnet.ibm.com>
3 years agobus/pci: fix IOVA as VA support for PowerNV
David Christensen [Tue, 15 Jun 2021 17:20:27 +0000 (10:20 -0700)]
bus/pci: fix IOVA as VA support for PowerNV

Fix the IOMMU detection logic that looks for the "platform" field of
/proc/cpuinfo on POWER systems.

Fixes: 905215731833 ("bus/pci: support IOVA as VA on PowerNV systems")
Cc: stable@dpdk.org
Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agoeal/arm: remove unused type
Ruifeng Wang [Fri, 11 Jun 2021 14:42:18 +0000 (14:42 +0000)]
eal/arm: remove unused type

Data types Elf32_auxv_t and Elf64_auxv_t are used by OS Linux
auxiliary vector read, and not used by arch specific cpu flag
API implementations. Hence remove them from Arm file.

Reported-by: James Grant <j.grant@qub.ac.uk>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
3 years agodevtools: recommend new logtype helpers
David Marchand [Thu, 1 Jul 2021 14:11:31 +0000 (16:11 +0200)]
devtools: recommend new logtype helpers

Following commit eeded2044af5 ("log: register with standardized names"),
the new helpers should be preferred so that we can maintain a consistent
naming for logtypes.

Signed-off-by: David Marchand <david.marchand@redhat.com>
3 years agocommon/mlx5: fix Netlink port name padding in probing
Viacheslav Ovsiienko [Sat, 19 Jun 2021 13:56:28 +0000 (16:56 +0300)]
common/mlx5: fix Netlink port name padding in probing

On some kernels the string attributes within Netlink
reply messages might be not padded with zeroes (in cases
when string length is aligned with 4-byte boundary).
While device probing, the physical port name was wrongly recognized,
causing a probing failure.

Fixes: 30a86157f6d5 ("net/mlx5: support PF representor")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: convert meta register to big-endian
Alexander Kozyrev [Wed, 16 Jun 2021 14:46:02 +0000 (17:46 +0300)]
net/mlx5: convert meta register to big-endian

Metadata were stored in the CPU order (little-endian format on x86),
while all the packet header fields are stored in the network order.
That caused wrong results whenever we tried to use metadata value
in the modify_field action: bytes were swapped as a result.

Convert the metadata value into big-endian format before storing it
in the Mellanox NIC to achieve consistent behaviour.

Fixes: 641dbe4fb053 ("net/mlx5: support modify field flow action")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: fix modify field action order for MAC
Alexander Kozyrev [Wed, 16 Jun 2021 14:42:36 +0000 (17:42 +0300)]
net/mlx5: fix modify field action order for MAC

MAC addresses are split into 2 parts inside Mellanox NIC:
bits 0-15 are separate from bits 16-47. That makes a copy
from another packet field tricky because any other field
is aligned to 32 bits, not 16. This causes unexpected
results when using the MODIFY_FIELD action with MAC addresses.
Track crossing MAC addresses boundary and arrange a proper
order for the MODIFY_FIELD action involving MAC addresses.

Fixes: 641dbe4fb053 ("net/mlx5: support modify field flow action")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: fix IPIP multi-tunnel validation
Lior Margalit [Wed, 16 Jun 2021 07:01:18 +0000 (10:01 +0300)]
net/mlx5: fix IPIP multi-tunnel validation

A flow rule must not include multiple tunnel layers.
An attempt to create such a rule, for example:
testpmd> flow create .../ vxlan / eth / ipv4 proto is 4 / end <actions>
results in an unclear error.

In the current implementation there is a check for
multiple IPIP tunnels, but not for combination of IPIP
and a different kind of tunnel, such as VXLAN. The fix
is to enhance the above check to use MLX5_FLOW_LAYER_TUNNEL
that consists of all the tunnel masks. The error message
will be "multiple tunnel not supported".

Fixes: 5e33bebdd8d3 ("net/mlx5: support IP-in-IP tunnel")
Cc: stable@dpdk.org
Signed-off-by: Lior Margalit <lmargalit@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: fix Rx queue timestamp format
Viacheslav Ovsiienko [Mon, 14 Jun 2021 13:52:42 +0000 (16:52 +0300)]
net/mlx5: fix Rx queue timestamp format

The timestamp format was not configured correctly for the
receiving queues created via DevX calls. It caused non-UTC
timestamps in CQEs  for real time configurations.

Fixes: d61381ad46d0 ("net/mlx5: support timestamp format")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: fix switchdev mode recognition
Viacheslav Ovsiienko [Fri, 11 Jun 2021 15:37:19 +0000 (18:37 +0300)]
net/mlx5: fix switchdev mode recognition

The new kernels might add the switch_id attribute to the
Netlink replies and this caused the wrong recognition
of the E-Switch presence. The single uplink device was
erroneously recognized as master and it caused the
extending match for source vport index on all installed
flows, including the default ones, and adding extra hops
in the steering engine, that affected the maximal
throughput packet rate.

The extra check for the new device name format (it supposes
the new kernel) and the device is only one is added. If this
check succeeds the E-Switch presence is considered as wrongly
detected and overridden.

Fixes: 30a86157f6d5 ("net/mlx5: support PF representor")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: fix aging counter deallocation
Matan Azrad [Wed, 9 Jun 2021 12:32:51 +0000 (15:32 +0300)]
net/mlx5: fix aging counter deallocation

When a counter is destroyed and used for aging action, the driver should
remove the counter object from the age-out list if it is there.

The counter memory of the list entry and of the counter shared
information is shared because, currently, shared counter cannot be used
for aging.

When the support for counter action in action handle API was added, the
counter shared information was reused and moved to be used also for
non-shared case. Wrongly, it is used for aging case too.

Remove the usage of shared information in case of aging.

Fixes: f3191849f2c2 ("net/mlx5: support flow count action handle")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Bing Zhao <bingz@nvidia.com>
3 years agonet/mlx5: fix meter policy creation failure handling
Li Zhang [Wed, 9 Jun 2021 02:07:11 +0000 (05:07 +0300)]
net/mlx5: fix meter policy creation failure handling

When an error appears in the policy creation,
the IDs mapping between the user policy ID to
the driver policy ID is skipped.

Wrongly, the driver tried to clean the mapping in
this case what caused an error.

Skip the clearance in this case.

Fixes: afb4aa4f122 ("net/mlx5: support meter policy operations")
Cc: stable@dpdk.org
Signed-off-by: Li Zhang <lizh@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
3 years agonet/mlx5: allow copy from one tag to another
Alexander Kozyrev [Tue, 25 May 2021 17:14:14 +0000 (20:14 +0300)]
net/mlx5: allow copy from one tag to another

The modify field implementation in mlx5 driver has a check to
prevent a copy from a field to the same field. But the level
is not taken into account which prevents a copy from different
tags. Check the level and allow a copy from one tag to another.

Fixes: 641dbe4fb05 ("net/mlx5: support modify field flow action")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: fix RSS pattern expansion
Gregory Etelson [Thu, 27 May 2021 15:20:24 +0000 (18:20 +0300)]
net/mlx5: fix RSS pattern expansion

Flow rule pattern may be implicitly expanded by the PMD if the rule
has RSS flow action. The expansion adds network headers to the
original pattern. The new pattern lists all network levels that
participate in the rule RSS action.

The patch fixes expanded pattern for cases when original pattern
included meta items like MARK, TAG, META.

Fixes: c7870bfe09dc ("ethdev: move RSS expansion code to mlx5 driver")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx5: remove barrier for memory region cache
Feifei Wang [Tue, 18 May 2021 08:50:58 +0000 (16:50 +0800)]
net/mlx5: remove barrier for memory region cache

'dev_gen' is a variable to trigger all cores to flush their local caches
once the global MR cache has been rebuilt.

This is due to MR cache's R/W lock can maintain synchronization between
threads:

1. dev_gen and global cache updating ordering inside the lock protected
section does not matter. Because other threads cannot take the lock
until global cache has been updated. Thus, in out of order platform,
even if other agents firstly observe updated dev_gen but global does
not update, they also have to wait the lock. As a result, it is
unnecessary to add a wmb between global cache rebuilding and updating
the dev_gen to keep the memory store order.

2. Store-Release of unlock provides the implicit wmb at the level
visible by software. This makes 'rebuilding global cache' and 'updating
dev_gen' be observed before local_cache starts to be updated by other
agents. Thus, wmb after 'updating dev_gen' can be removed.

Suggested-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agonet/mlx4: remove barrier for memory region cache
Feifei Wang [Tue, 18 May 2021 08:50:57 +0000 (16:50 +0800)]
net/mlx4: remove barrier for memory region cache

'dev_gen' is a variable to trigger all cores to flush their local caches
once the global MR cache has been rebuilt.

This is due to MR cache's R/W lock can maintain synchronization between
threads:

1. dev_gen and global cache updating ordering inside the lock protected
section does not matter. Because other threads cannot take the lock
until global cache has been updated. Thus, in out of order platform,
even if other agents firstly observe updated dev_gen but global does
not update, they still have to wait the lock. As a result, it is
unnecessary to add a wmb between global cache rebuilding and updating
the dev_gen to keep the memory store order.

2. Store-Release of unlock provides the implicit wmb at the level
visible by software. This makes 'rebuilding global cache' and 'updating
dev_gen' be observed before local_cache starts to be updated by other
agents. Thus, wmb after 'updating dev_gen' can be removed.

Suggested-by: Ruifeng Wang <ruifeng.wang@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
3 years agotests/eal: fix memory leak
Owen Hilyard [Wed, 16 Jun 2021 16:24:52 +0000 (12:24 -0400)]
tests/eal: fix memory leak

The directory steam was not closed when the hugepage action was
HUGEPAGE_CHECK_EXISTS. This caused a memory leak in some parts of
the unit tests.

Fixes: 45f1b6e8680a ("app: add new tests on eal flags")
Cc: stable@dpdk.org
Signed-off-by: Owen Hilyard <ohilyard@iol.unh.edu>
Reviewed-by: David Marchand <david.marchand@redhat.com>
3 years agotests/cmdline: fix memory leaks
Owen Hilyard [Wed, 23 Jun 2021 18:06:45 +0000 (14:06 -0400)]
tests/cmdline: fix memory leaks

Fixes for a few memory leaks in the cmdline_autotest unit test.

All of the leaks were related to not freeing the commandline struct
after testing had completed.

Fixes: dbb860e03eb1 ("cmdline: tests")
Cc: stable@dpdk.org
Signed-off-by: Owen Hilyard <ohilyard@iol.unh.edu>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
3 years agorib: fix max depth IPv6 lookup
Owen Hilyard [Wed, 23 Jun 2021 15:17:29 +0000 (11:17 -0400)]
rib: fix max depth IPv6 lookup

ASAN found a stack buffer overflow in lib/rib/rte_rib6.c:get_dir.
The fix for the stack buffer overflow was to make sure depth
was always < 128, since when depth = 128 it caused the index
into the ip address to be 16, which read off the end of the array.

While trying to solve the buffer overflow, I noticed that a few
changes could be made to remove the for loop entirely.

Fixes: f7e861e21c46 ("rib: support IPv6")
Cc: stable@dpdk.org
Signed-off-by: Owen Hilyard <ohilyard@iol.unh.edu>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
3 years agoflow_classify: fix leaking rules on delete
Owen Hilyard [Wed, 23 Jun 2021 17:07:07 +0000 (13:07 -0400)]
flow_classify: fix leaking rules on delete

Rules in a classify table were not freed if the table
had a delete function.

Fixes: be41ac2a330f ("flow_classify: introduce flow classify library")
Cc: stable@dpdk.org
Signed-off-by: Owen Hilyard <ohilyard@iol.unh.edu>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
3 years agokni: fix crash on userspace VA for segmented packets
Ferruh Yigit [Tue, 22 Jun 2021 12:29:56 +0000 (13:29 +0100)]
kni: fix crash on userspace VA for segmented packets

When IOVA=VA, address translation for segmented packets is wrong, it
assumes the address in the mbuf->next is physical address, not VA
address.

Fixing the address translation to work both PA & VA mode.

Fixes: e73831dc6c26 ("kni: support userspace VA")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agokni: fix mbuf allocation for kernel side use
Yunjian Wang [Tue, 22 Jun 2021 12:44:29 +0000 (20:44 +0800)]
kni: fix mbuf allocation for kernel side use

In kni_allocate_mbufs(), we alloc mbuf for alloc_q as this code.
allocq_free = (kni->alloc_q->read - kni->alloc_q->write - 1) \
& (MAX_MBUF_BURST_NUM - 1);
The value of allocq_free maybe zero, for example :
The ring size is 1024. After init, write = read = 0. Then we fill
kni->alloc_q to full. At this time, write = 1023, read = 0.

Then the kernel send 32 packets to userspace. At this time, write
= 1023, read = 32. And then the userspace receive this 32 packets.
Then fill the kni->alloc_q, (32 - 1023 - 1) & 31 = 0, fill nothing.
...
Then the kernel send 32 packets to userspace. At this time, write
= 1023, read = 992. And then the userspace receive this 32 packets.
Then fill the kni->alloc_q, (992 - 1023 - 1) & 31 = 0, fill nothing.

Then the kernel send 32 packets to userspace. The kni->alloc_q only
has 31 mbufs and will drop one packet.

Absolutely, this is a special scene. Normally, it will fill some
mbufs everytime, but may not enough for the kernel to use.

In this patch, we always keep the kni->alloc_q to full for the kernel
to use.

Fixes: 49da4e82cf94 ("kni: allocate no more mbuf than empty slots in queue")
Cc: stable@dpdk.org
Signed-off-by: Cheng Liu <liucheng11@huawei.com>
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/virtio: add MAC device config getter and setter
Maxime Coquelin [Thu, 17 Jun 2021 14:17:18 +0000 (16:17 +0200)]
net/virtio: add MAC device config getter and setter

This patch uses the new device config ops to get and set
the MAC address if supported.

If a valid MAC address is passed as devarg of the
Virtio-user PMD, the driver will try to store it in the
device config space. Otherwise the one provided in
the device config space will be used, if available.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agonet/virtio: add device config support to vDPA
Maxime Coquelin [Thu, 17 Jun 2021 14:17:17 +0000 (16:17 +0200)]
net/virtio: add device config support to vDPA

This patch introduces two virtio-user callbacks to get
and set device's config, and implements it for vDPA
backends.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agonet/virtio: keep device and frontend features separated
Maxime Coquelin [Thu, 17 Jun 2021 14:17:16 +0000 (16:17 +0200)]
net/virtio: keep device and frontend features separated

This patch is preliminary rework to add support for getting
and setting device's config space.

In order to get or set a device config such as its MAC address,
we need to know whether the device itself support the feature,
or if it is emulated by the frontend.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
3 years agovhost: allocate and free packets in bulk in Tx split
Balazs Nemeth [Tue, 8 Jun 2021 11:41:11 +0000 (13:41 +0200)]
vhost: allocate and free packets in bulk in Tx split

Same idea as commit a287ac28919d ("vhost: allocate and free packets
in bulk in Tx packed"), allocate and free packets in bulk.
Also remove the unused function virtio_dev_pktmbuf_alloc.

Signed-off-by: Balazs Nemeth <bnemeth@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
3 years agonet/virtio: fix kernel set features for multi-queue device
Thierry Herbelot [Fri, 28 May 2021 13:20:38 +0000 (15:20 +0200)]
net/virtio: fix kernel set features for multi-queue device

Restore the original code, where VHOST_SET_FEATURES is applied to
all vhostfds of the device.

Fixes: cc0151b34dee ("net/virtio: add virtio-user features ops")
Cc: stable@dpdk.org
Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
3 years agovhost/crypto: check request pointer before dereference
Thierry Herbelot [Mon, 24 May 2021 09:08:21 +0000 (11:08 +0200)]
vhost/crypto: check request pointer before dereference

Use vc_req only after it was checked not to be NULL.

Fixes: 2d962bb736521 ("vhost/crypto: fix possible TOCTOU attack")
Cc: stable@dpdk.org
Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
3 years agodevtools: fix file listing in maintainers check
Thomas Monjalon [Tue, 15 Jun 2021 12:49:49 +0000 (14:49 +0200)]
devtools: fix file listing in maintainers check

When having multiple working trees, the main one has a .git directory
while attached trees have a .git file.
Thus the git check should work for both file and directory.

In the case there is no working tree (.git not readable), the command
"find" is used and should be able to list paths with wildcards.
Wildcards work only as shell expansion in the case of file paths,
so the quotes must be removed.

Fixes: 27c2ce563216 ("maintainers: start a Linux-style file")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
3 years agoconfig/arm: check SVE CPU flag
Chengwen Feng [Fri, 21 May 2021 03:33:54 +0000 (11:33 +0800)]
config/arm: check SVE CPU flag

If compiled with SVE feature (e.g. "-march=armv8.2-a+sve'), the binary
could not run on non-SVE platform else it will encounter illegal
instruction [1].

This patch fixes it by adding 'RTE_CPUFLAG_SVE' to compile_time_cpuflags,
so that rte_cpu_is_supported() will print meaningful log under above
situation.

[1] http://mails.dpdk.org/archives/dev/2021-May/209124.html

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
3 years agoeal/windows: cleanup interrupt resources
Dmitry Kozlyuk [Sun, 2 May 2021 02:33:33 +0000 (05:33 +0300)]
eal/windows: cleanup interrupt resources

Interrupt manager in Windows EAL allocates on IOCP and starts
a control thread that runs indefinitely. At DPDK cleanup
this thread was not stopped and IOCP handle was not closed.

Gracefully stop interrupt-handling in rte_eal_cleanup().
The thread already closes IOCP handle before exiting.

Fixes: 5c016fc0205a ("eal/windows: add interrupt thread skeleton")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Jie Zhou <jizh@microsoft.com>
Tested-by: Jie Zhou <jizh@microsoft.com>
3 years agoeal/windows: fix interrupt thread handle leakage
Dmitry Kozlyuk [Sun, 2 May 2021 02:33:32 +0000 (05:33 +0300)]
eal/windows: fix interrupt thread handle leakage

Each time a work was scheduled in the interrupt thread,
usually an alarm, a handle was opened but not closed.

Opening a handle is a system call, which harms alarm precision.
Instead of opening and closing a handle each time, open it
when interrupt thread starts and close it when the thread finishes.

Fixes: 5c016fc0205a ("eal/windows: add interrupt thread skeleton")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Tested-by: Pallavi Kadam <pallavi.kadam@intel.com>
3 years agoeal/windows: fix interrupt thread ID
Dmitry Kozlyuk [Sun, 2 May 2021 02:33:31 +0000 (05:33 +0300)]
eal/windows: fix interrupt thread ID

Interrupt thread ID retained its value after interrupt thread finish.
Other interrupt routines could then operate on the wrong thread.
Clear interrupt thread ID before thread termination.

Fixes: 5c016fc0205a ("eal/windows: add interrupt thread skeleton")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
3 years agoraw/ioat: fix missing ring pointer reset
Kevin Laatz [Thu, 17 Jun 2021 14:18:15 +0000 (14:18 +0000)]
raw/ioat: fix missing ring pointer reset

In the event of a device reconfigure, "hdls_avail" is not being reset. This
can lead to miscalculations in rte_ioat_completed_ops(), causing the
function to report an incorrect amount of completed operations. This patch
fixes the issue by resetting "hdls_avail" during the device configure.

Fixes: 74464005a2af ("raw/ioat: rework SW ring layout")
Cc: stable@dpdk.org
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agoraw/ioat: fix memory leak in device configure
Kevin Laatz [Thu, 17 Jun 2021 14:17:52 +0000 (14:17 +0000)]
raw/ioat: fix memory leak in device configure

During device configure, memory is allocated for "hdl_ring_flags". In the
event of another call to the device configure function (reconfigure), a
memory leak would occur. This patch fixes the memory leak by free'ing the
memory before reallocating it.

Fixes: 245efe544d8e ("raw/ioat: report status of completed jobs")
Cc: stable@dpdk.org
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agobuildtools: allow string constant padding
Dmitry Kozlyuk [Thu, 27 May 2021 21:24:21 +0000 (00:24 +0300)]
buildtools: allow string constant padding

Size of string constant symbol may be larger than its length
measured up to NUL terminator. In this case pmdinfogen included padding
bytes after NUL terminator in generated source, yielding incorrect code.

Always trim string data to NUL terminator while reading ELF.
It was already done for COFF because there's no symbol size.

Bugzilla ID: 720
Fixes: f0f93a7adfee ("buildtools: use Python pmdinfogen")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
3 years agovfio: add stdbool include
Christian Ehrhardt [Tue, 1 Jun 2021 08:28:25 +0000 (10:28 +0200)]
vfio: add stdbool include

This became visible by backporting the following for the 19.11 stable tree:
 c13ca4e8 "vfio: fix DMA mapping granularity for IOVA as VA"

The usage of type bool in the vfio code would require "#include
<stdbool.h>", but rte_vfio.h has no direct paths to stdbool.h.
It happens that in eal_vfio_mp_sync.c it comes after "#include
<rte_log.h>".

And rte_log.h since 20.05 includes stdbool since this change:
 241e67bfe "log: add API to check if a logtype can log in a given level"
and thereby mitigates the issue.

It should be safe to include stdbool.h from rte_vfio.h itself
to be present exactly when needed for the struct it defines using that
type.

Fixes: c13ca4e81cac ("vfio: fix DMA mapping granularity for IOVA as VA")
Cc: stable@dpdk.org
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
3 years agodoc: fix default burst size in testpmd
Ajit Khaparde [Fri, 28 May 2021 17:45:29 +0000 (10:45 -0700)]
doc: fix default burst size in testpmd

Default burst size in testpmd has been changed from 16 to 32
for some time now. But the documentation had not been updated.

Fixes: 836853d3d4cf7 ("app/testpmd: increase default burst size to 32")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agodoc: fix typo in SPDX tag
Kevin Traynor [Fri, 11 Jun 2021 16:38:42 +0000 (17:38 +0100)]
doc: fix typo in SPDX tag

A stray character got added. Remove it.

Fixes: cb056611a8ed ("eal: rename lcore master and slave")
Cc: stable@dpdk.org
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
3 years agoraw/ioat: add device reset to configuration script
Kevin Laatz [Fri, 28 May 2021 13:55:59 +0000 (14:55 +0100)]
raw/ioat: add device reset to configuration script

Currently once a device is configured, the user does not have the ability
to reset the device via the script.

This patch adds a device reset option to the script. For example
"$dpdk_idxd_cfg.py 0 --reset" would reset device 0.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agoraw/ioat: handle PCI address in configuration script
Kevin Laatz [Fri, 28 May 2021 13:55:58 +0000 (14:55 +0100)]
raw/ioat: handle PCI address in configuration script

Currently the user needs to find the DSA instance number for any DSA device
they would like to configure using this script, which can be cumbersome and
error-prone since the instance numbering may change when changing the
binding of the devices between vfio-pci and idxd.

This patch improves the usability of the script by adding the ability to
specify the DSA device to configure using the device's PCI address instead
of the DSA instance number. For example, "$dpdk_idxd_cfg.py 0" and
"$dpdk_idxd_cfg.py 6a:01.0" are both valid references to the same device
(assuming the numbering).

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
3 years agoraw/ioat: fix missing device name in idxd bus scan
Kevin Laatz [Thu, 27 May 2021 13:36:09 +0000 (14:36 +0100)]
raw/ioat: fix missing device name in idxd bus scan

The device name is not being initialized during the idxd bus scan which
will cause segmentation faults when an appliation tries to access this
information.

This patch adds the required initialization of the device name so that it
can be read without issues.

Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")
Cc: stable@dpdk.org
Reported-by: Sunil Pai G <sunil.pai.g@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
3 years agoacl: fix build with GCC 6.3
Konstantin Ananyev [Fri, 21 May 2021 14:42:07 +0000 (15:42 +0100)]
acl: fix build with GCC 6.3

--buildtype=debug with gcc 6.3 produces the following error:

../lib/librte_acl/acl_run_avx512_common.h: In function
‘resolve_match_idx_avx512x16’:
../lib/librte_acl/acl_run_avx512x16.h:33:18: error:
the last argument must be an 8-bit immediate
                               ^
../lib/librte_acl/acl_run_avx512_common.h:373:9: note:
in expansion of macro ‘_M_I_’
      return _M_I_(slli_epi32)(mi, match_log);
             ^~~~~

Seems like gcc-6.3 complains about the following construct:

static const uint32_t match_log = 5;
    ...
_mm512_slli_epi32(mi, match_log);

It can't substitute constant variable 'match_log' with its actual value.
The fix replaces constant variable with its immediate value.

Bugzilla ID: 717
Fixes: b64c2295f7fc ("acl: add 256-bit AVX512 classify method")
Fixes: 45da22e42ec3 ("acl: add 512-bit AVX512 classify method")
Cc: stable@dpdk.org
Reported-by: Liang Ma <liangma@liangbit.com>
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
3 years agonet/ice/base: fix ptype bitmap for IP fragment
Ting Xu [Thu, 10 Jun 2021 02:45:09 +0000 (10:45 +0800)]
net/ice/base: fix ptype bitmap for IP fragment

IPv4 and IPv6 fragment ptypes are supposed to be separated from IP
other ptypes. New bitmaps for IP fragment ptypes were created, but the
IP fragment ptypes were not deleted from the previous non-frag bitmaps,
which will cause conflicts. This patch removes IP fragment ptypes from
the non-frag bitmaps.

Fixes: 843452817561 ("net/ice/base: support IP fragment RSS and FDIR")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice: fix RSS for L2 packet
Wenjun Wu [Thu, 10 Jun 2021 02:30:19 +0000 (10:30 +0800)]
net/ice: fix RSS for L2 packet

L2 RSS support was deleted by mistake during code
refactoring. This patch adds it again.

Fixes: 38d632cbdc88 ("net/ice: refactor PF RSS")
Cc: stable@dpdk.org
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/iavf: fix scalar Rx
Beilei Xing [Tue, 1 Jun 2021 05:09:51 +0000 (13:09 +0800)]
net/iavf: fix scalar Rx

The new allocated mbuf should be updated to the SW
ring.

Fixes: a2b29a7733ef ("net/avf: enable basic Rx Tx")
Fixes: b8b4c54ef9b0 ("net/iavf: support flexible Rx descriptor in normal path")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
3 years agonet/i40e: fix use after free in FDIR release
Dapeng Yu [Fri, 4 Jun 2021 02:02:01 +0000 (10:02 +0800)]
net/i40e: fix use after free in FDIR release

The original code use a heap pointer after it is freed.

Fixes: 460d1679586e ("drivers/net: delete HW rings while freeing queues")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice: fix FDIR flow type for IPv4 fragment
Ting Xu [Wed, 2 Jun 2021 08:21:04 +0000 (16:21 +0800)]
net/ice: fix FDIR flow type for IPv4 fragment

When creating FDIR rule and parsing the pattern, if IPv4 fragment type is
detected, the flow type is not changed to ICE_FLTR_PTYPE_FRAG_IPV4 from
ICE_FLTR_PTYPE_NONF_IPV4_OTHER. It will cause profile confilict with
other FDIR rules for IPv4 other type.

Fixes: b7e8781de768 ("net/ice: support flow director for IP fragment packet")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice: fix data path in secondary process
Qi Zhang [Wed, 26 May 2021 06:12:56 +0000 (14:12 +0800)]
net/ice: fix data path in secondary process

The rte_eth_devices array is not in share memory, it should not be
referenced by ice_adapter which is shared by primary and secondary.
Any process set ice_adapter->eth_dev will corrupt another process'
context.

The patch removed the field "eth_dev" from ice_adapter.
Now, when the data paths try to access the rte_eth_dev_data instance,
they should replace adapter->eth_dev->data with adapter->pf.dev_data.

Fixes: f9cf4f864150 ("net/ice: support device initialization")
Cc: stable@dpdk.org
Reported-by: Yixue Wang <yixue.wang@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Yixue Wang <yixue.wang@intel.com>
3 years agonet/ice: fix data path selection in secondary process
Qi Zhang [Mon, 24 May 2021 09:07:59 +0000 (17:07 +0800)]
net/ice: fix data path selection in secondary process

The flag use_avx2 and use_avx512 are defined as local variables, they
will not be aware by the secondary process, then wrong data path is
selected. Fix the issue by moving them into struct ice_adapter.

Fixes: ae60d3c9b227 ("net/ice: support Rx AVX2 vector")
Fixes: 2d5f6953d56d ("net/ice: support vector AVX2 in Tx")
Fixes: 7f85d5ebcfe1 ("net/ice: add AVX512 vector path")
Cc: stable@dpdk.org
Reported-by: Yixue Wang <yixue.wang@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Yixue Wang <yixue.wang@intel.com>
3 years agonet/ice/base: remove unncessary code
Qi Zhang [Tue, 1 Jun 2021 13:36:22 +0000 (21:36 +0800)]
net/ice/base: remove unncessary code

Remove unnecessary jumbo frame configure.

Signed-off-by: Fabio Pricoco <fabio.pricoco@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice/base: remove VSI info from previous aggregator
Qi Zhang [Tue, 1 Jun 2021 12:26:06 +0000 (20:26 +0800)]
net/ice/base: remove VSI info from previous aggregator

remove the VSI info from previous aggregator after moving the VSI to a
new aggregator.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice/base: remove firmware log
Qi Zhang [Tue, 1 Jun 2021 12:13:49 +0000 (20:13 +0800)]
net/ice/base: remove firmware log

Remove firmware log related code.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/ice/base: add function for DSCP configure
Qi Zhang [Tue, 1 Jun 2021 12:03:52 +0000 (20:03 +0800)]
net/ice/base: add function for DSCP configure

ice_aq_set_pfc_mode is used to configure DSCP.

Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/iavf: use write combining store for tail updates
Gordon Noonan [Wed, 12 May 2021 10:28:54 +0000 (11:28 +0100)]
net/iavf: use write combining store for tail updates

Performance improvement: use a write combining store
instead of a regular mmio write to update queue tail
registers.

Signed-off-by: Gordon Noonan <gordon.noonan@intel.com>
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agonet/i40e: fix raw packet flow director
Steve Yang [Wed, 19 May 2021 03:27:45 +0000 (03:27 +0000)]
net/i40e: fix raw packet flow director

When user configured the flow rule with raw packet via command
"flow_director_filter", it would reset all previous fdir input set
flags with "i40e_flow_set_fdir_inset()".

Ignore to configure the flow input set with raw packet rule used.

Fixes: ff04964ea6d5 ("net/i40e: fix flow director for common pctypes")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
3 years agonet/iavf: fix handling of unsupported promiscuous
Qi Zhang [Wed, 26 May 2021 09:53:05 +0000 (17:53 +0800)]
net/iavf: fix handling of unsupported promiscuous

iavf_execute_vf_cmd returns standard error code but not IAVF_xxx,
The patch fix the wrong error handling in iavf_config_promisc.

Fixes: 1e4d55a7fe71 ("net/iavf: optimize promiscuous device operations")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
3 years agonet/ice: fix default RSS key generation
Dapeng Yu [Thu, 27 May 2021 06:42:51 +0000 (14:42 +0800)]
net/ice: fix default RSS key generation

In original implementation, device reconfiguration will generate
a new default RSS key if there is no one from user, it is unexpected
when updating a completely unrelated configuration.

This patch makes default RSS key unchanged, during the lifetime of the
DPDK application even if there are multiple reconfigurations.

Fixes: 50370662b727 ("net/ice: support device and queue ops")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
3 years agoraw/ifpga/base: check address before assigning
Wei Huang [Mon, 31 May 2021 05:22:31 +0000 (01:22 -0400)]
raw/ifpga/base: check address before assigning

In max10_staging_area_init(), variable "start" from fdt_get_reg() may
be invalid, it should be checked before assigning to member variable
"staging_area_base" of structure "intel_max10_device".

Coverity issue: 367480, 367482
Fixes: a05bd1b40bde ("raw/ifpga: add FPGA RSU APIs")
Cc: stable@dpdk.org
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
3 years agonet/iavf: fix RSS key access out of bound
Haiyue Wang [Wed, 19 May 2021 07:59:33 +0000 (15:59 +0800)]
net/iavf: fix RSS key access out of bound

The array rss_key has size 'vf->vf_res->rss_key_size', the array index
should be less than that.

Cc: stable@dpdk.org
Fixes: 69dd4c3d0898 ("net/avf: enable queue and device")

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
3 years agonet/bnxt: remove unnecessary comment
Kalesh AP [Mon, 31 May 2021 07:26:44 +0000 (12:56 +0530)]
net/bnxt: remove unnecessary comment

Remove unnecessary comment in the code.

Fixes: 0a6d2a720078 ("net/bnxt: get device infos")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: improve probing log message
Kalesh AP [Mon, 31 May 2021 07:26:43 +0000 (12:56 +0530)]
net/bnxt: improve probing log message

The existing log message is missing a space. Modified it to
a more meaningful log as part of this change.

Before this patch:

bnxt_dev_init(): bnxtfound at mem D67E0000, node addr 0x2101112000M

With this patch:

bnxt_dev_init(): Found bnxt device at mem D67E0000, node addr 0x2101112000M

Fixes: 1bf01f5135f8 ("net/bnxt: prevent device access when device is in reset")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: fix check for PTP support in FW
Kalesh AP [Mon, 31 May 2021 07:26:42 +0000 (12:56 +0530)]
net/bnxt: fix check for PTP support in FW

On Thor, driver must use HWRM to access the timestamp information.
Driver should not advertise PTP support to application
if PTP information is not available via HWRM commands.

Fixes: 6cbd89f9f3d8 ("net/bnxt: support PTP for Thor")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: use common function to free VNIC resource
Kalesh AP [Mon, 31 May 2021 07:26:41 +0000 (12:56 +0530)]
net/bnxt: use common function to free VNIC resource

Use the function bnxt_vnic_destroy() to destroy VNIC resources
and thereby eliminate few duplicate code.

Fixes: 8d0a244b40b2 ("net/bnxt: cleanup VNIC after flow validate")
Fixes: 49d0709b257f ("net/bnxt: delete and flush L2 filters cleanly")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: set flow error after tunnel redirection free
Kalesh AP [Mon, 31 May 2021 07:26:40 +0000 (12:56 +0530)]
net/bnxt: set flow error after tunnel redirection free

During flow destroy, when bnxt_hwrm_tunnel_redirect_free() fails,
driver is not setting flow error using "rte_flow_error_set".

Fixes: 11e5e19695c7 ("net/bnxt: support redirecting tunnel packets to VF")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: fix error handling in VNIC prepare
Kalesh AP [Mon, 31 May 2021 07:26:39 +0000 (12:56 +0530)]
net/bnxt: fix error handling in VNIC prepare

Resources should be freed on error conditions. i.e, VNIC and
VNIC context created in HW and memory allocated in
bnxt_vnic_grp_alloc() should be freed.

Added a new function bnxt_vnic_destroy() to do the cleanup.
This lightweight function can be used in flow destroy/flush
path to avoid duplicate code as well.

Fixes: d24610f7bfda ("net/bnxt: allow flow creation when RSS is enabled")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: remove unnecessary code
Kalesh AP [Mon, 31 May 2021 07:26:38 +0000 (12:56 +0530)]
net/bnxt: remove unnecessary code

Also removed a log message which does not convey any
useful information.

Fixes: d24610f7bfda ("net/bnxt: allow flow creation when RSS is enabled")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: set flow error when free filter not available
Kalesh AP [Mon, 31 May 2021 07:26:37 +0000 (12:56 +0530)]
net/bnxt: set flow error when free filter not available

In bnxt_flow_validate(), when bnxt_get_unused_filter() fails due to
no filter resources available, driver is not setting flow error using
"rte_flow_error_set".

Also, fixed the error code.

Fixes: 5ef3b79fdfe6 ("net/bnxt: support flow filter ops")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: fix error messages in VNIC prepare
Kalesh AP [Mon, 31 May 2021 07:26:36 +0000 (12:56 +0530)]
net/bnxt: fix error messages in VNIC prepare

The bnxt_vnic_prep() can fail due to multiple reasons.
But when bnxt_vnic_prep() fails, PMD is not returning
the actual error/string to the application.

Fix it by moving the "rte_flow_error_set" to bnxt_vnic_prep()
to set the actual error code.

Fixes: d24610f7bfda ("net/bnxt: allow flow creation when RSS is enabled")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
3 years agonet/bnxt: workaround spurious zero stats in Thor
Somnath Kotur [Mon, 31 May 2021 05:53:01 +0000 (11:23 +0530)]
net/bnxt: workaround spurious zero stats in Thor

There is a HW bug that can result in certain stats being reported as
zero.
Workaround this by ignoring stats with a value of zero based on the
previously stored snapshot of the same stat.
This bug mainly manifests in the output of func_qstats as FW aggregrates
each ring's stat value to give the per function stat and if one of
them is zero, the per function stat value ends up being lower than the
previous snapshot which shows up as a zero PPS value in testpmd.
Eliminate invocation of func_qstats and aggregate the per-ring stat
values in the driver itself to derive the func_qstats output post
accounting for the spurious zero stat value.

Bugzilla ID: 641
Fixes: f8168ca0e690 ("net/bnxt: support thor controller")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: detect bad opaque in Rx completion
Somnath Kotur [Thu, 27 May 2021 06:18:46 +0000 (11:48 +0530)]
net/bnxt: detect bad opaque in Rx completion

There is a rare hardware bug that can cause a bad opaque value in the RX
or TPA start completion. When this happens, the hardware may have used the
same buffer twice for 2 Rx packets.  In addition, the driver might also
crash later using the bad opaque as an index into the ring.

The Rx opaque value is predictable and is always monotonically increasing.
The workaround is to keep track of the expected next opaque value and
compare it with the one returned by hardware during RX and TPA start
completions. If they miscompare, log it, discard the completion,
schedule a ring reset and move on to the next one.

Fixes: 0958d8b6435d ("net/bnxt: support LRO")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: add AVX2 RX/Tx
Lance Richardson [Mon, 24 May 2021 18:59:51 +0000 (14:59 -0400)]
net/bnxt: add AVX2 RX/Tx

Implement AVX2 vector PMD.

Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: fix Rx burst size constraint
Lance Richardson [Mon, 24 May 2021 18:59:50 +0000 (14:59 -0400)]
net/bnxt: fix Rx burst size constraint

The burst receive function should return all packets currently
present in the receive ring up to the requested burst size,
update vector mode receive functions accordingly.

Fixes: 398358341419 ("net/bnxt: support NEON")
Fixes: bc4a000f2f53 ("net/bnxt: implement SSE vector mode")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: refactor HW ptype mapping table
Lance Richardson [Mon, 24 May 2021 18:59:49 +0000 (14:59 -0400)]
net/bnxt: refactor HW ptype mapping table

Make the definition of the table used to map hardware packet type
information to DPDK packet type more generic.

Add macro definitions for constants used in creating table
indices, use these to eliminate raw constants in code.

Add build-time assertions to validate ptype mapping constants.

Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agonet/bnxt: check access to possible null pointer
Thierry Herbelot [Mon, 24 May 2021 09:00:38 +0000 (11:00 +0200)]
net/bnxt: check access to possible null pointer

Check that pointers are valid before using them.

Fixes: 7bc8e9a227ccb ("net/bnxt: support async link notification")
Cc: stable@dpdk.org
Signed-off-by: Thierry Herbelot <thierry.herbelot@6wind.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
3 years agomalloc: fix size annotation for NUMA-aware realloc
David Marchand [Thu, 10 Jun 2021 12:09:22 +0000 (14:09 +0200)]
malloc: fix size annotation for NUMA-aware realloc

__rte_alloc_size is mapped to compiler alloc_size attribute.

Quoting gcc documentation:
"""
alloc_size
    The alloc_size attribute is used to tell the compiler that the
    function return value points to memory, where the size is given by
    one or two of the functions parameters. GCC uses this information
    to improve the correctness of __builtin_object_size.

    The function parameter(s) denoting the allocated size are specified
    by one or two integer arguments supplied to the attribute.
    The allocated size is either the value of the single function
    argument specified or the product of the two function arguments
    specified. Argument numbering starts at one.
"""

In rte_realloc_socket case, only 'size' matters.

Note: this has been spotted by Maxime trying to use rte_realloc_socket
and compiling with gcc 11.

Fixes: 17b347dab769 ("malloc: add alloc_size attribute to functions")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
3 years agobitmap: fix buffer overrun in bitmap init
Ivan Ilchenko [Wed, 2 Jun 2021 09:49:22 +0000 (12:49 +0300)]
bitmap: fix buffer overrun in bitmap init

Bitmap initialization function is allowed to memset()
caller-provided buffer with number of bytes exceeded
this buffer size. This happens due to wrong comparison
sign between buffer size and number of bytes required
to initialize bitmap.

Fixes: 602c9ca33a4 ("sched: bitmap is now dynamically allocated")
Cc: stable@dpdk.org
Reported-by: Andy Moreton <amoreton@xilinx.com>
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
3 years agonet/i40e: enable PCI bus master after reset
Haiyue Wang [Mon, 24 May 2021 01:23:45 +0000 (09:23 +0800)]
net/i40e: enable PCI bus master after reset

The VF reset can be triggered by the PF reset event, then the PCI bus
master will be cleared, the VF will be not allowed to issue any Memory
or I/O Requests.

So after the reset event is detected, always enable the PCI bus master.
And if failed, the device or system may be in an invalid state, so keep
the VF reset state to mark it as I/O error.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
3 years agonet/iavf: enable PCI bus master after reset
Haiyue Wang [Mon, 24 May 2021 01:23:44 +0000 (09:23 +0800)]
net/iavf: enable PCI bus master after reset

The VF reset can be triggered by the PF reset event, then the PCI bus
master will be cleared, the VF will be not allowed to issue any Memory
or I/O Requests.

So after the reset event is detected, always enable the PCI bus master.
And if failed, the device or system may be in an invalid state, so keep
the VF reset state to mark it as I/O error.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
3 years agobus/pci: configure PCI bus master
Haiyue Wang [Mon, 24 May 2021 01:23:43 +0000 (09:23 +0800)]
bus/pci: configure PCI bus master

Add the API to set 'Bus Master Enable' bit to be enabled or disabled in
the PCI command register.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
3 years agotelemetry: remove static limit on callbacks count
David Marchand [Thu, 6 May 2021 08:27:54 +0000 (10:27 +0200)]
telemetry: remove static limit on callbacks count

This code is not performance sensitive and can be switched to dynamic
allocations.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ciara Power <ciara.power@intel.com>
3 years agograph: fix null dereference in stats
Hongbo Zheng [Thu, 6 May 2021 07:16:27 +0000 (15:16 +0800)]
graph: fix null dereference in stats

In function 'stats_mem_init', pointer 'stats' should
be confirmed not null before memset it.

Fixes: af1ae8b6a32c ("graph: implement stats")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>