dpdk.git
2 years agolpm: fix scalar version header for C++
Stanislaw Kardach [Thu, 9 Jun 2022 12:17:00 +0000 (14:17 +0200)]
lpm: fix scalar version header for C++

rte_xmm_t is a union type which wraps around xmm_t and maps its contents
to scalar structures. Since C++ has stricter type conversion rules than
C, the rte_xmm_t::x has to be used instead of C-casting.

The generated assembly is identical to the code without the fix (checked
both on x86 and RISC-V).

Fixes: 406937f89ffd ("lpm: add scalar version of lookupx4")

Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2 years agoeal/riscv: fix vector header for C++
Stanislaw Kardach [Thu, 9 Jun 2022 12:16:59 +0000 (14:16 +0200)]
eal/riscv: fix vector header for C++

rte_xmm_t is a union type which wraps around xmm_t and maps its contents
to scalar structures. Since C++ has stricter type conversion rules than
C, the rte_xmm_t::x has to be used instead of C-casting.

Fixes: f22e705ebf12 ("eal/riscv: support RISC-V architecture")

Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2 years agoconfig: remove explicit undef of unset values
Bruce Richardson [Thu, 16 Dec 2021 11:14:30 +0000 (11:14 +0000)]
config: remove explicit undef of unset values

Rather than explicitly clearing any setting of undefined values in our
rte_config.h file, it's better to instead just add a comment that the
value is not set. Using a comment allows the user to set the value using
CFLAGS or similar mechanism without the config file clearing the value
again.

The text used "<VALUE> is not set" is modelled after the kernel approach
of doing the same thing.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Signed-off-by: David Marchand <david.marchand@redhat.com>
2 years agobuild: add ccache for cross compilation
Jerin Jacob [Wed, 8 Jun 2022 17:13:04 +0000 (22:43 +0530)]
build: add ccache for cross compilation

By default, ccache is not used for cross build[1].
Update all cross files to use ccache if it is available
in build machine.

Also, updated devtools/test-meson-builds.sh
script to find the correct DPDK_TARGET due to
change in cross file syntax.

[1]
https://mesonbuild.com/Machine-files.html

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Stanislaw Kardach <kda@semihalf.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
2 years agotest: validate test names in non interactive mode
Bruce Richardson [Fri, 10 Jun 2022 14:24:06 +0000 (15:24 +0100)]
test: validate test names in non interactive mode

When passing in test names to run via either the DPDK_TEST environment
variable or via extra argv parameters, the checks run on those commands
can miss valid commands that are registered with the cmdline library in
the initial context used to set it up. This is seen in the fact that the
"dump_*" set of commands are not callable via argv parameters, but can
be called manually.

To fix this, just use the commandline library to validate each command
before executing it, stopping execution when an error is encountered.
This also has the benefit of not having the test binary drop to
interactive mode if all commandline parameters given are invalid.

Bugzilla ID: 1002
Fixes: 9b848774a5dc ("test: use env variable to run tests")
Fixes: ace2f054ed43 ("test: take test names from command line")

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2 years agocmdline: add function to verify valid commands
Bruce Richardson [Fri, 10 Jun 2022 14:24:05 +0000 (15:24 +0100)]
cmdline: add function to verify valid commands

The cmdline library cmdline_parse() function parses a command and
executes the action automatically too. The cmdline_valid_buffer function
also uses this function to validate commands, meaning that there is no
function to validate a command as ok without executing it.

To fix this omission, we extract the body of cmdline_parse into a new
static inline function with an extra parameter to indicate whether the
action should be performed or not. Then we create two wrappers around
that - a replacement for the existing cmdline_parse function where the
extra parameter is "true" to execute the command, and a new function
"cmdline_parse_check" which passes the parameter as "false" to perform
cmdline validation only.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Weiyuan Li <weiyuanx.li@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2 years agoversion: 22.07-rc1
Thomas Monjalon [Wed, 8 Jun 2022 19:43:41 +0000 (21:43 +0200)]
version: 22.07-rc1

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2 years agodevtools: unify cross-compilation tests
Thomas Monjalon [Wed, 8 Jun 2022 15:36:40 +0000 (17:36 +0200)]
devtools: unify cross-compilation tests

Reduce the number of Arm builds from 3 to 1:
only generic armv8 with GCC.
The specific PPC builds on Ubuntu are skipped.

The build directories for PPC and RISC-V
are also renamed for consistency:
- build-arm64-generic-gcc
- build-ppc64-power8-gcc
- build-riscv64-generic-gcc

The cross file is always saved in variable "f" for readability.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
2 years agokni: use dedicated function to set MAC address
Ke Zhang [Wed, 8 Jun 2022 12:11:17 +0000 (15:11 +0300)]
kni: use dedicated function to set MAC address

The warning info:
warning: passing argument 1 of ‘memcpy’ discards ‘const’
qualifier from pointer target type

Variable dev_addr is done const intentionally in v5.17 to prevent using
it directly.  See the following Linux kernel changeset for details:

commit adeef3e32146 ("net: constify netdev->dev_addr")

Used helper function was introduced earlier in v5.15.

Fixes: ea6b39b5b847 ("kni: remove ethtool support")
Cc: stable@dpdk.org
Signed-off-by: Ke Zhang <ke1x.zhang@intel.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
2 years agokni: use dedicated function to set random MAC address
Ke Zhang [Wed, 8 Jun 2022 12:11:16 +0000 (15:11 +0300)]
kni: use dedicated function to set random MAC address

eth_hw_addr_random() sets address type correctly.

eth_hw_addr_random() is available since Linux v3.4, so
no compat is required.

Also fix the warning:
warning: passing argument 1 of ‘memcpy’ discards ‘const’
qualifier from pointer target type

Variable dev_addr is done const intentionally in Linux v5.17 to
prevent using it directly.

Fixes: ea6b39b5b847 ("kni: remove ethtool support")
Cc: stable@dpdk.org
Signed-off-by: Ke Zhang <ke1x.zhang@intel.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
2 years agoethdev: introduce available Rx descriptors threshold
Spike Du [Wed, 8 Jun 2022 16:35:28 +0000 (19:35 +0300)]
ethdev: introduce available Rx descriptors threshold

A new event RTE_ETH_EVENT_RX_AVAIL_THRESH should be generated by HW
when number of available descriptors in Rx queue goes below the
threshold.

The threshold is defined as a percentage of an Rx queue size with valid
values from 0 to 99 (inclusive). Zero (default) value disables it.

There is no capability reporting for the feature. Application should
simply try to set required threshold value and handle result.

Add testpmd commands to control the threshold:
  set port <port_id> rxq <rxq_id> avail_thresh <avail_thresh_num>

Signed-off-by: Spike Du <spiked@nvidia.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
2 years agokernel/linux: get kernel version from kernel source
Ferdinand Thiessen [Thu, 3 Mar 2022 13:15:43 +0000 (14:15 +0100)]
kernel/linux: get kernel version from kernel source

When building the kernel modules, try to get the kernel version from
the kernel sources first.
This fixes the kernel modules installation directory if the target kernel
version differs from the host kernel version, like for CI build or when
packaging for linux distributions.

Signed-off-by: Ferdinand Thiessen <rpm@fthiessen.de>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
2 years agonet/tap: fix device freeing
Yunjian Wang [Tue, 7 Jun 2022 06:50:57 +0000 (14:50 +0800)]
net/tap: fix device freeing

The error path was calling rte_eth_dev_release_port() function,
which frees eth_dev->data->dev_private, and then tries to free
pmd->intr_handle, which causes the use after free issue.

The free can be moved to before the release function is called.

Fixes: d61138d4f0e ("drivers: remove direct access to interrupt handle")
Cc: stable@dpdk.org
Signed-off-by: Xiangjun Meng <mengxiangjun4@huawei.com>
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2 years agonet/failsafe: fix device freeing
Yunjian Wang [Tue, 7 Jun 2022 06:50:49 +0000 (14:50 +0800)]
net/failsafe: fix device freeing

The PMD destroy function was calling the release function, which frees
dev->data->dev_private, and then tries to free PRIV(dev)->intr_handle,
which causes the heap use after free issue.

The free can be moved to before the release function is called.

Fixes: d61138d4f0e ("drivers: remove direct access to interrupt handle")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2 years agoapp/testpmd: fix multicast address pool leak
Ke Zhang [Fri, 25 Mar 2022 08:35:55 +0000 (08:35 +0000)]
app/testpmd: fix multicast address pool leak

A multicast address pool is allocated for a port when
using mcast_addr testpmd commands.

When closing a port or stopping testpmd, this pool was
not freed, resulting in a leak.
This issue has been caught using ASan.

Free this pool when closing the port.

Error info as following:
ERROR: LeakSanitizer: detected memory leaksDirect leak of
       192 byte(s)
0 0x7f6a2e0aeffe in __interceptor_realloc
(/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe)
1 0x565361eb340f in mcast_addr_pool_extend
../app/test-pmd/config.c:5162
2 0x565361eb3556 in mcast_addr_pool_append
../app/test-pmd/config.c:5180
3 0x565361eb3aae in mcast_addr_add
../app/test-pmd/config.c:5243

Fixes: 8fff667578a7 ("app/testpmd: new command to add/remove multicast MAC addresses")
Cc: stable@dpdk.org
Signed-off-by: Ke Zhang <ke1x.zhang@intel.com>
Acked-by: Yuying Zhang <yuying.zhang@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
2 years agoapp/testpmd: fix packet segment allocation
Raja Zidane [Thu, 2 Jun 2022 12:59:47 +0000 (15:59 +0300)]
app/testpmd: fix packet segment allocation

When --mbuf-size cmdline parameter is specified, the segments to scatter
packets on are allocated sequentially from these extra memory pools
(the mbuf for the first segment is allocated from the first pool, the
second one from the second pool, and so on, if segment number is greater
then pool’s the mbuf for remaining segments will be allocated from the
last valid pool).
A bug in comparing segment index with mbuf index caused wrong mapping
of one of the segments.

Fix the comparison.

Fixes: 2befc67ff679 ("app/testpmd: add extended Rx queue setup")
Cc: stable@dpdk.org
Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2 years agoeal: remove unused arch-specific headers for locks
David Marchand [Wed, 8 Jun 2022 11:57:01 +0000 (13:57 +0200)]
eal: remove unused arch-specific headers for locks

MCS lock, PF lock and Ticket lock have no arch specific implementation,
there is no need for the extra redirection in headers.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Stanislaw Kardach <kda@semihalf.com>
2 years agonet/ark: support virtual functions
Ed Czeck [Tue, 7 Jun 2022 21:31:49 +0000 (17:31 -0400)]
net/ark: support virtual functions

- Add capabilities field isvf to dev struct
- Disable configuration calls as required by VF

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
2 years agonet/ark: support new devices
Ed Czeck [Tue, 7 Jun 2022 21:31:48 +0000 (17:31 -0400)]
net/ark: support new devices

Support new devices and update device list in doc

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
2 years agonet/ark: report additional errors from firmware
Ed Czeck [Tue, 7 Jun 2022 21:31:47 +0000 (17:31 -0400)]
net/ark: report additional errors from firmware

Detect and report completion errors from firmware

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
2 years agonet/ark: update UDM functions for firmware update
Ed Czeck [Tue, 7 Jun 2022 21:31:46 +0000 (17:31 -0400)]
net/ark: update UDM functions for firmware update

- New firmware version for UDM (Upstream Data Mover)
- Remove device-level start, stop, and reset operations
- Add queue-based start, stop and reset as required by firmware
- Remove performance structs as they are not in the firmware module

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
2 years agonet/ark: update DDM functions for firmware update
Ed Czeck [Tue, 7 Jun 2022 21:31:45 +0000 (17:31 -0400)]
net/ark: update DDM functions for firmware update

- New firmware version for DDM (Downstream Data Mover)
- Remove device-level start, stop, and reset operations
- Add queue-based start, stop and reset as required by firmware

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
2 years agonet/ark: update MPU functions for firmware update
Ed Czeck [Tue, 7 Jun 2022 21:31:44 +0000 (17:31 -0400)]
net/ark: update MPU functions for firmware update

- New firmware version for MPU (Mbuf Prefetch Unit)
- Remove device-level global operations
- Remove ark_mpu_reset_stats function

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
2 years agodevtools: add Atomic Rules acronyms for commit checks
Ed Czeck [Tue, 7 Jun 2022 21:31:43 +0000 (17:31 -0400)]
devtools: add Atomic Rules acronyms for commit checks

DDM -> Downstream Data Mover
MPU -> Mbuf Prefetch Unit
UDM -> Upstream Data Mover

Signed-off-by: Ed Czeck <ed.czeck@atomicrules.com>
Acked-by: Ferruh Yigit <ferruh.yigit@xilinx.com>
2 years agonet/ena: update version to 2.7.0
Michal Krawczyk [Tue, 7 Jun 2022 16:43:41 +0000 (18:43 +0200)]
net/ena: update version to 2.7.0

This release contains changes listed below.

  - Fast mbuf free feature support.
  - Device argument to disable the LLQ.
  - Simplification of the MTU verification.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
2 years agonet/ena: add device argument to disable LLQ
Michal Krawczyk [Tue, 7 Jun 2022 16:43:40 +0000 (18:43 +0200)]
net/ena: add device argument to disable LLQ

The PMD attempts to enable the LLQ (Low Latency Queue) whenever it's
possible. The LLQ requires the user to enable the Write Combining for
the supported igb_uio/vfio-pci modules.

The vfio-pci module officially doesn't support the WC. Moreover, in some
Linux distributions, it can be built into the kernel, so any
modifications to the vfio-pci module require a full rebuild of the
kernel. This can make the configuration process much harder and for some
users, that are not interested in the great network performance for
their setups, it may be redundant. These users requested to be able to
turn off LLQ to avoid the hassle of such a setup.

It's generally not recommended to disable the LLQ, as it won't result in
the performance improvement and on the 6th generation AWS instances the
lack of LLQ can have a huge negative impact on hardware performance.

The device argument which controls the LLQ is called 'enable_llq` and by
default, it's set to 1 (which means that the LLQ is enabled). Setting
it to 0 disables the LLQ.

This commit also adds the explicit initialization of the devarg for the
'use_large_llq_hdr'. The PMD_REGISTER_PARAM_STRING() call for the ENA
was updated with all the available devargs (including
ENA_DEVARG_MISS_TXC_TO, which wasn't added previously).

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
2 years agonet/ena: remove redundant MTU verification
Dawid Gorecki [Tue, 7 Jun 2022 16:43:39 +0000 (18:43 +0200)]
net/ena: remove redundant MTU verification

Remove MTU verification from ena_mtu_set() and ena_start(). It is done
by rte_ethdev already, so there is no reason to repeat it inside the ENA
driver.

Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
2 years agonet/ena: support fast mbuf free
Dawid Gorecki [Tue, 7 Jun 2022 16:43:38 +0000 (18:43 +0200)]
net/ena: support fast mbuf free

Add support for RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE offload. It can be
enabled if all the mbufs for a given queue belong to the same mempool
and their reference count is equal to 1.

Signed-off-by: Dawid Gorecki <dgr@semihalf.com>
Reviewed-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
2 years agoexamples/l3fwd: merge l3fwd-acl example
Sean Morrissey [Fri, 22 Apr 2022 09:57:19 +0000 (09:57 +0000)]
examples/l3fwd: merge l3fwd-acl example

l3fwd-acl contains duplicate functions to l3fwd.
For this reason we merge l3fwd-acl code into l3fwd
with '--lookup acl' cmdline option to run ACL.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2 years agoexamples/l3fwd: add vector stubs for RISC-V
Stanislaw Kardach [Tue, 7 Jun 2022 10:46:14 +0000 (12:46 +0200)]
examples/l3fwd: add vector stubs for RISC-V

Add missing em_mask_key() implementation and fix l3fwd_common.h
inclusion in FIB lookup functions to enable the l3fwd to be run on
RISC-V.

Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2 years agonet/tap: set BPF syscall ID for RISC-V
Stanislaw Kardach [Tue, 7 Jun 2022 10:46:13 +0000 (12:46 +0200)]
net/tap: set BPF syscall ID for RISC-V

Define the missing __NR_bpf syscall id to enable the tap PMD.

Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2 years agonet/memif: set memfd syscall ID for RISC-V
Stanislaw Kardach [Tue, 7 Jun 2022 10:46:12 +0000 (12:46 +0200)]
net/memif: set memfd syscall ID for RISC-V

Define the missing __NR_memfd_create syscall id to enable the memif PMD.

Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2 years agonet/ixgbe: add vector stubs for RISC-V
Stanislaw Kardach [Tue, 7 Jun 2022 10:46:11 +0000 (12:46 +0200)]
net/ixgbe: add vector stubs for RISC-V

Re-use vector processing stubs in ixgbe PMD defined for PPC for RISC-V.
This enables ixgbe PMD usage in scalar mode on this architecture.

The ixgbe PMD driver was validated with Intel X520-DA2 NIC and the
test-pmd application. Packet transfer checked using all UIO drivers
available for non-IOMMU platforms: uio_pci_generic, vfio-pci noiommu and
igb_uio.

Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2 years agotest/cpuflags: add flags for RISC-V
Michal Mazurek [Tue, 7 Jun 2022 10:46:15 +0000 (12:46 +0200)]
test/cpuflags: add flags for RISC-V

Add checks for all flag values defined in the RISC-V misa CSR register.

Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Michal Mazurek <maz@semihalf.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2 years agoci: add RISC-V cross compilation
Stanislaw Kardach [Tue, 7 Jun 2022 10:46:17 +0000 (12:46 +0200)]
ci: add RISC-V cross compilation

Check cross-compilation using Ubuntu 20.04 x86.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2 years agodevtools: add RISC-V in build test
Stanislaw Kardach [Tue, 7 Jun 2022 10:46:16 +0000 (12:46 +0200)]
devtools: add RISC-V in build test

Validate RISC-V compilation when test-meson-builds.sh is called. The
check will be only performed if appropriate toolchain is present on the
system (same as with other architectures).

Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2 years agoeal/riscv: support RISC-V architecture
Michal Mazurek [Tue, 7 Jun 2022 10:46:10 +0000 (12:46 +0200)]
eal/riscv: support RISC-V architecture

Add all necessary elements for DPDK to compile and run EAL on SiFive
Freedom U740 SoC which is based on SiFive U74-MC (ISA: rv64imafdc)
core complex.

This includes:

- EAL library implementation for rv64imafdc ISA.
- meson build structure for 'riscv' architecture. RTE_ARCH_RISCV define
  is added for architecture identification.
- xmm_t structure operation stubs as there is no vector support in the
  U74 core.

Compilation was tested on Ubuntu and Arch Linux using riscv64 toolchain.
Clang compilation currently not supported due to issues with missing
relocation relaxation.

Two rte_rdtsc() schemes are provided: stable low-resolution using rdtime
(default) and unstable high-resolution using rdcycle. User can override
the scheme by defining RTE_RISCV_RDTSC_USE_HPM=1 during compile time of
both DPDK and the application. The reasoning for this is as follows.
The RISC-V ISA mandates that clock read by rdtime has to be of constant
period and synchronized between all hardware threads within 1 tick
(chapter 10.1 in version 20191213 of RISC-V spec).
However this clock may not be of high-enough frequency for dataplane
uses. I.e. on HiFive Unmatched (FU740) it is 1MHz.
There is a high-resolution alternative in form of rdcycle which is
clocked at the core clock frequency. The drawbacks are that it may be
disabled during sleep (WFI), its frequency might change due to DVFS and
it is core-local and therefore cannot be used as a wall-clock. It can
however be used for micro-benchmarking user applications, similarly to
Aarch64's PMCCNTR PMU counter.

The platform is currently marked as linux-only because rte_cycles
implementation uses the timebase-frequency device-tree node read through
the proc file system. Such approach was chosen because Linux kernel
depends on the presence of this device-tree node.

The i40e PMD driver is disabled on RISC-V as the rv64gc ISA has no vector
operations.

The compilation of following modules has been disabled by this commit
and will be re-enabled in later commits as fixes are introduced:
net/ixgbe, net/memif, net/tap, example/l3fwd.

Sponsored-by: Frank Zhao <frank.zhao@starfivetech.com>
Sponsored-by: Sam Grove <sam.grove@sifive.com>
Signed-off-by: Michal Mazurek <maz@semihalf.com>
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
2 years agomempool/cnxk: avoid batch op free for empty pools
Ashwin Sekhar T K [Thu, 28 Apr 2022 09:59:35 +0000 (15:29 +0530)]
mempool/cnxk: avoid batch op free for empty pools

Batch op data is initialized inside mempool alloc. But
in case of empty mempools, the alloc function is not
called and hence the initialization of batch op data is
also not done. So ensure the validity of batch op data
inside mempool free.

Signed-off-by: Ashwin Sekhar T K <asekhar@marvell.com>
2 years agoraw/cnxk_gpio: allow controlling existing GPIO
Tomasz Duszynski [Sat, 4 Jun 2022 14:03:23 +0000 (16:03 +0200)]
raw/cnxk_gpio: allow controlling existing GPIO

Controlling existing GPIO should be normally frowned upon because
we want to avoid situation where multiple contenders modify GPIO
state simultaneously.

Still there might be situations where this is actually needed.
Restarting killed application being an example here.

So relax current restrictions and respect user needs.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
2 years agodma/idxd: fix error code for PCI device commands
Kevin Laatz [Fri, 8 Apr 2022 14:16:55 +0000 (15:16 +0100)]
dma/idxd: fix error code for PCI device commands

When sending a command to an idxd device via PCI BAR, the response from
HW is checked to ensure it was successful. The response was incorrectly
being negated before being returned by the function, meaning error codes
cannot be checked against the HW specification.

This patch fixes the return values of the function by removing the
negation.

Fixes: 9449330a8458 ("dma/idxd: create dmadev instances on PCI probe")
Fixes: 452c1916b0db ("dma/idxd: fix truncated error code in status check")
Cc: stable@dpdk.org
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Conor Walsh <conor.walsh@intel.com>
2 years agodoc: improve ordering and remove old titles in prog guide
Harry van Haaren [Fri, 27 May 2022 13:45:01 +0000 (13:45 +0000)]
doc: improve ordering and remove old titles in prog guide

Move the "source_org" page to after overview, where it fits
better to explain the source-code layout of DPDK, before getting
into details of specific libraries such as EAL.

Also removes the older titles from the 3 documents which still had them.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
2 years agodoc: fix formatting and link in BPF library guide
Harry van Haaren [Fri, 27 May 2022 13:45:00 +0000 (13:45 +0000)]
doc: fix formatting and link in BPF library guide

Small improvements to the documentation based on Sphinx HTML doc output.

Fixes: 14b8f0bbe519 ("doc: add BPF library guide")
Fixes: b901d928361c ("bpf: support packet data load instructions")
Cc: stable@dpdk.org
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
2 years agocommon/cnxk: allow building for generic arm64
Tomasz Duszynski [Sat, 4 Jun 2022 16:31:57 +0000 (18:31 +0200)]
common/cnxk: allow building for generic arm64

Allow building generic arm64 target using config/arm/arm64_armv8_linux_*
config which works on both cn9k and cn10k by relaxing cache line size
requirements a bit.

While at it move cache line checks to common place.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2 years agomaintainers: update for NXP devices
Nipun Gupta [Fri, 3 Jun 2022 05:50:39 +0000 (11:20 +0530)]
maintainers: update for NXP devices

Update and add maintainers for NXP devices and RAW device API.

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2 years agoapp/test: count tests skipped at setup
Anoob Joseph [Tue, 24 May 2022 14:16:11 +0000 (19:46 +0530)]
app/test: count tests skipped at setup

If the setup function returns TEST_SKIPPED, the logs would say the test
case is skipped while the summary count would consider it under failed
cases. Address this by counting such test cases under 'skipped'.

Signed-off-by: Anoob Joseph <anoobj@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
2 years agobuild: add definitions for use as Meson subproject
Bruce Richardson [Fri, 6 May 2022 14:43:18 +0000 (15:43 +0100)]
build: add definitions for use as Meson subproject

To allow other projects to easily use DPDK as a subproject, add in the
necessary dependency definitions. Slightly different definitions are
necessary for static and shared builds, since for shared builds the
drivers should not be linked in, and the internal meson dependency
objects are more complete.

To use DPDK as a subproject fallback i.e. use installed DPDK if present,
otherwise the shipped one, the following meson statement can be used:

libdpdk = dependency('libdpdk', fallback: ['dpdk', 'dpdk_dep'])

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ben Magistro <koncept1@gmail.com>
Tested-by: Ben Magistro <koncept1@gmail.com>
2 years agobus/fslmc: fix VFIO setup
Romain Delhomel [Fri, 3 Jun 2022 15:18:30 +0000 (17:18 +0200)]
bus/fslmc: fix VFIO setup

At device probe, the fslmc bus driver calls rte_vfio_get_group_fd() to
get a fd associated to a vfio group. This function first checks if the
group is already opened, else it opens /dev/vfio/%u, and increases the
number of active groups in default_vfio_cfg (which references the
default vfio container).

When adding the first group to a vfio_cfg, the caller is supposed to
pick an IOMMU type and set up DMA mappings for container, as it's done
by pci bus, but it is not done here. Instead, a new container is created
and used.

This prevents the pci bus driver, which uses the default_vfio_cfg
container, to configure the container because
default_vfio_cfg->active_group > 1.

This patch fixes the issue by always creating a new container (and its
associated vfio_cfg) and binding the group to it.

Fixes: a69f79300262 ("bus/fslmc: support multi VFIO group")
Cc: stable@dpdk.org
Signed-off-by: Romain Delhomel <romain.delhomel@6wind.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2 years agoreplace zero-length arrays with flexible ones
Bruce Richardson [Fri, 3 Jun 2022 11:16:23 +0000 (12:16 +0100)]
replace zero-length arrays with flexible ones

This patch replaces instances of zero-sized arrays i.e. those at the end
of structures with "[0]" with the more standard syntax of "[]".
Replacement was done using coccinelle script, with some revert and
cleanup of whitespace afterwards.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2 years agococci: add script for zero-length arrays in structs
Bruce Richardson [Fri, 3 Jun 2022 11:16:22 +0000 (12:16 +0100)]
cocci: add script for zero-length arrays in structs

Add script to replace [0] with [] when used at the end of a struct.
The script also includes an additional struct member to match against so
as to avoid issues with arrays with only a single zero-length element.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
2 years agodoc: describe OFS in ifpga guide
Wei Huang [Tue, 7 Jun 2022 09:07:24 +0000 (05:07 -0400)]
doc: describe OFS in ifpga guide

OFS (Open FPGA Stack) specification is introduced briefly.

Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
2 years agoraw/ifpga: support OFS card probing
Wei Huang [Tue, 7 Jun 2022 09:07:23 +0000 (05:07 -0400)]
raw/ifpga: support OFS card probing

PAC N6000 is the first OFS platform, its device id is added to ifpga
device support list.

Previous FPGA platform like Intel PAC N3000 and N5000, FME DFL (Device
Feature List) starts from BAR0 by default, port DFL location is indicated
in PORTn_OFFSET register in FME. In OFS implementation, FME DFL and port
DFL location can be defined individually in PCIe VSEC (Vendor Specific
Extended Capabilities). In this patch, DFL definition is searched in VSEC,
the legacy DFL is used only when DFL VSEC is not present.

In original DFL enumeration process, AFU is expected to locate in port DFL,
but this is not the case in OFS implementation. In this patch, enumeration
can search AFU in any PF/VF which has no FME and port.

Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
2 years agoraw/ifpga: unregister interrupt on close
Wei Huang [Tue, 7 Jun 2022 09:07:22 +0000 (05:07 -0400)]
raw/ifpga: unregister interrupt on close

There is an API rte_pmd_ifpga_cleanup provided by ifpga driver to
free the software resource used by ifpga card. The function call
of rte_pmd_ifpga_cleanup is list below.
rte_pmd_ifpga_cleanup()
  ifpga_rawdev_cleanup()
     rte_rawdev_pmd_release()
       rte_rawdev_close()
         ifpga_rawdev_close()

The interrupts are unregistered in ifpga_rawdev_destroy instead of
ifpga_rawdev_close function, so rte_pmd_ifpga_cleanup cannot free
interrupt resource as expected.

To fix such issue, interrupt unregistration is moved from
ifpga_rawdev_destroy to ifpga_rawdev_close function. The change of
function call of ifpga_rawdev_destroy is as below.
ifpga_rawdev_destroy()
  ifpga_unregister_msix_irq()  // removed
  rte_rawdev_pmd_release()
    rte_rawdev_close()
      ifpga_rawdev_close()

Fixes: e0a1aafe2af9 ("raw/ifpga: introduce IRQ functions")
Cc: stable@dpdk.org
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
2 years agoraw/ifpga: remove virtual devices on close
Wei Huang [Tue, 7 Jun 2022 09:07:21 +0000 (05:07 -0400)]
raw/ifpga: remove virtual devices on close

Virtual devices created on ifpga raw device will not be removed
when ifpga device has closed. To avoid resource leak problem,
this patch introduces an ifpga virtual device remove function,
virtual devices will be destroyed after the ifpga raw device closed.

Fixes: ef1e8ede3da5 ("raw/ifpga: add Intel FPGA bus rawdev driver")
Cc: stable@dpdk.org
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
2 years agoraw/ifpga: remove experimental tag
Wei Huang [Tue, 7 Jun 2022 09:07:20 +0000 (05:07 -0400)]
raw/ifpga: remove experimental tag

These APIs are introduced in DPDK 21.05 and have been tested in several
release, experimental tag can be formally removed.

Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Reviewed-by: Rosen Xu <rosen.xu@intel.com>
2 years agotest/threads: add unit test for get/set priority
Tyler Retzlaff [Tue, 24 May 2022 11:08:37 +0000 (04:08 -0700)]
test/threads: add unit test for get/set priority

Add unit tests to exercise and demonstrate rte_thread_{get,set}_priority().

Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
2 years agoeal: get/set thread priority per thread identifier
Tyler Retzlaff [Tue, 24 May 2022 11:08:36 +0000 (04:08 -0700)]
eal: get/set thread priority per thread identifier

Add functions for setting and getting the priority of a thread.
Priorities on multiple platforms are similarly determined by a priority
value and a priority class/policy.

Currently in DPDK most threads operate at the OS-default priority level
but there are cases when increasing the priority is useful. For
example, high performance applications may require elevated priority
levels.

For these reasons, EAL will expose two priority levels which are named
suggestively "normal" and "realtime_critical" and are computed as
follows:

  On Linux, the following mapping is created:
    RTE_THREAD_PRIORITY_NORMAL corresponds to
      * policy SCHED_OTHER
      * priority value:   (sched_get_priority_min(SCHED_OTHER) +
   sched_get_priority_max(SCHED_OTHER))/2;
    RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to
      * policy SCHED_RR
      * priority value: sched_get_priority_max(SCHED_RR);

  On Windows, the following mapping is created:
    RTE_THREAD_PRIORITY_NORMAL corresponds to
      * class NORMAL_PRIORITY_CLASS
      * priority THREAD_PRIORITY_NORMAL
    RTE_THREAD_PRIORITY_REALTIME_CRITICAL corresponds to
      * class REALTIME_PRIORITY_CLASS (when running with privileges)
      * class HIGH_PRIORITY_CLASS (when running without privileges)
      * priority THREAD_PRIORITY_TIME_CRITICAL

Note that on Linux the resulting priority value will be 0, in
accordance to the documentation that mention the value should be 0 for
SCHED_OTHER policy.

Signed-off-by: Narcisa Vasile <navasile@linux.microsoft.com>
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
2 years agoeal: add seqlock
Mattias Rönnblom [Mon, 23 May 2022 14:23:46 +0000 (16:23 +0200)]
eal: add seqlock

A sequence lock (seqlock) is a synchronization primitive which allows
for data-race free, low-overhead, high-frequency reads, suitable for
data structures shared across many cores and which are updated
relatively infrequently.

A seqlock permits multiple parallel readers. A spinlock is used to
serialize writers. In cases where there is only a single writer, or
writer-writer synchronization is done by some external means, the
"raw" sequence counter type (and accompanying rte_seqcount_*()
functions) may be used instead.

To avoid resource reclamation and other issues, the data protected by
a seqlock is best off being self-contained (i.e., no pointers [except
to constant data]).

One way to think about seqlocks is that they provide means to perform
atomic operations on data objects larger than what the native atomic
machine instructions allow for.

DPDK seqlocks (and the underlying sequence counters) are not
preemption safe on the writer side. A thread preemption affects
performance, not correctness.

A seqlock contains a sequence number, which can be thought of as the
generation of the data it protects.

A reader will
  1. Load the sequence number (sn).
  2. Load, in arbitrary order, the seqlock-protected data.
  3. Load the sn again.
  4. Check if the first and second sn are equal, and even numbered.
     If they are not, discard the loaded data, and restart from 1.

The first three steps need to be ordered using suitable memory fences.

A writer will
  1. Take the spinlock, to serialize writer access.
  2. Load the sn.
  3. Store the original sn + 1 as the new sn.
  4. Perform load and stores to the seqlock-protected data.
  5. Store the original sn + 2 as the new sn.
  6. Release the spinlock.

Proper memory fencing is required to make sure the first sn store, the
data stores, and the second sn store appear to the reader in the
mentioned order.

The sn loads and stores must be atomic, but the data loads and stores
need not be.

The original seqlock design and implementation was done by Stephen
Hemminger. This is an independent implementation, using C11 atomics.

For more information on seqlocks, see
https://en.wikipedia.org/wiki/Seqlock

Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
2 years agoeal/ppc: fix compilation for musl
Duncan Bellamy [Sat, 14 May 2022 07:14:35 +0000 (08:14 +0100)]
eal/ppc: fix compilation for musl

musl lacks __ppc_get_timebase() but has __builtin_ppc_get_timebase()

Signed-off-by: Duncan Bellamy <dunk@denkimushi.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
2 years agoexamples/pipeline: fix build
Ali Alnubani [Thu, 2 Jun 2022 12:34:12 +0000 (15:34 +0300)]
examples/pipeline: fix build

This patch fixes the following build failure seen on Ubuntu 16.04
with gcc 5.4.0 because of uninitialized variable:
  [..]
  examples/pipeline/cli.c:2853:9: error: 'session_id' may be used
    uninitialized in this function [-Werror=maybe-uninitialized]
  [..]

Fixes: 172254555f9f ("examples/pipeline: support packet mirroring")

Signed-off-by: Ali Alnubani <alialnu@nvidia.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
2 years agodma/idxd: add generic option for queue config
Kevin Laatz [Fri, 1 Apr 2022 10:35:00 +0000 (11:35 +0100)]
dma/idxd: add generic option for queue config

The device config script currently uses some defaults to configure
devices in a generic way.

With the addition of this option, users have more control over how
queues are configured.

Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
2 years agodma/hisilicon: support vchan status query
Chengwen Feng [Fri, 27 May 2022 03:40:55 +0000 (11:40 +0800)]
dma/hisilicon: support vchan status query

This patch adds support for vchan-status ops.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2 years agodma/hisilicon: enhance CQ scan robustness
Chengwen Feng [Fri, 27 May 2022 03:40:54 +0000 (11:40 +0800)]
dma/hisilicon: enhance CQ scan robustness

The CQ (completion queue) descriptors were updated by hardware, and then
scanned by driver to retrieve hardware completion status.

This patch enhances robustness by following:
1. replace while (true) with a finite loop to avoid potential dead loop.
2. check the csq_head field in CQ descriptor to avoid status array
overflows.

Fixes: 2db4f0b82360 ("dma/hisilicon: add data path")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2 years agotest/dma: check index when no DMA completed
Chengwen Feng [Fri, 27 May 2022 03:40:53 +0000 (11:40 +0800)]
test/dma: check index when no DMA completed

If no DMA request is completed, the ring_idx of the last completed
operation need returned by last_idx parameter. This patch adds
testcase for it.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Tested-by: Kevin Laatz <kevin.laatz@intel.com>
2 years agodma/hisilicon: fix index returned when no DMA completed
Chengwen Feng [Fri, 27 May 2022 03:40:52 +0000 (11:40 +0800)]
dma/hisilicon: fix index returned when no DMA completed

If no DMA request is completed, the ring_idx of the last completed
operation need returned by last_idx parameter. This patch fixes it.

Fixes: 2db4f0b82360 ("dma/hisilicon: add data path")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
2 years agoexamples/dma: add force minimal copy size parameter
Chengwen Feng [Sun, 24 Apr 2022 06:07:41 +0000 (14:07 +0800)]
examples/dma: add force minimal copy size parameter

This patch adds force minimal copy size parameter
(-m/--force-min-copy-size), so when do copy by CPU or DMA, the real copy
size will be the maximum of mbuf's data_len and this parameter.

This parameter was designed to compare the performance between CPU copy
and DMA copy. User could send small packets with a high rate to drive
the performance test.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
2 years agoexamples/dma: fix Tx drop statistics
Chengwen Feng [Sun, 24 Apr 2022 06:07:40 +0000 (14:07 +0800)]
examples/dma: fix Tx drop statistics

The Tx drop statistic was designed to be collected by
rte_eth_dev_tx_buffer mechanism, but the application uses
rte_eth_tx_burst to send packets and this lead the Tx drop statistic
was not collected.

This patch removes rte_eth_dev_tx_buffer mechanism to fix the problem.

Fixes: 632bcd9b5d4f ("examples/ioat: print statistics")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
2 years agoexamples/dma: fix MTU configuration
Huisong Li [Sun, 24 Apr 2022 06:07:39 +0000 (14:07 +0800)]
examples/dma: fix MTU configuration

The MTU in dma App can be configured by 'max_frame_size' parameters which
have a default value(1518). It's not reasonable to use it directly as MTU.
This patch fix it.

Fixes: 1bb4a528c41f ("ethdev: fix max Rx packet length")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
2 years agodmadev: add telemetry
Sean Morrissey [Fri, 1 Apr 2022 15:01:35 +0000 (15:01 +0000)]
dmadev: add telemetry

Telemetry commands are now registered through the dmadev library
for the gathering of DSA stats. The corresponding callback
functions for listing dmadevs and providing info and stats for a
specific dmadev are implemented in the dmadev library.

An example usage can be seen below:

Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 22.03.0-rc2", "pid": 2956551, "max_output_len": 16384}
Connected to application: "dpdk-dma"
--> /
{"/": ["/", "/dmadev/info", "/dmadev/list", "/dmadev/stats", ...]}
--> /dmadev/list
{"/dmadev/list": [0, 1]}
--> /dmadev/info,0
{"/dmadev/info": {"name": "0000:00:01.0", "nb_vchans": 1, "numa_node": 0,
"max_vchans": 1, "max_desc": 4096, "min_desc": 32, "max_sges": 0,
"capabilities": {"mem2mem": 1, "mem2dev": 0, "dev2mem": 0, ...}}}
--> /dmadev/stats,0,0
{"/dmadev/stats": {"submitted": 0, "completed": 0, "errors": 0}}

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Conor Walsh <conor.walsh@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
Tested-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Chengwen Feng <fengchengwen@huawei.com>
2 years agodmadev: clarify visibility of completed jobs
Bruce Richardson [Thu, 12 May 2022 15:11:22 +0000 (16:11 +0100)]
dmadev: clarify visibility of completed jobs

Clarify that once an operation has completed, the output of that
operation is visible to all cores.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
2 years agokni: fix build
Thomas Monjalon [Mon, 6 Jun 2022 10:39:49 +0000 (12:39 +0200)]
kni: fix build

A previous fix had #else instead of #endif.
The error message is:
kernel/linux/kni/kni_net.c: In function ‘kni_net_rx_normal’:
kernel/linux/kni/kni_net.c:448:2: error: #else after #else

Bugzilla ID: 1025
Fixes: c98600d4bed6 ("kni: fix build with Linux 5.18")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
2 years agonet/mlx5: support ESP item on Windows
Raja Zidane [Thu, 2 Jun 2022 13:03:08 +0000 (16:03 +0300)]
net/mlx5: support ESP item on Windows

ESP item is not supported on Windows, yet it is expanded from the
expansion graph when trying to create default flow to RSS all packets.

Support ESP item match (without ability to match on SPI field on Windows).
Split ESP validation per OS.

Signed-off-by: Raja Zidane <rzidane@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2 years agonet/mlx5: fix entry size in construct data ipool
Michael Baum [Thu, 2 Jun 2022 11:39:16 +0000 (14:39 +0300)]
net/mlx5: fix entry size in construct data ipool

The mlx5_action_construct_data structure memory is managed by ipool
named acts_ipool.

The size of one entry in this ipool is mistakenly defined as size of
rte_flow_hw structure.
This size is used to reset in the allocated part. When the size is
incorrect it resets memory that does not belong to it.

This patch defines the correct size.

Fixes: f13fab23922b ("net/mlx5: add flow jump action")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2 years agocommon/mlx5: remove unused lcore check
Suanming Mou [Tue, 31 May 2022 01:25:48 +0000 (04:25 +0300)]
common/mlx5: remove unused lcore check

While non-lcore list operations were supported, non-lcore index will
be converted to MLX5_LIST_NLCORE. In that case, no need to check the
lcore index be -1 or not anymore.

This commit removes the unused lcore check in list.

Fixes: 7e1cf892711b ("common/mlx5: support list non-lcore operations")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2 years agonet/iavf: remove dead code
Qi Zhang [Mon, 30 May 2022 11:36:29 +0000 (07:36 -0400)]
net/iavf: remove dead code

Remove unimplemented function call be wrapped by
RTE_LIBRTE_IAVF_DEBUG_TX_DESC_RING

Fixes: 1e728b01120c ("net/iavf: rework Tx path")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Radu Nicolau <radu.nicolau@intel.com>
2 years agonet/iavf: increase reset complete wait count
Qiming Yang [Mon, 30 May 2022 05:34:58 +0000 (13:34 +0800)]
net/iavf: increase reset complete wait count

Kernel iavf driver has sent patch to increase the completion
wait time to reduce the "Reset never finished" case.
Follow this action in DPDK iavf driver.
Kernel reference commit:
8e3e4b9da7e6 ("iavf: increase reset complete wait time")

Fixes: 22b123a36d07 ("net/avf: initialize PMD")
Cc: stable@dpdk.org
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: fix outer L4 checksum in scalar Rx
Wenjing Qiao [Fri, 27 May 2022 08:09:55 +0000 (04:09 -0400)]
net/ice: fix outer L4 checksum in scalar Rx

In scalar datapath, ol_flag shows RTE_MBUF_F_RX_OUTER_L4_CKSUM_UNKNOWN
which is error, therefore fixing this bug.

Fixes: 94005e4640a7 ("net/ice: fix build with 16-byte Rx descriptor")
Cc: stable@dpdk.org
Signed-off-by: Wenjing Qiao <wenjing.qiao@intel.com>
Reported-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: fix initialization with quanta configuration
Wenjun Wu [Fri, 27 May 2022 05:38:51 +0000 (13:38 +0800)]
net/iavf: fix initialization with quanta configuration

When kernel driver does not support quanta size configuration,
it will return error. We do not expect it to occur in default
initialization process.

Fixes: b14e8a57b9fe ("net/iavf: support quanta size configuration")

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/igc: support I226 devices
Qiming Yang [Wed, 25 May 2022 05:57:50 +0000 (13:57 +0800)]
net/igc: support I226 devices

Added I226 Series device ID in igc driver and updated igc guide
document for new devices.

Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Signed-off-by: Kevin Liu <kevinx.liu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: fix device stop
Radu Nicolau [Mon, 23 May 2022 12:04:36 +0000 (13:04 +0100)]
net/iavf: fix device stop

Move security context destroy from device stop to device close function.
Deleting the context on device stop can prevent the application from
properly cleaning and releasing resources.

Fixes: 6bc987ecb860 ("net/iavf: support IPsec inline crypto")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: fix device initialization without inline crypto
Radu Nicolau [Wed, 20 Apr 2022 11:03:01 +0000 (12:03 +0100)]
net/iavf: fix device initialization without inline crypto

When the inline crypto feature VF capability flag is set also check if the
feature is enabled, otherwise the initialization will fail even when
the inline crypto is not required.

Fixes: 6bc987ecb860 ("net/iavf: support IPsec inline crypto")
Cc: stable@dpdk.org
Signed-off-by: Radu Nicolau <radu.nicolau@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: fix race condition with Rx timestamp offload
Wenjun Wu [Mon, 23 May 2022 04:49:00 +0000 (12:49 +0800)]
net/iavf: fix race condition with Rx timestamp offload

In multi-cores cases for Rx timestamp offload, if packets arrive
too fast, aq command to get phc time will be pended.

This patch adds spinlock to fix this issue. To avoid phc time being
frequently overwritten, move related variables to iavf_rx_queue
structure, and each queue will handle timestamp calculation by itself.

Fixes: b5cd735132f6 ("net/iavf: enable Rx timestamp on flex descriptor")
Fixes: 33db16136e55 ("net/iavf: improve performance of Rx timestamp offload")

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: support VF RSS flow rule with raw pattern
Ting Xu [Mon, 23 May 2022 02:31:38 +0000 (10:31 +0800)]
net/iavf: support VF RSS flow rule with raw pattern

Enable Protocol Agnostic Flow Offloading for RSS hash in VF. It supports
raw pattern flow rule creation in VF based on Parser Library feature. VF
parses the spec and mask input of raw pattern, and passes it to kernel
driver to create the flow rule. Current rte_flow raw API is utilized.

command example:
RSS hash for ipv4-src-dst:
flow create 0 ingress pattern raw pattern spec
00000000000000000000000008004500001400004000401000000000000000000000
pattern mask
0000000000000000000000000000000000000000000000000000ffffffffffffffff /
end actions rss queues end / end

Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: enable flow rule with raw pattern
Junfeng Guo [Mon, 23 May 2022 02:31:36 +0000 (10:31 +0800)]
net/iavf: enable flow rule with raw pattern

This patch enabled Protocol Agnostic Flow (raw flow) Offloading Flow
Director (FDIR) in AVF, based on the Parser Library feature and the
existing rte_flow `raw` API.

The input spec and mask of raw pattern are first parsed via the
Parser Library, and then passed to the kernel driver to create the
flow rule.

Similar as ice PMD's implemnentation, each raw flow requires:
1. A byte string of raw target packet bits.
2. A byte string contains mask of target packet.

Here is an example:
FDIR matching ipv4 dst addr with 1.2.3.4 and redirect to queue 3:

flow create 0 ingress pattern raw \
pattern spec \
00000000000000000000000008004500001400004000401000000000000001020304 \
pattern mask \
000000000000000000000000000000000000000000000000000000000000ffffffff \
/ end actions queue index 3 / mark id 3 / end

Note that mask of some key bits (e.g., 0x0800 to indicate ipv4 proto)
is optional in our cases. To avoid redundancy, we just omit the mask
of 0x0800 (with 0xFFFF) in the mask byte string example. The prefix
'0x' for the spec and mask byte (hex) strings are also omitted here.

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agocommon/iavf: support raw packet in protocol header
Junfeng Guo [Mon, 23 May 2022 02:31:34 +0000 (10:31 +0800)]
common/iavf: support raw packet in protocol header

The patch extends existing virtchnl_proto_hdrs structure to allow VF
to pass a pair of buffers as packet data and mask that describe
a match pattern of a filter rule. Then the kernel PF driver is requested
to parse the pair of buffer and figure out low level hardware metadata
(ptype, profile, field vector.. ) to program the expected FDIR or RSS
rules.

Also update the proto_hdrs template init to align the virtchnl changes.

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agodoc: update matching versions in i40e guide
Qiming Yang [Fri, 20 May 2022 07:04:41 +0000 (15:04 +0800)]
doc: update matching versions in i40e guide

Add recommended matching list for i40e PMD in DPDK 21.05,
21.08, 21.11 and 22.03. And add a known issue when FW upgrade
to a version 8.4 and higher

Cc: stable@dpdk.org
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: fix Rx queue interrupt setting
Ke Zhang [Fri, 20 May 2022 03:00:23 +0000 (03:00 +0000)]
net/iavf: fix Rx queue interrupt setting

For Rx-Queue Interrupt Setting, when VF Rx interrupt
disable (INTENA=0), there are two ways to write back
descriptor to host memory:

1) Set WB_ON_ITR bit 0 to Interrupt Dynamic Control Register:
Completed descriptors are posted to host memory according to
the internal descriptor cache policy (in other words when a
full cache line is available for write-back).

A internal descriptor size is 16 bytes or 32 bytes, a cache
line size is 64 bytes or 128 bytes from datasheet :
PCIe Global Config 2 - GLPCI_CNF2 (0x000BE004; RO)
so the full cache line could contains 4 packets, it means
Network card will send 4 packets to host when a full cache line
is available.

2) Set WB_ON_ITR bit 1 to Interrupt Dynamic Control Register:
Completed descriptors also trigger the ITR. Following ITR
expiration, all leftover completed descriptors are posted to
host memory.

Network card will send packet to host even if only one
descriptor is completed.

Changing 1) to 2) to make sure VF send the packet to host even
if there is only one Rx packet is ready in hardware.

Fixes: d6bde6b5eae9 ("net/avf: enable Rx interrupt")
Cc: stable@dpdk.org
Signed-off-by: Ke Zhang <ke1x.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: fix mbuf release in multi-process
Ke Zhang [Thu, 19 May 2022 07:36:04 +0000 (07:36 +0000)]
net/iavf: fix mbuf release in multi-process

In the multiple process environment, the subprocess operates on the
shared memory and changes the function pointer of the main process,
resulting in the failure to find the address of the function when main
process releasing, resulting in crash.

Fixes: 319c421f3890 ("net/avf: enable SSE Rx Tx")
Cc: stable@dpdk.org
Signed-off-by: Ke Zhang <ke1x.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: fix queue start exception handling
Qiming Yang [Thu, 19 May 2022 05:01:56 +0000 (05:01 +0000)]
net/iavf: fix queue start exception handling

If any queue start fail during dev_start, all started queues
should be stopped.

Fixes: 69dd4c3d0898 ("net/avf: enable queue and device")
Cc: stable@dpdk.org
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/i40e: fix max frame size config at port level
Wenxuan Wu [Wed, 18 May 2022 04:59:14 +0000 (04:59 +0000)]
net/i40e: fix max frame size config at port level

Previously, max frame size can only be set when link is up, and the wait
time is 1 sec. Startup time of 10G_BASET longer than 1s would result in
failure.

Actually, max frame size of media type I40E_MEDIA_TYPE_BASET can be set
regardless of link status.

This patch omitted the link status check of 10G_MEDIA_TYPE_BASET.

Fixes: a4ba77367923 ("net/i40e: enable maximum frame size at port level")
Cc: stable@dpdk.org
Signed-off-by: Wenxuan Wu <wenxuanx.wu@intel.com>
Acked-by: Yuying Zhang <yuying.zhang@intel.com>
2 years agonet/iavf: fix crash after VF reset failure
Yiding Zhou [Thu, 12 May 2022 10:48:51 +0000 (18:48 +0800)]
net/iavf: fix crash after VF reset failure

Some pointers will be set to NULL when iavf_dev_reset() failed,
for example vf->vf_res, vf->vsi_res vf->rss_key and etc.
APIs access these NULL pointers will trigger segfault.

This patch adds closed flag to indicate that the VF is closed,
and rejects API calls in this state to avoid coredump.

Fixes: e74e1bb6280d ("net/iavf: enable port reset")
Cc: stable@dpdk.org
Signed-off-by: Yiding Zhou <yidingx.zhou@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: fix MTU info for DCF
Kevin Liu [Fri, 8 Apr 2022 01:43:08 +0000 (01:43 +0000)]
net/ice: fix MTU info for DCF

In the DCF module, Missing maximum and minimum
MTU value settings.

This patch adds the settings of the maximum and
minimum MTU to correctly calculate the MTU value.

Fixes: bf89db4409bb ("net/ice: complete device info get in DCF")
Cc: stable@dpdk.org
Signed-off-by: Kevin Liu <kevinx.liu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice/base: fix direction of flow that matches any
Yuying Zhang [Thu, 12 May 2022 07:42:00 +0000 (07:42 +0000)]
net/ice/base: fix direction of flow that matches any

The tx/rx packets were both dropped when creating drop any rule
for ingress direction only, the root cause is the recipe didn't
contain direction flag matching.

This patch adds the packet flag which represents the direction of
source interface to solve the issue.

Fixes: 92317961a731 ("net/ice: support drop any and steer all to queue")
Cc: stable@dpdk.org
Signed-off-by: Yuying Zhang <yuying.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: add warning for unsupported TM configuration
Wenjun Wu [Tue, 17 May 2022 05:09:32 +0000 (13:09 +0800)]
net/ice: add warning for unsupported TM configuration

Priority configuration is enabled in level 3 and level 4.
Weight configuration is enabled in level 4.
This patch adds warning log for unsupported priority
and weight configuration.

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: support queue weight configuration
Wenjun Wu [Tue, 17 May 2022 05:09:31 +0000 (13:09 +0800)]
net/ice: support queue weight configuration

This patch adds queue weight configuration support.

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: support queue and queue group priority config
Wenjun Wu [Tue, 17 May 2022 05:09:30 +0000 (13:09 +0800)]
net/ice: support queue and queue group priority config

This patch adds queue and queue group priority configuration
support. The highest priority is 0, and the lowest priority
is 7.

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: support queue and queue group bandwidth limit
Ting Xu [Tue, 17 May 2022 05:09:29 +0000 (13:09 +0800)]
net/ice: support queue and queue group bandwidth limit

Enable basic TM API for PF only. Support for adding profiles and queue
nodes. Only max bandwidth is supported in profiles. Profiles can be
assigned to target queues and queue group. To set up the exact queue
group, we need to reconfigure topology by delete and then recreate
queue nodes. Only TC0 is valid.

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice/base: support priority configuration of exact node
Wenjun Wu [Tue, 17 May 2022 05:09:28 +0000 (13:09 +0800)]
net/ice/base: support priority configuration of exact node

This patch adds priority configuration support of the exact
node in the scheduler tree.
This function does not need additional calls to the scheduler
lock.

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice/base: support queue BW allocation configuration
Wenjun Wu [Tue, 17 May 2022 05:09:27 +0000 (13:09 +0800)]
net/ice/base: support queue BW allocation configuration

This patch adds BW allocation support of queue scheduling node
to support WFQ in queue level.

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice/base: fix getting sched node from ID type
Wenjun Wu [Tue, 17 May 2022 05:09:26 +0000 (13:09 +0800)]
net/ice/base: fix getting sched node from ID type

The function ice_sched_get_node_by_id_type needs to be called
with the scheduler lock held. However, the function
ice_sched_get_node also requests the scheduler lock.
It will cause the dead lock issue.

This patch replaces function ice_sched_get_node with
function ice_sched_find_node_by_teid to solve this problem.

Fixes: 93e84b1bfc92 ("net/ice/base: add basic Tx scheduler")
Cc: stable@dpdk.org
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ixgbe: add option for link up check on pin SDP3
Jeff Daly [Tue, 10 May 2022 18:57:25 +0000 (14:57 -0400)]
net/ixgbe: add option for link up check on pin SDP3

1ca05831b9b added a check that SDP3 (used as a TX_DISABLE output to the
SFP cage on these cards) is not asserted to avoid incorrectly reporting
link up when the SFP's laser is turned off.

ff8162cb957 limited this workaround to fiber ports

This patch:
* Adds devarg 'fiber_sdp3_no_tx_disable' not all fiber ixgbe devs use
  SDP3 as TX_DISABLE

Fixes: 1ca05831b9b ("net/ixgbe: fix link status")
Fixes: ff8162cb957 ("net/ixgbe: fix link status")
Cc: stable@dpdk.org
Signed-off-by: Jeff Daly <jeffd@silicom-usa.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>