git.droids-corp.org - dpdk.git/log

net/txgbe: replace forbidden functions

Remove rte_panic(), and use rte_atomic_thread_fence()
instead of rte_smp_[r/w]mb.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

net/mlx5: fix Tx queue stop state

The Tx queue stop API doesn't call the PMD callback when the state of
the queue is stopped.
The drivers should update the state to be stopped when the queue stop
callback is done successfully or when the port is stopped.
The drivers should update the state to be started when the queue start
callback is done successfully or when the port is started.

The driver wrongly didn't update the state as started when the port
start callback was done which kept the state as stopped.
Following call to a queue stop API was not completed by ethdev layer
because the state is already stopped.

Move the state update from the Tx queue setup to the port start
callback.

Fixes: 161d103b231c ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix Tx queue reference count check

The Txq refcnt 1 value means that there is no real reference to the
queue and only the control configurations are saved in the struct.

The patch below wrongly didn't consider it and caused a leak in the Txq
object resource.

Revert the specific update in the refcnt.

Fixes: b5c8b3e70cdf ("net/mlx5: use C11 atomics for RxQ/TxQ refcounts")
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

common/mlx5: free MR resource on device DMA unmap

mlx5 PMD created the MR (Memory Region) resource on the
mlx5_dma_map call to make the memory available for DMA
operations. On the mlx5_dma_unmap call the MR resource
was not freed but inserted to MR Free list for further
garbage collection.
Actual MR resource destroying happened on device stop
call. That caused the runtime out of memory in case of
application performed multiple DMA map/unmap calls.

The fix immediately frees the MR resource on mlx5_dma_unmap
call not engaging the list. The export for mlx5_mr_free
function from common PMD part is added as well.

Fixes: 989e999d9305 ("net/mlx5: support PCI device DMA map and unmap")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix SQ resources release in error flow

Fix in error flow in which the function
mlx5_txq_release_devx_sq_resources is called twice by setting the
release object to NULL after the first call

The incorrect flow was introduced in the work done on generic
object creation.

Once an error flow inside mlx5_txq_create_devx_sq_resources
occurs the function will call mlx5_txq_release_devx_sq_resources
however the released pointers are not set to NULL after the release
calls and undefined memory is released in the same call in
mlx5_txq_release_devx_resources.

This results in calls to MLX5_FREE with
an already released memory addresses and assert in mlx5_release_dbr:

EAL: Error: Invalid memory
EAL: Error: Invalid memory

PANIC in mlx5_txq_release_devx_sq_resources():
assert "(mlx5_release_dbr(&txq_obj->txq_ctrl->priv->dbrpgs,
mlx5_os_get_umem_id (txq_obj->sq_dbrec_page->umem),
txq_obj->sq_dbrec_offset)) == 0" failed

The fix is setting the released pointers to NULL after the first release
calls.

Fixes: 86d259cec852 ("net/mlx5: separate Tx queue object creations")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix Rx queue object allocation with MPRQ

The space for extra buffer pointers used by MPRQ routines was not
allocated in Rx queue object creation structure causing memory
corruption.
The fix allocates the extra memory for the pointers in case MPRQ is
engaged.

Fixes: a0a45e8af723 ("net/mlx5: configure Rx queue for buffer split")
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/bnxt: remove useless prefetches

Prefetching only helps performance if it is done several 100
instructions before the actual use. The purpose of the prefetch
is to read ahead, it doesn't help if the next instruction
will block.

The code in the bnxt driver was doing these unnecessary prefetches.

Fixes: 2eb53b134aae ("net/bnxt: add initial Rx code")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

config: enable packet prefetching with Meson

With Make build system, RTE_PMD_PACKET_PREFETCH was enabled
by default. It got lost when transitioning to Meson build
system.

In order to avoid performance changes, this patch enables
packet prefetching in rte_config.h.

Fixes: 9314afb68a53 ("drivers: add infrastructure for meson build")
Cc: stable@dpdk.org
Reported-by: Marvin Liu <yong.liu@intel.com>
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>

gro: fix packet type detection with IPv6 tunnel

For VxLAN packets, GRO will mistakenly reassemble them
if inner L3 is IPv6, inner L4 is TCP or UDP, and outer L3
is IPv4 because the value of IS_IPV4_VXLAN_TCP4/UDP4_PKT
is true for them.

This fix makes sure IS_IPV4_TCP_PKT, IS_IPV4_UDP_PKT,
IS_IPV4_VXLAN_TCP4_PKT and IS_IPV4_VXLAN_UDP4_PKT can make
decision precisely.

Fixes: e2d811063673 ("gro: support VXLAN UDP/IPv4")
Fixes: 1ca5e6740852 ("gro: support UDP/IPv4")
Fixes: 9e0b9d2ec0f4 ("gro: support VxLAN GRO")
Fixes: 0d2cbe59b719 ("lib/gro: support TCP/IPv4")
Cc: stable@dpdk.org
Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>

net/mlx5: fix UAR used by ASO queues

The dedicated UAR was allocated for the ASO queues.
The shared UAR created for Tx queues can be used instead.

Fixes: f935ed4b645a ("net/mlx5: support flow hit action for aging")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

vdpa/mlx5: fix UAR allocation

This patch provides the UAR allocation workaround for the
hosts where UAR allocation with Write-Combining memory
mapping type fails.

Fixes: 8395927cdfaf ("vdpa/mlx5: prepare HW queues")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

regex/mlx5: fix UAR allocation

This patch provides the UAR allocation workaround for the
hosts where UAR allocation with Write-Combining memory
mapping type fails.

Fixes: b34d816363b5 ("regex/mlx5: support rules import")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

common/mlx5: share UAR allocation routine

This patch introduces the routine to allocate the UAR (User
Access Region) with various memory mapping types. The origin
patch being fixed provided the UAR allocation workaround
for the mlx5 net PMD only. As it was found the other mlx5
based drivers - vdpa and regex are affected by the issue
as well and must be fixed.

Fixes: a0bfe9d56f74 ("net/mlx5: fix UAR memory mapping type")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

usertools: fix CPU layout script to be PEP8 compliant

The pycodestyle tool flagged the following issues, which are now fixed.

$ pycodestyle cpu_layout.py
cpu_layout.py:18:5: E722 do not use bare 'except'
cpu_layout.py:62:14: E231 missing whitespace after ','

Fixes: deb87e6777c0 ("usertools: use sysfs for CPU layout")
Fixes: c9208f1dc967 ("usertools: fix CPU layout with python 3")
Cc: stable@dpdk.org
Signed-off-by: Ciara Power <ciara.power@intel.com>

raw/ioat: fix queue index calculation

Coverity flags a possible problem where the 8-bit wq_idx value may have
errors when shifted and sign-extended to pointer size. Since this can
only occur if the shift index is larger than any expected value from
hardware, it's unlikely to cause any real problems, but we can eliminate
any possible errors, and the coverity issue, by explicitly typecasting
the uint8_t value to uintptr_t before any shift operations occur.

Coverity issue: 363695
Fixes: a33969462135 ("raw/ioat: fix work-queue config size")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

build: fix MS linker flag with meson 0.54

Meson versions >= 0.54.0 include support for handling /implib
with msvc link. Specifying it explicitly causes failures when
linking against the dll. Tested using Link 14.27.29112.0 and
Clang 11.0.0.

There were a number of changes to the way that import libraries
are handled between 0.47.1 and 0.54.0. Only make the change
for >= 0.54.0, leaving the behaviour unchanged for earlier
versions.

Fixes: 77cca7ccec13 ("build: fix drivers library path on Windows")
Cc: stable@dpdk.org
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Tested-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Khoa To <khot@microsoft.com>

build: fix install on Windows

Don't run symlink-drivers-solibs.sh as part of 'install' because
Windows doesn't support shell scripts.

Fixes: 82ba4416dd87 ("build: add module definition files for Windows")
Cc: stable@dpdk.org
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Tested-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>

examples/qos_sched: fix usage string

The short option written for interactive mode is --i in usage of
this qos_sched example. Actually, it is -i.

Fixes: cfd5c971e5e ("examples/qos_sched: add stats")
Cc: stable@dpdk.org
Signed-off-by: Ibtisam Tariq <ibtisam.tariq@emumba.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>

table: fix exact match SWX table lookup

Fix for the exact match lookup function.

Fixes: d0a00966618b ("table: add exact match SWX table")
Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

doc: describe the SWX pipeline type

Add the new SWX pipeline type to the Programmer's Guide.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

crypto/armv8: replace meson option with pkg-config support

With pkg-config support available within AArch64crypto library,
meson option 'armv8_crypto_dir' can be removed.
PKG_CONFIG_PATH environment variable should be set appropriately
to use the crypto library.

Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

eal/arm: fix clang build of native target

When doing Clang build with '-mcpu=native' on N1 platform, build failed
with:
../lib/librte_eal/arm/include/rte_atomic_64.h:76:39:
error: instruction requires: lse
__ATOMIC128_CAS_OP(__cas_128_release, "caspl")

This is because native detection for Neoverse N1 was added in Clang-11.
Prior version of Clang's assembler doesn't know LSE support on hardware.
Fixed this for Clang earlier than version 11 by specifying architecture
for assembler.
Referred to [1] for this fix.

Fixes: 7e2c3e17fe2c ("eal/arm64: add 128-bit atomic compare exchange")
Cc: stable@dpdk.org
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e0d5896bd356cd577f9710a02d7a474cdf58426b

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

vfio: use static window sizing for sPAPR IOMMU

The SPAPR IOMMU requires that a DMA window size be defined before memory
can be mapped for DMA. Current code dynamically modifies the DMA window
size in response to every new memory allocation which is potentially
dangerous because all existing mappings need to be unmapped/remapped in
order to resize the DMA window, leaving hardware holding IOVA addresses
that are temporarily unmapped. The new SPAPR code statically assigns
the DMA window size on first use, using the largest physical memory
memory address when IOVA=PA and the highest existing memseg virtual
address when IOVA=VA.

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

devtools: fix directory filter in forbidden token check

checkpatches.sh current complains on a patch [1] adding
ALLOW_EXPERIMENTAL_API in an example while this check is for app, lib
and drivers directories:

Warning in examples/ethtool/ethtool-app/Makefile:
Using experimental build flag for in-tree compilation

The regexp on entering files concerned by this filter is incorrect.
In the [1] case, the file full name is matched against "app" rather than
"+++ b/app".

1: https://patchwork.dpdk.org/patch/83902/

Fixes: 7413e7f2aeb3 ("devtools: alert on new calls to exit from libs")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>

examples: stop processing meson file if build impossible

Once it has been determined that an example cannot be built, there is
little point in continuing to process the meson.build file for that
example, so we can use subdir_done() to return to the calling file.
This can potentially prevent problems where later statement in the file
may cause an error on systems where the app cannot be built, e.g. on
Windows or FreeBSD.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

examples/l2fwd-keepalive: skip meson build if no librt

When librt is not present on a system, processing the meson.build file
for this example application causes an error. Make the library
non-mandatory and just mark the example as unbuildable if it is
not present.

Fixes: 89f0711f9ddf ("examples: build some samples with meson")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

examples: fix flattening directory layout on install

By installing the examples one-by-one in a loop in the examples
meson.build file we effectively flattened out the structure of the examples
folder and omitted some common and shared subfolders that were never
directly built. Instead, we can remove the loop and just have the whole
"examples" folder installed as-is in a single statement, preserving its
directory structure, and thereby fixing the build of a number of the
examples.

Fixes: 2daf565f91b5 ("examples: install as part of ninja install")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

mbuf: move pool pointer in first half

According to the Technical Board decision
(http://mails.dpdk.org/archives/dev/2020-November/191859.html),
the mempool pointer in the mbuf struct is moved
from the second to the first half.
It may increase performance in some cases
on systems having 64-byte cache line, i.e. mbuf split in two cache lines.

Due to this change, all fields after "pool" are moved up.
Hopefully no vector data path is impacted.

Moving this field gives more space to dynfield1
while dropping the temporary dynfield0.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
0    void *                            buf_addr;         /*   0 +  8 */
1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
5.5  uint64_t             union        hash;             /*  44 +  8 */
6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
7    struct rte_mempool *              pool;             /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
8    struct rte_mbuf *                 next;             /*  64 +  8 */
9    uint64_t             union        tx_offload;       /*  72 +  8 */
10    struct rte_mbuf_ext_shared_info * shinfo;           /*  80 +  8 */
11    uint16_t                          priv_size;        /*  88 +  2 */
      uint16_t                          timesync;         /*  90 +  2 */
11.5  uint32_t                          dynfield1[9];     /*  92 + 36 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>

drivers: disable OCTEON TX2 in 32-bit build

The drivers for OCTEON TX2 are not supported in 32-bit mode.

Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Jerin Jacob <jerinj@marvell.com>

devtools: allow custom set of examples in build test

To test the installation process of DPDK using "ninja install"
test-meson-builds.sh builds a subset of the examples using "make". To allow
more flexibility for people testing, allow the set of examples chosen for
this make test to be overridden using variable "DPDK_BUILD_TEST_EXAMPLES"
in the environment.

Since a number of example apps link against drivers directly even for
shared builds, we need to ensure that LD_LIBRARY_PATH points to the main
DPDK lib folder so any dependencies of those drivers can be found e.g. that
the PCI/vdev bus driver .so is found. [All drivers are symlinked from
drivers dir back to lib dir on install, so only one dir rather than two is
needed in the path.]

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

devtools: fix x86-default build test install env

The x86-default environment was loaded after installing this target.
I did not see any problem with it, yet we should load corresponding
environment before installing a target.

Fixes: bd253daa7717 ("devtools: fix test of ninja install")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>

devtools: reduce build test verbosity

The default verbosity of test-meson-builds.sh is to be quiet.
In order to better apply the verbosity policy, some file descriptors
are open to redirect to stdout or /dev/null accordingly.

The target variable and meson/ninja commands are printed in verbose modes.
The installation commands are printed only in very verbose mode.
The examples build commands are printed only in very verbose mode.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

devtools: fix build test config inheritance from env

The variables DPDK_MESON_OPTIONS, PATH, PKG_CONFIG_PATH,
CPPFLAGS, CFLAGS and LDFLAGS can be customized in the config file
loaded by devtools/load-devel-config at each build.
The configuration can be adjusted per target thanks to the value set
in the DPDK_TARGET variable.

PKG_CONFIG_PATH is specific to each target, so it must be empty
before configuring each build from the file according to DPDK_TARGET.
Inheriting a default PKG_CONFIG_PATH for all targets does not make sense
and is prone to confusion.

DPDK_MESON_OPTIONS might take a global initial value from environment
to customize a build test from the shell. Example:
DPDK_MESON_OPTIONS="b_lto=true"
Some target-specific options can be added in the configuration file:
DPDK_MESON_OPTIONS="$DPDK_MESON_OPTIONS kernel_dir=$MYKERNEL"

Fixes: 272236741258 ("devtools: load target-specific compilation environment")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: David Marchand <david.marchand@redhat.com>

usertools: fix pmdinfo parsing

This script inspects an ELF file (binary or shared library) and its
linked dependencies by following DT_NEEDED tags.
So far a simple librte_pmd prefix was used as a filter to only parse
DPDK drivers dependencies.
While the reason is not clear from the commitlog of the patch that
introduced this filter, it was probably added for performance reasons,
since going through all dependencies can be quite long.
Testing with a DPDK built before the driver name changes:
- running the script takes ~0.3s with the filter,
- running the script takes ~9s without the filter,

Now that we changed the driver library names, it becomes more difficult
to identify only DPDK drivers, but we can just filter on the librte_
prefix to identify DPDK libraries: the script later checks for the
PMD_INFO_STRING string in .rodata and it is enough to differentiate the
DPDK drivers from the other DPDK libraries.

Running the script with this patch takes ~0.5s.

A debug message was logged for each inspected file, it gives no useful
information and is removed.

Fixes: a20b2c01a7a1 ("build: standardize component names and defines")
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Robin Jarry <robin.jarry@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

doc: add instructions for building 32-bit DPDK

For users with 32-bit applications who wish to use DPDK we need to provide
instructions on creating a 32-bit build of DPDK with meson. Therefore add a
section with this information to the GSG.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>

devtools: test 32-bit build

It's reasonably common for patches to have issues when built on 32-bits, so
to prevent this, we can add a 32-bit build (if supported) to the
"test-meson-builds.sh" script. The tricky bit is using a valid
PKG_CONFIG_LIBDIR, so for now we use two common possibilities for where that
should point to in order to get a successful build.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

version: 20.11-rc3

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

app/testpmd: revert max Rx packet length adjustment

The fix of max_rx_pkt_len for allowing VLAN packets in all cases
was breaking configuration of some drivers. Example with virtio:

Ethdev port_id=0 max_rx_pkt_len 11229 > max valid value 9728
Fail to configure port 0

Trying to fix the logic was revealing other issues in some drivers.
That's why it is decided to revert.

The workaround for the original issue would be
to set the MTU explicitly from the application
with rte_eth_dev_set_mtu().
See RFC: https://patches.dpdk.org/patch/83756/

Fixes: f6870a7ed6b3 ("app/testpmd: fix max Rx packet length for VLAN packet")
Cc: stable@dpdk.org
Reported-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

mbuf: clean up comments and prefix

The mbuf header files had some commenting style errors that affected the
API documentation.
Also, the RTE_ prefix was missing on a macro and a definition.

Note: This patch does not touch the offload and attachment flags that are
also missing the RTE_ prefix.

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>

cmdline: avoid name clash with Windows system types

cmdline_numtype member names clash with Windows system identifiers.
Add RTE_ prefix to cmdline constants to avoid this and possible
future conflicts.

Suggested-by: Ranjit Menon <ranjit.menon@intel.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Jie Zhou <jizh@microsoft.com>
Tested-by: Jie Zhou <jizh@microsoft.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>

test/lpm: avoid code duplication in RCU perf tests

Avoid code duplication by combining single and multi threaded tests

Also, enable support for more than 2 writers

Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>

test/lpm: remove unneeded checks in RCU perf tests

Remove redundant error checking for reader threads
since they never return error.

Fixes: eff30b59cc2e ("test/lpm: add RCU performance tests")
Cc: stable@dpdk.org
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>

test/lpm: report errors in RCU perf tests

Return error if Add/Delete fail in multiwriter perf test
Return error if single or multi writer test fails

Fixes: eff30b59cc2e ("test/lpm: add RCU performance tests")
Cc: stable@dpdk.org
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>

test/lpm: fix cycle calculation in RCU perf tests

Fix incorrect calculations for LPM adds, LPM deletes,
and average cycles in RCU QSBR perf tests

Since, rcu qsbr tests run for 'RCU_ITERATIONS' and not
'ITERATIONS', replace 'ITERATIONS' with 'RCU_ITERATIONS'
for calculating adds, deletes, and cycles.

Also, for multi-writer perf test, each writer only writes
half of NUM_LDEPTH_ROUTE_ENTRIES.
For 2 writers, total adds (or deletes) should be
(RCU_ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES) instead of
(2 * RCU_ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES).

Since, for both the single and multi writer tests, total adds/deletes
is equal to (RCU_ITERATIONS * NUM_LDEPTH_ROUTE_ENTRIES),
this has been replaced with a macro 'TOTAL_WRITES' and furthermore,
'g_writes' has been removed since it is always a fixed value
equal to TOTAL_WRITES.

Fixes: eff30b59cc2e ("test/lpm: add RCU performance tests")
Cc: stable@dpdk.org
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>

eal: fix MCS lock and ticketlock headers install

Add missing arch-specific headers in meson.build.

Fixes: 2173f3333b61 ("mcslock: add MCS queued lock implementation")
Fixes: ca49b92079df ("ticketlock: enable generic ticketlock on all arch")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>

version: 20.11-rc2

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

test/telemetry: fix typo at beginning of line

A "+" symbol was incorrectly placed at the beginning of a line,
this is now removed.

Fixes: 52af6ccb2b39 ("telemetry: add utility functions for creating JSON")
Cc: stable@dpdk.org
Signed-off-by: Ciara Power <ciara.power@intel.com>

app/flow-perf: configure rule batches

Currently, flow-perf measures the performance of
rule installation/deletion operations by breaking
down the entire number of operations into windows
of fixed size (i.e., 100000 operations per window).
Then, flow-perf measures the total time per window
and computes an average time across all windows.

This commit allows flow-perf users to configure
the number of rules per window instead of using
a fixed pre-compiled value. To do so, users must
pass --rules-batch=N, where N is the number of
rules per window (or batch).
For consistency reasons, flow_count variable is
now renamed to rules_count. This variable is the
total number of rules to be installed/deleted.

For example, if a user wants to measure how much
time it takes to install 1M rules in a certain NIC,
he/she can input:
--rules-count=1000000
This way flow-perf will break down 1M flow rules into
10 batches of 100k flow rules each (this is the default
batch size) and compute an average across the 10
measurements.
Now, if the user modifies the number of rules per
batch as follows:
--rules-count=1000000 --rules-batch=500000
then flow-perf will break down 1M flow rules into
2 batches of 500k flow rules each and compute the
average across the 2 measurements.

Finally, this commit also adds default variables
to the usage function instead of hardcoded values.

Signed-off-by: Georgios Katsikas <katsikas.gp@gmail.com>
Acked-by: Wisam Jaddo <wisamm@nvidia.com>

doc: remove obsolete deprecation notice for power library

Remove notice announcing an already-implemented change.

In 19.05, rte_power_set_env was changed to return -1 in cases where
the environment was already set up, and for the same release, a
deprecation notice was added.
This patch removes that notice.

The API change was tested by calling rte_power_set_env twice. The first
call succeeded, and the second call failed, as expected.

Fixes: 5a5f3178d4a8 ("power: return error when environment already set")
Cc: stable@dpdk.org
Signed-off-by: David Hunt <david.hunt@intel.com>

doc: fix typo in KNI guide

The typo "withe" should have been "with the". This is now fixed.

Fixes: 89397a01ce4a ("kni: set default carrier state of interface")
Cc: stable@dpdk.org
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>

mbuf: fix dynamic fields and flags with multiprocess

The dynamic flag management is broken if rte_mbuf_dynflag_lookup()
is done in a secondary process because the local pointer to
the memzone is not ever initialized.

Fix it by using the same checks as dynfield_register().
I.e if shared memory zone has not been looked up already,
then discover it.

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>

license: remove dual prefix

This patch removes the dual keyword from dual license
definitions to avoid confusion. As the *dual* word is
not required to be added SPDX license.

Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>

fix spellings that Lintian complains about

Fixes: 103809d032cd ("app/test-fib: add test application for FIB")
Fixes: 1265b5372d9d ("net/hns3: add some definitions for data structure and macro")
Fixes: a85e378cc606 ("net/ixgbe/base: add debug traces")
Fixes: 4861cde46116 ("i40e: new poll mode driver")
Fixes: 4e9c73e96e83 ("net/netvsc: add Hyper-V network device")
Fixes: 86a2265e59d7 ("qede: add SRIOV support")
Fixes: 1db4d2330bc8 ("net/virtio-user: check negotiated features before set")
Cc: stable@dpdk.org
Signed-off-by: Luca Boccassi <luca.boccassi@microsoft.com>

common/mlx5: split PCI relaxed ordering for read and write

The current DevX implementation of the relaxed ordering feature is
enabling relaxed ordering usage only if both relaxed ordering read AND
write are supported. In that case both relaxed ordering read and write
are activated.

This commit will optimize the usage of relaxed ordering by enabling it
when the read OR write features are supported. Each relaxed ordering
type will be activated according to its own capability bit.

This will align the DevX flow with the verbs implementation of
ibv_reg_mr when using the flag IBV_ACCESS_RELAXED_ORDERING

Fixes: 53ac93f71ad1 ("net/mlx5: create relaxed ordering memory regions")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/hinic/base: fix log info for PF command channel

When PF command channel is error, the variables in the log has been
cleared, which is not printed yet.

Fixes: 214164a6bf7f ("net/hinic/base: remove unused function parameters")
Cc: stable@dpdk.org
Signed-off-by: Guoyang Zhou <zhouguoyang@huawei.com>

net/hinic/base: support two or more AEQS for chip

For device initialize, driver only supports four aeqs before,
and now driver can supports two or more aeqs from chip
config file.

Fixes: 611faa5f46cc ("fix various typos found by Lintian")
Cc: stable@dpdk.org
Signed-off-by: Guoyang Zhou <zhouguoyang@huawei.com>

ethdev: fix data type for port id

The ethdev port id is 16 bits now. This patch fixes the data type
of the variable for 'pid', which changing from uint32_t to uint16_t.

RTE_MAX_ETHPORTS is the maximum number of ports, which customized by
the user. To avoid 16-bit unsigned integer overflow, the valid value
of RTE_MAX_ETHPORTS should be set from 0 to UINT16_MAX, and it is
safer to cut one more port from space.

So we use RTE_BUILD_BUG_ON() to ensure that RTE_MAX_ETHPORTS is less
to UINT16_MAX.

Fixes: 5b7ba31148a8 ("ethdev: add port ownership")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

doc: update release notes for iavf

Update release notes with feature of outer IP hash for GTPC and GTPU.

Fixes: 6cd2d6adc783 ("net/iavf: support outer IP hash for GTPC")
Fixes: 262100a34a38 ("net/iavf: support outer IP hash for no inner GTPU")
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

net/iavf: fix protocol field for RSS hash

Add PROT field into IPv4 and IPv6 protocol headers for rss hash.

Fixes: 91f27b2e39ab ("net/iavf: refactor RSS")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

ethdev: fix using Rx split config before null check

Coverity flags that 'rx_conf' variable is used before
it's checked for NULL. This patch fixes this issue.

Coverity issue: 363570
Fixes: 4ff702b5dfa9 ("ethdev: introduce Rx buffer split")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

app/testpmd: fix protocol size for copy

The rte_flow_item_eth and rte_flow_item_vlan items are refined.
The structs do not exactly represent the packet bits captured on the
wire anymore so set raw_encap/decap commands should only copy real
header instead of the whole struct.

Replace the rte_flow_item_* with the existing corresponding rte_*_hdr.

Fixes: 09315fc83861 ("ethdev: add VLAN attributes to ethernet and VLAN items")
Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

net/mlx5: fix tunnel flow destroy

Flow destructor tired to access flow related resources after the
flow object memory was already released and crashed dpdk process.

The patch moves flow memory release to the end of destructor.

Fixes: 4ec6360de37d ("net/mlx5: implement tunnel offload")
Signed-off-by: Gregory Etelson <getelson@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix CQE decompression for Arm and PowerPC

The recent Rx code refactoring moved the incrementing
of the CQ completion index out of the rxq_cq_decompress_v()
function to the rxq_burst_v() function.

The advancing of CQ completion index was removed in SSE
version only causing Neon and Altivec Rx bursts to stall.

Remove the incrementation of CQ completion index for all
the architectures in order to fix the stall.

Fixes: 1ded26239aa0 ("net/mlx5: refactor vectorized Rx")
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/bnxt: fix VXLAN decap offload

This patch fixes a couple of scenarios which were overlooked
by the patch which added VXLAN rte_flow offload support.

1. When a PMD application queries for flow counters, it could ask PMD
   to reset the counters when the application is doing the counters
   accumulation. In this case, PMD should not accumulate rather reset
   the counter.

2. Some of the PMD applications may set the protocol field in the IPv4
   spec but don't set the mask. So, consider the mask in the proto
   value calculation.

4. The cached tunnel inner flow is not getting installed in the
   context of tunnel outer flow create because of the wrong
   error code check when tunnel outer flow is installed in the
   hardware.

5. When a dpdk application offloads the same tunnel inner flow on
   all the uplink ports, other than the first one the driver rejects
   the rest of them. However, the first tunnel inner flow request
   might not be of the correct physical port. This is fixed by
   caching the tunnel inner flow entry for all the ports on which
   the flow offload request has arrived on. The tunnel inner flows
   which were cached on the irrelevant ports will eventually get
   aged out as there won't be any traffic on these ports.

Fixes: 675e31d877b6 ("net/bnxt: support VXLAN decap offload")
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>

net/bnxt: fix PAM4 link negotiation

In some instances link was not coming up if PAM4 signaling is enabled.
Added check to disable autoneg if FW indicates auto speeds are zero.
Use default auto speeds if PAM4 auto speeds is not set.
Added a fix for forced link setting.

Fixes: c23f9ded0391 ("net/bnxt: support 200G PAM4 link")
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

app/testpmd: support shared flow action attribute transfer

This attribute helps PMDs to tell actions supposed to work
on the so-called hardware e-switch level from regular ones.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>

ethdev: introduce transfer attribute to shared action conf

In a flow rule, attribute "transfer" means operation level
at which both traffic is matched and actions are conducted.

Add the very same attribute to shared action configuration.
If a driver needs to prepare HW resources in two different
ways, depending on the operation level, in order to set up
an action, then this new attribute will indicate the level.
Also, when handling a flow rule insertion, the driver will
be able to turn down a shared action if its level is unfit.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Ori Kam <orika@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Andrey Vesnovaty <andreyv@nvidia.com>

app/testpmd: fix max Rx packet length for VLAN packet

When the max Rx packet length is smaller than the sum of MTU size and
ether overhead size, it should be enlarged, otherwise the VLAN packets
will be dropped.

Fixes: 35b2d13fd6fd ("net: add rte prefix to ether defines")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

net/hns3: cleanup includes

Some header files have included by others. Also,
some header files have a header file self-contained
error will trigger building warning. As a result,
it is unnecessary and move it into the correct
location.

Beside, here also remove some unused lines.

Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: check quantity limiter support before using it

If hardware does not support QL (quantity limiter), the int_ql_max
is 0, software should confirm ql_value is less than int_ql_max
before write QL register. This patch add check of int_ql_max
value from firmware and delete the unused variable coalesce_mode.

Fixes: 27911a6e62e5 ("net/hns3: add Rx interrupts compatibility")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: fix configurations of port-level scheduling rate

Scheduling rate of port-level in hns3 PF driver configured to
hardware is obtained from firmware, which determines the
bandwidth capability of the port. The rate in firmware is
generally configured with the maximum value for network engine
supporting multiple rates, such as 10G and 25G. It may cause
the following issues:
1) When a 10G optical module is used on the network engine, scheduling
   rate of this port will also be configured to hardware with 25G.
   However, the MAC rate of this port is 10G. In this case, it is
   unreasonable that the port scheduling rate is different from the MAC
   rate.
2) If default speed in firmware is not the maximum value, the 25G port
   may not reach the capability of the port.

Therefore, we fix configurations of port-level scheduling rate
according to updating of MAC link speed.

Fixes: 59fad0f32135 ("net/hns3: support link update operation")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: support VXLAN-GPE TSO and checksum

Kupeng920 support tso and checksum offload for VXLAN_GPE with
the next protocol id 3(i.e., Ethernet).

Kupeng930 support TSO and checksum offload for VXLAN_GPE with
the next protocol id 1,2,3(i.e., IPv4, IPv6 and Ethernet).

This patch add support for this tunnel type.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: fix Tx checksum with fixed header length

Currently, the header length of all the layers are fixed, It would
lead to a csum error when the header length changed.

This patch fixes above problem by using the header length in mbuf
instead of the fixed header length to perform the TX cksum offload.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: fix Tx checksum outer header prepare

Currently, there are two mistakes in Tx checksum outer header prepare.
1) Check whether the packet outer header is IPV4 based on PKT_TX_IPV4
   which is incorrect.
2) For HIP08, the outer UDP cksum could not be offloaded. And driver
   should ensure the outer udp cksum filed set to 0. In current code,
   PKT_TX_UDP_CKSUM is used to determine whether the outer layer of
   the packet is a UDP header. Actually, for tunnel TSO, the flag will
   never be set.

For the first mistake, it is fixed by replacing PKT_TX_IPV4 with
PKT_TX_OUTER_IPV4. And the protocol number in L3 header is used to check
whether the outer L4 header is UDP.

Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations")
Fixes: 6dca716c9e1d ("net/hns3: support TSO")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: limit promiscuous mode for VF

For Kunpeng920, both tx and rx promisc is set when the promisc mode
is enabled. In other words, all the ingress packets and the packets sent
from the PF and other VFs on the same physical port will be copied
to the function which set promisc mode on.

Kunpeng930 support to turn off the tx unicast promisc. A limit promisc
mode is introduced, which means turn off the tx unicast promisc when
promisc is set.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hinic: fix SCTP checksum error

For SCTP checksum offload, pmd driver does not parse payload offset
info, which may cause hardware calculate SCTP checksum failed.

Fixes: 8c8b61234ffd ("net/hinic: refactor checksum functions")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>

net/hinic: fix outer L3 length parse

This patch fixes outer_l3_len parse error when
PKT_TX_OUTER_IP_CKSUM is not set, which does not affect
checksum function, just be consistent with mbuf meta
information description.

The outer_l3_len is calculated wrong because 'vlan_hdr' is calculated
wrong, 'vlan_hdr' fixed and code refactored.

Fixes: 8c8b61234ffd ("net/hinic: refactor checksum functions")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>

doc: fix hyperlink in igc guide

The hyperlink in the IGC documentation showed the whole link in italics
and was not clickable. This is now fixed to have a clickable label.

Fixes: 66fde1b943eb ("net/igc: add skeleton")
Cc: stable@dpdk.org
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

app/testpmd: do not allow dynamic change of core number

When the number of forwarding cores changed in runtime, the issue may
be encountered:
If the nbcore set little than current nbcore, the forwarding thread
will still running on the extra cores. Therefore, trying to stop
forwarding will hang testpmd, since it will wait for the extra cores to
stop.

So do not allow to change nbcore number when forwarding is running.

Fixes: 0c0db76f42ed ("app/testpmd: separate forward config setup from display")
Cc: stable@dpdk.org
Signed-off-by: Zhenghua Zhou <zhenghuax.zhou@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

raw/ifpga: use trusted buffer to free

In rte_fpga_do_pr, calling function read() may taints argument buffer
which turn to an untrusted value as argument of rte_free().

Coverity issue: 279449
Fixes: ef1e8ede3da5 ("raw/ifpga: add Intel FPGA bus rawdev driver")
Cc: stable@dpdk.org
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

raw/ifpga: terminate string filled by readlink with null

readlink() does not terminate string, add a null character at the end
of the string if readlink() succeeds.

Coverity issue: 362820
Fixes: 9c006c45d0c5 ("raw/ifpga: scan PCIe BDF device tree")
Cc: stable@dpdk.org
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/iavf: fix supported RSS type

When a RSS rule with symmetric hash function, the RSS type shouldn't
carry with l3/l4 SRC/DST_ONLY. This patch adds invalid RSS type check
for the case.

Fixes: 91f27b2e39ab ("net/iavf: refactor RSS")
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice: delete unsupported ptypes in default hash set

Ptypes for GTPU with inner SCTP are not supported in current DDP pkg.
Thus, delete them in the default hash set config function.
Also clean up the rss vsi when calling the hash set config function.

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/mlx5: allow age modes combination

ASO age action mode is not supported in group 0 while counter base age
action mode supports group 0.

Allow using the 2 modes of age action in parallel, so group 0 flows will
use counter base age actions and group > 0 flows will use ASO age
actions.

Currently, counter base age action doesn't support shared action API so
group 0 flows cannot share age actions.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Dekel Peled <dekelp@nvidia.com>

net/mlx5: support shared age action

Add support for rte_flow shared action API for ASO age action.

First step here to support validate, create, query and destroy.

The support is only for age ASO mode.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Dekel Peled <dekelp@nvidia.com>

net/mlx5: optimize shared RSS action memory

The RSS shared action was saved in flow memory by a pointer.
It means that every flow memory includes 8B only for optional shared
RSS case.

Move the RSS objects to be used by indexed pool which reduces the flow
handle memory to 4B.

So, now, the shared action handler is also just a 4B index.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Dekel Peled <dekelp@nvidia.com>

net/mlx5: support flow hit action for aging

A new ASO (Advanced Steering Operation) feature was added in the last
mlx5 adapters to support flow hit detection.

Using this new steering action, the driver can detect flow traffic hit
and to reset this indication any time.

The ASO age action cannot support flows in table 0.

Add support for flow aging action in rte_flow using this new feature.

The counter aging mode will be taken only when the ASO feature is not
supported for the user flow groups.

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Signed-off-by: Matan Azrad <matan@nvidia.com>

common/mlx5: add definitions for ASO flow hit

This patch adds different PRM definitions, related to ASO flow hit
feature, in MLX5 PMD code.

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

common/mlx5: add glue function to create flow hit action

Add glue function to create the flow hit action using DV API,
if rdma-core support exists.

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

common/mlx5: add read ASO flow hit HCA capability

Read and store the device capability of FLOW_HIT_ASO general object,
using the DevX API.

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

common/mlx5: use general object type for cap index

PRM defines the general object types using positive numbers.
The same values are used as index for the relevant bit in HCA
capabilities general_obj_types bit mask.

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

common/mlx5: add DevX API to create ASO flow hit object

Add DevX API to create ASO flow hit object.

Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: support flow tag and packet header miniCQEs

CQE compression allows us to save the PCI bandwidth and improve
the performance by compressing several CQEs together to a miniCQE.
But the miniCQE size is only 8 bytes and this limits the ability
to successfully keep the compression session in case of various
traffic patterns.

The current miniCQE format only keeps the compression session alive
in case of uniform traffic with the Hash RSS as the only difference.
There are requests to keep the compression session in case of tagged
traffic by RTE Flow Mark Id and mixed UDP/TCP and IPv4/IPv6 traffic.
Add 2 new miniCQE formats in order to achieve the best performance
for these traffic patterns: Flow Tag and Packet Header miniCQEs.

The existing rxq_cqe_comp_en devarg is modified to specify the
desired miniCQE format. Specifying 2 selects Flow Tag format
for better compression rate in case of RTE Flow Mark traffic.
Specifying 3 selects Checksum format (existing format for MPRQ).
Specifying 4 selects L3/L4 Header format for better compression
rate in case of mixed TCP/UDP and IPv4/IPv6 traffic.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx: remove separate ABI version for glue libraries

The glue libraries are tightly bound to the mlx drivers of a dpdk
version and are packaged with them.

Keeping a separate ABI version prevents us from installing two versions
of dpdk.
Maintaining this separate version just adds confusion.
Align the glue library ABI version to the global ABI version.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

common/mlx5/linux: replace malloc and free in glue

This commit replaces mlx5_malloc and mlx5_free calls with Linux calls
malloc and free in file mlx5_glue.c.
The current mlx5_malloc calls have no flags, alignment or socket
selection, so they are equivalent to calling malloc.  Rdma-core itself
is using malloc.  When using mlx5_malloc the glue library is dependent
on common_mlx5 library which must be compiled first.  Not doing so and
in case ibverbs_link=dlopen will result in compilation failure:
mlx5_glue.c: undefined reference to `mlx5_malloc'.
To make all of this simpler and remove the common_mlx5 dependency - this
commit does the alloc/free replacements.

Fixes: 66914d19d135 ("common/mlx5: convert control path memory to unified malloc")
Cc: stable@dpdk.org
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

common/mlx5: fix DevX SQ object creation

Fix wrong assignment of allow_multi_pkt_send_wqe
in mlx5_devx_cmd_create_sq.
The incorrect assignment was introduced in the initial
mlx5_devx_cmd_create_sq implementation.

sq_attr->flush_in_error_en is
mistakenly assigned to both allow_multi_pkt_send_wqe and
flush_in_error_en, it was detected during Windows PMD development.

The fix is simply assigning the right value in mlx5_devx_cmd_create_sq
to sq_attr->allow_multi_pkt_send_wqe

Fixes: ae18a1ae9692 ("net/mlx5: support Tx hairpin queues")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

common/mlx5: fix glue library name

The MLX5 glue library wasn't following the standard
'librte_<class>_<name>.so' naming.

Fixes: a20b2c01a7a1 ("build: standardize component names and defines")
Signed-off-by: Ali Alnubani <alialnu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx4: fix glue library name

The MLX4 library wasn't being successfully initialized with
-Dibverbs_link=dlopen because it expected a shared object file
with a different name.

Fixes: a20b2c01a7a1 ("build: standardize component names and defines")
Signed-off-by: Ali Alnubani <alialnu@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/bnxt: fix pass by reference

Pass 'eth_da' pointer instead of pass by value to bnxt_rep_port_probe()

Coverity issue: 360841
Fixes: 322bd6e70272 ("net/bnxt: add port representor infrastructure")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

net/bnxt: add a failure log

Check and log an error message if switch domain free API fails

Coverity issue: 362757
Fixes: 322bd6e70272 ("net/bnxt: add port representor infrastructure")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>