git.droids-corp.org - dpdk.git/log

net/mlx5: fix RSS flow item expansion for GRE key

The support of RSS expansion for the flows with IPv6 GRE item was added
to mlx5 PMD. And the GRE KEY item support in expansion was missed
and the flows with GRE and GRE KEY items were expanded in the wrong
way causing the flow creation failure.

This patch adds the RSS expansion support for GRE KEY and mlx5 PMD
performs RSS expansion correctly.

Fixes: 048f0d45e342 ("net/mlx5: support RSS expansion for IPv6 GRE")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

common/mlx5: fix mkey attributes initialization

The crypto driver added new fields to the mkey attributes struct:
crypto_en and set_remote_rw.

The entire mkey struct was not initialized, only specific fields in it,
which caused the new added fields not to be initialized resulting in a
mkey creation error.

This is fixed by initializing the entire mkey attributes struct to 0
which will prevent this issue from reoccurring if any fields are added
to the mkey struct in the future.

Fixes: 0111a74e13dd ("common/mlx5: adjust DevX mkey fields for crypto")

Signed-off-by: Shiri Kuzin <shirik@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/ice/base: remove dead code in capabilities parsing

Execution cannot reach this statement: "break;".
Remove the unnecessary if branch.

Coverity issue: 370613
Fixes: 2913bc4155d2 ("net/ice/base: sign external device package programming")

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

net/ice: remove redundant RSS configuration for GTPU

Originally, the default RSS for GTPU is inner fields. Now, we hope outer
RSS for GTPU to be the default.

Since RSS for IPv4, RSS for IPv6, RSS for UDP and RSS for TCP can cover
the cases of outer RSS for GTPU, this patch deletes redundant default
RSS configurations for GTPU.

Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/iavf: remove dead code in Rx function selection

Execution cannot reach the expression "use_avx2"
inside this statement: "if (!use_sse && !use_avx2 &..."."

The check is useless.

Coverity issue: 370606
Fixes: bb3ef9aaa478 ("net/iavf: fix Rx function selection")

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice: fix IP RSS configuration template

To enable IP fragment RSS hash, ICE_FLOW_SEG_HDR_IPV_FRAG is added to the
IP RSS configuration template, together with ICE_FLOW_SEG_HDR_IPV_OTHER.
It will cause error when associating flow profile. And packet id field
for RSS is not correctly added when IP fragment is enabled. To fix this
issue, this patch only selects one of the above two segment header types
based on RSS types.

Fixes: f1ea76eb6394 ("net/ice: support RSS hash for IP fragment")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice: fix Tx queue vector setup

If vector mode is not allowed for Tx, no need to perform vector
related setup for Tx queue.

The patch deferred vector setup for Tx queue to the place that
vector mode is confirmed to be allowed.

Fixes: 28f9002ab67f ("net/ice: add Tx AVX512 offload path")

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice/base: fix memory allocation wrapper

This is reported by our internal covscan:

1. dpdk-20.11/drivers/net/ice/base/ice_switch.c:4214: sign_extension:
Suspicious implicit sign extension: "s_rule_size" with type "u16" (16
bits, unsigned) is promoted in "num_unicast * s_rule_size" to type "int"
(32 bits, signed), then sign-extended to type "unsigned long" (64 bits,
unsigned).
If "num_unicast * s_rule_size" is greater than 0x7FFFFFFF, the upper bits
of the result will all be 1.

#  4212|    s_rule_size = ICE_SW_RULE_RX_TX_ETH_HDR_SIZE;
#  4213|    s_rule = (struct ice_aqc_sw_rules_elem *)
#  4214|-> ice_calloc(hw, num_unicast, s_rule_size);
#  4215|    if (!s_rule) {
#  4216|    status = ICE_ERR_NO_MEMORY;

Even if this condition is not likely to happen, in any case, it is more
straightforward to rely on the existing rte_calloc.

Fixes: 5f0978e96220 ("net/ice/base: add OS specific implementation")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

test/event: fix result of unsupported periodic timer

Test case setup should return -ENOTSUP, if it is not supported.

Fixes: 7d761b07fcf6 ("test/event: add unit tests for periodic timer")
Cc: stable@dpdk.org
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>

app/eventdev: fix lcore parsing skipping last core

The last lcore declared in the list is also a valid lcore in the list.

Fixes: 32d7dbf269be ("app/eventdev: fix overflow in lcore list parsing")
Cc: stable@dpdk.org
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>

event/dpaa2: remove unused macros

Fixes: 653242c3375a ("event/dpaa2: add self test")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

power: fix sanity checks for guest channel read

In function power_guest_channel_read_msg, 'lcore_id' is used before
validity check, which may cause buffer 'global_fds' accessed by index
'lcore_id' overflow.

This patch moves the validity check of 'lcore_id' before the 'lcore_id'
being used for the first time.

Fixes: 9dc843eb273b ("power: extend guest channel API for reading")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Reviewed-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: David Hunt <david.hunt@intel.com>

doc: remove PDF requirements

The documentation is generated in HTML only.
The PDF format is abandoned since DPDK 20.11
while dropping support of the make-based build.

This decision has been mentioned by the Technical Board:
https://mails.dpdk.org/archives/dev/2021-January/195549.html

Fixes: 3cc6ecfdfe85 ("build: remove makefiles")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

test/timer: check memzone allocation

Segmentation fault may occur without checking if memzone
reserves succeed or not.

Fixes: 50247fe03fe0 ("test/timer: exercise new APIs in secondary process")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Erik Gabriel Carrillo <erik.g.carrillo@intel.com>

examples/timer: fix time interval

Timer sample example assumes that the frequency of the timer is about
2Ghz to control the period of calling rte_timer_manage(). But this
assumption is easy to fail. For example. the frequency of tsc on ARM64
is much less than 2Ghz.

This patch uses the frequency of the current timer to calculate the
correct time interval to ensure consistent result on all platforms.

In addition, the rte_rdtsc() is replaced with the more recommended
rte_get_timer_cycles function in this patch.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

ipc: use monotonic clock

Currently, the mp uses gettimeofday() API to get the time, and used as
timeout parameter.

But the time which gets from gettimeofday() API isn't monotonically
increasing. The process may fail if the system time is changed.

This fixes it by using clock_gettime() API with monotonic attribution.

Fixes: 783b6e54971d ("eal: add synchronous multi-process communication")
Fixes: f05e26051c15 ("eal: add IPC asynchronous request")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>

raw/ioat: fix parameter shadow warning

In the function __idxd_completed_ops() we have a parameter shadow warning
due to a local variable having the same name as one of the function
parameters. This issue is fixed by simply renaming the local variable.

This warning was caught when additions were made to the OVS codebase,
which include adding calls the IOAT APIs. The OVS build passes the
-Wshadow flag by default, allowing the warning to be seen when building
OVS with DPDK 21.05-rc2.

Fixes: 245efe544d8e ("raw/ioat: report status of completed jobs")

Reported-by: Sunil Pai G <sunil.pai.g@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

raw/skeleton: add missing check after setting attribute

This patch adds return value check for setting an attribute.

Fixes: 88a81bcecb7b ("raw/skeleton: remove compile-time constant for device id")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

eal: fix memory mapping on 32-bit target

For 32-bit targets, size_t is normally a 32-bit type and
does not have sufficient range to represent 64-bit offsets
that are needed when mapping PCI addresses.
Use uint64_t instead.

Found when attempting to run 32-bit Linux dpdk-testpmd
using VFIO driver:

EAL: pci_map_resource(): cannot map resource(63, 0xc0010000, \
0x200000, 0x20000000000): Invalid argument ((nil))

Fixes: c4b89ecb64ea ("eal: introduce memory management wrappers")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

doc: fix build with Sphinx 4

Sphinx 4.0 became stricter with permalink configuration:
"
html_add_permalinks has been deprecated since v3.5.0.
Please use html_permalinks and html_permalinks_icon instead.
"

The new variable is used while keeping compatibility
with older Sphinx versions.

Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>

net: fix header include order for FreeBSD

Spotted by sparse in OVS build:

../../lib/netdev-dpdk.c: note: in included file (through
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_ip.h,
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h, ...):
../../include/sparse/arpa/inet.h:22:2: error: "Must include
<netinet/in.h> before <arpa/inet.h> for FreeBSD support"

This is a check enforced by OVS itself.
See [1] for some context.

1: https://github.com/openvswitch/ovs/commit/b2befd5bb2db

Fixes: 89813a522e68 ("net: provide IP-related API on any OS")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>

net: add endianness annotations to ethernet headers

Spotted by sparse in OVS build:

/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:789:27:
error: incorrect type in initializer (different base types)
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:789:27:
expected unsigned short [usertype] ether_type
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:789:27:
got restricted ovs_be16 [usertype]
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:829:25:
error: incorrect type in initializer (different base types)
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:829:25:
expected unsigned short [usertype] vlan_tci
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:829:25:
got restricted ovs_be16 [usertype]
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:830:26:
error: incorrect type in initializer (different base types)
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:830:26:
expected unsigned short [usertype] eth_proto
/home/runner/work/ovs/ovs/dpdk-dir/build/include/rte_flow.h:830:26:
got restricted ovs_be16 [usertype]

This was not caught before as no code in headers was using those fields.
This changed with commit 6f2168b69aee ("ethdev: reuse ethernet header
definition in flow item") and commit a56a262e3408 ("ethdev: reuse VLAN
header definition in flow item").

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>

log: register with standardized names

Let's try to enforce the convention where most drivers use a pmd. logtype
with their class reflected in it, and libraries use a lib. logtype.

Introduce two new macros:
- RTE_LOG_REGISTER_DEFAULT can be used when a single logtype is
  used in a component. It is associated to the default name provided
  by the build system,
- RTE_LOG_REGISTER_SUFFIX can be used when multiple logtypes are used,
  and then the passed name is appended to the default name,

RTE_LOG_REGISTER is left untouched for existing external users
and for components that do not comply with the convention.

There is a new Meson variable log_prefix to adapt the default name
for baseband (pmd.bb.), bus (no pmd.) and mempool (no pmd.) classes.

Note: achieved with below commands + reverted change on net/bonding +
edits on crypto/virtio, compress/mlx5, regex/mlx5

$ git grep -l RTE_LOG_REGISTER drivers/ |
  while read file; do
    pattern=${file##drivers/};
    class=${pattern%%/*};
    pattern=${pattern#$class/};
    drv=${pattern%%/*};
    case "$class" in
      baseband) pattern=pmd.bb.$drv;;
      bus) pattern=bus.$drv;;
      mempool) pattern=mempool.$drv;;
      *) pattern=pmd.$class.$drv;;
    esac
    sed -i -e 's/RTE_LOG_REGISTER($.*$, '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
    sed -i -e 's/RTE_LOG_REGISTER($.*$, '$pattern'\.$.*$,/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
  done

$ git grep -l RTE_LOG_REGISTER lib/ |
  while read file; do
    pattern=${file##lib/};
    pattern=lib.${pattern%%/*};
    sed -i -e 's/RTE_LOG_REGISTER($.*$, '$pattern',/RTE_LOG_REGISTER_DEFAULT(\1,/' $file;
    sed -i -e 's/RTE_LOG_REGISTER($.*$, '$pattern'\.$.*$,/RTE_LOG_REGISTER_SUFFIX(\1, \2,/' $file;
  done

Signed-off-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

hash: fix tuple adjustment

rte_thash_adjust_tuple() uses random to generate a new subtuple if
fn() callback reports about collision. In some cases random changes
the subtuple in a way that after complementary bits are applied the
original tuple is obtained. This patch replaces random with subtuple
increment.

Fixes: 28ebff11c2dc ("hash: add predictable RSS")
Cc: vladimir.medvedkin@intel.com
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
Tested-by: Stanislaw Kardach <kda@semihalf.com>
Reviewed-by: Stanislaw Kardach <kda@semihalf.com>

eal: fix leak in shared lib mode detection

This is reported by our internal covscan:

1. dpdk-20.11/lib/librte_eal/common/eal_common_options.c:508: alloc_fn:
Storage is returned from allocation function "dlopen".
6. dpdk-20.11/lib/librte_eal/common/eal_common_options.c:508:
leaked_storage: Failing to save or free storage allocated by
"dlopen("librte_eal.so.21.0", 5)" leaks it.

#   506|    * shared library is not already loaded i.e. it's
#   statically linked.)
#   507|    */
#   508|-> if (dlopen("librte_eal.so."ABI_VERSION, RTLD_LAZY |
#   RTLD_NOLOAD) != NULL &&
#   509|    *default_solib_dir != '\0' &&
#   510|    stat(default_solib_dir, &sb) == 0 &&

This leak is not an issue per se, but on the other hand, this is easy
to fix and I prefer not having to waive this warning later.

Fixes: 06c7871dde01 ("eal: restrict default plugin path to shared lib mode")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

bus/fslmc: remove unused debug macro

Fixes: ce9efbf5bb09 ("bus/fslmc: support dynamic logging")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>

raw/ioat: skip VA requirement for bus without device

If after a bus scan, there are no devices using a particular bus, then
that bus should not be taken into account when deciding whether DPDK
should be run in VA or PA addressing mode. This becomes an issue when
the DSA bus driver code is used on a system without an IOMMU. The PCI
bus correctly reports that it only works in PA mode, while the DSA bus -
also correctly - reports that it works only in VA mode. The difference
is that there will be no devices found in a scan for the DSA bus, since
the kernel driver can only present those to userspace in the presence of
an IOMMU.

While we could change DSA instance to always report that it does not
care about the addressing mode, this would imply that it could be used
with DPDK in PA mode which is not the case. Therefore, this patch
changes the driver to report DC (don't care) in the case where no
devices are present, and VA otherwise.

NOTE: this addressing mode use of VA-only applies only in the case of
using DSA through the idxd kernel driver. The use of DSA though vfio-pci
is unaffected and works as with other PCI devices.

Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")

Reported-by: Harry van Haaren <harry.van.haaren@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Harry van Haaren <harry.van.haaren@intel.com>
Tested-by: Conor Walsh <conor.walsh@intel.com>

raw/ioat: fix directory handle leak

When reading the /dev directory as part of the bus scan for DSA devices,
the directory handle from opendir was not freed on function return,
leading to a resource leak.

Coverity issue: 370588
Fixes: b7aaf417f936 ("raw/ioat: add bus driver for device scanning automatically")

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

build: fix default drivers list without Python

If no enable_drivers option is passed, the default is to build
the drivers list by calling list-dir-globs.py.

But if no Python interpreter is installed, no error is reported
and all drivers end up being disabled.

Example on a minimal FreeBSD VM:

  dpdk@freebsd:~/dpdk $ meson setup build
  ...
  drivers:
  common/cpt: not in enabled drivers build config
  common/dpaax: not in enabled drivers build config
  common/iavf: not in enabled drivers build config
  common/mvep: not in enabled drivers build config
  common/octeontx: not in enabled drivers build config
  common/octeontx2: not in enabled drivers build config
  bus/dpaa: not in enabled drivers build config
  bus/fslmc: not in enabled drivers build config
  ...

  dpdk@freebsd:~/dpdk $ cd drivers/
  dpdk@freebsd:~/dpdk/drivers $ ~/dpdk/buildtools/list-dir-globs.py */*
  env: python3: No such file or directory

Rely on meson internal interpreter.
Check return code when calling this script.

Fixes: ab9407c3addd ("build: allow using wildcards to disable drivers")
Fixes: 2e33309ebe03 ("config: enable/disable drivers in Arm builds")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

net/hns3: fix debug build

The variable "dev" is not used in hns3_get_tx_prep_needed()
in the case of RTE_LIBRTE_ETHDEV_DEBUG:
drivers/net/hns3/hns3_rxtx.c:4213:45: error: unused parameter ‘dev’

Fixes: d7ec2c076579 ("net/hns3: select Tx prepare based on Tx offload")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>

doc: fix Arm SoCs list

Keep the list of SoCs in a single place and include it so that the
documentation won't get outdated.

Fixes: 8f5ea6a464ac ("config/arm: fix implementer and its SoCs")
Fixes: 1b4c86a721c9 ("config/arm: add Marvell CN10K")
Fixes: 7cf32a22b240 ("config/arm: add Hisilicon kunpeng")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

version: 21.05-rc2

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

test/crypto: copy offset data to OOP destination buffer

Copy over the offset data required for auth in out-of-place op
when auth offset and cipher offset are not aligned.

Fixes: e847fc512817 ("test/crypto: add encrypted digest case for AES-CTR-CMAC")
Cc: stable@dpdk.org
Signed-off-by: Kai Ji <kai.ji@intel.com>

crypto/dpaa2_sec: fix close and uninit functions

The init function was calling the dpseci_open
while dpseci_close was called by the open function.
This is a mismatch un-init shall clean the init configurations and
close shall clear the configure function settings.

This was causing issue with recent changes in test framework, where
the close was being called and causing DPAA2 SEC to fail in configure

Fixes: e5cbdfc53765 ("crypto/dpaa2_sec: add basic operations")
Cc: stable@dpdk.org
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>

crypto/dpaa_sec: affine the thread portal affinity

DPAA requires the I/O shall be done in a HW portal context only.
The portal affinity is currently only being done in session create
and config APIs with the assumption that same thread will be used
for IO. This is causing issue.
This patch add support during I/O to check the HW portal affinity
and affine portal- if not affined already.

Fixes: 9a984458f755 ("crypto/dpaa_sec: rewrite Rx/Tx path")
Cc: stable@dpdk.org
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>

test/crypto: fix auth-cipher compare length in OOP

For out-of-place operations, comparing expected ciphertext with
the operation result should skip cipher_offset bytes, as those
will not be copied from source to the destination buffer, making
the tests fail.

Fixes: 02ed7b3871d6 ("test/crypto: add SNOW3G test cases for auth-cipher")
Cc: stable@dpdk.org
Signed-off-by: Kai Ji <kai.ji@intel.com>
Signed-off-by: Damian Nowak <damianx.nowak@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>

examples/ipsec-secgw: fix handling IPv6 extension headers

Recent patch to support UDP encapsulation introduced problem with
handling inbound IPv6 packets with header extensions.
This patch aims to fix the issue.

Bugzilla ID: 695
Fixes: 9a1cc8f1ed74 ("examples/ipsec-secgw: support UDP encapsulation")

Reported-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>

compress/qat: enable compression on GEN3

This patch enables the compression on QAT GEN3 (on hardware
versions that support it) and changes the error message shown
on older hardware versions that don't support the compression.

It also fixes the crash that happened on IM buffer allocation
failure (not enough memory) during the PMD cleaning phase.

Fixes: a124830a6f00 ("compress/qat: enable dynamic huffman encoding")
Fixes: 352332744c3a ("compress/qat: add dynamic SGL allocation")
Cc: stable@dpdk.org
Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>

common/qat: increase IM buffer size for GEN3

This patch increases the intermediate buffer size used for the
compression on QAT GEN3 to accommodate new hardware versions.

Fixes: a124830a6f00 ("compress/qat: enable dynamic huffman encoding")
Cc: stable@dpdk.org
Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>

app/bbdev: fix HARQ error messages

The logging should show context by printing the two variables which
compared to each other. 'nb_harq_inputs', not 'nb_hard_outputs';
'nb_harq_outputs', not 'nb_hard_outputs'.

This patch corrected misused variable.

Fixes: d819c08327f3 ("app/bbdev: update for 5GNR")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Nicolas Chautru <nicolas.chautru@intel.com>

app/bbdev: check memory allocation

Return value of a function 'rte_malloc' is dereferenced without
checking, and may result in segmentation fault.

This patch fixed it.

Fixes: 31a7853d1ed9 ("baseband/turbo_sw: support large size code block")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Nicolas Chautru <nicolas.chautru@intel.com>

examples/vm_power: remove VM channel number check

VM channel number should not be validated against the
host vm_power_manager coremask core indexes, as VM
cores need not to be same as host cores.
So remove this check, to allow all the vm channels
to be added successfully.

Fixes: b49c677a0d24 ("examples/vm_power: respect core mask")
Cc: stable@dpdk.org
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Reviewed-by: David Hunt <david.hunt@intel.com>

eal: fix service core list parsing

This patch adds checking for service core index validity when parsing
service corelist.

Fixes: 7dbd7a6413ef ("service: add -S corelist option")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Harry van Haaren <harry.van.haaren@intel.com>

ipc: check malloc sync reply result

This patch adds checking for mp reply result in handle_sync().

Fixes: 07dcbfe0101f ("malloc: support multiprocess memory hotplug")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

raw/ntb: check memory allocations

This patch adds checking for rte_zmalloc() result when init Intel ntb
device, also fix the same bug when start ntb device.

Fixes: 034c328eb025 ("raw/ntb: support Intel NTB")
Fixes: c39d1e082a4b ("raw/ntb: setup queues")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>

raw/ntb: check SPAD user index

This patch adds checking spad user index validity when set or get attr.

Fixes: 277310027965 ("raw/ntb: introduce NTB raw device driver")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>

examples: fix pkg-config override

Move pkg-config override to beginning in the Makefile to allow
use PKGCONF variable to detect the libdpdk availability.

Fixes: fda34680eb9a ("examples: remove legacy sections of makefiles")
Cc: stable@dpdk.org
Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

regex/octeontx2: remove unused include directory

The variable inc_dir is not defined in this file.

Fixes: 4cd1c5fd9ed4 ("regex/octeontx2: introduce REE driver")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Guy Kaneti <guyk@marvell.com>

lib: restore developer mode checks

Most of the checks on developer_mode have been accidentally dropped.
Restore them.

Fixes: 7d611e35b077 ("lib: simplify main build file")

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

net/mlx5: support connection tracking between two ports

After creating a connection tracking context, it can be used between
two ports. For each port, the flow for one direction traffic will
be created.

The context can only be shared between the owner port and the peer
port that was specified when being created. Only the owner port
could update the context or query it in current implementation.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: validate connection tracking item

The item of ASO connection tracking will be translated into the
register value when matching. The validation of this item has no
dependency on other layers, since the flow including this item
should be jumped from another group. All the layers checking was
already done in the previous groups. Only the state bits conflict
should be checked.

It is assumed that the flow with CT item will always work on the
TCP traffic.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: validate connection tracking action

The validation of a CT action contains two parts. The first is the
CT action configurations parameter. When creating a CT action
context, some members need to be verified.

The second is that when creating a flow, the DR action of CT should
be validated with other actions and items as well. Currently, only
the TCP protocol support connection tracking.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: add connection tracking context update

When updating a connection tracking context, two separate parts
could be updated.
First, the direction. This will only update the traffic direction
recorded in the software for flow creation.
Second, the TCP parameters. The hardware context will be updated
via the WQE. This update will be blocked until the hardware status
is updated and ready for the next flow creation.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: add translation of connection tracking item

The return register of the DR action will be used for matching.
After the ASO CT checking of a TCP packet, the syndrome is filled in
the register. Only the 8 LSB should be used. A converting from
RTE_FLOW_CONNTRACK_FLAG* to the syndrome should be done after
checking the spec and mask fields.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: add translation of connection tracking action

When creating a flow with this action context for CT, it needs to be
translated in 2 levels.

First, retrieve from action context to rte_flow action.
Second, translate it to the corresponding DR action with traffic
direction that was specified when creating or updating via
rte_flow_action_handle* API.

Before using the DR action in a flow, the CT context should be
available to use in the hardware. A synchronization is done before
inserting the flow rule with CT action to check the HW availability
of this CT context.

In order to release the DR actions and reuse the context of a CT,
the reference count should also be handled in the flow rule
destroying.

The CT index will be recorded in the rte_flow by reusing the ASO age
index to save memory, since only one ASO action is supported in one
flow rule currently. The action context type should also be saved
for CT. When destroying a flow rule, if the context type is CT and
the index is valid (non-zero), the release process should be
handled. By default, the handling will fall back to try to release
the ASO age if any.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: add ASO connection tracking destroy

When trying to destroy an ASO connection tracking context, the DR
action created on this context should also be destroyed. Before
inserting the related software object into the management free list,
the reference count should be checked.

Right now, the context object will not be freed to the system and
will be reused directly from the free list.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: add ASO connection tracking query

After the connection tracking context is created and being used by
the flows, the context will be updated by the HW automatically after
a packet passed the CT validation. E.g., the ACK, SEQ, window and
state of CT can be updated with both direction traffic.

In order to query the updated contents of this context, a WQE should
be posted to the SQ with a return buffer. The data will be filled
into the buffer. And the profile will be filled with specific value.

During the execution of query command, the context may be updated.
The result of the query command may not be the latest one.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: release connection tracking management

When freeing the IB shared context during stopping a device, the
ASO connection tracking management structure should also be cleaned
up.

All the DR actions created should be destroyed. The structures need
to be freed and ASO CT QP should be released. In the meanwhile, the
allocated and registered memory region for query should also be
deregistered and then freed.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: add actions for connection tracking creation

Allocating a CT from the management pools and creating the DR actions
for both directions by default.

If there is no available connection tracking action, a new pool will
be created with a fixed size bulk allocation. Right now, all the
resources are controlled by the linked list.

The ASO connection tracking context associated with these actions
need to be updated via WQE before using for steering.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: support connection tracking modify

After the connection tracking object bulk is allocated, all the
objects' contents are filled with zero by default. Every
new-allocated object must be modified via WQE operation before it is
used.

In order to reduce the latency for the flow creation, an asynchronous
way is used instead of busy waiting for the CQE to be generated.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

common/mlx5: add DevX connection tracking objects creation

Adding support for connection tracking ASO creation via Devx command.
Right now only bulk creation is supported.

By default, the objects with zero contents will be created. Before
using a single object, the modification via posting a WQE to the ASO
CT SQ is needed.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: initialize connection tracking management

The definitions of ASO connection tracking objects management
structures are added.

Considering performance, the bulk allocation of ASO CT objects
should be used. The maximal value per bulk and the granularity could
be fetched from HCA capabilities 2. Right now, a fixed number of 64
is used for each bulk for a better management purpose.

The ASO QP for CT is initialized, the SQ will be used for both
modify and query command.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: use meter color register for connection tracking

Based on the capacity, 3 registers could be used. Due to the register
allocation, only the one REG_C_3 for meter color could be reused
right now.

Then in the same flow, no more than one ASO action can be supported.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

common/mlx5: check connection tracking offload capability

During startup, the ASO connection tracking offload capability could
be queried via HCA_CAP_QUERY command. If the HW doesn't support ASO
CT, the value would be 0 by default. The following initialization
should be skipped and the creation of the CT object should return
a failure directly.

The following CT creation should also check this capability. With
the old driver, the pre-processing macro should be used in order to
make the compiling pass.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

common/mlx5: add connection tracking object

The structures of ASO connection tracking offload object are added
based on the definitions in the PRM. One CT object context will be
loaded into the cache completely in a reversed order of dwords. The
valid bit should be the MSB of the last dword. This is used for the
conntrack context creation and update, as well as for the query.

The capabilities 2 (HCA_CAP_2) layout is also added. The connection
tracking related capabilities could be queried via the HCA_CAP_2.

Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix TCP flags size for modify actions

From RFC the size of the TCP flags is 9, while the defined
current size is 6.

Fixes: 641dbe4fb053 ("net/mlx5: support modify field flow action")
Cc: stable@dpdk.org
Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: support power monitoring

Support the PMD power management API in MLX5 driver.
The monitor policy of this API puts a CPU core to sleep until
a data in some monitored memory address is changed by the NIC.
Implement the get_monitor_addr function to return an address
of a CQE owner bit to monitor the arrival of a new packet.

Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: workaround ASO memory region creation

Due to kernel issue in direct MKEY creation using the DevX API for
physical memory, this patch replaces the ASO MR creation to use Verbs
API.

Fixes: f935ed4b645a ("net/mlx5: support flow hit action for aging")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/bnxt: prevent device access in error state

Driver should prevent any DMA with the device when it
detects an error. When firmware is in fatal state,
stop tx/rx by assigning them to dummy functions.

Fixes: be14720def9c ("net/bnxt: support FW reset")
Fixes: 9d0cbaecc91a ("net/bnxt: support periodic FW health monitoring")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/bnxt: fix ring count calculation

Fix ring count calculation for Thor. VNIC count does not have a
direct bearing on the number of rings that can be used.

Fixes: fe8dd26f86c7 ("net/bnxt: cap max Rx rings for Thor")

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/bnxt: fix mismatched type comparison in Rx

Fix comparison between uint16_t and uint32_t types.

Fixes: 6dc83230b43b ("net/bnxt: support port representor data path")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/bnxt: check PCI config read

Return value where return value of rte_pci_read_config was not checked.
Fix it.

Coverity issue: 349919
Fixes: 9d0cbaecc91a ("net/bnxt: support periodic FW health monitoring")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>

net/bnxt: fix mismatched type comparison in MAC restore

dev_info.max_mac_addrs is of type uint32_t. But the counter i is
of type uint16_t. This mismatch may cause the loop condition may
always be true. Change the loop counter variable to uint32_t.

Fixes: b02f1573cd07 ("net/bnxt: restore MAC filters during reset recovery")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>

net/bnxt: fix single PF per port check

The check BNXT_SINGLE_PF(bp) returns false for a VF. So there is no
extra check needed for VF along with BNXT_SINGLE_PF(bp).

Also make error messages more explicit.

Fixes: ff947c6ce15f ("net/bnxt: add check for multi host PF per port")
Fixes: f86febfb46da ("net/bnxt: support VF")
Fixes: 3e12fdb78e82 ("net/bnxt: support VLAN pvid")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/bnxt: fix dynamic VNIC count

Ensure that the current count of in-use VNICs is decremented
when a VNIC is freed. Don't attempt VNIC allocation when the
maximum supported number of VNICs is currently allocated.

Fixes: 49d0709b257f ("net/bnxt: delete and flush L2 filters cleanly")
Fixes: d24610f7bfda ("net/bnxt: allow flow creation when RSS is enabled")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reported-by: Stephen Hemminger <sthemmin@microsoft.com>

net/bnxt: fix Rx timestamp when FIFO pending bit is set

Fix to clear the Rx FIFO while reading the timestamp.
If the Rx FIFO has pending bit set, keep reading to clear it
and return the last valid timestamp instead of unconditionally
returning an error.

Fixes: b11cceb83a34 ("net/bnxt: support timesync")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/bnxt: refactor multi-queue Rx configuration

Eliminate separate codepath/handling for single queue
as the multiqueue code path takes care of it as well.
The only difference being the end_grp_id being 1
now instead of 0 for single queue, but that does not matter
for single queue and does not alter any functionality.

Fixes: 6133f207970c ("net/bnxt: add Rx queue create/destroy")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/iavf: fix VLAN extraction in AVX512 path

The new VIRTCHNL_VF_OFFLOAD_VLAN_V2 capability added support that allows
the PF to set the location of the RX VLAN tag for stripping offloads.

So the VF needs to extract the VLAN tag according to the location flags.

This patch is the fix for AVX512 path, as AVX2 is already fixed.

Fixes: 9c9aa0040344 ("net/iavf: add offload path for Rx AVX512 flex descriptor")

Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Tested-by: Qin Sun <qinx.sun@intel.com>

net/ice: support flow director for IP fragment packet

New FDIR parsing are added to handle the fragmented IPv4/IPv6 packet.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice: support RSS hash for IP fragment

New pattern and RSS hash flow parsing are added to handle fragmented
IPv4/IPv6 packet.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice/base: support IP fragment RSS and FDIR

Add support for IP fragment RSS hash and FDIR function. Separate IP
fragment and IP other packet types.

The patch also update the release date in README.

Signed-off-by: Ting Xu <ting.xu@intel.com>
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice/base: sign external device package programming

External topology devices (e.g. PHYs) connected to 100G or to SoC that
includes 100G IP might have a firmware engine within the device and
the firmware is usually loaded from NVM connected to the topology
device.
The topology device NVM images can be updated using SW tools but
such solution poses a security risk if there is no validation of
the integrity of an image before programming it to the device NVM.
In order to prevent security risk, the topology device NVM image might
be included as part of 100G NVM image. When the topology device
NVM image is present in the 100G NVM image, it is authenticated
and might be loaded to the topology device at startup or on command
of SW using dedicated AQ.
This patch provides support for this functionality.

Signed-off-by: Stefan Wegrzyn <stefan.wegrzyn@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice/base: support L3 DSCP QoS

The base code support to build configuration TLVs
in DSCP mode has not been implemented before, so
the functions to do so and the flow control to determine
if we are in VLAN or DSCP mode need to be added.

The current value for maximum number of DCB APPs
(ICE_DCBX_MAX_APPS) is not sufficient when supporting
DSCP mode.  Each DSCP->TC mapping will come in as a
single APP value.  So, there can be up to 64 APPs for
DSCP mapping.

Need to keep track of the current DSCP to TC mapping
so that TLVs can be built up to send to the FW.  Add
an u8 array to hold this info.

A u64 is also needed to keep track of the DSCP values
that have had an APP submitted to map its value to a
TC.  Since it would be unwise to allow an APP to be
overwritten by subsequent APPs, reject mappings for a
DSCP value that already has a user mapped value.  This
will allow us to easily track which DSCP values have
been mapped, and when the last one has been deleted.

Signed-off-by: Dave Ertman <david.m.ertman@intel.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice/base: log if DDP/FW do not support QinQ

Currently if the driver supports QinQ there is no message/information
if the DDP and/or FW don't support QinQ. Add functionality that prints
if the DDP and/or FW don't support QinQ if the driver attempts to
configured DVM. This will make it more obvious to users in the field
that they need to update their DDP and/or FW.

This required a small refactor so some of the existing code could be
shared and used by this new print functionality.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

net/ice/base: refactor post DDP download VLAN mode config

Currently it's not clear that only the first PF downloads the package
and configures the VLAN mode. When this is happening all other PFs are
blocked on the global configuration lock. Once the package is
successfully downloaded and the global configuration lock has been
released then all PFs resume initialization. This includes some post
package download VLAN mode configuration. To make this more obvious add
the new function ice_post_pkg_dwnld_vlan_mode_cfg() so any/all post
download VLAN mode configuration code can be put in here.

This also makes it more clear that all PFs will call this new function.

Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

net/ice/base: add IP fragment flags

Add the IPv6 fragment flags and the IPv4 fragment field shift.

Signed-off-by: Ting Xu <ting.xu@intel.com>
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>

common/iavf: fix order of protocol header types

The new virtchnl protocol header types for IPv4 and IPv6 fragment are
not added in order, which will break ABI. Move them to the end of the
list.

Fixes: e6a42fd9158b ("common/iavf: add protocol header for IP fragment")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

vhost: fix offload flags in Rx path

The vhost library currently configures Tx offloading (PKT_TX_*) on any
packet received from a guest virtio device which asks for some offloading.

This is problematic, as Tx offloading is something that the application
must ask for: the application needs to configure devices
to support every used offloads (ip, tcp checksumming, tso..), and the
various l2/l3/l4 lengths must be set following any processing that
happened in the application itself.

On the other hand, the received packets are not marked wrt current
packet l3/l4 checksumming info.

Copy virtio rx processing to fix those offload flags with some
differences:
- accept VIRTIO_NET_HDR_GSO_ECN and VIRTIO_NET_HDR_GSO_UDP,
- ignore anything but the VIRTIO_NET_HDR_F_NEEDS_CSUM flag (to comply with
the virtio spec),

Some applications might rely on the current behavior, so it is left
untouched by default.
A new RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS flag is added to enable the
new behavior.

The vhost example has been updated for the new behavior: TSO is applied to
any packet marked LRO.

Fixes: 859b480d5afd ("vhost: add guest offload setting")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

net/virtio: refactor Tx offload helper

Purely cosmetic but it is rather odd to have an "offload" helper that
checks if it actually must do something.
We already have the same checks in most callers, so move this branch
in them.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Flavio Leitner <fbl@sysclose.org>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

net/virtio: do not touch Tx offload flags

Tx offload flags are of the application responsibility.
Leave the mbuf alone and use a local storage for implicit tcp checksum
offloading in case of TSO.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>

vdpa/mlx5: improve interrupt management

The driver should notify the guest for each traffic burst detected by CQ
polling.

The CQ polling trigger is defined by `event_mode` device argument,
either by busy polling on all the CQs or by blocked call to HW
completion event using DevX channel.

Also, the polling event modes can move to blocked call when the
traffic rate is low.

The current blocked call uses the EAL interrupt API suffering a lot
of overhead in the API management and serve all the drivers and
libraries using only single thread.

Use blocking FD of the DevX channel in order to do blocked call
directly by the DevX channel FD mechanism.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

vhost: add batch datapath for async packed ring

Add batch datapath for async vhost packed ring to improve the
performance of small packet processing.

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

vhost: support packed ring in async datapath

For now async vhost data path only supports split ring. This patch
enables packed ring in async vhost data path to make async vhost
compatible with virtio 1.1 spec.

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

vhost: refactor async split ring functions

This patch moves some code of async vhost split ring into
inline functions to improve the readability. Also, it
changes the pointer index style of iterator to make the
code more concise.

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>

examples/vhost: fix overflow in argument parsing

Change the way passing args to fix potential overflow in args process.

Coverity issue: 363741
Fixes: 965b06f03582 ("examples/vhost: enhance getopt_long usage")

Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

net/virtio: fix vectorized Rx queue rearm

When Rx queue worked in vectorized mode and rxd <= 512, under traffic of
high PPS rate, testpmd often start and receive packets of rxd without
further growth.

Testpmd started with rxq flush which tried to rx MAX_PKT_BURST(512)
packets and drop. When Rx burst size >= Rx queue size, all descriptors
in used queue consumed without rearm, device can't receive more packets.
The next Rx burst returned at once since no used descriptors found,
rearm logic was skipped, rx vq kept in starving state.

To avoid rx vq starving, this patch always check the available queue,
rearm if needed even no used descriptor reported by device.

Fixes: fc3d66212fed ("virtio: add vector Rx")
Fixes: 2d7c37194ee4 ("net/virtio: add NEON based Rx handler")
Fixes: 52b5a707e6ca ("net/virtio: add Altivec Rx")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

telemetry: fix race on callbacks list

The list_commands() function accessed the callbacks list,
but did not take the lock. This may have caused inconsistencies if
callbacks were being registered at the same time.
This is now fixed to lock before iterating the list,
and unlock afterwards.

Fixes: f38748736eb2 ("telemetry: add default callback commands")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ciara Power <ciara.power@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

telemetry: hide internal define

Remove TELEMETRY_MAX_CALLBACKS symbol from the public
rte_telemetry.h header file.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>

test/distributor: fix burst flush on worker quit

While working on RISC-V port I have encountered a situation where worker
threads get stuck in the rte_distributor_return_pkt() function in the
burst test.
Investigation showed some of the threads enter this function with
flag RTE_DISTRIB_GET_BUF set in the d->retptr64[0]. At the same time the
main thread has already passed rte_distributor_process() so nobody will
clear this flag and hence workers can't return.

What I've noticed is that adding a flush just after the last _process(),
similarly to how quit_workers() function is written in the
test_distributor.c fixes the issue.
Lukasz Wojciechowski reproduced the same issue on x86 using a VM with 32
emulated CPU cores to force some lcores not to be woken up.

Fixes: 7c3287a10535 ("test/distributor: add performance test for burst mode")
Cc: stable@dpdk.org
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Acked-by: David Hunt <david.hunt@intel.com>
Tested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Reviewed-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>

test/distributor: fix worker notification in burst mode

Because a single worker can process more than one packet from the
distributor, the final set of notifications in burst mode should be
sent one-by-one to ensure that each worker has a chance to wake up.

This fix mirrors the change done in the functional test by
commit f72bff0ec272 ("test/distributor: fix quitting workers in burst
mode").

Fixes: c3eabff124e6 ("distributor: add unit tests")
Cc: stable@dpdk.org
Signed-off-by: Stanislaw Kardach <kda@semihalf.com>
Acked-by: David Hunt <david.hunt@intel.com>
Tested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
Reviewed-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>