Thomas Monjalon [Thu, 25 Feb 2021 00:07:56 +0000 (01:07 +0100)]
drivers: replace page size definitions with function
The page size is often retrieved from the macro PAGE_SIZE.
If PAGE_SIZE is not defined, it is either using hard coded default,
or getting the system value from the UNIX-only function sysconf().
Such definitions are replaced with the generic function
rte_mem_page_size() defined for each supported OS.
Removing PAGE_SIZE definitions will fix dlb drivers for musl libc,
because #ifdef checks were missing, causing redefinition errors.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Boyer <aboyer@pensando.io>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Timothy McDaniel <timothy.mcdaniel@intel.com>
Thomas Monjalon [Wed, 24 Feb 2021 23:21:58 +0000 (00:21 +0100)]
eal: fix build with musl
In musl libc, cpu_set_t is defined only if _GNU_SOURCE is defined.
In case _GNU_SOURCE is undefined, as in eal_common_errno.c,
it was not possible to include rte_os.h which uses cpu_set_t.
This limitation is removed: if CPU_SETSIZE is not defined,
cpu_set_t related definitions and functions are skipped.
Note: such definitions are unneeded in eal_common_errno.c.
Applications which do not define _GNU_SOURCE may miss cpu_set_t related
features on musl. Such case is detected by RTE_HAS_CPUSET being undefined,
so functions which depend on rte_cpuset_t will be unavailable.
A missing include of fcntl.h is also added.
Bugzilla ID: 35
Fixes:
11b57c698005 ("eal: fix error string function")
Fixes:
176bb37ca6f3 ("eal: introduce internal wrappers for file operations")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Thomas Monjalon [Thu, 25 Feb 2021 08:45:06 +0000 (09:45 +0100)]
build: remove redundant _GNU_SOURCE definitions
The feature macro _GNU_SOURCE is defined globally,
but there was some remaining useless settings.
The internal definition in config/meson.build is kept,
all other internal definitions of _GNU_SOURCE are removed,
except in examples, which can be built as external applications.
Note: external applications do not inherit of _GNU_SOURCE.
Fixes:
5d7b673d5fd6 ("mk: build with _GNU_SOURCE defined by default")
Fixes:
28188cee2aa0 ("build: enable BSD features visibility for FreeBSD")
Fixes:
e6cdc54cc0ef ("net/mlx5: add socket server for external tools")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Thomas Monjalon [Thu, 25 Feb 2021 01:49:19 +0000 (02:49 +0100)]
build: detect execinfo library on Linux
The library execinfo and its header file can be installed on Alpine Linux
where the backtrace feature is not part of musl libc:
apk add libexecinfo-dev
As a consequence, this library should not be restricted to BSD only.
At the same time, the library and header are detected once and added
globally to be linked with any application, internal or external.
Fixes:
9065b1fac65f ("build: fix dependency on execinfo for BSD meson builds")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Thomas Monjalon [Wed, 24 Feb 2021 22:47:56 +0000 (23:47 +0100)]
buildtools: fix build with busybox
If using busybox for mktemp and awk (as in Alpine),
some bugs prevent the script from running:
1/ It seems busybox mktemp requires the pattern to have at least
6 X and no other suffix.
The same has been fixed for other scripts in the past:
commit
3771edc35438 ("buildtools: fix build for some mktemp")
2/ It seems busybox awk does not accept the regex ^.*{
except if the opening curly brace is escaped.
Fixes:
4c82473412e8 ("build: add internal tag check")
Fixes:
68b1f1cda5b4 ("build: check AVX512 rather than binutils version")
Fixes:
3290ac14eb94 ("buildtools: detect discrepancies for experimental symbols")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Thomas Monjalon [Wed, 24 Feb 2021 23:16:22 +0000 (00:16 +0100)]
eal: fix comment of OS-specific header files
The same comment is on top of each rte_os.h file.
It is reworded to remove the mention of "future releases".
Fixes:
428eb983f5f7 ("eal: add OS specific header file")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Viacheslav Ovsiienko [Sun, 7 Mar 2021 11:45:47 +0000 (11:45 +0000)]
net/mlx5: fix Rx metadata leftovers
The Rx metadata might use the metadata register C0 to keep the
values. The same register C0 might be used by kernel for source
vport value handling, kernel uses upper half of the register,
leaving the lower half for application usage.
In the extended metadata mode 1 (dv_xmeta_en devarg is
assigned with value 1) the metadata width is 16 bits only,
the Rx datapath code fetched the entire 32-bit value of the
metadata register and presented one to application. The patch
provides data masking depending on the chosen metadata mode.
Fixes:
6c55b622a956 ("net/mlx5: set dynamic flow metadata in Rx queues")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Viacheslav Ovsiienko [Sun, 14 Mar 2021 12:13:02 +0000 (12:13 +0000)]
compress/mlx5: support timestamp format
This patch adds support for the timestamp format settings for
the receive and send queues. If the firmware version x.30.1000
or above is installed and the NIC timestamps are configured
with the real-time format, the default zero values for newly
added fields cause the queue creation to fail.
The patch queries the timestamp formats supported by the hardware
and sets the configuration values in queue context accordingly.
Fixes:
8619fcd5161b ("compress/mlx5: support queue pair operations")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Viacheslav Ovsiienko [Sun, 14 Mar 2021 12:13:01 +0000 (12:13 +0000)]
regex/mlx5: support timestamp format
This patch adds support for the timestamp format settings for
the receive and send queues. If the firmware version x.30.1000
or above is installed and the NIC timestamps are configured
with the real-time format, the default zero values for newly
added fields cause the queue creation to fail.
The patch queries the timestamp formats supported by the hardware
and sets the configuration values in queue context accordingly.
Fixes:
92f2c6a30fe0 ("regex/mlx5: add send queue")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Viacheslav Ovsiienko [Sun, 14 Mar 2021 12:13:00 +0000 (12:13 +0000)]
vdpa/mlx5: support timestamp format
This patch adds support for the timestamp format settings for
the receive and send queues. If the firmware version x.30.1000
or above is installed and the NIC timestamps are configured
with the real-time format, the default zero values for newly
added fields cause the queue creation to fail.
The patch queries the timestamp formats supported by the hardware
and sets the configuration values in queue context accordingly.
Fixes:
95276abaaf0a ("vdpa/mlx5: introduce Mellanox vDPA driver")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Viacheslav Ovsiienko [Sun, 14 Mar 2021 12:12:59 +0000 (12:12 +0000)]
net/mlx5: support timestamp format
This patch adds support for the timestamp format settings for
the receive and send queues. If the firmware version x.30.1000
or above is installed and the NIC timestamps are configured
with the real-time format, the default zero values for newly
added fields cause the queue creation to fail.
The patch queries the timestamp formats supported by the hardware
and sets the configuration values in queue context accordingly.
Fixes:
86fc67fc9315 ("net/mlx5: create advanced RxQ object via DevX")
Fixes:
ae18a1ae9692 ("net/mlx5: support Tx hairpin queues")
Fixes:
15c3807e86ab ("common/mlx5: support DevX QP operations")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Viacheslav Ovsiienko [Sun, 14 Mar 2021 12:12:58 +0000 (12:12 +0000)]
common/mlx5: add timestamp format support to DevX
This patch handles the NIC-supported timestamp formats via DevX.
Two different timestamp formats can be provided potentially.
The free-running format provides opaque values captured from
the internal clock counter fed by some independent oscillator.
The free-running frequency is not pre-defined and should be
queried from the NIC. The real-time timestamps are expressed
in nanoseconds, captured from the dedicated UTC counter, that
can be adjusted on the fly and synchronized with some external
reference clock.
Depending on the version and configuration the hardware might
support either FR (free-running) or RT (real-time) timestamps,
per queue basis.
The commit provides the querying information about the supported
timestamp formats and provides the means to configure ones
at queue creation time.
Fixes:
e2b4925ef7c1 ("net/mlx5: support Direct Rules E-Switch")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Thomas Monjalon [Tue, 9 Mar 2021 09:48:36 +0000 (10:48 +0100)]
net/mlx5: reduce log level of alignment message
Having to force an alignment does not impact the user,
so it should not be a warning.
The log level is reduced from warning to debug.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@nvidia.com>
Thomas Monjalon [Tue, 9 Mar 2021 09:48:35 +0000 (10:48 +0100)]
common/mlx5: remove extra line feed in log messages
The macro DRV_LOG already includes a terminating line feed character
defined in PMD_DRV_LOG_.
The extra line feeds added in some messages are removed.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@nvidia.com>
Thomas Monjalon [Tue, 9 Mar 2021 09:48:34 +0000 (10:48 +0100)]
net/mlx5: enable debug logs dynamically
Most debug logs are using DRV_LOG(DEBUG,)
but some were using DEBUG().
The macro DEBUG is doing nothing if not compiled with
RTE_LIBRTE_MLX5_DEBUG.
As it is not used in the data path, the macro DEBUG
can be replaced with DRV_LOG.
Then all debug logs can be enabled at runtime with:
--log-level pmd.net.mlx5:debug
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@nvidia.com>
Thomas Monjalon [Tue, 9 Mar 2021 09:48:33 +0000 (10:48 +0100)]
net/mlx4: enable debug logs dynamically
The macro DEBUG was doing nothing if not compiled with
RTE_LIBRTE_MLX4_DEBUG.
As it is not used in the data path, it can be always enabled at
compilation time. Then it can be enabled at runtime with:
--log-level pmd.net.mlx4:debug
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Matan Azrad <matan@nvidia.com>
Wenjun Wu [Fri, 26 Feb 2021 07:22:00 +0000 (15:22 +0800)]
net/ice: check some functions return
Fix unchecked return values reported by coverity.
Coverity issue: 349907
Fixes:
03a05924dad0 ("net/ice: support device-specific DDP package loading")
Cc: stable@dpdk.org
Signed-off-by: Wenjun Wu <wenjun1.wu@intel.com>
Xueming Li [Thu, 11 Mar 2021 13:13:32 +0000 (13:13 +0000)]
ethdev: add helper function to get representor ID
The NIC can have multiple PCIe links and can be attached to multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. The full representor identifier consists of
three indices - controller index, PF index, and VF or SF index (if any).
SR-IOV and SubFunction are created on top of PF. PF index is introduced
because there might be multiple PFs in the bonding configuration and
only bonding device is probed.
In eth representor comparator callback, ethdev representor ID was
compared with devarg. Since controller index and PF index not compared,
callback returned representor from other PF or controller.
This patch adds new API to get representor ID from controller, pf and
vf/sf index. Representor comparer callback get representor ID then
compare with device representor ID.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Xueming Li [Thu, 11 Mar 2021 13:13:31 +0000 (13:13 +0000)]
ethdev: add API to get representor info
The NIC can have multiple PCIe links and can be attached to multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. The full representor identifier consists of
three indices - controller index, PF index, and VF or SF index (if any).
This patch introduces a new API rte_eth_representor_info_get() to
retrieve representor corresponding info mapping:
- caller controller index and pf index.
- supported representor ID ranges.
- type, controller, pf and start vf/sf ID of each range.
The API is useful to calculate representor from devargs to representor
ID.
New ethdev callback representor_info_get() is added to retrieve info
from PMD driver, optional for PMD that doesn't support new devargs
representor syntax.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Xueming Li [Thu, 11 Mar 2021 13:13:30 +0000 (13:13 +0000)]
ethdev: support multi-host in representor
The NIC can have multiple PCIe links and can be attached to the multiple
hosts, for example the same single NIC can be shared for multiple server
units in the rack. On each PCIe link NIC can provide multiple PFs and
VFs/SFs based on these ones. To provide the unambiguous identification
of the PCIe function the controller index is added. The full representor
identifier consists of three indices - controller index, PF index, and
VF or SF index (if any).
This patch introduces controller index to ethdev representor syntax,
examples:
[[c#]pf#]vf#: VF port representor/s, example: pf0vf1
[[c#]pf#]sf#: SF port representor/s, example: c1pf1sf[0-3]
c# is controller(host) ID/range in case of multi-host, optional.
For user application (e.g. OVS), PMD is responsible to interpret and
locate representor device based on controller ID, PF ID and VF/SF ID in
representor syntax.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Xueming Li [Thu, 11 Mar 2021 13:13:29 +0000 (13:13 +0000)]
ethdev: support PF index in representor
With Kernel bonding, multiple underlying PFs are bonded, VFs come
from different PF, need to identify representor of VFs unambiguously by
adding PF index.
This patch introduces optional 'pf' section to representor devargs
syntax, examples:
representor=pf0vf0 - single VF representor
representor=pf[0-1]sf[0-1023] - SF representors from 2 PFs
PF type representor is supported by using standalone 'pf' section:
representor=pf1 - PF representor
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Xueming Li [Thu, 11 Mar 2021 13:13:28 +0000 (13:13 +0000)]
kvargs: support multiple lists
This patch updates kvargs parser to support value of multiple lists or
ranges:
k1=v[1,2]v[3-5]
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Xueming Li [Thu, 11 Mar 2021 13:13:27 +0000 (13:13 +0000)]
ethdev: support sub-function representor
SubFunction is a portion of the PCI device, created on demand, a SF
netdev has its own dedicated queues(txq, rxq). A SF netdev supports
eswitch representation offload similar to existing PF and VF
representors.
To support SF representor, this patch introduces new devargs syntax,
examples:
representor=sf0 - single SubFunction representor
representor=sf[1,3,5] - single list
representor=sf[0-3], - single range
representor=sf[0,2-6,8,10-12] - list with singles and ranges
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Xueming Li [Thu, 11 Mar 2021 13:13:26 +0000 (13:13 +0000)]
ethdev: support new VF representor syntax
Current VF representor syntax:
representor=2 - single representor
representor=[0-3] - single range
To prepare for more representor types, this patch adds compatible VF
representor devargs syntax:
vf#:
representor=vf2 - single representor
representor=vf[1,3,5] - single list
representor=vf[0-3] - single range
representor=vf[0,1,4-7] - list with singles and range
For backwards compatibility, representor "#" is interpreted as "vf#".
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Xueming Li [Thu, 11 Mar 2021 13:13:25 +0000 (13:13 +0000)]
ethdev: refactor representor port list parsing
To the extended representor syntax which need to reuse the value parsing
function for controller and PF section, this patch refactors the port
list parsing.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Xueming Li [Thu, 11 Mar 2021 13:13:24 +0000 (13:13 +0000)]
ethdev: introduce representor type
To support more representor type, this patch introduces representor type
enum. The enum is subject to be extended to support new representor in
patches upcoming.
For each devarg structure, only one type supported.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Hyong Youb Kim <hyonkim@cisco.com>
Vijay Kumar Srivastava [Tue, 16 Mar 2021 08:58:32 +0000 (11:58 +0300)]
common/sfc_efx: move function to get family
Move function to get efx family from net driver into common driver.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Vijay Kumar Srivastava [Tue, 16 Mar 2021 08:58:31 +0000 (11:58 +0300)]
net/sfc: skip driver probe for incompatible device class
Driver would be probed only for the net device class.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Vijay Kumar Srivastava [Tue, 16 Mar 2021 08:58:30 +0000 (11:58 +0300)]
common/sfc_efx: support getting device class
Device class argument would be used to select compatible driver.
Driver probe would be skipped for incompatible device class.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Vijay Kumar Srivastava [Tue, 16 Mar 2021 08:58:29 +0000 (11:58 +0300)]
common/sfc_efx/base: support verifying virtio features
Add an API to verify virtio features supported by device.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Vijay Kumar Srivastava [Tue, 16 Mar 2021 08:58:28 +0000 (11:58 +0300)]
common/sfc_efx/base: support getting virtio features
Add an API to get virtio features supported by device.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Vijay Kumar Srivastava [Tue, 16 Mar 2021 08:58:27 +0000 (11:58 +0300)]
common/sfc_efx/base: add virtio build dependency
Add EFSYS_HAS_UINT64 build dependency on EFSYS_OPT_VIRTIO.
virtio features are represented as bitmask in 64-bit unsigned
integer.
Signed-off-by: Vijay Kumar Srivastava <vsrivast@xilinx.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Vijay Srivastava [Tue, 16 Mar 2021 08:58:26 +0000 (11:58 +0300)]
common/sfc_efx/base: support getting virtq doorbell offset
Add an API to query the virtqueue doorbell offset in the BAR for a VI.
For vDPA, the virtio net driver notifies the device directly by writing
doorbell. This API would be invoked from vDPA client driver.
Signed-off-by: Vijay Srivastava <vijays@solarflare.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Vijay Srivastava [Tue, 16 Mar 2021 08:58:25 +0000 (11:58 +0300)]
common/sfc_efx/base: add base virtio support for vDPA
In the vDPA mode, only data path is offloaded in the hardware and
control path still goes through the hypervisor and it configures
virtqueues via vDPA driver so new virtqueue APIs are required.
Implement virtio init/fini and virtqueue create/destroy APIs.
Signed-off-by: Vijay Srivastava <vijays@solarflare.com>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Kalesh AP [Tue, 16 Mar 2021 06:51:36 +0000 (12:21 +0530)]
app/testpmd: check MAC address query
This patch checks return value for rte_eth_dev_info_get() in show_macs().
Coverity issue: 353629
Fixes:
e1d44d0ad623 ("app/testpmd: show MAC addresses added to a port")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Somnath Kotur [Fri, 12 Mar 2021 05:21:09 +0000 (10:51 +0530)]
net/bnxt: fix Rx and Tx timestamps
timesync adjust and write_time APIs needed to account for the Rx and Tx
timestamp counters as well. Fix it since it was not done earlier.
Fixes:
b11cceb83a34 ("net/bnxt: support timesync")
Cc: stable@dpdk.org
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Somnath Kotur [Fri, 12 Mar 2021 05:21:08 +0000 (10:51 +0530)]
net/bnxt: fix xstats get
Fix to return count in xstats get op in all cases.
Driver was returning 0 if the 'xstats' parameter being passed to
xstats_get_op was NULL. This won't work on some applications that
rely on a valid count being passed even in this case so that it can
allocate memory accordingly followed by a reissue of the xstats_get_op
to get the actual stats populated by the driver.
Fixes:
063e59ddd28e ("net/bnxt: fix crash in xstats get")
Cc: stable@dpdk.org
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Lance Richardson [Sat, 6 Mar 2021 15:19:11 +0000 (10:19 -0500)]
net/bnxt: optimize Tx completion handling
Avoid copying mbuf pointers to separate array for bulk
mbuf free when handling transmit completions for vector
mode transmit.
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Ajit Khaparde [Thu, 11 Mar 2021 23:30:33 +0000 (15:30 -0800)]
net/bnxt: rename a member to avoid conflict
Address build issues with Clang and without glibc on ppc64le.
Vector can be a keyword and should not be used in code.
Renaming it to avoid conflict.
Reported-by: Piotr Kubaj <pkubaj@freebsd.org>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 10 Mar 2021 07:50:54 +0000 (13:20 +0530)]
net/bnxt: mute some failure logs
In the init path, driver ignores few of the HWRM command failures.
There is no need to log the error message in those cases.
Fixes:
3fb93bc7c349 ("net/bnxt: initialize parent PF information")
Fixes:
4e3f887bec4b ("net/bnxt: support HWRM port PHY qcaps")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 10 Mar 2021 07:50:53 +0000 (13:20 +0530)]
net/bnxt: fix HWRM and FW incompatibility handling
Fix to return an error when the HWRM version that the driver
is compiled against is incompatible with the FW that is actually
running on the card. This is determined based on the req length
indicated by FW against the value supported in the HWRM.
Fixes:
804e746c7b73 ("net/bnxt: add hardware resource manager init code")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Thu, 4 Mar 2021 09:07:28 +0000 (14:37 +0530)]
net/bnxt: fix VF info allocation
1. Renamed bnxt_hwrm_alloc_vf_info()/bnxt_hwrm_free_vf_info to
bnxt_alloc_vf_info()/bnxt_free_vf_info as it does not
issue any HWRM command to fw.
2. Fix missing unlock when memory allocation fails.
Fixes:
b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Thu, 4 Mar 2021 09:07:27 +0000 (14:37 +0530)]
net/bnxt: fix device readiness check
Fix HWRM_VER_GET command to handle DEV_NOT_RDY state.
Driver should fail probe if the device is not ready.
Conversely, the HWRM_VER_GET poll after reset can safely
retry until the existing timeout is exceeded.
Fixes:
804e746c7b73 ("net/bnxt: add hardware resource manager init code")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Ajit Khaparde [Thu, 4 Mar 2021 09:07:26 +0000 (14:37 +0530)]
net/bnxt: check flush status during ring free
When host SW issues a HWRM_RING_FREE for Tx/Rx/AGG ring in HW,
the FW flushes the BDs associated with the ring and performs other
cleanup in the HW. The host software should ideally check for an
indication from the FW indicating this step has been completed
successfully to avoid unexpected errors during cleanup.
The FW issues a HWRM_DONE response to the RING_FREE request on
the corresponding CQ ring. Poll the CQs during cleanup and
ensure the HWRM_FREE command is completed not just based on the
value of valid bit but also the HWRM_DONE response for the ring.
If the HWRM_DONE response is not seen, force the cleanup to
complete just based on the valid bit.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Lance Richardson [Tue, 2 Mar 2021 15:16:08 +0000 (10:16 -0500)]
net/bnxt: refactor mbuf pointer reset
Remove code for setting consumed mbuf pointers to NULL from the
vector receive functions as a minor performance optimization.
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Lance Richardson [Tue, 2 Mar 2021 14:29:27 +0000 (09:29 -0500)]
net/bnxt: fix Rx descriptor status
Fix a number of issues in the bnxt receive descriptor status
function, including:
- Provide status of receive descriptor instead of completion
descriptor.
- Remove invalid comparison of raw ring index with masked ring
index.
- Correct misinterpretation of offset parameter as ring index.
- Correct misuse of completion ring index for mbuf ring (the
two rings have different sizes).
Fixes:
0fe613bb87b2 ("net/bnxt: support Rx descriptor status")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:53 +0000 (21:25 +0530)]
net/bnxt: fix PTP support for Thor
On Thor, Rx timestamp is present in the Rx completion record.
Only 32 bits of the timestamp is present in the completion.
The driver needs to periodically poll the current 48 bit
free running timer using the HWRM_PORT_TS_QUERY command.
It can combine the upper 16 bits from the HWRM response
with the lower 32 bits in the Rx completion to produce
the 48 bit timestamp for the Rx packet.
This patch adds an alarm thread to periodically poll the current 48 bit
free running timer using the HWRM_PORT_TS_QUERY command.
This avoids issuing the hwrm command from the Rx handler.
This patch also handles the timer roll over condition.
Fixes:
6cbd89f9f3d8 ("net/bnxt: support PTP for Thor")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:52 +0000 (21:25 +0530)]
net/bnxt: fix FW readiness check during recovery
Moved fw readiness check to a new routine bnxt_check_fw_ready().
During error recovery, driver needs to wait for fw readiness.
For that, it uses bnxt_hwrm_ver_get() function now and that
function does parsing of the VER_GET response as well.
Added a new lightweight function bnxt_hwrm_poll_ver_get() for polling
the firmware readiness which issues VER_GET and checks for success
without processing the command response.
Fixes:
df6cd7c1f73a ("net/bnxt: handle reset notify async event from FW")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:51 +0000 (21:25 +0530)]
net/bnxt: fix firmware fatal error handling
During some fatal firmware error conditions, the PCI config space
register 0x2e which normally contains the subsystem ID will become
0xffff. This register will revert back to the normal value after
the chip has completed core reset. If we detect this condition,
we can poll this config register immediately for the value to revert.
Because we use config read cycles to poll this register, there is no
possibility of Master Abort if we happen to read it during core reset.
This speeds up recovery significantly as we don't have to wait for the
conservative min_time before polling to see if the firmware has come
out of reset. As soon as this register changes value we can proceed
to re-initialize the device.
Fixes:
df6cd7c1f73a ("net/bnxt: handle reset notify async event from FW")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:50 +0000 (21:25 +0530)]
net/bnxt: handle echo request async message
This is a new async message that the firmware can send to check if it
can communicate with the driver. This is an added error detection
scheme that firmware can use if it suspects errors in the PCIe
interface. When the driver receives this async message, it will reply
back echoing some data in the async message. If the firmware is not
getting the reply with the proper data after some retries, error
recovery will kick in.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:49 +0000 (21:25 +0530)]
net/bnxt: log port id in async events
1. Used port id in async event logs.
2. Added a debug log in bnxt_hwrm_func_driver_unregister().
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Ajit Khaparde [Wed, 24 Feb 2021 15:55:48 +0000 (21:25 +0530)]
net/bnxt: update to new version of backing store
Update HWRM headers to version 1.10.2.15
which updates the backing store API for additional TQM rings.
Add support for 9th TQM ring using latest firmware interface.
Also make sure that we set only necessary bits in the enables
field in backing store request.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:47 +0000 (21:25 +0530)]
net/bnxt: update HWRM structures
Brought in the latest hsi_struct_def_dpdk.h.
HWRM API is now updated to version 1.10.2.15.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Venkat Duvvuru [Wed, 24 Feb 2021 15:55:46 +0000 (21:25 +0530)]
net/bnxt: fix queues per VNIC
Update queues per VNIC in single queue mode.
bp->rx_num_qs_per_vnic is not initialized in the single queue mode.
As a result of this when an interface is reconfigured to single
queue mode from an existing multiqueue mode, bp->rx_num_qs_per_vnic
is not updated to the value of 1. Hence, the driver will try to
access more than one queue resulting in a crash.
This patch fixes it by initializing bp->rx_num_qs_per_vnic in the
single queue mode as well.
Fixes:
36024b2e7fe5 ("net/bnxt: allow dynamic creation of VNIC")
Cc: stable@dpdk.org
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:45 +0000 (21:25 +0530)]
net/bnxt: remove extra blank line
Removed an unnecessary extra blank line.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:44 +0000 (21:25 +0530)]
net/bnxt: fix VNIC configuration
PMD should not set any flags to receive RoCE traffic while
configuring the vnic. Since the PMD does not support RoCE
some of the flags and code is unused. Clean it up.
Fixes:
b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:43 +0000 (21:25 +0530)]
net/bnxt: remove unused macro
remove HWRM_SEQ_ID_INVALID macro.
Fixes:
804e746c7b73 ("net/bnxt: add hardware resource manager init code")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Ajit Khaparde [Wed, 10 Mar 2021 18:31:01 +0000 (10:31 -0800)]
devtools: add acronyms in dictionary for commit checks
Update word list with VNIC and Thor to catch errors in patch title.
Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Thu, 11 Mar 2021 10:00:40 +0000 (13:00 +0300)]
net/sfc: update copyright year
Bump copyright year to 2021.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Andrew Rybchenko [Thu, 11 Mar 2021 10:00:39 +0000 (13:00 +0300)]
common/sfc_efx: update copyright year
Bump copyright year to 2021.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Andrew Rybchenko [Thu, 11 Mar 2021 10:48:35 +0000 (13:48 +0300)]
app/testpmd: log reason of port start failure
Provide a bit more diagnostics information when port start fails.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Ivan Malov [Mon, 8 Mar 2021 06:51:04 +0000 (09:51 +0300)]
net: fix comment in IPv6 header
The comment got it wrong. The payload length field
does not include the fixed IPv6 header size.
Fixes:
7eca7f7fd09d ("net: add missing endianness annotations")
Fixes:
af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Igor Russkikh [Wed, 10 Mar 2021 08:16:31 +0000 (09:16 +0100)]
maintainers: update for qede
Removing Shahed, adding me and Devendra
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Acked-by: Devendra Singh Rawat <dsinghrawat@marvell.com>
Ciara Loftus [Wed, 10 Mar 2021 07:48:16 +0000 (07:48 +0000)]
net/af_xdp: prefer busy polling
This commit introduces support for preferred busy polling
to the AF_XDP PMD. This feature aims to improve single-core
performance for AF_XDP sockets under heavy load.
A new vdev arg is introduced called 'busy_budget' whose default
value is 64. busy_budget is the value supplied to the kernel
with the SO_BUSY_POLL_BUDGET socket option and represents the
busy-polling NAPI budget. To set the budget to a different value
eg. 256:
--vdev=net_af_xdp0,iface=eth0,busy_budget=256
Preferred busy polling is enabled by default provided a kernel with
version >= v5.11 is in use. To disable it, set the budget to zero.
The following settings are also strongly recommended to be used in
conjunction with this feature:
echo 2 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs
echo 200000 | sudo tee /sys/class/net/eth0/gro_flush_timeout
.. where eth0 is the interface being used by the PMD.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ciara Loftus [Wed, 10 Mar 2021 07:48:15 +0000 (07:48 +0000)]
net/af_xdp: use recvfrom instead of poll syscall
poll() is more expensive and requires more tuning
when used with the upcoming busy polling functionality.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ciara Loftus [Wed, 10 Mar 2021 07:48:14 +0000 (07:48 +0000)]
net/af_xdp: allow bigger batch sizes
Prior to this commit, the maximum batch sizes for zero-copy and
copy-mode rx and copy-mode tx were set to 32. Apart from zero-copy tx,
the user could never rx/tx any more than 32 packets at a time and
without inspecting the code the user wouldn't be aware of this.
This commit removes these upper limits placed on the user and instead
sets an internal batch size equal to the default ring size (2048).
Batches larger than this are still processed, however they are split
into smaller batches similar to how it's done in other drivers. This is
necessary because some arrays used during rx/tx need to be sized at
compile-time.
Allowing a larger batch size allows for fewer batches and thus larger
bulk operations, fewer ring accesses and fewer syscalls which should
yield improved performance.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Thomas Monjalon [Tue, 16 Mar 2021 22:09:38 +0000 (23:09 +0100)]
bus/pci: fix Windows kernel driver categories
In Windows probing, the value RTE_PCI_KDRV_NONE was used
instead of RTE_PCI_KDRV_UNKNOWN.
This value covers the mlx case where the kernel driver is in place,
offering a bifurcated mode to the userspace driver.
When the kernel driver is listed as unknown,
there is no special treatment in DPDK probing, contrary to UIO modes.
The value RTE_PCI_KDRV_NIC_UIO (FreeBSD) was re-used
instead of having a new RTE_PCI_KDRV_NET_UIO for Windows NetUIO.
While adding the new value RTE_PCI_KDRV_NET_UIO
(at the end for ABI compatibility),
the enum of kernel driver categories is annotated.
Fixes:
b762221ac24f ("bus/pci: support Windows with bifurcated drivers")
Fixes:
c76ec01b4591 ("bus/pci: support netuio on Windows")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Thomas Monjalon [Wed, 17 Mar 2021 14:34:02 +0000 (15:34 +0100)]
eal: mark version parts API as experimental
Some functions were introduced in DPDK 21.05 to query the version parts
(prefix, year, month, minor, suffix, release) at runtime.
Per guidelines, these new public functions must be marked with
__rte_experimental and ABI versioned as EXPERIMENTAL.
Fixes:
5b637a848195 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Thomas Monjalon [Wed, 17 Mar 2021 09:18:22 +0000 (10:18 +0100)]
eal: fix version macro
The macro RTE_VERSION was broken since updated with function calls.
It is a build-time version number, and must be built with macros.
For a run-time version number, there is the function rte_version().
Fixes:
5b637a848195 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Hemant Agrawal [Tue, 23 Feb 2021 06:14:12 +0000 (11:44 +0530)]
examples/ptpclient: enable Rx timestamp offload
This patch add support to enable Rx offload for timestamp.
It is required to be enabled for some PMDs e.g. dpaa2.
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
Anatoly Burakov [Wed, 3 Mar 2021 12:35:13 +0000 (12:35 +0000)]
doc: update power management in doxygen API index
The headers rte_power_intrinsics.h and rte_power_pmd_mgmt.h
were missing from the doxygen API index.
Fixes:
cda57d9388c0 ("eal: add power management intrinsics")
Fixes:
682a645438c5 ("power: add ethdev power management")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Khoa To [Mon, 1 Mar 2021 06:52:41 +0000 (22:52 -0800)]
bus/pci: support allow/block lists on Windows
EAL -a and -b options are used to specify which PCI devices are
explicitly allowed or blocked during PCI bus scan. This evaluation
is missing in the Windows implementation of rte_pci_scan.
This patch provides this missing functionality, so that apps can specify
which devices to ignore during PCI bus scan.
Signed-off-by: Khoa To <khot@microsoft.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Nick Connolly [Mon, 1 Mar 2021 09:56:44 +0000 (09:56 +0000)]
bus/pci: set Windows device class and bus
Attaching to an NVMe disk on Windows using SPDK requires the
PCI class ID and device.bus fields. Decode the class ID from the PCI
device info strings if it is present and set device.bus.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Pallavi Kadam [Wed, 10 Feb 2021 20:36:54 +0000 (12:36 -0800)]
bus/pci: skip probing some Windows NDIS devices
Implement rte_pci_map_device() to distinguish between the devices bound
to netuio and NDIS devices.
Only return success for the netuio devices.
Fixes:
c76ec01b4591 ("bus/pci: support netuio on Windows")
Cc: stable@dpdk.org
Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tested-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tal Shnaiderman [Thu, 18 Feb 2021 11:40:58 +0000 (13:40 +0200)]
eal/windows: fix default thread priority
The hard-coded thread priority for Windows threads in EAL
is REALTIME_PRIORITY_CLASS/THREAD_PRIORITY_TIME_CRITICAL.
This results in issues with DPDK threads causing OS thread starvation
and eventually a bugcheck.
The fix reduce the thread priority to
NORMAL_PRIORITY_CLASS/THREAD_PRIORITY_NORMAL.
Bugzilla ID: 600
Fixes:
53ffd9f080f ("eal/windows: add minimum viable code")
Cc: stable@dpdk.org
Reported-by: Odi Assli <odia@nvidia.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Sat, 27 Feb 2021 20:32:01 +0000 (23:32 +0300)]
eal/windows: add missing SPDX license tag
Fixes:
c08bd191b13d ("eal/windows: initialize hugepage info")
Cc: stable@dpdk.org
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Jie Zhou [Mon, 8 Mar 2021 18:05:39 +0000 (10:05 -0800)]
metrics: export telemetry stubs if no libjansson
This patch allows the same set of rte_metrics_tel_* functions to be
exported no matter JANSSON is available or not, by doing following:
1. Leverage dpdk_conf to set configuration flag RTE_HAS_JANSSON
when Jansson dependency is found.
2. In rte_metrics_telemetry.c, leverage RTE_HAS_JANSSON to handle the
case when JANSSON is not available by adding stubs for all the instances.
3. In meson.build, per dpdk/doc/guides/rel_notes/release_20_05.rst,
it is claimed that "Telemetry library is no longer dependent on the
external Jansson library, which allows Telemetry be enabled by default.",
thus make the deps and includes of Telemetry as not conditional anymore.
Signed-off-by: Jie Zhou <jizh@microsoft.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Ferruh Yigit [Tue, 9 Feb 2021 15:06:20 +0000 (15:06 +0000)]
log/linux: make default output stderr
In Linux by default DPDK log goes to stdout, as well as syslog.
It is possible for an application to change the library output stream
via 'rte_openlog_stream()' API, to set it to stderr, it can be used as:
rte_openlog_stream(stderr);
But still updating the default log output to 'stderr'.
Bugzilla ID: 8
Fixes:
af75078fece3 ("first public release")
Cc: stable@dpdk.org
Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Juraj Linkeš [Thu, 11 Feb 2021 12:59:46 +0000 (13:59 +0100)]
build: support KNI cross-compilation
The KNI linux module is using a custom target for building, which
doesn't take into account any cross compilation arguments. The arguments
in question are ARCH, CROSS_COMPILE (for gcc, clang) and CC, LD (for
clang). Get those from the cross file and pass them to the custom
target.
The user supplied path may not contain the 'build' directory, such as
when using cross-compiled headers, so only append that in the default
case (when no path is supplied in native builds) and use the unmodified
path from the user otherwise. Also modify the install path accordingly.
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Tue, 16 Feb 2021 15:13:29 +0000 (15:13 +0000)]
eal: fix querying DPDK version at runtime
For using a DPDK application, such as OVS, which is dynamically linked, the
DPDK version in use should always report the actual version, not the
version used at build time. This incorrect behaviour can be seen by
building OVS against one version of DPDK and running it against a later
one. Using "ovs-vsctl list Open_vSwitch" to query basic info, the
dpdk_version returned will be the build version not the currently running
one - which can be verified using the DPDK telemetry library client.
$ sudo ovs-vsctl list Open_vSwitch | grep dpdk_version
dpdk_version : "DPDK 20.11.0-rc4"
$ echo quit | sudo dpdk-telemetry.py
Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 21.02.0-rc2", "pid": 405659, "max_output_len": 16384}
-->
To fix this, we need to convert the rte_version() function, and any other
necessary parts of the rte_version.h, to be actual functions in EAL, not
just inlines/macros. The only complication in doing so is that telemetry
library cannot call rte_version() directly, and instead needs the version
string passed in on init.
Fixes:
af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Fri, 12 Mar 2021 14:56:05 +0000 (14:56 +0000)]
build: exclude meson files from examples installation
The meson.build files in each example directory is simply to support
building the example as part of the main SDK build, and these should not
be installed with the example's source code and makefile. The exclude of
"meson.build" only filters out the top-level examples/meson.build file,
not the file in each subdirectory.
To fix this, we can build up the list of files to exclude based off the
list of all examples. With this change "find examples/ -name meson.build"
returns no hits when run on an installed instance.
Fixes:
e5b95003f1df ("examples: fix flattening directory layout on install")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Huawei Xie [Wed, 10 Mar 2021 17:36:30 +0000 (01:36 +0800)]
bus/pci: support MMIO for ioport
With I/O BAR, we get PIO (port-mapped I/O) address.
With MMIO (memory-mapped I/O) BAR, we get mapped virtual address.
We distinguish PIO and MMIO by their address range like how kernel does,
i.e, address below 64K is PIO.
ioread/write8/16/32 is provided to access PIO/MMIO.
By the way, for virtio on arch other than x86, BAR flag indicates PIO
but is mapped.
Signed-off-by: Huawei Xie <huawei.xhw@alibaba-inc.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Huawei Xie [Wed, 10 Mar 2021 17:36:29 +0000 (01:36 +0800)]
bus/pci: use Linux PCI sysfs to get PIO address
Currently virtio PMD assumes legacy device uses PIO bar.
There are three ways to get PIO (port-mapped I/O) address for virtio
legacy device.
1) under igb_uio
- get PIO address from uio/uio# sysfs attribute, for instance:
/sys/bus/pci/devices/0000:00:09.0/uio/uio0/portio/port0/start
2) under uio_pci_generic
- for X86, get PIO address from /proc/ioport
- for other ARCH, get PIO address from standard PCI sysfs attribute,
for instance: /sys/bus/pci/devices/0000:00:09.0/resource
Actually, "port0/start" in igb_uio and "resource" point to exactly the
same thing, i.e, pci_dev->resource[0] in kernel source code.
This patch refactors these messy things, and uses standard PCI sysfs
attribute "resource".
Signed-off-by: Huawei Xie <huawei.xhw@alibaba-inc.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Satheesh Paul [Tue, 9 Feb 2021 10:01:13 +0000 (15:31 +0530)]
net/octeontx2: fix VLAN filter
This patch fixes incorrect MCAM key preparation when creating
MCAM entry to allow VLAN IDs after vlan filtering is enabled on port.
Fixes:
ba1b3b081edf ("net/octeontx2: support VLAN offloads")
Cc: stable@dpdk.org
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Satheesh Paul [Mon, 8 Feb 2021 13:01:57 +0000 (18:31 +0530)]
net/octeontx2: support flow API dump
Add support to dump hardware internal representation information of
rte flow to file.
Every flow rule added will be dumped in the below format.
MCAM Index:1881
Interface :NIX-RX (0)
Priority :1
NPC RX Action:0X00000000404001
ActionOp:NIX_RX_ACTIONOP_UCAST (1)
PF_FUNC: 0X400
RQ Index:0X004
Match Id:0000
Flow Key Alg:0
NPC RX VTAG Action:0X00000000008100
VTAG0:relptr:0
lid:0X1
type:0
Patterns:
NPC_PARSE_NIBBLE_CHAN:000
NPC_PARSE_NIBBLE_LA_LTYPE:LA_ETHER
NPC_PARSE_NIBBLE_LB_LTYPE:NONE
NPC_PARSE_NIBBLE_LC_LTYPE:LC_IP
NPC_PARSE_NIBBLE_LD_LTYPE:LD_TCP
NPC_PARSE_NIBBLE_LE_LTYPE:NONE
LA_ETHER, hdr offset:0, len:0X6, key offset:0X8,\
Data:0X4AE124FC7FFF, Mask:0XFFFFFFFFFFFF
LA_ETHER, hdr offset:0XC, len:0X2, key offset:0X4, Data:0XCA5A,\
Mask:0XFFFF
LC_IP, hdr offset:0XC, len:0X8, key offset:0X10,\
Data:0X0A01010300000000, Mask:0XFFFFFFFF00000000
LD_TCP, hdr offset:0, len:0X4, key offset:0X18, Data:0X03450000,\
Mask:0XFFFF0000
MCAM Raw Data :
DW0 :
0000CA5A01202000
DW0_Mask:
0000FFFF0FF0F000
DW1 :
00004AE124FC7FFF
DW1_Mask:
0000FFFFFFFFFFFF
DW2 :
0A01010300000000
DW2_Mask:
FFFFFFFF00000000
DW3 :
0000000003450000
DW3_Mask:
00000000FFFF0000
DW4 :
0000000000000000
DW4_Mask:
0000000000000000
DW5 :
0000000000000000
DW5_Mask:
0000000000000000
DW6 :
0000000000000000
DW6_Mask:
0000000000000000
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Jiawei Zhu [Mon, 1 Mar 2021 17:19:50 +0000 (12:19 -0500)]
net/mlx5: fix Rx segmented packets on mbuf starvation
The issue occurred if mbuf starvation happened
in the middle of segmented packet reception.
In such a situation, after release the segments of
packet being received, code did not advance the
consumer index to the next stride. This caused
the receiving of the wrong segmented packet data.
The possible error scenario:
- we assume segs_n is 4 and we are receiving 4
segments of multi-segment packet.
- we fail to allocate mbuf while receiving the 3rd segment,
and this frees the mbufs of the packet chain we have built.
There are the 1st and 2nd segments in the chain.
- the 1st and the 2nd segments of this stride of Rx queue
are filled up (in elts array) with the new allocated
mbufs and their data are random (the 3rd and 4th
segments still contain the valid data of the packet though).
- on the next iteration of stride processing we get
the wrong two segments of the multi-segment packet.
Hence, we should skip these mbufs in the stride and
we should advance the consumer index on loop exit.
Fixes:
15a756b63734 ("net/mlx5: fix possible NULL dereference in Rx path")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Zhu <zhujiawei12@huawei.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xiaoyun Li [Tue, 2 Mar 2021 07:03:20 +0000 (15:03 +0800)]
net/i40e: fix IPv4 fragment offload
IPv4 fragment_offset mask was required to be 0 no matter what the
spec value was. But zero mask means not caring about fragment_offset
field then both non-frag and frag packets should hit the rule.
But the actual fragment rules should be like the following:
Only non-fragment packets can hit Rule 1:
Rule 1: mask=0x3fff, spec=0
Only fragment packets can hit rule 2:
Rule 2: mask=0x3fff, spec=0x8, last=0x2000
This patch allows the above rules.
Fixes:
42044b69c67d ("net/i40e: support input set selection for FDIR")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Wei Huang [Wed, 3 Mar 2021 02:34:31 +0000 (21:34 -0500)]
raw/ifpga: add miscellaneous APIs
Below miscellaneous APIs are used to implement OPAE application.
1. rte_pmd_ifpga_get_pci_bus() get PCI bus ifpga driver registered.
2. rte_pmd_ifpga_partial_reconfigure() do partial reconfiguration.
3. rte_pmd_ifpga_cleanup() free software resources allocated by driver.
4. rte_pmd_ifpga_set_rsu_status() set status of rsu process.
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Wei Huang [Wed, 3 Mar 2021 02:34:30 +0000 (21:34 -0500)]
raw/ifpga: add APIs to get FPGA information
There are some information data can be got from FPGA, they are
implemented in below APIs:
1. rte_pmd_ifpga_get_property() get properties of FPGA (include BMC).
2. rte_pmd_ifpga_get_phy_info() get information of PHY connect to FPGA.
3. rte_pmd_ifpga_get_rsu_status() get status of rsu process.
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Wei Huang [Wed, 3 Mar 2021 02:34:29 +0000 (21:34 -0500)]
raw/ifpga: add FPGA RSU APIs
RSU (Remote System Update) depends on secure manager which may be
different on various implementations, so a new secure manager device
is implemented for adapting such difference.
There are five APIs added:
1. rte_pmd_ifpga_get_dev_id() get raw device ID of ifpga device from PCI
address like 'Domain:Bus:Dev.Func'.
2. rte_pmd_ifpga_update_flash() update flash with specific image file.
3. rte_pmd_ifpga_stop_update() abort flash update process.
4. rte_pmd_ifpga_reboot_try() check current ifpga status and change it
to reboot status if it is idle.
5. rte_pmd_ifpga_reload() trigger full reconfiguration of ifpga device.
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Beilei Xing [Wed, 24 Feb 2021 02:09:00 +0000 (10:09 +0800)]
net/i40evf: fix packet loss for X722
When Tx queue number is more than Rx queue number, and RSS is
enabled, there'll be packet loss with X722.
The root cause is the lookup table is not configured correctly,
since it uses VF's queue pair number but not Rx queue number.
Fixes:
2da3ba746795 ("net/i40e: fix VF runtime queues RSS config")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Hengjian Zhang <hengjianx.zhang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:07 +0000 (10:54 +0800)]
net/ice: clean GTPU flow type for flow director
Currently, FDIR only support GTPU outer fields in PF. Clean the
redundant GTPU inner info in flow type definition and align with
shared code.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:06 +0000 (10:54 +0800)]
net/ice: distinguish input set outer fields
Distinguish input_set_mask to inner and outer part. Use
input_set_mask_o for tunnel outer or non-tunnel input set.
input_set_mask_i is used for tunnel inner fields only.
Adjust indentation of ice_pattern_match_item list in switch, ACL, RSS
and FDIR for easy review.
For switch, ACL and RSS, only use
input_set_mask_o and set the input_set_mask_i all none.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:05 +0000 (10:54 +0800)]
net/ice: refactor input set config
For tunnel or non-tunnel packet, the input set is in outer_input_set
and use seg_tun[0]. seg_tun[1] is only used for tunnel inner fields.
This patch make align with input_set inner/outer with seg_tun[] and
simplify it.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:04 +0000 (10:54 +0800)]
net/ice: refactor flow pattern parser
Distinguish inner/outer input_set. And avoid too many nested
conditionals in each type's parser. input_set_o is used for
tunnel outer fields or non-tunnel fields , input_set_i is only
used for inner fields.
For GTPU, store the outer IP fields in inner part to align with
shared code behavior.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:03 +0000 (10:54 +0800)]
net/ice: refactor flow director filter structure
This patch use input_set_o and input_set_i to distinguish inner/outer
input set. input_set_i is only used for inner field.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:02 +0000 (10:54 +0800)]
net/ice: clean input set macro definition
Currently, the macro of input set use 2 bits, one bit for protocol and
inner/outer, another bit for src/dst field. But this could not
distinguish a rule with inner and outer fields for tunnel packet.
Redefine input set macro to make it clear. Only use these two bits for
protocol and field. Ignore the redundant inner/outer info.
ICE_INSET_TUN_* is used by switch module, should be removed after
switch refactor.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:57 +0000 (15:23 +0800)]
net/ice/base: cleanup filter list on error
When ice_remove_vsi_lkup_fltr is called, by calling
ice_add_to_vsi_fltr_list local copy of vsi filter list
is created. If any issues during creation of vsi filter
list occurs it up for the caller to free already
allocated memory. This patch ensures proper memory
deallocation in these cases.
Fixes:
c7dd15931183 ("net/ice/base: add virtual switch code")
Cc: stable@dpdk.org
Signed-off-by: Robert Malz <robertx.malz@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:56 +0000 (15:23 +0800)]
net/ice/base: fix uninitialized struct
One of the structs being used for ACL counter rules was allocated on
the stack and left uninitialized. Rather than depending on
undefined behavior around the .amount member during rule removal,
just leave a comment and initialize the struct to zero, as this is a
slow path call anyway. This bug could have caused silent failures
during counter removal.
Fixes:
f3202a097f12 ("net/ice/base: add ACL module")
Cc: stable@dpdk.org
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:55 +0000 (15:23 +0800)]
net/ice/base: update GTPU EH dummy packets for FDIR
Update GTPU EH dummy pkts for FDIR, including EH/DL/UL.
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>