git.droids-corp.org - dpdk.git/log

]> git.droids-corp.org - dpdk.git/log

git.droids-corp.org / dpdk.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Stephen Hemminger [Sat, 13 Nov 2021 17:22:54 +0000 (09:22 -0800)]

ipc: end multiprocess thread during cleanup

When rte_eal_cleanup is called, all control threads should exit.
For the mp thread, this best handled by closing the mp_socket
and letting the thread see that.

This also fixes potential problems where the mp_socket gets
another hard error, and the thread runs away repeating itself
by reading the same error.

Fixes: 85d6815fa6d0 ("eal: close multi-process socket during cleanup")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

commit | commitdiff | tree

Stephen Hemminger [Sat, 13 Nov 2021 17:22:53 +0000 (09:22 -0800)]

log: close in cleanup stage

When application calls rte_eal_cleanup on shutdown,
the DPDK log should be closed and cleaned up.

This helps reduce false reports from tools like ASAN
and valgrind that track memory leaks.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

commit | commitdiff | tree

David Marchand [Thu, 3 Feb 2022 09:39:12 +0000 (10:39 +0100)]

test/mbuf: fix mbuf data content check

When allocating a mbuf, its data content is most of the time zero'd but
nothing ensures this. This is especially wrong when building with
RTE_MALLOC_DEBUG, where data is poisoned to 0x6b on free.

This test reserves MBUF_TEST_DATA_LEN2 bytes in the mbuf data segment,
and sets this data to 0xcc.
Calling strlen(), the test may try to read more than MBUF_TEST_DATA_LEN2
which has been noticed when memory had been poisoned.

The mbuf data content is checked right after, so we can simply remove
strlen().

Fixes: 7b295dceea07 ("test/mbuf: add unit test cases")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>

commit | commitdiff | tree

Vladimir Medvedkin [Thu, 27 Jan 2022 18:08:53 +0000 (18:08 +0000)]

app/fib: fix division by zero

This patch fixes the division by 0, which occurs if the number of
routes is less than 10.
Can be triggered by passing -n argument with value < 10:

./dpdk-test-fib -- -n 9
...
Floating point exception (core dumped)

Fixes: 103809d032cd ("app/test-fib: add test application for FIB")
Cc: stable@dpdk.org
Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>

commit | commitdiff | tree

Yunjian Wang [Tue, 14 Dec 2021 13:30:25 +0000 (21:30 +0800)]

mem: check allocation in dynamic hugepage init

The function malloc() could return NULL, the return value
need to be checked.

Fixes: 6f63858e55e6 ("mem: prevent preallocated pages from being freed")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>

commit | commitdiff | tree

Pavan Nikhilesh [Fri, 5 Nov 2021 08:57:12 +0000 (14:27 +0530)]

eal/arm: inline 128-bit atomic compare exchange with GCC

GCC [1] now assigns even register pairs for CASP, the fix has also been
backported to all stable releases of older GCC versions.
Removing the manual register allocation allows GCC to inline the
functions and pick optimal registers for performing CASP.

1: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=563cc649beaf

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>

commit | commitdiff | tree

Bruce Richardson [Thu, 10 Feb 2022 15:42:38 +0000 (15:42 +0000)]

vhost: fix C++ include

The virtio kernel header includes are already noted as being
incompatible with C++. We can ensure that the header is safe for
inclusion in C++ code by not including those headers during C++ builds.
While not ideal, this does ensure that all DPDK headers can be included
in C++ code without errors.

Fixes: f8904d563691 ("vhost: fix header for strict compilation flags")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

commit | commitdiff | tree

Bruce Richardson [Thu, 10 Feb 2022 15:42:37 +0000 (15:42 +0000)]

table: fix C++ include

Since C++ doesn't support automatic casting from void * to other types,
we need to explicitly add the casts to any header files in DPDK.

Fixes: ea7be0a0386e ("lib/librte_table: add hash function headers")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

commit | commitdiff | tree

Bruce Richardson [Thu, 10 Feb 2022 15:42:36 +0000 (15:42 +0000)]

ipsec: fix C++ include

C++ does not have automatic casting to/from void pointers, so need
explicit cast if header is to be included in C++ code

Fixes: f901d9c82688 ("ipsec: add helpers to group completed crypto-ops")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

commit | commitdiff | tree

Bruce Richardson [Thu, 10 Feb 2022 15:42:35 +0000 (15:42 +0000)]

graph: fix C++ include

C++ does not have automatic casting to/from void pointers, so need
explicit cast if header is to be included in C++ code

Fixes: 40d4f51403ec ("graph: implement fastpath routines")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>

commit | commitdiff | tree

Bruce Richardson [Thu, 10 Feb 2022 15:42:34 +0000 (15:42 +0000)]

eventdev: fix C++ include

The eventdev headers had issues when used from C++

* Missing closing "}" for the extern "C" block
* No automatic casting to/from void *

Fixes: a6562f6d6f8e ("eventdev: introduce event timer adapter")
Fixes: 32e326869ed6 ("eventdev: add tracepoints")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>

commit | commitdiff | tree

Bruce Richardson [Thu, 10 Feb 2022 15:42:33 +0000 (15:42 +0000)]

eal: fix C++ include

C++ files could not include some headers because:

* "new" is a keyword in C++, so can't be a variable name
* there is no automatic casting to/from void *

Fixes: 184104fc6121 ("ticketlock: introduce fair ticket based locking")
Fixes: 032a7e5499a0 ("trace: implement provider payload")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

commit | commitdiff | tree

Juraj Linkeš [Tue, 25 Jan 2022 10:08:28 +0000 (11:08 +0100)]

config/arm: add values for native armv7

Armv7 native build fails with this error:
../config/meson.build:364:1: ERROR: Problem encountered:
Number of CPU cores not specified.

This is because RTE_MAX_LCORE is not set. We also need to set
RTE_MAX_NUMA_NODES in armv7 native builds.

Fixes: 8ef09fdc506b ("build: add optional NUMA and CPU counts detection")
Cc: stable@dpdk.org
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>

commit | commitdiff | tree

David Marchand [Thu, 10 Feb 2022 08:55:43 +0000 (09:55 +0100)]

stack: fix stubs header export

The stubs header is included as part of rte_stack.h for architectures
other than x86_64 and aarch64 (i.e. x86 32 bits and ppc).

Note: chkincs won't catch this issue since it checks headers from within
the source directory.

Fixes: 7911ba0473e0 ("stack: enable lock-free implementation for aarch64")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>

commit | commitdiff | tree

Weiguo Li [Tue, 25 Jan 2022 11:51:41 +0000 (19:51 +0800)]

regex/mlx5: fix memory allocation check

The wrong field was checked after allocation.

Fixes: e3dbbf718ebc ("regex/mlx5: support configuration")
Cc: stable@dpdk.org
Signed-off-by: Weiguo Li <liwg06@foxmail.com>

commit | commitdiff | tree

Elena Agostini [Thu, 10 Feb 2022 01:58:51 +0000 (01:58 +0000)]

gpu/cuda: differentiate V100 32GB GPU IDs

Differentiate between GPU V100 32GB SMX2 device id
and V100 32GB PCIe device id.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>

commit | commitdiff | tree

Elena Agostini [Thu, 27 Jan 2022 03:50:28 +0000 (03:50 +0000)]

gpudev: expose GPU memory to CPU

Enable the possibility to expose a GPU memory area and make it
accessible from the CPU.

GPU memory has to be allocated via rte_gpu_mem_alloc().

This patch allows the gpudev library to map (and unmap),
through the GPU driver, a chunk of GPU memory and to return
a memory pointer usable by the CPU to access the GPU memory area.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>

commit | commitdiff | tree

Christophe Fontaine [Mon, 7 Feb 2022 10:21:29 +0000 (11:21 +0100)]

vhost: remove payload size limitation

FDs at the end of the VhostUserMessage structure limits the size
of the payload. Move them to an other englobing structure, before
the header & payload of a VhostUserMessage.
Also removes a reference to fds in the VHUMsg structure defined in
drivers/net/virtio/virtio_user/vhost_user.c

Signed-off-by: Christophe Fontaine <cfontain@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>

commit | commitdiff | tree

Marvin Liu [Thu, 20 Jan 2022 12:22:18 +0000 (20:22 +0800)]

net/virtio: fix slots number when indirect feature on

Virtio driver only occupies one slot for enqueuing chained mbufs when
indirect feature is on. Required slots calculation should depend on
indirect feature status at the end.

Fixes: 0eaf7fc2fe8e ("net/virtio: separate AVX Rx/Tx")
Cc: stable@dpdk.org
Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

commit | commitdiff | tree

Yuan Wang [Mon, 17 Jan 2022 16:20:27 +0000 (16:20 +0000)]

vhost: fix guest to host physical address mapping

Async copy fails when looking up hpa in the gpa to hpa mapping table.
This happens because the gpa is matched exactly in the merged
mapping table, and the merge loses the mapping entries.
A new range comparison method is introduced to solve this issue.

Fixes: 6563cf92380a ("vhost: fix async copy on multi-page buffers")
Cc: stable@dpdk.org
Signed-off-by: Yuan Wang <yuanx.wang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

commit | commitdiff | tree

Xuan Ding [Mon, 22 Nov 2021 08:49:48 +0000 (08:49 +0000)]

doc: update recommended IOVA mode for async vhost

DPDK 21.11 adds vfio support for DMA device in vhost. This patch
updates recommended IOVA mode in async datapath.

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

commit | commitdiff | tree

Sunil Kumar Kori [Tue, 8 Feb 2022 08:50:49 +0000 (14:20 +0530)]

app/testpmd: add queue based priority flow control command

Patch adds command line options to configure queue based
priority flow control.

- Syntax command is given as below:

set pfc_queue_ctrl <port_id> rx <on|off> <tx_qid> <tx_tc> \
tx <on|off> <rx_qid> <rx_tc> <pause_time>

- Example command to configure queue based priority flow control
on rx and tx side for port 0, Rx queue 0, Tx queue 0 with pause
time 2047

testpmd> set pfc_queue_ctrl 0 rx on 0 0 tx on 0 0 2047

Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Jerin Jacob [Tue, 8 Feb 2022 08:50:48 +0000 (14:20 +0530)]

ethdev: support queue-based priority flow control

Based on device support and use-case need, there are two different ways
to enable PFC. The first case is the port level PFC configuration, in
this case, rte_eth_dev_priority_flow_ctrl_set() API shall be used to
configure the PFC, and PFC frames will be generated using based on VLAN
TC value.

The second case is the queue level PFC configuration, in this
case, Any packet field content can be used to steer the packet to the
specific queue using rte_flow or RSS and then use
rte_eth_dev_priority_flow_ctrl_queue_configure() to configure the
TC mapping on each queue.
Based on congestion selected on the specific queue, configured TC
shall be used to generate PFC frames.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Sunil Kumar Kori <skori@marvell.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Ivan Malov [Mon, 7 Feb 2022 11:15:59 +0000 (14:15 +0300)]

net/sfc: fix lock releases

Fixes: 155583abe63c ("net/sfc: implement representor queue setup and release")
Fixes: 75f080fdf74a ("net/sfc: implement port representor start and stop")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Kumara Parameshwaran [Thu, 3 Feb 2022 08:24:12 +0000 (13:54 +0530)]

drivers/net: use internal function to get ethdev struct

Make changes in PMDs to use the new function where
rte_eth_dev_get_port_by_name is used to get port_id
to access rte_eth_devices

Signed-off-by: Kumara Parameshwaran <kparameshwar@vmware.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Xiaoyun Li [Mon, 24 Jan 2022 12:28:57 +0000 (20:28 +0800)]

app/testpmd: add SW L4 checksum in multi-segments

Csum forwarding mode only supports software UDP/TCP csum calculation
for single segment packets when hardware offload is not enabled.
This patch enables software UDP/TCP csum calculation over multiple
segments.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Xiaoyun Li [Mon, 24 Jan 2022 12:28:56 +0000 (20:28 +0800)]

net: add UDP/TCP checksum in mbuf segments

Add functions to call rte_raw_cksum_mbuf() to calculate IPv4/6
UDP/TCP checksum in mbuf which can be over multi-segments.

Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Aman Singh <aman.deep.singh@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Sunil Pai G <sunil.pai.g@intel.com>

commit | commitdiff | tree

Elena Agostini [Thu, 16 Dec 2021 23:38:40 +0000 (23:38 +0000)]

net/mlx5: add C++ include guard to public header

The support for linking rte_pmd_mlx5.h functions with
C++ applications was missing.

Signed-off-by: Elena Agostini <eagostini@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

commit | commitdiff | tree

Nipun Gupta [Tue, 11 Jan 2022 05:05:38 +0000 (10:35 +0530)]

app/testpmd: update raw flow to take hex input

This patch enables method to provide key and mask for raw rules
to be provided as hexadecimal values. There is new parameter
pattern_mask added to support this.

Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Steve Yang [Thu, 20 Jan 2022 02:59:31 +0000 (02:59 +0000)]

app/testpmd: fix stack overflow for EEPROM display

When the size of EEPROM exceeds the default thread stack size(8MB),
e.g.: 10MB size, it will crash due to stack overflow.

Allocate the data of EPPROM information on the heap.

Fixes: 6b67721dee2a ("app/testpmd: add EEPROM command")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Aman Singh <aman.deep.singh@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Selwin Sebastian [Mon, 31 Jan 2022 05:39:20 +0000 (11:09 +0530)]

net/axgbe: disable CDR workaround for Yellow Carp device

Yellow Carp ethernet devices (V3xxx) do not require
autonegotiation CDR workaround, hence disable the same.

Signed-off-by: Selwin Sebastian <selwin.sebastian@amd.com>
Acked-by: Chandubabu Namburu <chandu@amd.com>

commit | commitdiff | tree

Selwin Sebastian [Mon, 31 Jan 2022 05:39:19 +0000 (11:09 +0530)]

net/axgbe: support Yellow Carp device

Yellow Carp ethernet devices (V3xxx) use the existing PCI ID but
the window settings for the indirect PCS access have been
altered. Add the check for Yellow Carp Ethernet devices to
use the new register values.

Signed-off-by: Selwin Sebastian <selwin.sebastian@amd.com>
Acked-by: Chandubabu Namburu <chandu@amd.com>

commit | commitdiff | tree

Ivan Malov [Tue, 1 Feb 2022 08:50:02 +0000 (11:50 +0300)]

net/sfc: use even spread mode in flow action RSS

If the user provides contiguous ascending queue IDs,
use the even spread mode to avoid wasting resources
which are needed to serve indirection table entries.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

commit | commitdiff | tree

Ivan Malov [Tue, 1 Feb 2022 08:50:01 +0000 (11:50 +0300)]

common/sfc_efx/base: support even spread RSS mode

Riverhead boards support spreading traffic across the
specified number of queues without using indirections.
This mode is provided by a dedicated RSS context type.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

commit | commitdiff | tree

Ivan Malov [Tue, 1 Feb 2022 08:50:00 +0000 (11:50 +0300)]

net/sfc: use adaptive table entry count in flow action RSS

Currently, every RSS context uses 128 indirection entries in
the hardware. That is not always optimal because the entries
come from a pool shared among all PCI functions of the board,
while the format of action RSS allows to pass less queue IDs.

With EF100 boards, it is possible to decide how many entries
to allocate for the indirection table of a context. Make use
of that in order to optimise resource usage in RSS scenarios.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

commit | commitdiff | tree

Ivan Malov [Tue, 1 Feb 2022 08:49:59 +0000 (11:49 +0300)]

common/sfc_efx/base: support selecting RSS table entry count

On Riverhead boards, the client can control how many entries
to have in the indirection table of an exclusive RSS context.

Provide the new parameter to clients and indicate its bounds.
Extend the API for writing the table to have the flexibility.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

commit | commitdiff | tree

Ivan Malov [Tue, 1 Feb 2022 08:49:58 +0000 (11:49 +0300)]

common/sfc_efx/base: refactor RSS table entry count name

In the existing code, "n" is hardly a clear name for that.
Use a clearer name to help future maintainers of the code.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

commit | commitdiff | tree

Ivan Malov [Tue, 1 Feb 2022 08:49:57 +0000 (11:49 +0300)]

net/sfc: use non-static queue span limit in flow action RSS

On EF10 boards, the limit on how many queues an RSS context
can address is 64. On EF100 boards, this parameter may vary.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

commit | commitdiff | tree

Ivan Malov [Tue, 1 Feb 2022 08:49:56 +0000 (11:49 +0300)]

common/sfc_efx/base: query RSS queue span limit on Riverhead

On Riverhead boards, clients can query the limit on how many
queues an RSS context may address. Put the capability to use.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

commit | commitdiff | tree

Ivan Malov [Tue, 1 Feb 2022 08:49:55 +0000 (11:49 +0300)]

net/sfc: rework flow action RSS support

Currently, the driver always allocates a dedicated NIC RSS context
for every separate flow rule with action RSS, which is not optimal.

First of all, multiple rules which have the same RSS specification
can share a context since filters in the hardware operate this way.

Secondly, entries in a context's indirection table are not precise
queue IDs but offsets with regard to the base queue ID of a filter.
This way, for example, queue arrays "0, 1, 2" and "3, 4, 5" in two
otherwise identical RSS specifications allow the driver to use the
same context since they both yield the same table of queue offsets.

Rework flow action RSS support in order to use these optimisations.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

commit | commitdiff | tree

Kumara Parameshwaran [Mon, 31 Jan 2022 14:32:34 +0000 (20:02 +0530)]

net/tap: fix to populate FDs in secondary process

When a tap device is hotplugged to primary process which in turn
adds the device to all secondary process, the secondary process
does a tap_mp_attach_queues, but the fds are not populated in
the primary during the probe they are populated during the queue_setup,
added a fix to sync the queues during rte_eth_dev_start

Fixes: 4852aa8f6e21 ("drivers/net: enable hotplug on secondary process")
Cc: stable@dpdk.org
Signed-off-by: Kumara Parameshwaran <kparameshwar@vmware.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Kumara Parameshwaran [Mon, 31 Jan 2022 14:32:33 +0000 (20:02 +0530)]

ethdev: add internal function to device struct from name

The PMDs would need a function to access the rte_eth_devices
without accessing the global rte_eth_device array.

Cc: stable@dpdk.org
Signed-off-by: Kumara Parameshwaran <kparameshwar@vmware.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Ciara Loftus [Fri, 28 Jan 2022 09:50:29 +0000 (09:50 +0000)]

net/af_xdp: use libxdp if available

AF_XDP support is deprecated in libbpf since v0.7.0 [1]. The libxdp library
now provides the functionality which once was in libbpf and which the
AF_XDP PMD relies on. This commit updates the AF_XDP meson build to use the
libxdp library if a version >= v1.2.2 is available. If it is not available,
only versions of libbpf prior to v0.7.0 are allowed, as they still contain
the required AF_XDP functionality.

libbpf still remains a dependency even if libxdp is present, as we use
libbpf APIs for program loading.

The minimum required kernel version for libxdp for use with AF_XDP is v5.3.
For the library to be fully-featured, a kernel v5.10 or newer is
recommended. The full compatibility information can be found in the libxdp
README.

v1.2.2 of libxdp includes an important fix required for linking with DPDK
which is why this version or greater is required. Meson uses pkg-config to
verify the version of libxdp on the system, so it is necessary that the
library is discoverable using pkg-config in order for the PMD to use it. To
verify this, you can run: pkg-config --modversion libxdp

[1] https://github.com/libbpf/libbpf/commit/277846bc6c15

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>

commit | commitdiff | tree

Min Hu (Connor) [Fri, 28 Jan 2022 02:35:19 +0000 (10:35 +0800)]

app/testpmd: fix bonding mode set

when start testpmd, and type command like this, it will lead to
Segmentation fault, like:

testpmd> create bonded device 4 0
testpmd> add bonding slave 0 2
testpmd> add bonding slave 1 2
testpmd> port start 2
testpmd> set bonding mode 0 2
testpmd> quit
Stopping port 0...
Stopping ports...
...
Bye...
Segmentation fault

The reason to the bug is that rte timer do not be cancelled when quit.
That is, in 'bond_ethdev_start', resources are allocated according to
different bonding mode. In 'bond_ethdev_stop', resources are free by
the corresponding mode.

For example, 'bond_ethdev_start' start bond_mode_8023ad_ext_periodic_cb
timer for bonding mode 4. and 'bond_ethdev_stop' cancel the timer only
when the current bonding mode is 4. If the bonding mode is changed,
and directly quit the process, the timer will still on, and freed memory
will be accessed, then segmentation fault.

'bonding mode' changed means resources changed, reallocate resources for
different mode should be done, that is, device should be restarted.

Fixes: 2950a769315e ("bond: testpmd support")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Min Hu (Connor) [Fri, 28 Jan 2022 02:25:33 +0000 (10:25 +0800)]

net/bonding: fix reference count on mbufs

In bonding Tx broadcast mode, Packets should be sent by every slave,
but only one mbuf exits. The solution is to increment reference count
on mbufs, but it ignores multi segments.

This patch fixed it by adding reference for every segment in multi
segments Tx scenario.

Fixes: 2efb58cbab6e ("bond: new link bonding library")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

commit | commitdiff | tree

Min Hu (Connor) [Fri, 28 Jan 2022 02:25:32 +0000 (10:25 +0800)]

net/bonding: fix promiscuous and allmulticast state

Currently, promiscuous or allmulticast state of bonding port will not be
passed to the new primary slave when active/standby switch-over. It
causes bugs in some scenario.

For example, promiscuous state of bonding port is off now, primary slave
(called A) is off but secondary slave(called B) is on.
Then active/standby switch-over, promiscuous state of the bonding port
is off, but the new primary slave turns to be B and its promiscuous
state is still on.
It is not consistent with bonding port. And this patch will fix it.

Fixes: 2efb58cbab6e ("bond: new link bonding library")
Fixes: 68218b87c184 ("net/bonding: prefer allmulti to promiscuous for LACP")
Cc: stable@dpdk.org
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

commit | commitdiff | tree

Yunjian Wang [Fri, 24 Dec 2021 11:26:38 +0000 (19:26 +0800)]

net/ixgbe: check filter init failure

The function ixgbe_fdir_filter_init() and ixgbe_l2_tn_filter_init()
could return errors, the return value need to be checked and returned.

Fixes: 080e3c0ee989 ("net/ixgbe: store flow director filter")
Fixes: d0c0c416ef1f ("net/ixgbe: store L2 tunnel filter")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>

commit | commitdiff | tree

Chengwen Feng [Fri, 28 Jan 2022 02:07:08 +0000 (10:07 +0800)]

net/hns3: delete duplicated RSS type

The hns3_set_rss_types hold two IPV4_TCP items, this patch deletes
duplicate item.

Fixes: 806f1d5ab0e3 ("net/hns3: set RSS hash type input configuration")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

commit | commitdiff | tree

Huisong Li [Fri, 28 Jan 2022 02:07:07 +0000 (10:07 +0800)]

net/hns3: fix operating queue when TCAM table is invalid

Reset queues will query the TCAM table. The table is cleared after global
or imp reset. Currently, PF driver first resets Rx/Tx queues and then
restore the table during the reset recovery process, which will fail to
query the table and trigger a RAS error.

Fixes: fa29fe45a7b4 ("net/hns3: support queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

commit | commitdiff | tree

Huisong Li [Fri, 28 Jan 2022 02:07:06 +0000 (10:07 +0800)]

net/hns3: fix double decrement of secondary count

The "secondary_cnt" indicates the number of secondary processes on an
Ethernet device. But the variable is double subtracted when detach the
device in secondary processes.

Fixes: ff6dc76e40b8 ("net/hns3: refactor multi-process initialization")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

commit | commitdiff | tree

Huisong Li [Fri, 28 Jan 2022 02:07:05 +0000 (10:07 +0800)]

net/hns3: fix insecure way to query MAC statistics

The query way of MAC statistics in HNS3 PF driver is as following:
1) get MAC statistics register number and calculate descriptor number.
2) use above descriptor number to send command to firmware to query all
MAC statistics and copy to hns3_mac_stats struct in driver.

The preceding way does not verify the validity of the number of obtained
register, which may cause memory out-of-bounds.

Fixes: 8839c5e202f3 ("net/hns3: support device stats")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

commit | commitdiff | tree

Lijun Ou [Fri, 28 Jan 2022 02:07:04 +0000 (10:07 +0800)]

net/hns3: fix RSS key with null

Since the patch '1848b117' has initialized the variable 'key' in
'struct rte_flow_action_rss' with 'NULL', the PMD will use the
default RSS key when create the first RSS rule with NULL RSS key.
Then, if create a repeated RSS rule with the above, it will not
identify duplicate rules and return an error message.

To solve the preceding problem, determine whether the current RSS keys
are the same based on whether the length of key_len of rss is 0.

Fixes: 1848b117cca1 ("app/testpmd: fix RSS key for flow API RSS rule")
Cc: stable@dpdk.org
Signed-off-by: Lijun Ou <oulijun@huawei.com>

commit | commitdiff | tree

Huisong Li [Fri, 28 Jan 2022 02:07:03 +0000 (10:07 +0800)]

net/hns3: fix max packet size rollback in PF

HNS3 PF driver use the hns->pf.mps to restore the MTU when a reset
occurs.
If user fails to configure the MTU, the MPS of PF may not be restored to
the original value.

Fixes: 25fb790f7868 ("net/hns3: fix HW buffer size on MTU update")
Fixes: 1f5ca0b460cd ("net/hns3: support some device operations")
Fixes: d51867db65c1 ("net/hns3: add initialization")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>

commit | commitdiff | tree

John Daley [Fri, 28 Jan 2022 17:58:13 +0000 (09:58 -0800)]

net/enic: support max descriptors allowed by adapter

Newer VIC adapters have the max number of supported RX and TX
descriptors in their configuration. Use these values as the
maximums.

Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>

commit | commitdiff | tree

John Daley [Fri, 28 Jan 2022 17:58:12 +0000 (09:58 -0800)]

net/enic: update VIC firmware interface

Update the configuration structure used between the adapter and
driver. The structure is compatible with all Cisco VIC adapters.

Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>

commit | commitdiff | tree

John Daley [Fri, 28 Jan 2022 17:58:11 +0000 (09:58 -0800)]

net/enic: support eCPRI matching

eCPRI message can be over Ethernet layer (.1Q supported also) or over
UDP layer. Message header formats are the same in these two variants.

Only up though the first packet header in the PDU can be matched.
RSS on the eCPRI payload is not supported.

Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Hyong Youb Kim <hyonkim@cisco.com>

commit | commitdiff | tree

Ferruh Yigit [Wed, 26 Jan 2022 13:10:37 +0000 (13:10 +0000)]

net/bonding: fix MTU set for slaves

ethdev requires device to be configured before setting MTU.

In bonding PMD, while configuring slaves, bonding first sets MTU later
configures them, which causes failure if slaves are not configured in
advance.

Fixing by changing the order in slave configure as requested in ethdev
layer, configure first and set MTU later.

Bugzilla ID: 864
Fixes: b26bee10ee37 ("ethdev: forbid MTU set before device configure")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Yu Jiang <yux.jiang@intel.com>
Acked-by: Min Hu (Connor) <humin29@huawei.com>

commit | commitdiff | tree

Weiguo Li [Tue, 25 Jan 2022 14:23:48 +0000 (22:23 +0800)]

net/dpaa2: fix null pointer dereference

Check for memory allocation failure is added to avoid null
pointer dereference.

Fixes: 4690a6114ff6 ("net/dpaa2: enable error queues optionally")
Cc: stable@dpdk.org
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Weiguo Li [Tue, 25 Jan 2022 12:00:49 +0000 (20:00 +0800)]

net/enic: fix dereference before null check

Move memcpy to 'ah->key' after 'ah' null check

Fixes: bb66d562aefc ("net/enic: share flow actions with same signature")
Cc: stable@dpdk.org
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Reviewed-by: John Daley <johndale@cisco.com>

commit | commitdiff | tree

Stephen Hemminger [Wed, 9 Feb 2022 06:54:03 +0000 (22:54 -0800)]

eal: move Unix filesystem functions into one file

Both Linux and FreeBSD have same code for creating runtime
directory and reading sysfs files. Put them in the new lib/eal/unix
subdirectory.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>

commit | commitdiff | tree

Stephen Hemminger [Wed, 9 Feb 2022 06:54:02 +0000 (22:54 -0800)]

support systemd service convention for runtime directory

Systemd.exec supports configuring the runtime directory of a service
via RuntimeDirectory=. This creates the directory with the necessary
permissions which actual service may not have if running in container.

The change to DPDK is to look for the environment RUNTIME_DIRECTORY
first and use that in preference to the fallback alternatives.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>

commit | commitdiff | tree

Stephen Hemminger [Wed, 9 Feb 2022 06:54:01 +0000 (22:54 -0800)]

eal: remove size for setting runtime directory

The size argument to eal_set_runtime_dir is useless and was
being used incorrectly in strlcpy. It worked only because
all callers passed PATH_MAX which is same as sizeof the destination
runtime_dir.

Note: this is an internal API so no user exposed change.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

commit | commitdiff | tree

Srikanth Yalavarthi [Tue, 18 Jan 2022 13:33:40 +0000 (05:33 -0800)]

eal: add internal function to get base address

Added an internal helper to get OS-specific EAL mapping base address

This helper can be used by the drivers to program offload / accelerator
devices, where the base address can be used as a reference address by
the accelerator to access the host memory

An address can also be represented as an offset relative to the base
address using smaller data types

Signed-off-by: Srikanth Yalavarthi <syalavarthi@marvell.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

commit | commitdiff | tree

Dmitry Kozlyuk [Thu, 3 Feb 2022 18:13:36 +0000 (20:13 +0200)]

eal: extend --huge-unlink for hugepage file reuse

Expose Linux EAL ability to reuse existing hugepage files
via --huge-unlink=never switch.
Default behavior is unchanged, it can also be specified
using --huge-unlink=existing for consistency.
Old --huge-unlink switch is kept,
it is an alias for --huge-unlink=always.
Add a test case for the --huge-unlink=never mode.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

commit | commitdiff | tree

Dmitry Kozlyuk [Thu, 3 Feb 2022 18:13:35 +0000 (20:13 +0200)]

eal/linux: allow hugepage file reuse

Linux EAL ensured that mapped hugepages are clean
by always mapping from newly created files:
existing hugepage backing files were always removed.
In this case, the kernel clears the page to prevent data leaks,
because the mapped memory may contain leftover data
from the previous process that was using this memory.
Clearing takes the bulk of the time spent in mmap(2),
increasing EAL initialization time.

Introduce a mode to keep existing files and reuse them
in order to speed up initial memory allocation in EAL.
Hugepages mapped from such files may contain data
left by the previous process that used this memory,
so RTE_MEMSEG_FLAG_DIRTY is set for their segments.
If multiple hugepages are mapped from the same file:
1. When fallocate(2) is used, all memory mapped from this file
   is considered dirty, because it is unknown
   which parts of the file are holes.
2. When ftruncate(3) is used, memory mapped from this file
   is considered dirty unless the file is extended
   to create a new mapping, which implies clean memory.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>

commit | commitdiff | tree

Dmitry Kozlyuk [Thu, 3 Feb 2022 18:13:34 +0000 (20:13 +0200)]

eal: refactor --huge-unlink storage

In preparation to extend --huge-unlink option semantics
refactor how it is stored in the internal configuration.
It makes future changes more isolated.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

commit | commitdiff | tree

Dmitry Kozlyuk [Thu, 3 Feb 2022 18:13:33 +0000 (20:13 +0200)]

mem: add dirty malloc element support

EAL malloc layer assumed all free elements content
is filled with zeros ("clean"), as opposed to uninitialized ("dirty").
This assumption was ensured in two ways:
1. EAL memalloc layer always returned clean memory.
2. Freed memory was cleared before returning into the heap.

Clearing the memory can be as slow as around 14 GiB/s.
To save doing so, memalloc layer is allowed to return dirty memory.
Such segments being marked with RTE_MEMSEG_FLAG_DIRTY.
The allocator tracks elements that contain dirty memory
using the new flag in the element header.
When clean memory is requested via rte_zmalloc*()
and the suitable element is dirty, it is cleared on allocation.
When memory is deallocated, the freed element is joined
with adjacent free elements, and the dirty flag is updated:

a) If the joint element contains dirty parts, it is dirty:

    dirty + freed + dirty = dirty  =>  no need to clean
            freed + dirty = dirty      the freed memory

   Dirty parts may be large (e.g. initial allocation),
   so clearing them could create unpredictable slowdown.

b) If the only dirty part of the joint element
   is the freed memory, the joint element can be made clean:

    clean + freed + clean = clean  =>  freed memory
    clean + freed         = clean      must be cleared
            freed + clean = clean
            freed         = clean

   This logic naturally reproduces the old behavior
   and always applies in modes when EAL memalloc layer
   returns only clean segments.

As a result, memory is either cleared on free, as before,
or it will be cleared on allocation if need be, but never twice.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>

commit | commitdiff | tree

Dmitry Kozlyuk [Thu, 3 Feb 2022 18:13:32 +0000 (20:13 +0200)]

app/test: add allocator performance benchmark

Memory allocator performance is crucial to applications that deal
with large amount of memory or allocate frequently. DPDK allocator
performance is affected by EAL options, API used and, at least,
allocation size. New autotest is intended to be run with different
EAL options. It measures performance with a range of sizes
for dirrerent APIs: rte_malloc, rte_zmalloc, and rte_memzone_reserve.

Work distribution between allocation and deallocation depends on EAL
options. The test prints both times and total time to ease comparison.

Memory can be filled with zeroes at different points of allocation path,
but it always takes considerable fraction of overall timing. This is why
the test measures filling speed and prints how long clearing takes
for each size as a reference (for rte_memzone_reserve estimations
are printed).

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Reviewed-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

commit | commitdiff | tree

Dmitry Kozlyuk [Thu, 3 Feb 2022 18:13:31 +0000 (20:13 +0200)]

doc: add hugepage mapping details

Hugepage mapping is a layer of EAL malloc builds upon.
There were implicit references to its details,
like mentions of segment file descriptors,
but no explicit description of its modes and operation.
Add an overview of mechanics used on ech supported OS.
Convert memory management subsections from list items
to level 4 headers: they are big and important enough.

Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

commit | commitdiff | tree

Weiguo Li [Mon, 7 Feb 2022 12:37:01 +0000 (20:37 +0800)]

eventdev: remove useless C++ include guard

This private header contains an incomplete cplusplus guard,
just remove it.

Fixes: d35e61322de52 ("eventdev: move inline APIs into separate structure")
Signed-off-by: Weiguo Li <liwg06@foxmail.com>

commit | commitdiff | tree

Weiguo Li [Mon, 7 Feb 2022 12:37:00 +0000 (20:37 +0800)]

eal/windows: remove useless C++ include guard

Remove the incomplete cplusplus guard in internal header.

Fixes: 6e1ed4cbbe99 ("eal/windows: add dirent implementation")
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Pallavi Kadam <pallavi.kadam@intel.com>

commit | commitdiff | tree

Weiguo Li [Mon, 7 Feb 2022 12:36:59 +0000 (20:36 +0800)]

net/dpaa2: remove useless C++ include guard

Remove the incomplete cplusplus guard in internal headers.

Fixes: 72ec7a678e70 ("net/dpaa2: add soft parser driver")
Signed-off-by: Weiguo Li <liwg06@foxmail.com>

commit | commitdiff | tree

Weiguo Li [Mon, 7 Feb 2022 12:36:58 +0000 (20:36 +0800)]

net/cxgbe: remove useless C++ include guard

Remove the incomplete cplusplus guard in internal header.

Fixes: 3bd122eef2cc ("cxgbe/base: add hardware API for Chelsio T5 series adapters")
Signed-off-by: Weiguo Li <liwg06@foxmail.com>

commit | commitdiff | tree

Weiguo Li [Mon, 7 Feb 2022 12:36:57 +0000 (20:36 +0800)]

common/mlx5: remove useless C++ include guard

Remove the incomplete cplusplus guard in internal headers.

Fixes: 7525ebd8ebb0 ("common/mlx5: add glue functions on Windows")
Signed-off-by: Weiguo Li <liwg06@foxmail.com>

commit | commitdiff | tree

Weiguo Li [Mon, 7 Feb 2022 12:36:56 +0000 (20:36 +0800)]

bus/dpaa: fix C++ include guard

Supplement the missing half of braces for the extern "C" block,
or remove the incomplete guard in internal header.

Fixes: 6d6b4f49a155 ("bus/dpaa: add FMAN hardware operations")
Fixes: 919eeaccb2ba ("bus/dpaa: introduce NXP DPAA bus driver skeleton")
Signed-off-by: Weiguo Li <liwg06@foxmail.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:44 +0000 (21:10 -0800)]

test: enable subset of tests on Windows

Enable a subset of unit tests for Windows CI

- For driver tests, driver owners should enable corresponding tests when
  enabling driver for Windows.
- For dump tests, currently the tests hang on Windows which require
  further investigation.
- For telemetry tests, it has POSIX socket specific codes which require
  replacement for Windows. Will investigate and work on a separate patch.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:43 +0000 (21:10 -0800)]

test: replace shell script with Python

- Add python script to check if system supports hugepages
- Remove corresponding .sh script
- Replace calling of .sh with corresponding .py in meson.build

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:42 +0000 (21:10 -0800)]

test: skip unsupported tests on Windows

Skip tests which are not yet supported for Windows:
- The libraries that tests depend on are not enabled on Windows yet
- The tests can compile but with issue still under investigation
    * test_func_reentrancy:
      Windows EAL has no protection against repeated calls.
    * test_lcores:
      Execution enters an infinite loops, requires investigation.
    * test_rcu_qsbr_perf:
      Execution hangs on Windows, requires investigation.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:41 +0000 (21:10 -0800)]

test: resolve name collision on Windows

Add prefix to resolve name collision on Windows.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:40 +0000 (21:10 -0800)]

test/alarm: disable bad time cases on Windows

Remove two alarm_autotest test cases which do bogus range check
on Windows.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:39 +0000 (21:10 -0800)]

eal: differentiate strerror message on Windows

On Windows, strerror returns just "Unknown error" for errnum greater
than MAX_ERRNO, while linux and freebsd returns "Unknown error <num>",
which is the current expectation for errno_autotest. Differentiate
the error string on Windows to remove a "duplicate error code" failure.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:38 +0000 (21:10 -0800)]

test/log: skip regex on Windows

DPDK logs_autotest on Windows failed at "dynamic log types" tests.
The failures are on 2 test cases for rte_log_set_level_regexp API,
due to regular expression is not supported on Windows in DPDK yet
and regcomp/regexec are just stubs on Windows (in regex.h).

In app/test/test_logs.c, ifndef these two test cases, and for the
rte_log_set_level_pattern validation case following these two cases,
differentiate the expected log level passed into macro CHECK_LEVELS

Now logs_autotest completes for all dynamic log types and static log types.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:37 +0000 (21:10 -0800)]

test/interrupts: skip on Windows

Even though test_interrupts.c can compile on Windows, skip interrupt
tests for now since majority of eal_interrupt on Windows are stubs.
Will remove the skip after interrupt being fully enabled on Windows.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:36 +0000 (21:10 -0800)]

test/mem: fix error check

Fix incorrect errno variable used in memory autotest.
Use rte_errno instead.

Fixes: 086d426406bd ("test/mem: fix memory autotests on FreeBSD")
Cc: stable@dpdk.org
Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:35 +0000 (21:10 -0800)]

test: remove POSIX-specific code

- Replace POSIX-specific code with DPDK equivalents or
  conditionally disable it on Windows
- Use NUL on Windows as /dev/null for Unix
- Exclude tests not supported on Windows yet
  * multi-process
  * PMD performance statistics display on signal

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com>

commit | commitdiff | tree

Jie Zhou [Wed, 26 Jan 2022 05:10:34 +0000 (21:10 -0800)]

eal/windows: fix error code for not supported API

UT memory_autotest on Windows has 2 failed cases on EAL APIs
eal_memalloc_get_seg_fd and eal_memalloc_get_seg_fd_offset. These 2
APIs are not supported on Windows yet. Should return ENOTSUP such that
in test_memory.c these 2 ENOTSUP cases will not be marked as failures,
same as other ENOTSUP cases.

Fixes: 2a5d547a4a9b ("eal/windows: implement basic memory management")
Cc: stable@dpdk.org
Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

commit | commitdiff | tree

Zhihong Wang [Tue, 14 Dec 2021 03:30:16 +0000 (11:30 +0800)]

ring: fix overflow in memory size calculation

Parameters count and esize are both unsigned int, and their product can
legaly exceed unsigned int and lead to runtime access violation.

Fixes: cc4b218790f6 ("ring: support configurable element size")
Cc: stable@dpdk.org
Signed-off-by: Zhihong Wang <wangzhihong.wzh@bytedance.com>
Reviewed-by: Liang Ma <liangma@liangbit.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

commit | commitdiff | tree

Robert Sanford [Wed, 22 Dec 2021 16:20:18 +0000 (11:20 -0500)]

ring: update ring size doxygen comments

- Add RING_F_EXACT_SZ description to rte_ring_init and
rte_ring_create param comments.
- Fix ring size comments.

Signed-off-by: Robert Sanford <rsanford@akamai.com>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

commit | commitdiff | tree

Yunjian Wang [Mon, 10 Jan 2022 09:23:03 +0000 (17:23 +0800)]

ring: fix error code when creating ring

The error value returned by rte_ring_create_elem() should be positive
integers. However, if the rte_ring_get_memsize_elem() function fails,
a negative number is returned and is directly used as the return value.
As a result, this will cause the external call to check the return
value to fail(like called by rte_mempool_create()).

Fixes: a182620042aa ("ring: get size in memory")
Cc: stable@dpdk.org
Reported-by: Nan Zhou <zhounan14@huawei.com>
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

commit | commitdiff | tree

Andrzej Ostruszka [Tue, 11 Jan 2022 11:37:39 +0000 (12:37 +0100)]

ring: optimize corner case for enqueue/dequeue

When enqueueing/dequeueing to/from the ring we try to optimize by manual
loop unrolling.  The check for this optimization looks like:

if (likely(idx + n < size)) {

where 'idx' points to the first usable element (empty slot for enqueue,
data for dequeue).  The correct comparison here should be '<=' instead
of '<'.

This is not a functional error since we fall back to the loop with
correct checks on indexes.  Just a minor suboptimal behaviour for the
case when we want to enqueue/dequeue exactly the number of elements that
we have in the ring before wrapping to its beginning.

Fixes: cc4b218790f6 ("ring: support configurable element size")
Fixes: 286bd05bf70d ("ring: optimisations")
Signed-off-by: Andrzej Ostruszka <amo@semihalf.com>
Reviewed-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>

commit | commitdiff | tree

Pallavi Kadam [Fri, 21 Jan 2022 00:17:49 +0000 (16:17 -0800)]

eal/windows: set worker thread affinity at init

Sometimes OS tries to switch the core. So, bind the lcore thread
to a fixed core.
Implement affinity call on Windows similar to Linux.

Signed-off-by: Qiao Liu <qiao.liu@intel.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Tested-by: Idan Hackmon <idanhac@nvidia.com>

commit | commitdiff | tree

Morten Brørup [Mon, 24 Jan 2022 14:59:53 +0000 (15:59 +0100)]

mempool: test performance with constant n

"What gets measured gets done."

This patch adds mempool performance tests where the number of objects to
put and get is constant at compile time, which may significantly improve
the performance of these functions. [*]

Also, it is ensured that the array holding the object used for testing
is cache line aligned, for maximum performance.

And finally, the following entries are added to the list of tests:
- Number of kept objects: 512
- Number of objects to get and to put: The number of pointers fitting
  into a cache line, i.e. 8 or 16

[*] Some example performance test (with cache) results:

get_bulk=4 put_bulk=4 keep=128 constant_n=false rate_persec=280480972
get_bulk=4 put_bulk=4 keep=128 constant_n=true  rate_persec=622159462

get_bulk=8 put_bulk=8 keep=128 constant_n=false rate_persec=477967155
get_bulk=8 put_bulk=8 keep=128 constant_n=true  rate_persec=917582643

get_bulk=32 put_bulk=32 keep=32 constant_n=false rate_persec=871248691
get_bulk=32 put_bulk=32 keep=32 constant_n=true rate_persec=1134021836

Signed-off-by: Morten Brørup <mb@smartsharesystems.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>

commit | commitdiff | tree

Haiyue Wang [Wed, 19 Jan 2022 12:26:14 +0000 (20:26 +0800)]

doc: fix KNI PMD name typo

The KNI PMD name should be "net_kni".

Fixes: 75e2bc54c018 ("net/kni: add KNI PMD")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Markus Theil [Fri, 3 Dec 2021 07:19:07 +0000 (08:19 +0100)]

kni: fix ioctl signature

Fix kni's ioctl signature to correctly match the kernel's
structs. This shaves off the (void*) casts and uses struct file*
instead of struct inode*. With the correct signature, control flow
integrity checkers are no longer confused at this point.

Signed-off-by: Markus Theil <markus.theil@secunet.com>
Tested-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>

commit | commitdiff | tree

Tudor Cornea [Thu, 20 Jan 2022 12:41:34 +0000 (14:41 +0200)]

kni: allow configuring thread granularity

The Kni kthreads seem to be re-scheduled at a granularity of roughly
1 millisecond right now, which seems to be insufficient for performing
tests involving a lot of control plane traffic.

Even if KNI_KTHREAD_RESCHEDULE_INTERVAL is set to 5 microseconds, it
seems that the existing code cannot reschedule at the desired granularily,
due to precision constraints of schedule_timeout_interruptible().

In our use case, we leverage the Linux Kernel for control plane, and
it is not uncommon to have 60K - 100K pps for some signaling protocols.

Since we are not in atomic context, the usleep_range() function seems to be
more appropriate for being able to introduce smaller controlled delays,
in the range of 5-10 microseconds. Upon reading the existing code, it would
seem that this was the original intent. Adding sub-millisecond delays,
seems unfeasible with a call to schedule_timeout_interruptible().

KNI_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */
schedule_timeout_interruptible(
        usecs_to_jiffies(KNI_KTHREAD_RESCHEDULE_INTERVAL));

Below, we attempted a brief comparison between the existing implementation,
which uses schedule_timeout_interruptible() and usleep_range().

We attempt to measure the CPU usage, and RTT between two Kni interfaces,
which are created on top of vmxnet3 adapters, connected by a vSwitch.

insmod rte_kni.ko kthread_mode=single carrier=on

schedule_timeout_interruptible(usecs_to_jiffies(5))
kni_single CPU Usage: 2-4 %
[root@localhost ~]# ping 1.1.1.2 -I eth1
PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 eth1: 56(84) bytes of data.
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.70 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.00 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.99 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.985 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.00 ms

usleep_range(5, 10)
kni_single CPU usage: 50%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.338 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.150 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.123 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.139 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.159 ms

usleep_range(20, 50)
kni_single CPU usage: 24%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.202 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.170 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.171 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.248 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.185 ms

usleep_range(50, 100)
kni_single CPU usage: 13%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.537 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.257 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.231 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.143 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.200 ms

usleep_range(100, 200)
kni_single CPU usage: 7%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.716 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.167 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.459 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.455 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.252 ms

usleep_range(1000, 1100)
kni_single CPU usage: 2%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.22 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.17 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.17 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=1.17 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.15 ms

Upon testing, usleep_range(1000, 1100) seems roughly equivalent in
latency and cpu usage to the variant with schedule_timeout_interruptible(),
while usleep_range(100, 200) seems to give a decent tradeoff between
latency and cpu usage, while allowing users to tweak the limits for
improved precision if they have such use cases.

Disabling RTE_KNI_PREEMPT_DEFAULT, interestingly seems to lead to a
softlockup on my kernel.

Kernel panic - not syncing: softlockup: hung tasks
CPU: 0 PID: 1226 Comm: kni_single Tainted: G        W  O 3.10 #1
<IRQ>  [<ffffffff814f84de>] dump_stack+0x19/0x1b
[<ffffffff814f7891>] panic+0xcd/0x1e0
[<ffffffff810993b0>] watchdog_timer_fn+0x160/0x160
[<ffffffff810644b2>] __run_hrtimer.isra.4+0x42/0xd0
[<ffffffff81064b57>] hrtimer_interrupt+0xe7/0x1f0
[<ffffffff8102cd57>] smp_apic_timer_interrupt+0x67/0xa0
[<ffffffff8150321d>] apic_timer_interrupt+0x6d/0x80

This patch also attempts to remove this option.

References:
[1] https://www.kernel.org/doc/Documentation/timers/timers-howto.txt

Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
Acked-by: Padraig Connolly <Padraig.J.Connolly@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

commit | commitdiff | tree

Bruce Richardson [Mon, 24 Jan 2022 17:49:59 +0000 (17:49 +0000)]

build: remove deprecated Meson functions

Starting in meson 0.56, the functions meson.source_root() and
meson.build_root() are deprecated and to be replaced by the [more
descriptive] functions: project_source_root()/global_source_root() and
project_build_root()/global_build_root(). Unfortunately, these new
replacement functions were only added in 0.56 release too, so to use
them we would need version checks for old/new functions to remove the
deprecation warnings.

However, the functions "current_build_dir()" and "current_source_dir()"
remain unaffected by all this, so we can bypass the versioning problem,
by saving off these values to "dpdk_source_root" and "dpdk_build_root"
in the top-level meson.build file

Bugzilla ID: 926
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>

commit | commitdiff | tree

Bruce Richardson [Fri, 21 Jan 2022 16:12:30 +0000 (16:12 +0000)]

build: fix warning about using -Wextra flag

Each build, meson would issue a warning reporting that the
"warning_level" setting should be used in place of adding -Wextra
directly to our build commands. Testing with meson 0.61 shows that the
only difference for gcc and clang builds between warning levels 1 and
2 is the addition of -Wextra, so we can remove the warning by deleting
our explicit set of Wextra and changing the build defaults to
warning_level 2.

Fixes: 524a0d5d66b9 ("build: enable extra warnings with meson")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>

commit | commitdiff | tree

Bruce Richardson [Thu, 20 Jan 2022 18:06:39 +0000 (18:06 +0000)]

build: fix warnings when running external commands

Meson 0.61.1 is giving warnings that the calls to run_command do not
always explicitly specify if the result is to be checked or not, i.e.
there is a missing "check" parameter. This is because the default
behaviour without the parameter is due to change in the future.

We can fix these warnings by explicitly adding into each call whether
the result should be checked by meson or not. This patch therefore
adds in "check: false" to each run_command call where the result is
being checked by the DPDK meson.build code afterwards, and adds in
"check: true" to any calls where the result is currently unchecked.

Bugzilla ID: 921
Cc: stable@dpdk.org
Reported-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Tested-by: Jerin Jacob <jerinj@marvell.com>

commit | commitdiff | tree

Martijn Bakker [Mon, 31 Jan 2022 22:48:21 +0000 (22:48 +0000)]

pflock: fix header file installation

The generic header file was missing
in the list of files to install.

Fixes: 9667d97c2507 ("pflock: add phase-fair reader writer locks")
Cc: stable@dpdk.org
Signed-off-by: Martijn Bakker <gladdyu@gmail.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

commit | commitdiff | tree

Qi Zhang [Tue, 25 Jan 2022 01:26:11 +0000 (09:26 +0800)]

doc: update matching versions in ice guide

Add recommended matching list for ice PMD in DPDK 21.08 and DPDK 21.11.

Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Junfeng Guo <junfeng.guo@intel.com>

DPDK repo used for reviews