Huisong Li [Thu, 5 May 2022 12:27:02 +0000 (20:27 +0800)]
net/hns3: fix MAC and queues HW statistics overflow
The MAC and queues statistics are 32-bit registers in hardware. If
hardware statistics are not obtained for a long time, these statistics
will be overflow.
So PF and VF driver have to periodically obtain and save these
statistics. Since the periodical task and the stats API are in different
threads, we introduce a statistics lock to protect the statistics.
Fixes: 8839c5e202f3 ("net/hns3: support device stats") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Huisong Li [Thu, 5 May 2022 12:27:01 +0000 (20:27 +0800)]
net/hns3: fix order of clearing imissed register in PF
Clearing imissed registers in PF hardware depends on the
'drop_stats_mode' in struct hns3_hw. The variable is initialized after
the "hns3_get_configuration". But, in current code, the clearing
operation runs before the function.
So this patch fixes this order. In addition, this patch extracts a
public function to initialize and uninitialize statistics to improve the
maintainability of these codes.
Fixes: 3e9f3042d7c8 ("net/hns3: add imissed packet stats") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
rte_pmd_tun/tap_probe() allocates pmd->intr_handle in eth_dev_tap_create()
and it should not be freed until rte_pmd_tap_remove() is called.
Inspection of tap_rx_intr_vec_set() shows that the call to
tap_tx_intr_vec_uninstall() was calling rte_intr_instance_free() but
tap_tx_intr_vec_install() can then be immediately called, and this then
uses pmd->intr_handle without it being reallocated.
Move rte_intr_instance_free() call from tap_tx_intr_vec_uninstall()
to rte_pmd_tap_remove().
Fixes: d61138d4f0e2 ("drivers: remove direct access to interrupt handle") Cc: stable@dpdk.org Signed-off-by: Quentin Armitage <quentin@armitage.org.uk> Reviewed-by: David Marchand <david.marchand@redhat.com>
Huisong Li [Tue, 3 May 2022 10:02:14 +0000 (18:02 +0800)]
net/bonding: fix slave stop and remove on port close
All slaves will be stopped and removed when closing a bonded port.
But the while loop can not end if both rte_eth_dev_stop and
rte_eth_bond_slave_remove fails, runs infinitely.
This is because the skipped slave port counted in both function failures
but it should be counted only one.
Fixing by not continue to process in the loop after first failure.
Fixes: fb0379bc5db3 ("net/bonding: check stop call status") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Huisong Li [Tue, 3 May 2022 10:02:13 +0000 (18:02 +0800)]
net/bonding: fix stopping non-active slaves
When stopping a bonded port, all slaves should be stopped. But only
active slaves are stopped.
So fix by stopping all slave ports and later do "deactivate_slave()" for
active slaves.
Fixes: 0911d4ec0183 ("net/bonding: fix crash when stopping mode 4 port") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
net/iavf: improve performance of Rx timestamp offload
In this patch, We use CPU ticks instead of HW register
to determine whether low 32 bits timestamp has turned
over. It can avoid requesting register value frequently
and improve receiving performance.
Simei Su [Thu, 28 Apr 2022 08:13:45 +0000 (16:13 +0800)]
net/iavf: enable Rx timestamp on flex descriptor
Dump Rx timestamp value into dynamic mbuf field by flex descriptor.
This feature is turned on by dev config "enable-rx-timestamp".
Currently, it's only supported under scalar path.
Signed-off-by: Simei Su <simei.su@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Simei Su [Thu, 28 Apr 2022 08:13:44 +0000 (16:13 +0800)]
common/iavf: support Rx timestamp in virtual channel
Add new ops and structures to support VF to support Rx timestamp
on flex descriptor.
"VIRTCHNL_OP_1588_PTP_GET_CAPS" ops is sent by the VF to request PTP
capabilities and responded by the PF with capabilities enabled for
that VF.
"VIRTCHNL_OP_1588_PTP_GET_TIME" ops is sent by the VF to request
the current time of the PHC. The PF will respond by reading the
device time and reporting it back to the VF.
Signed-off-by: Simei Su <simei.su@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
net/ice: optimize maximum queue number calculation
Remove the limitation that max queue pair number must be 2^n.
With this patch, even on a 8 ports device, the max queue pair
number increased from 128 to 254.
Walter Heymans [Wed, 20 Apr 2022 13:46:39 +0000 (15:46 +0200)]
net/nfp: update how max MTU is read
The 'max_rx_pktlen' value was previously read from hardware, which was
set by the running firmware. This caused confusion due to different
meanings of 'MAX_MTU'. This patch updates the 'max_rx_pktlen' to the
maximum value that the NFP NIC can support. The 'max_mtu' value that is
read from hardware, is assigned to the 'dev_info->max_mtu' variable.
If more layer 2 metadata must be used, the firmware can be updated to
report a smaller 'max_mtu' value.
The constant defined for NFP_FRAME_SIZE_MAX is derived for the maximum
supported buffer size of 10240, minus 136 bytes that is reserved by the
hardware and another 56 bytes reserved for expansion in firmware. This
results in a usable maximum packet length of 10048 bytes.
Signed-off-by: Walter Heymans <walter.heymans@corigine.com> Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Reviewed-by: Chaoyong He <chaoyong.he@corigine.com> Reviewed-by: Richard Donkin <richard.donkin@corigine.com>
Xueming Li [Sun, 8 May 2022 14:25:53 +0000 (17:25 +0300)]
vdpa/mlx5: support device cleanup callback
This patch supports device cleanup callback API which is called when
the device is disconnected from the VM. Cached resources like VM MR and
VQ memory are released.
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xueming Li [Sun, 8 May 2022 14:25:52 +0000 (17:25 +0300)]
vdpa/mlx5: cache and reuse hardware resources
During device suspend and resume, resources are not changed normally.
When huge resources were allocated to VM, like huge memory size or lots
of queues, time spent on release and recreate became significant.
To speed up, this patch reuses resources like VM MR and VirtQ memory if
not changed.
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xueming Li [Sun, 8 May 2022 14:25:51 +0000 (17:25 +0300)]
vdpa/mlx5: reuse resources in reconfiguration
To speed up device resume, create reuseable resources during device
probe state, release when device is removed. Reused resources includes
TIS,
TD, VAR Doorbell mmap, error handling event channel and interrupt
handler, UAR, Rx event channel, NULL MR, steer domain and table.
Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xueming Li [Sun, 8 May 2022 14:25:49 +0000 (17:25 +0300)]
vdpa/mlx5: fix dead loop when process interrupted
In Ctrl+C handling, sometimes kick handling thread gets endless EGAIN
error and fall into dead lock.
Kick happens frequently in real system due to busy traffic or retry
mechanism. This patch simplifies kick firmware anyway and skip setting
hardware notifier due to potential device error, notifier could be set
in next successful kick request.
Fixes: 62c813706e41 ("vdpa/mlx5: map doorbell") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
David Marchand [Mon, 25 Apr 2022 12:54:30 +0000 (14:54 +0200)]
vhost: validate FDs attached to messages
Some message handlers do not expect any file descriptor attached as
ancillary data.
Provide a common way to enforce this by adding a accepts_fd boolean in
the message handler structure. When a message handler sets accepts_fd to
true, it is responsible for calling validate_msg_fds with a right
expected file descriptor count.
This will avoid leaking some file descriptor by mistake when adding
support for new vhost user message types.
Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xuan Ding [Fri, 8 Apr 2022 10:22:14 +0000 (10:22 +0000)]
examples/vhost: use API to check in-flight packets
In async data path, call rte_vhost_async_get_inflight_thread_unsafe()
API to directly return the number of in-flight packets instead of
maintaining a local variable.
Signed-off-by: Xuan Ding <xuan.ding@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Xuan Ding [Fri, 8 Apr 2022 10:22:13 +0000 (10:22 +0000)]
vhost: add unsafe API to check in-flight packets
In async data path, when vring state changes or device is destroyed,
it is necessary to know the number of in-flight packets in DMA engine.
This patch provides a thread unsafe API to return the number of
in-flight packets for a vhost queue without using any lock.
Signed-off-by: Xuan Ding <xuan.ding@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Yuan Wang [Fri, 11 Mar 2022 16:35:12 +0000 (00:35 +0800)]
net/vhost: fix access to freed memory
This patch fixes heap-use-after-free reported by ASan.
It is possible for the rte_vhost_dequeue_burst() to access the vq
is freed when numa_realloc() gets called in the device running state.
The control plane will set the vq->access_lock to protected the vq
from the data plane. Unfortunately the lock will fail at the moment
the vq is freed, allowing the rte_vhost_dequeue_burst() to access
the fields of the vq, which will trigger a heap-use-after-free error.
In the case of multiple queues, the vhost pmd can access other queues
that are not ready when the first queue is ready, which makes no sense
and also allows numa_realloc() and rte_vhost_dequeue_burst() access to
vq to happen at the same time. By controlling vq->allow_queuing we can make
the pmd access only the queues that are ready.
Fixes: 1ce3c7fe149 ("net/vhost: emulate device start/stop behavior") Signed-off-by: Yuan Wang <yuanx.wang@intel.com> Tested-by: Wei Ling <weix.ling@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Harold Huang [Wed, 2 Mar 2022 09:41:05 +0000 (17:41 +0800)]
net/virtio: support NAPI when using vhost-net backend
In patch [1], NAPI has been supported in kernel tun driver to accelerate
packet processing received from vhost-net. This will greatly improve the
throughput of the tap device in the vhost-net backend.
Match the closest supported Rx payload buffer size with the mempool
data size and program it for the Rx queue. This removes unnecessary
need for handling additional padding, packing, and alignment, when
posting Rx buffers to hardware.
net/cxgbe: fix Tx queue stuck with mbuf chain coalescing
When trying to coalesce mbufs with chain on Tx side, it is possible
to get stuck during queue wrap around. When coalescing this mbuf
chain fails, the Tx path returns EBUSY and when the same packet
is retried again, it couldn't get coalesced again, and the loop
repeats. Fix by pushing the packet through the normal Tx path.
Also use FW_ETH_TX_PKTS_WR to handle mbufs with chain for FW
to optimize.
Ke Zhang [Mon, 11 Apr 2022 05:40:03 +0000 (05:40 +0000)]
net/bonding: fix RSS key config with extended key length
When creating a bonding device, if the slave device's
RSS key length = standard_rss_key length + extended_hash_key length,
then bonding device will be same as slave,
in function bond_ethdev_configure(), the default_rss_key length is 40,
it is not matched, so it should calculate a new key for bonding device
if the default key could not be used.
Fixes: 6b1a001ec546 ("net/bonding: fix RSS key length") Cc: stable@dpdk.org Signed-off-by: Ke Zhang <ke1x.zhang@intel.com> Acked-by: Min Hu (Connor) <humin29@huawei.com>
Long Li [Thu, 24 Mar 2022 17:46:17 +0000 (10:46 -0700)]
net/netvsc: fix hot adding multiple VF PCI devices
This patch fixes two issues with hot removing/adding a VF PCI device:
1. The original device argument is lost when it's hot added
2. If there are multiple VFs hot adding at the same time, some of the
VFs may not get added successfully because only one single VF status
is stored in the netvsc.
Fix these by storing the original device arguments and maintain a list
of hot add contexts to deal with multiple VF devices.
Fixes: a2a23a794b ("net/netvsc: support VF device hot add/remove") Cc: stable@dpdk.org Signed-off-by: Long Li <longli@microsoft.com>
David Marchand [Thu, 5 May 2022 09:29:52 +0000 (11:29 +0200)]
ci: build some job with ASan
Enable ASan, this can greatly help identify leaks and buffer overflows.
Running unit tests relying on multiprocess is unreliable with ASan
enabled, so skip them.
Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com>
David Marchand [Thu, 5 May 2022 09:29:51 +0000 (11:29 +0200)]
test/mem: disable ASan when accessing unallocated memory
As described in bugzilla, ASan reports accesses to all memory segment as
invalid, since those parts have not been allocated with rte_malloc.
Move __rte_no_asan to rte_common.h and disable ASan on a part of the test.
Bugzilla ID: 880 Fixes: 6cc51b1293ce ("mem: instrument allocator for ASan") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
test/hash: report non HTM numbers for single thread
In hash_readwrite_perf_autotest a single read and write operation is
benchmarked for both HTM and non HTM cases. However the result summary
only shows the HTM value. Therefore add the non HTM value for
completeness.
Fixes: 0eb3726ebcf1 ("test/hash: add test for read/write concurrency") Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
Raja Zidane [Thu, 7 Apr 2022 11:42:49 +0000 (14:42 +0300)]
examples/l2fwd-crypto: fix stats refresh rate
TIMER_MILLISECOND is defined as the number of cpu cycles per millisecond,
current definition is correct for cores with frequency of 2GHZ, for cores
with different frequency, it caused different periods between refresh,
(i.e. the definition is about 14ms on ARM cores).
The devarg that stated the period between stats print was not used,
instead, it was always defaulted to 10 seconds (on 2GHZ core).
Use DPDK API to get CPU frequency, to define TIMER_MILLISECOND.
Use the refresh period devarg instead of defaulting to 10s always.
Driver is creating a fle pool with a fixed number of
buffers for all queue pairs of a DPSECI object.
These fle buffers are equivalent to the number of descriptors.
In this patch, creating the fle pool for each queue pair
so that user can control the number of descriptors of a
queue pair using API rte_cryptodev_queue_pair_setup().
To perform crypto operations on DPAA platform,
QI interface of HW must be enabled.
Earlier DPAA crypto driver was dependent on
kernel for QI enable. Now with this patch
there is no such dependency on kernel.
crypto/dpaa_sec: fix chained FD length in raw datapath
DPAA sec raw driver is calculating the wrong lengths while
creating the FD for chain.
This patch fixes lengths for chain FD.
Fixes: 78156d38e112 ("crypto/dpaa_sec: support authonly and chain with raw API") Cc: stable@dpdk.org Signed-off-by: Gagandeep Singh <g.singh@nxp.com> Acked-by: Akhil Goyal <gakhil@marvell.com>
Driver allocates a fle buffer for each packet
before enqueue and free the buffer on dequeue. But in case if
there are enqueue failures, then code should free the fle buffers.
test/crypto-perf: extend asymmetric crypto throughput test
Extended support for asymmetric crypto perf throughput test.
Added support for new modulus lengths.
Added new parameter --modex-len.
Supported lengths are 60, 128, 255, 448. Default length is 128.
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>
crypto/cnxk: prevent out-of-bound access in capabilities
In a situation where crypto_caps elements are checked only for
RTE_CRYPTO_OP_TYPE_UNDEFINED until valid op defined, there is
possibility for an out of bound access. Add this array by one
element for current capabilities.
Signed-off-by: Gowrishankar Muthukrishnan <gmuthukrishn@marvell.com> Acked-by: Anoob Joseph <anoobj@marvell.com>
Anoob Joseph [Mon, 25 Apr 2022 05:38:25 +0000 (11:08 +0530)]
crypto/cnxk: use set ctx operation for session destroy
Usage of flush and invalidate would involve delays to account
for flush delay. Use set_ctx operation instead. When set_ctx fails,
fall back to flush + invalidate scheme.
Signed-off-by: Anoob Joseph <anoobj@marvell.com> Acked-by: Akhil Goyal <gakhil@marvell.com>
David Marchand [Fri, 6 May 2022 11:57:36 +0000 (13:57 +0200)]
ci: add MinGW cross-compilation in GHA
Add mingw cross compilation in our public CI so that users with their
own github repository have a first level of checks for Windows compilation
before submitting to the mailing list.
This does not replace our better checks in other entities of the CI.
Only the helloworld example is compiled (same as what is tested in
test-meson-builds.sh).
Note: the mingw cross compilation toolchain (version 5.0) in Ubuntu
18.04 was broken (missing a ENOMSG definition).
Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com>
David Marchand [Fri, 6 May 2022 11:57:35 +0000 (13:57 +0200)]
ci: switch to Ubuntu 20.04
Ubuntu 18.04 is now rather old.
Besides, other entities in our CI are also testing this distribution.
Switch to a newer Ubuntu release and benefit from more recent
tool(chain)s: for example, net/cnxk now builds fine and can be
re-enabled.
Note: Ubuntu 18.04 and 20.04 seem to preserve the same paths for the ARM
and PPC cross compilation toolchains, so we can use a single
configuration file (with the hope, future releases of Ubuntu will do the
same).
Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Tianhao Chai [Thu, 5 May 2022 04:39:35 +0000 (23:39 -0500)]
eal: fix C++ include for device event and DMA
Currently the "extern C" section ends right before rte_dev_dma_unmap
and other DMA function declarations, causing some C++ compilers to
produce C++ mangled symbols to rte_dev_dma_unmap instead of C symbols.
This leads to build failures later when linking a final executable
against this object.
Anatoly Burakov [Wed, 4 May 2022 14:31:58 +0000 (14:31 +0000)]
malloc: fix ASan handling for unmapped memory
Currently, when we free previously allocated memory, we mark the area as
"freed" for ASan purposes (flag 0xfd). However, sometimes, freeing a
malloc element will cause pages to be unmapped from memory and re-backed
with anonymous memory again. This may cause ASan's "use-after-free"
error down the line, because the allocator will try to write into
memory areas recently marked as "freed".
To fix this, we need to mark the unmapped memory area as "available",
and fixup surrounding malloc element header/trailers to enable later
malloc routines to safely write into new malloc elements' headers or
trailers.
Bugzilla ID: 994 Fixes: 6cc51b1293ce ("mem: instrument allocator for ASan") Cc: stable@dpdk.org Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>