Morten Brørup [Mon, 24 Jan 2022 14:59:53 +0000 (15:59 +0100)]
mempool: test performance with constant n
"What gets measured gets done."
This patch adds mempool performance tests where the number of objects to
put and get is constant at compile time, which may significantly improve
the performance of these functions. [*]
Also, it is ensured that the array holding the object used for testing
is cache line aligned, for maximum performance.
And finally, the following entries are added to the list of tests:
- Number of kept objects: 512
- Number of objects to get and to put: The number of pointers fitting
into a cache line, i.e. 8 or 16
[*] Some example performance test (with cache) results:
Markus Theil [Fri, 3 Dec 2021 07:19:07 +0000 (08:19 +0100)]
kni: fix ioctl signature
Fix kni's ioctl signature to correctly match the kernel's
structs. This shaves off the (void*) casts and uses struct file*
instead of struct inode*. With the correct signature, control flow
integrity checkers are no longer confused at this point.
Signed-off-by: Markus Theil <markus.theil@secunet.com> Tested-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Tudor Cornea [Thu, 20 Jan 2022 12:41:34 +0000 (14:41 +0200)]
kni: allow configuring thread granularity
The Kni kthreads seem to be re-scheduled at a granularity of roughly
1 millisecond right now, which seems to be insufficient for performing
tests involving a lot of control plane traffic.
Even if KNI_KTHREAD_RESCHEDULE_INTERVAL is set to 5 microseconds, it
seems that the existing code cannot reschedule at the desired granularily,
due to precision constraints of schedule_timeout_interruptible().
In our use case, we leverage the Linux Kernel for control plane, and
it is not uncommon to have 60K - 100K pps for some signaling protocols.
Since we are not in atomic context, the usleep_range() function seems to be
more appropriate for being able to introduce smaller controlled delays,
in the range of 5-10 microseconds. Upon reading the existing code, it would
seem that this was the original intent. Adding sub-millisecond delays,
seems unfeasible with a call to schedule_timeout_interruptible().
KNI_KTHREAD_RESCHEDULE_INTERVAL 5 /* us */
schedule_timeout_interruptible(
usecs_to_jiffies(KNI_KTHREAD_RESCHEDULE_INTERVAL));
Below, we attempted a brief comparison between the existing implementation,
which uses schedule_timeout_interruptible() and usleep_range().
We attempt to measure the CPU usage, and RTT between two Kni interfaces,
which are created on top of vmxnet3 adapters, connected by a vSwitch.
insmod rte_kni.ko kthread_mode=single carrier=on
schedule_timeout_interruptible(usecs_to_jiffies(5))
kni_single CPU Usage: 2-4 %
[root@localhost ~]# ping 1.1.1.2 -I eth1
PING 1.1.1.2 (1.1.1.2) from 1.1.1.1 eth1: 56(84) bytes of data.
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.70 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.00 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.99 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.985 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.00 ms
usleep_range(5, 10)
kni_single CPU usage: 50%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.338 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.150 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.123 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.139 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.159 ms
usleep_range(20, 50)
kni_single CPU usage: 24%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.202 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.170 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.171 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.248 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.185 ms
usleep_range(50, 100)
kni_single CPU usage: 13%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.537 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.257 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.231 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.143 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.200 ms
usleep_range(100, 200)
kni_single CPU usage: 7%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=0.716 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=0.167 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=0.459 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=0.455 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=0.252 ms
usleep_range(1000, 1100)
kni_single CPU usage: 2%
64 bytes from 1.1.1.2: icmp_seq=1 ttl=64 time=2.22 ms
64 bytes from 1.1.1.2: icmp_seq=2 ttl=64 time=1.17 ms
64 bytes from 1.1.1.2: icmp_seq=3 ttl=64 time=1.17 ms
64 bytes from 1.1.1.2: icmp_seq=4 ttl=64 time=1.17 ms
64 bytes from 1.1.1.2: icmp_seq=5 ttl=64 time=1.15 ms
Upon testing, usleep_range(1000, 1100) seems roughly equivalent in
latency and cpu usage to the variant with schedule_timeout_interruptible(),
while usleep_range(100, 200) seems to give a decent tradeoff between
latency and cpu usage, while allowing users to tweak the limits for
improved precision if they have such use cases.
Disabling RTE_KNI_PREEMPT_DEFAULT, interestingly seems to lead to a
softlockup on my kernel.
Kernel panic - not syncing: softlockup: hung tasks
CPU: 0 PID: 1226 Comm: kni_single Tainted: G W O 3.10 #1
<IRQ> [<ffffffff814f84de>] dump_stack+0x19/0x1b
[<ffffffff814f7891>] panic+0xcd/0x1e0
[<ffffffff810993b0>] watchdog_timer_fn+0x160/0x160
[<ffffffff810644b2>] __run_hrtimer.isra.4+0x42/0xd0
[<ffffffff81064b57>] hrtimer_interrupt+0xe7/0x1f0
[<ffffffff8102cd57>] smp_apic_timer_interrupt+0x67/0xa0
[<ffffffff8150321d>] apic_timer_interrupt+0x6d/0x80
Bruce Richardson [Mon, 24 Jan 2022 17:49:59 +0000 (17:49 +0000)]
build: remove deprecated Meson functions
Starting in meson 0.56, the functions meson.source_root() and
meson.build_root() are deprecated and to be replaced by the [more
descriptive] functions: project_source_root()/global_source_root() and
project_build_root()/global_build_root(). Unfortunately, these new
replacement functions were only added in 0.56 release too, so to use
them we would need version checks for old/new functions to remove the
deprecation warnings.
However, the functions "current_build_dir()" and "current_source_dir()"
remain unaffected by all this, so we can bypass the versioning problem,
by saving off these values to "dpdk_source_root" and "dpdk_build_root"
in the top-level meson.build file
Bugzilla ID: 926 Cc: stable@dpdk.org Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Jerin Jacob <jerinj@marvell.com>
Bruce Richardson [Fri, 21 Jan 2022 16:12:30 +0000 (16:12 +0000)]
build: fix warning about using -Wextra flag
Each build, meson would issue a warning reporting that the
"warning_level" setting should be used in place of adding -Wextra
directly to our build commands. Testing with meson 0.61 shows that the
only difference for gcc and clang builds between warning levels 1 and
2 is the addition of -Wextra, so we can remove the warning by deleting
our explicit set of Wextra and changing the build defaults to
warning_level 2.
Fixes: 524a0d5d66b9 ("build: enable extra warnings with meson") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Luca Boccassi <bluca@debian.org>
Bruce Richardson [Thu, 20 Jan 2022 18:06:39 +0000 (18:06 +0000)]
build: fix warnings when running external commands
Meson 0.61.1 is giving warnings that the calls to run_command do not
always explicitly specify if the result is to be checked or not, i.e.
there is a missing "check" parameter. This is because the default
behaviour without the parameter is due to change in the future.
We can fix these warnings by explicitly adding into each call whether
the result should be checked by meson or not. This patch therefore
adds in "check: false" to each run_command call where the result is
being checked by the DPDK meson.build code afterwards, and adds in
"check: true" to any calls where the result is currently unchecked.
Bugzilla ID: 921 Cc: stable@dpdk.org Reported-by: Jerin Jacob <jerinj@marvell.com> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Tested-by: Jerin Jacob <jerinj@marvell.com>
Feifei Wang [Thu, 27 Jan 2022 07:40:01 +0000 (15:40 +0800)]
net/i40e: remove redundant reset operation
For free buffer operation in i40e vector path, it is unnecessary to
store 'NULL' into txep.mbuf. This is because when putting mbuf into Tx
queue, tx_tail is the sentinel. And when doing tx_free, tx_next_dd is
the sentinel. In all processes, mbuf==NULL is not a condition in check.
Thus reset of mbuf is unnecessary and can be omitted.
Signed-off-by: Feifei Wang <feifei.wang2@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Xiaoyu Min [Tue, 18 Jan 2022 11:38:50 +0000 (19:38 +0800)]
net/mlx5: reject jump to root table
Currently root table as destination is not supported.
The jump action which finally be translated to underlying root table in
rdma-core should be rejected.
Fixes: f78f747f41d0 ("net/mlx5: allow jump to group lower than current") Cc: stable@dpdk.org Signed-off-by: Xiaoyu Min <jackmin@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Bing Zhao [Mon, 17 Jan 2022 17:49:14 +0000 (19:49 +0200)]
common/mlx5: fix probing failure code
While probing the device with unsupported class, the process should
fail because no appropriate driver was found. After traversing all
the drivers, an error value should be returned for the case.
In the previous implementation, zero value indicating probing success
was wrongly returned.
Raja Zidane [Sun, 16 Jan 2022 15:23:47 +0000 (15:23 +0000)]
net/mlx5: fix mark enabling for Rx
To optimize datapath, the mlx5 pmd checked for mark action on flow
creation, and flagged possible destination rxqs (through queue/RSS
actions), then it enabled the mark action logic only for flagged rxqs.
Mark action didn't work if no queue/rss action was in the same flow,
even when the user use multi-group logic to manage the flows.
So, if mark action is performed in group X and the packet is moved to
group Y > X when the packet is forwarded to Rx queues, SW did not get
the mark ID to the mbuf.
Flag Rx datapath to report mark action for any queue when the driver
detects the first mark action after dev_start operation.
Fixes: 8e61555657b2 ("net/mlx5: fix shared RSS and mark actions combination") Cc: stable@dpdk.org Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Dmitry Kozlyuk [Fri, 14 Jan 2022 10:52:17 +0000 (12:52 +0200)]
common/mlx5: fix MR lookup for non-contiguous mempool
Memory region (MR) lookup by address inside mempool MRs
was not accounting for the upper bound of an MR.
For mempools covered by multiple MRs this could return
a wrong MR LKey, typically resulting in an unrecoverable
TxQ failure:
mlx5_net: Cannot change Tx QP state to INIT Invalid argument
Corresponding message from /var/log/dpdk_mlx5_port_X_txq_Y_index_Z*:
This is likely to happen with --legacy-mem and IOVA-as-PA,
because EAL intentionally maps pages at non-adjacent PA
to non-adjacent VA in this mode, and MLX5 PMD works with VA.
Maxime Coquelin [Wed, 26 Jan 2022 09:55:07 +0000 (10:55 +0100)]
vhost: improve virtio-net layer logs
This patch standardizes logging done in Virtio-net, so that
the Vhost-user socket path is always prepended to the logs.
It will ease log analysis when multiple Vhost-user ports
are in use.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
Maxime Coquelin [Wed, 26 Jan 2022 09:55:06 +0000 (10:55 +0100)]
vhost: improve socket layer logs
This patch adds the Vhost socket path whenever possible in
order to make debugging possible when multiple Vhost
devices are in use. Some vhost-user layer functions are
modified to pass the device path down to the socket layer.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
Harold Huang [Thu, 23 Dec 2021 04:42:37 +0000 (12:42 +0800)]
net/virtio-user: fix resource leak on probing failure
When eth_virtio_dev_init is failed, the registered virtio user memory
event cb is not released and the backend created tap device is not
destroyed. It would cause some residual tap device existed in the host
and creating a new vdev could be failed because the new virtio_user_dev
could use the same address pointer and register memory event cb to the
same address is not allowed.
Fixes: ca8326a94365 ("net/virtio_user: fix error management during init") Cc: stable@dpdk.org Signed-off-by: Harold Huang <baymaxhuang@gmail.com> Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Matan Azrad [Mon, 22 Nov 2021 13:12:35 +0000 (15:12 +0200)]
vdpa/mlx5: workaround queue stop with traffic
When the event thread polls traffic and a virtq is stopping, the FW loses
synchronization in the virtq indexes.
It causes LM failure on synchronization between the HOST indexes to
the GUEST indexes.
Unset the event thread before the queue stop in the LM process.
Fixes: 31b9c29c86af ("vdpa/mlx5: support close and config operations") Cc: stable@dpdk.org Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Selwin Sebastian [Tue, 25 Jan 2022 12:17:47 +0000 (17:47 +0530)]
net/axgbe: alter port speed bit range
Newer generation Hardware uses the slightly different
port speed bit widths, so alter the existing port speed
bit range to extend support to the newer generation hardware
while maintaining the backward compatibility with older
generation hardware.
The previously reserved bits are now being used which
then requires the adjustment to the BIT values, e.g.:
Selwin Sebastian [Tue, 25 Jan 2022 12:17:45 +0000 (17:47 +0530)]
net/axgbe: reset PHY Rx when mailbox command timeout
Sometimes mailbox commands timeout when the RX data path becomes
unresponsive. This prevents the submission of new mailbox commands
to DXIO. This patch identifies the timeout and resets the RX data
path so that the next message can be submitted properly.
Signed-off-by: Selwin Sebastian <selwin.sebastian@amd.com> Acked-by: Chandubabu Namburu <chandu@amd.com>
Selwin Sebastian [Tue, 25 Jan 2022 12:17:44 +0000 (17:47 +0530)]
net/axgbe: simplify rate change mailbox interface
Simplify and centralize the mailbox command rate change interface by
having a single function perform the writes to the mailbox registers
to issue the request.
Signed-off-by: Selwin Sebastian <selwin.sebastian@amd.com> Acked-by: Chandubabu Namburu <chandu@amd.com>
Selwin Sebastian [Tue, 25 Jan 2022 12:17:43 +0000 (17:47 +0530)]
net/axgbe: toggle PLL settings during rate change
For each rate change command submission, the FW has to do a phy
power off sequence internally. For this to happen correctly, the
PLL re-initialization control setting has to be turned off before
sending mailbox commands and re-enabled once the command submission
is complete. Without the PLL control setting, the link up takes
longer time in a fixed phy configuration.
Signed-off-by: Selwin Sebastian <selwin.sebastian@amd.com> Acked-by: Chandubabu Namburu <chandu@amd.com>
Selwin Sebastian [Tue, 25 Jan 2022 12:17:42 +0000 (17:47 +0530)]
net/axgbe: attempt always link training in KR mode
Link training is always attempted when in KR mode, but the code is
structured to check if link training has been enabled before attempting
to perform it. Since that check will always be true, simplify the code
to always enable and start link training during KR auto-negotiation.
Signed-off-by: Selwin Sebastian <selwin.sebastian@amd.com> Acked-by: Chandubabu Namburu <chandu@amd.com>
Huisong Li [Sat, 22 Jan 2022 01:51:30 +0000 (09:51 +0800)]
net/hns3: extract common function to initialize MAC address
The code logic to initialize "data->mac_addrs" for PF and VF is similar.
This patch extracts a common API to initialize it to improve code
maintainability.
Signed-off-by: Huisong Li <lihuisong@huawei.com> Acked-by: Min Hu (Connor) <humin29@huawei.com>
Huisong Li [Sat, 22 Jan 2022 01:51:29 +0000 (09:51 +0800)]
net/hns3: fix using enum as boolean
The enum type variables cannot be used as bool variables. This patch
fixes for "with->func" in hns3_action_rss_same().
Fixes: eb158fc756a5 ("net/hns3: fix config when creating RSS rule after flush") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Acked-by: Min Hu (Connor) <humin29@huawei.com>
Heinrich Kuhn [Wed, 19 Jan 2022 11:48:00 +0000 (13:48 +0200)]
net/nfp: free HW ring memzone on queue release
During Rx/Tx queue setup, memory is reserved for the hardware rings.
This memory zone should subsequently be freed in the queue release
logic. This commit also adds a call to the release logic in the
dev_close() callback so that the ring memzone may be freed during port
close too.
Fixes: b812daadad0d ("nfp: add Rx and Tx") Cc: stable@dpdk.org Signed-off-by: Heinrich Kuhn <heinrich.kuhn@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>
Nobuhiro Miki [Wed, 19 Jan 2022 07:43:16 +0000 (16:43 +0900)]
net/tap: forbid different Rx/Tx queue number
Users can create the desired number of RxQ and TxQ in DPDK. For
example, if the number of RxQ = 2 and the number of TxQ = 5,
a total of 8 file descriptors will be created for a tap device,
including RxQ, TxQ, and one for keepalive. The RxQ and TxQ
with the same ID are paired by dup(2).
In this scenario, Kernel will have 3 RxQ where packets are
incoming but not read. The reason for this is that there are only
2 RxQ that are polled by DPDK, while there are 5 queues in Kernel.
This patch add a checking if DPDK has appropriate numbers of
queues to avoid unexpected packet drop.
Min Hu (Connor) [Mon, 17 Jan 2022 02:43:02 +0000 (10:43 +0800)]
net/hns3: fix vector Rx/Tx when PTP enabled
If hardware supports IEEE 1588 PTP, PTP capability will be set.
Currently, vec and sve burst is unsupported when PTP capability is set.
For sake of Rx/Tx performance, IEEE 1588 PTP is not supported in sve or
vec burst mode. When enabling IEEE 1588 PTP, Rx/Tx burst mode should be
simple or common. Rx/Tx burst mode could be set like this, for example:
-a 0000:35:00.0,rx_func_hint=common,tx_func_hint=common
This patch supports vec and sve burst when PTP is disabled. And only
support simple or common burst When PTP is enabled.
Fixes: 38b539d96eb6 ("net/hns3: support IEEE 1588 PTP") Cc: stable@dpdk.org Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Huisong Li [Mon, 17 Jan 2022 02:43:01 +0000 (10:43 +0800)]
net/hns3: fix mailbox wait time
The mailbox wait time can be specified at runtime. But the variable that
controls this time are not initialized when the variable isn't designated
or is specified as an invalid value, which will fail to initialize device
in the case where no device is bound to initialize the device.
Fixes: 2fc3e696a7f1 ("net/hns3: add runtime config for mailbox limit time") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com>
Min Hu (Connor) [Mon, 17 Jan 2022 02:43:00 +0000 (10:43 +0800)]
net/hns3: fix Rx/Tx functions update
When fast path operation is introduced, the Rx/Tx function is done by
object 'rte_eth_fp_ops'. So 'rte_eth_fp_ops' should be updated if
'fast-path functions' need to be changed, such as PMD receive function,
prepare function and so on.
This patch fixed receiving packets bug when fast path operation is
introduced.
Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations") Fixes: 168b7d79dada ("net/hns3: support set link up/down for PF") Cc: stable@dpdk.org Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
The code in memif driver to stub out rx_irq_enable is unnecessary
and causes different error returns than other drivers.
The core ethdev code will return -ENOTSUP if the driver has
a null rx_queue_intr_enable callback.
Ajit Khaparde [Thu, 20 Jan 2022 09:12:28 +0000 (14:42 +0530)]
net/bnxt: fix VF resource allocation strategy
1. VFs need a notification queue to handle async messages.
But the current logic does not reserve a notification queue leading
to initialization failure in some cases.
2. With the current logic, DPDK PF driver reserves only one VNIC
to the VFs leading to initialization failure with more than 1 RXQs.
Added logic to distribute number of NQs and VNICs from the pool
across VFs and PF.
While reserving resources for the VFs, the strategy is to keep
both min & max values the same. This could result in a failure
when there isn't enough resources to satisfy the request.
Hence fixed to instruct the FW to not reserve all minimum
resources requested for the VF. The VF driver can request the FW
for the allocated resources during probe.
Fixes: b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF") Cc: stable@dpdk.org Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Thu, 20 Jan 2022 09:12:27 +0000 (14:42 +0530)]
net/bnxt: fix memzone allocation per VNIC
In case of Thor RSS table size is too big. This could result in
memory allocation failure when the supported vnic count is high.
Instead of allocating the memzone for all VNICs in one shot,
allocate for each VNIC individually.
Also, fixed to free the memzone in the uninit path.
Kalesh AP [Thu, 20 Jan 2022 09:12:25 +0000 (14:42 +0530)]
net/bnxt: fix check for autoneg enablement
HWRM_PORT_PHY_QCFG_OUTPUT response indicates the autoneg speed mask
supported by the FW. While enabling autoneg, driver should also check
the FW advertised PAM4 speeds supported in auto mode which is set
in the HWRM_PORT_PHY_QCFG_OUTPUT response.
Yunjian Wang [Tue, 25 Jan 2022 01:39:07 +0000 (09:39 +0800)]
net/ice: fix link up when starting device
Currently, there is a possibility that the link status is not correct
after set link up, the device ID is 159b. It would be fixed by calling
ice_link_update() while the parameter 'wait_to_complete' is true. It's
reasonable to wait for complete right after set link up as it is not
in an link status change interrupt handling scenario.
Fixes: cf911d90e366 ("net/ice: support link update") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Simei Su [Thu, 20 Jan 2022 10:21:52 +0000 (18:21 +0800)]
net/ice: fix mbuf offload flag for Rx timestamp
For received PTP packets, the flag "RTE_MBUF_F_RX_IEEE1588_TMST" has not
been set which leads to received PTP packet not timestamped by hardware
shown in testpmd/ieee1588 fwd.
Fixes: 646dcbe6c701 ("net/ice: support IEEE 1588 PTP") Cc: stable@dpdk.org Signed-off-by: Simei Su <simei.su@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Ivan Malov [Mon, 10 Jan 2022 21:48:45 +0000 (00:48 +0300)]
net/sfc: validate queue span when parsing flow action RSS
The current code silently shrinks the value if it exceeds
the supported maximum. Do not do that. Validate the value.
Fixes: d77d07391d4d ("net/sfc: support flow API RSS action") Cc: stable@dpdk.org Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Pavan Nikhilesh [Sat, 22 Jan 2022 15:48:12 +0000 (21:18 +0530)]
net/cnxk: add cn9k segregated Rx functions
Split template functions to multiple files based on the range
of offloads. This allows them to be built in parallel reducing
time spent on compiling single files containing all the template
functions.
The files are added to the build system in later patches modifying
the existing scheme of selecting template lookup with a simple
flat array based lookup.
Add cn9k segregated Rx and event dequeue template functions,
these help in parallelizing the build.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Harman Kalra [Fri, 21 Jan 2022 12:04:19 +0000 (17:34 +0530)]
common/cnxk: always use single interrupt ID with NIX
An errata exists whereby, in certain cases NIX may use an
incorrect QINT_IDX for SQ interrupts. As a result, the
interrupt may not be delivered to software, or may not be
associated with the correct SQ.
When NIX uses an incorrect QINT_IDX :
1. NIX_LF_QINT(0..63)_CNT[COUNT] will be incremented for
incorrect QINT.
2. NIX_LF_QINT(0..63)_INT[INTR] will be set for incorrect
QINT.
Harman Kalra [Fri, 21 Jan 2022 12:04:18 +0000 (17:34 +0530)]
common/cnxk: reset stale values on error debug registers
LF's error debug registers like NIX_LF_SQ_OP_ERR_DBG,
NIX_LF_MNQ_ERR_DBG, NIX_LF_SEND_ERR_DBG captures debug
info for an error detected during LMT operation or meta
enqueue or after meta enqueue granted respectively. HW
sets a valid bit when info is captured and SW is expected
to clear this valid bit by writing 1, else these registers
will show stale values of first interrupt when occurred and
will never update with subsequent interrupts.
Fixes: f6d567b03d28 ("common/cnxk: support NIX IRQ") Cc: stable@dpdk.org Signed-off-by: Harman Kalra <hkalra@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
common/cnxk: use for loop in shaper profiles cleanup
In shaper profiles cleanup, Klockwork static analyzer tool reports
infinite loop although existing loop condition is alright.
False positive may be due to tqh_first not checked in loop,
hence switching to FOREACH_SAFE to make Klockwork happy.
common/cnxk: fix shift offset for TL3 length disable
Fix shift offset for length disable flag in NIXX_AF_TL3X_SHAPE
register to be 24 instead of zero similar to other level SHAPE
registers. Also mask unused bits in adjust value.
Kiran Kumar K [Fri, 21 Jan 2022 06:26:39 +0000 (11:56 +0530)]
common/cnxk: support custom pre L2 header parsing as raw
Add ROC API for parsing custom pre_l2 headers as raw data.
Only relative offset is supported and search and limit is
not supported with this raw item type.
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Satheesh Paul <psatheesh@marvell.com>
Kiran Kumar K [Fri, 21 Jan 2022 06:26:38 +0000 (11:56 +0530)]
net/cnxk: support pre L2 switch header type
Adding changes to configure switch header type pre_l2 for cnxk.
pre_l2 headers are custom headers placed before the ethernet
header. Along with switch header type, user needs to provide the
offset within the custom header that holds the size of the
custom header and mask for the size within the size offset.
Signed-off-by: Kiran Kumar K <kirankumark@marvell.com> Reviewed-by: Satheesh Paul <psatheesh@marvell.com>
Thomas Monjalon [Tue, 1 Feb 2022 09:46:37 +0000 (10:46 +0100)]
net/dpaa2: fix build with musl
PAGE_SIZE is already defined in musl libc:
drivers/net/dpaa2/dpaa2_recycle.c:35: error: "PAGE_SIZE" redefined
/usr/include/limits.h:97: note:
this is the location of the previous definition
97 | #define PAGE_SIZE PAGESIZE
Fixes: f023d059769f ("net/dpaa2: support recycle loopback port") Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Nipun Gupta <nipun.gupta@nxp.com>
Maxime Gouin [Wed, 5 Jan 2022 10:32:03 +0000 (11:32 +0100)]
net/nfp: remove useless range checks
Reported by code analysis tool C++test (version 10.4):
> /build/dpdk-20.11/drivers/net/nfp/nfpcore/nfp_target.h
> 375 Condition "island < 1" is always evaluated to false
> 415 Condition "island < 1" is always evaluated to false
> 547 Condition "target < 0" is always evaluated to false
All of these conditions have the same error. They call
NFP_CPP_ID_ISLAND_of or NFP_CPP_ID_TARGET_of which return a uint8_t and
put the result in "island" or "target" which are integers. These
variables can only contain values between 0 and 255.
Fixes: c7e9729da6b5 ("net/nfp: support CPP") Cc: stable@dpdk.org Signed-off-by: Maxime Gouin <maxime.gouin@6wind.com> Reviewed-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Preparation of the stride size and the number of strides for
Multi-Packet RQ was updated recently to accommodate the hardware
limitation about minimum WQE size. The wrong assertion was
introduced to ensure this limitation is met. Assert that the
configured WQE size is not less than the minimum supported size.
The maximum packet headers size for TSO is calculated as a sum of
Ethernet, VLAN, IPv6 and TCP headers (plus inner headers).
The rationale behind choosing IPv6 and TCP is their headers
are bigger than IPv4 and UDP respectively, giving us the maximum
possible headers size. But it is not true for L3 headers.
IPv4 header size (20 bytes) is smaller than IPv6 header size
(40 bytes) only in the default case. There are up to 10
optional header fields called Options in case IHL > 5.
This means that the maximum size of the IPv4 header is 60 bytes.
Choosing the wrong maximum packets headers size causes inability
to transmit multi-segment TSO packets with IPv4 Options present.
PMD check that it is possible to inline all the packet headers
and the packet headers size exceeds the expected maximum size.
The maximum packet headers size was set to 192 bytes before,
but its value has been reduced during Tx path refactor activity.
Restore the proper maximum packet headers size for TSO.
During a large refactoring sweep for 21.11, a previous commit
removed the dependency the bnxt driver had on Linux virtual
bus drivers, such as vfio-pci. This breaks port detection.
This patch adds the kmod dependency back as it was.