Ruifeng Wang [Wed, 15 Sep 2021 08:33:39 +0000 (16:33 +0800)]
net/i40e: fix risk in descriptor read in scalar Rx
Rx descriptor is 16B/32B in size. If the DD bit is set, it indicates
that the rest of the descriptor words have valid values. Hence, the
word containing DD bit must be read first before reading the rest of
the descriptor words.
Since the entire descriptor is not read atomically, on relaxed memory
ordered systems like Aarch64, read of the word containing DD field
could be reordered after read of other words.
Read barrier is inserted between read of the word with DD field
and read of other words. The barrier ensures that the fetched data
is correct.
Testpmd single core test showed no performance drop on x86 or N1SDP.
On ThunderX2, 22% performance regression was observed.
Fixes: 7b0cf70135d1 ("net/i40e: support ARM platform") Cc: stable@dpdk.org Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
The routine converting RTE flow modify field action into
field driver's presentation did not specify the field mask
correctly and this resulted into wrong conversion for
the actions with shifted fields.
Matan Azrad [Mon, 8 Nov 2021 12:22:04 +0000 (14:22 +0200)]
common/mlx5: fix build for zero-length headroom array
The structure of the striding RQ(MPRQ) buffer includes an array size
defined by the RTE_PKTMBUF_HEADROOM macro added in [1].
When RTE_PKTMBUF_HEADROOM is set to 0 in the compilation config file
the compilation with debug type failed:
"In file included from ../drivers/common/mlx5/mlx5_common.h:25,
from ../drivers/common/mlx5/linux/mlx5_nl.h:12,
from ../drivers/common/mlx5/linux/mlx5_nl.c:22:
../drivers/common/mlx5/mlx5_common_mr.h:96:10: error: ISO C forbids
zero-size array 'pad' [-Werror=pedantic]"
Actually, the array for the first stride headroom is not needed:
Each stride in the striding RQ buffer includes the headroom of the next
stride, so the headroom of the first stride should be allocated before
the starting point of the buffer posted to the HW(HW buffer).
The striding RQ buffer is used as an attached buffer to mbuf and have
shared information per stride.
The LRO support moved all the strides shared information to the top of
the buffer before the first stride headroom but didn't remove the old
memory of this headroom from the buffer.
Remove the old headroom memory from the striding RQ buffer.
Bing Zhao [Fri, 5 Nov 2021 06:10:57 +0000 (08:10 +0200)]
net/mlx5: fix RETA update without stopping device
The global redirection table is used to create the default flow
rules for the ingress traffic with the lowest priority. It is also
used to create the default RSS rule in the destination table when
there is a tunnel offload.
To update the RETA in-flight, there is no restriction in the ethdev
API. In the previous implementation of mlx5, a port restart was
needed to make the new configuration take effect.
The restart is heavy, e.g., all the queues will be released and
reallocated, users' rules will be flushed. Since the restart is
internal, there is a risk to crash the application when some change
in the ethdev is introduced but no workaround is done in mlx5 PMD.
The users' rules, including the default miss rule for tunnel
offload, should not be impacted by the RETA update. It is improper
to flush all rules when updating RETA.
With this patch, only the default rules will be flushed and
re-created with the new table configuration.
Jiawei Wang [Wed, 3 Nov 2021 13:07:59 +0000 (15:07 +0200)]
net/mlx5: fix tag ID conflict with sample action
For the flows containing sample action, the tag action was added
implicitly to store the unique flow index into metadata register in the
split prefix subflow, and then match on this index in the split suffix
subflow. The metadata register for flow index of sample split subflows
was also used to store application metadata TAG 0 item, this might cause
TAG 0 corruption in the flows with sample actions.
This patch uses the same metadata register C index as used for
ASO action since it's reserved and not used directly by the application,
and adds the checking in validation to make sure not to conflict
with ASO CT in the same flow.
Fixes: b4c0ddbfcc58 ("net/mlx5: split sample flow into two sub-flows") Cc: stable@dpdk.org Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Dmitry Kozlyuk [Tue, 9 Nov 2021 10:32:53 +0000 (12:32 +0200)]
common/mlx5: fix external memory pool registration
Registration of packet mempools with RTE_PKTMBUF_POOL_PINNED_EXT_MEM
was performed incorrectly: after population of such mempool chunks
only contain memory for rte_mbuf structures, while pointers to actual
external memory are not yet filled. MR LKeys could not be obtained
for external memory addresses of such mempools. Rx datapath assumes
all used mempools are registered and does not fallback to dynamic
MR creation in such case, so no packets could be received.
Skip registration of extmem pools on population because it is useless.
If used for Rx, they are registered at port start.
During registration, recognize such pools, inspect their mbufs
and recover the pages they reside in.
While MRs for these pages may already be created by rte_dev_dma_map(),
they are not reused to avoid synchronization on Rx datapath
in case these MRs are changed in the database.
Rongwei Liu [Tue, 2 Nov 2021 07:22:40 +0000 (09:22 +0200)]
net/mlx5: fix meter policy validation
When a user specifies meter policy like "g_actions queue / end
y_actions queue / r_action drop / end", validation logic missed
to set meter policy mode and it took a random value from the stack.
Define ALL policy modes for the mentioned cases.
Fixes: 4b7bf3ffb473 ("net/mlx5: support yellow in meter policy validation") Cc: stable@dpdk.org Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Reviewed-by: Bing Zhao <bingz@nvidia.com>
Bing Zhao [Mon, 18 Oct 2021 14:43:07 +0000 (17:43 +0300)]
net/mlx5: fix RSS consistency check of meter policy
After yellow color actions in the metering policy were supported,
the RSS could be used for both green and yellow colors and only the
queues attribute could be different.
When specifying the attributes of a RSS, some fields can be ignored
and some default values will be used in PMD. For example, there is a
default RSS key in the PMD and it will be used to create the TIR if
nothing is provided by the application.
The default value cases were missed in the current implementation
and it would cause some false positives or crashes.
The comparison function should be adjusted to take all cases into
consideration when RSS is used for both green and yellow colors.
Fixes: 4b7bf3ffb473 ("net/mlx5: support yellow in meter policy validation") Cc: stable@dpdk.org Signed-off-by: Bing Zhao <bingz@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Jim Harris [Fri, 5 Nov 2021 15:53:51 +0000 (15:53 +0000)]
power: remove unused poll counter
Following the previous fix, there is nothing using the ppi counter.
We can remove the related ppi_av array in struct priority_worker.
This allows us to also remove num_dequeue_pkts_prev and pc from
struct priority_worker since they are only used in conjunction
with the ppi_av array.
Suggested-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
Jim Harris [Fri, 5 Nov 2021 15:53:51 +0000 (15:53 +0000)]
power: fix build with clang 13
clang-13 rightfully complains that the tot_ppi variable in update_stats
is set but not used, since the final accumulated tot_ppi results isn't
used anywhere.
Fixes: 450f0791312c ("power: add traffic pattern aware power control") Cc: stable@dpdk.org Signed-off-by: Jim Harris <james.r.harris@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
Matan Azrad [Tue, 9 Nov 2021 12:36:10 +0000 (14:36 +0200)]
vdpa/mlx5: workaround dirty bitmap MR creation
Due to kernel driver/FW issues in direct MKEY creation using the DevX
API, this patch replaces the dirty bitmap MR creation to use wrapped
mkey instead.
Fixes: 9d39e57f21ac ("vdpa/mlx5: support live migration") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Signed-off-by: Matan Azrad <matan@nvidia.com>
Matan Azrad [Tue, 9 Nov 2021 12:36:09 +0000 (14:36 +0200)]
common/mlx5: create wrapped MR
The mlx5 PMD uses the kernel mlx5 driver to map physical memory to the
HW.
Using the Verbs API ibv_reg_mr, a mkey can be created for that.
In this case, the mkey is signed on the user ID of the kernel driver.
Using the DevX API, a mkey also can be created, but it should point an
umem object (represents the specific buffer mapping) created by the
kernel. In this case, the mkey is signed on the user ID of the process
DevX context.
In FW DevX control commands which get mkey as a parameter, there is
a security check on the user ID and Verbs mkeys are rejected.
Unfortunately, also when using DevX mkey, there is an error in the FW
command on umem validation because the umem is not designed to be used
for any mkey parameters.
As a workaround to the kernel driver/FW issue, it is needed to use a
wrapped MR, which is an indirect mkey(created by the DevX API) pointing to
direct mkey created by the kernel for any DevX command uses an MR.
Add an API to create and destroy this wrapped MR.
Fixes: 5382d28c2110 ("net/mlx5: accelerate DV flow counter transactions") Fixes: 9d39e57f21ac ("vdpa/mlx5: support live migration") Cc: stable@dpdk.org Signed-off-by: Michael Baum <michaelba@nvidia.com> Signed-off-by: Matan Azrad <matan@nvidia.com>
Michael Baum [Tue, 9 Nov 2021 12:36:08 +0000 (14:36 +0200)]
common/mlx5: glue MR registration with IOVA
Add support for rdma-core API to register IOVA MR.
The API gets the process VA, size, and IOVA and returns a memory region
with space pointed by a specific IOVA.
So any access in this MR should come with an address that is relative to
the IOVA specified in the API.
The octeontx2_dma rawdev driver is removed in DPDK-21.11. The new driver
for the same device uses the dmadev. So this patch updates the device
naming and lists it under dma devices section.
Gagandeep Singh [Tue, 9 Nov 2021 04:39:06 +0000 (10:09 +0530)]
dma/dpaa: introduce DPAA DMA driver skeleton
The DPAA DMA driver is an implementation of the dmadev APIs,
that provide means to initiate a DMA transaction from CPU.
The initiated DMA is performed without CPU being involved
in the actual DMA transaction. This is achieved via using
the QDMA controller of DPAA SoC.
David Marchand [Mon, 8 Nov 2021 10:09:18 +0000 (11:09 +0100)]
build: cleanup libpcap dependent components
The RTE_PORT_PCAP variable is used to signal libpcap availability,
though its name seems to refer to pcap support in the port library.
Prefer a generic name and add explicit link dependencies where needed.
Fixes: 7a944656b33f ("test/pcapng: test pcapng library") Fixes: 2eccf6afbea9 ("bpf: add function to convert classic BPF to DPDK BPF") Fixes: cbb44143be74 ("app/dumpcap: add new packet capture application") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org>
David Marchand [Sat, 6 Nov 2021 08:53:04 +0000 (09:53 +0100)]
examples: skip build when missing dependencies
Trying to disable the vhost library, meson will complain it can't build
the vhost* and vdpa examples when passing -Dexamples=all.
-Dexamples=all skips examples if the example itself announces it can't
be built (for external dependencies, internal dependencies and other
reasons).
Since examples/meson.build will evaluate the internal dependencies
in any case, let's move the check there and resolve the issue for
optional internal libraries.
Fixes: 0bf583222297 ("lib: allow disabling optional libraries") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
David Marchand [Wed, 27 Oct 2021 14:05:44 +0000 (16:05 +0200)]
test: add bitmap to fast tests
This test was never added to the list of tests to run in CI.
Its name does not follow the implicit convention of ending with
_autotest.
Let's fix this.
Fixes: 5e9647fd5a1a ("test/bitmap: test scan after half cacheline is cleared") Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Huichao Cai [Mon, 25 Oct 2021 07:55:53 +0000 (15:55 +0800)]
ip_frag: revert fix fragmenting IPv4 fragment
The patch ("ip_frag: fix fragmenting IPv4 fragment") introduces
a bug and needs to be rolled back. This is because the patch
and variables "flag_offset" conflict with each other.
Bugzilla ID: 835 Fixes: 567473433b7e ("ip_frag: fix fragmenting IPv4 fragment") Cc: stable@dpdk.org Signed-off-by: Huichao Cai <chcchc88@163.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Increase default value for config parameter RTE_LIBRTE_IP_FRAG_MAX_FRAG
from 4 to 8. This parameter controls maximum number of fragments per
packet in ip reassembly table. Increasing this value from 4 to 8 will
allow users to cover common case with jumbo packet size of 9KB and
fragments with default frame size (1500B).
As RTE_LIBRTE_IP_FRAG_MAX_FRAG is used in definition of public
structure (struct rte_ip_frag_death_row), this is an ABI change.
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Ivan Malov [Fri, 5 Nov 2021 21:54:09 +0000 (00:54 +0300)]
net/sfc: support decrement IP TTL actions in transfer flows
These actions map to MAE action DECR_IP_TTL. It affects
the outermost header in the current processing state of
the packet, which might have been decapsulated by prior
action DECAP. It also updates IPv4 checksum accordingly.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Malov [Fri, 5 Nov 2021 21:54:08 +0000 (00:54 +0300)]
common/sfc_efx/base: add API to decrement TTL action to set
Affects the outermost header, taking prior action DECAP into
account. Takes care to also update IPv4 checksum accordingly.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Malov [Fri, 5 Nov 2021 21:54:07 +0000 (00:54 +0300)]
common/sfc_efx/base: factor out no-op helper functions
When an action gets added to an action set, a special helper is
used to handle its arguments. There are actions which have no
arguments, and the corresponding helpers are duplicates in
fact. Use a unified no-op helper instead of them.
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Malov [Fri, 5 Nov 2021 21:54:06 +0000 (00:54 +0300)]
common/sfc_efx/base: refine adding count action to set
1) Invalid counter ID is always set by default.
Do not set it again when adding the action.
2) Counter ID validity check is missing in the
action set allocation helper. Introduce it.
Fixes: 238306cf9aff ("common/sfc_efx/base: support counter in action set") Cc: stable@dpdk.org Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Malov [Fri, 5 Nov 2021 21:54:05 +0000 (00:54 +0300)]
common/sfc_efx/base: refine adding encap action to set
1) Invalid encap. header ID is always set by default.
Do not set it again when adding the action.
2) Encap. header ID validity check is missing in the
action set allocation helper. Introduce it.
Fixes: 3907defa5bf0 ("common/sfc_efx/base: support adding encap action to a set") Cc: stable@dpdk.org Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Huisong Li [Sat, 6 Nov 2021 01:43:05 +0000 (09:43 +0800)]
net/hns3: mark unchecked return of snprintf
Fixing the return value of the function to clear static warning.
Fixes: 1181500b2fc5 ("net/hns3: adjust MAC address logging") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Huisong Li [Sat, 6 Nov 2021 01:42:58 +0000 (09:42 +0800)]
net/hns3: simplify queue DMA address arithmetic
The patch obtains the upper 32 bits of the Rx/Tx queue DMA address in one
step instead of two steps.
Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Dmitry Kozlyuk [Mon, 8 Nov 2021 11:17:15 +0000 (13:17 +0200)]
net/mlx5: fix split buffer Rx
Routine to lookup LKey on Rx was assuming that the mbuf address
always belongs to a single mempool: the one associated with an RxQ
or the MPRQ mempool. This assumption is false for split buffers case.
A wrong LKey was looked up, resulting in completion errors.
Modify lookup routines to lookup LKey in the mbuf->pool
for non-MPRQ cases both on Rx datapath and on queue initialization.
Raja Zidane [Mon, 8 Nov 2021 13:09:21 +0000 (13:09 +0000)]
common/mlx5: fix queue size in DevX queue pair creation
The number of WQEBBs was provided to QP create, and QP size was calculated
by multiplying the number of WQEBBs by 64, which is the send WQE size.
When creating RQ in the QP (i.e., vdpa driver), the queue size was bigger
because the receive WQE size is 16.
Provide queue size to QP create instead of the number of WQEBBs.
Raja Zidane [Mon, 8 Nov 2021 13:09:20 +0000 (13:09 +0000)]
crypto/mlx5: fix queue size configuration
The DevX interface for QP creation expects the number of WQEBBs.
Wrongly, the number of descriptors was provided to the QP creation.
In addition, the QP size must be a power of 2 what was not guaranteed.
Provide the number of WQEBBs to the QP creation API.
Round up the SQ size to a power of 2.
Rename (sq/rq)_size to num_of_(send/receive)_wqes.
Fixes: 6152534e211e ("crypto/mlx5: support queue pairs operations") Cc: stable@dpdk.org Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com> Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Raja Zidane [Mon, 8 Nov 2021 13:09:19 +0000 (13:09 +0000)]
crypto/mlx5: fix freeing on probing failure
When calling device close, unset dek is called which destroys a hash list.
In case of error during dev probe, close is called when dek hlist is not
initialized.
Ensure non null list destroy.
Fixes: 90646d6c6e22 ("crypto/mlx5: support basic operations") Cc: stable@dpdk.org Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Raja Zidane [Mon, 8 Nov 2021 13:09:18 +0000 (13:09 +0000)]
common/mlx5: fix DevX queue size overflow
The HW QP/SQ/RQ/CQ queue sizes may be bigger than 64KB.
The width of the variable handled the queue size is 16 bits
which cannot contain the maximum queue size.
Replace the size type to be uint32_t.
Elena Agostini [Mon, 8 Nov 2021 18:58:04 +0000 (18:58 +0000)]
gpudev: add communication list
In heterogeneous computing system, processing is not only in the CPU.
Some tasks can be delegated to devices working in parallel.
When mixing network activity with task processing there may be the need
to put in communication the CPU with the device in order to synchronize
operations.
An example could be a receive-and-process application
where CPU is responsible for receiving packets in multiple mbufs
and the GPU is responsible for processing the content of those packets.
The purpose of this list is to provide a buffer in CPU memory visible
from the GPU that can be treated as a circular buffer
to let the CPU provide fondamental info of received packets to the GPU.
A possible use-case is described below.
CPU:
- Trigger some task on the GPU
- in a loop:
- receive a number of packets
- provide packets info to the GPU
GPU:
- Do some pre-processing
- Wait to receive a new set of packet to be processed
Elena Agostini [Mon, 8 Nov 2021 18:58:03 +0000 (18:58 +0000)]
gpudev: add communication flag
In heterogeneous computing system, processing is not only in the CPU.
Some tasks can be delegated to devices working in parallel.
When mixing network activity with task processing there may be the need
to put in communication the CPU with the device in order to synchronize
operations.
The purpose of this flag is to allow the CPU and the GPU to
exchange ACKs. A possible use-case is described below.
CPU:
- Trigger some task on the GPU
- Prepare some data
- Signal to the GPU the data is ready updating the communication flag
GPU:
- Do some pre-processing
- Wait for more data from the CPU polling on the communication flag
- Consume the data prepared by the CPU
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
Elena Agostini [Mon, 8 Nov 2021 18:58:01 +0000 (18:58 +0000)]
gpudev: add memory API
In heterogeneous computing system, processing is not only in the CPU.
Some tasks can be delegated to devices working in parallel.
Such workload distribution can be achieved by sharing some memory.
As a first step, the features are focused on memory management.
A function allows to allocate memory inside the device,
or in the main (CPU) memory while making it visible for the device.
This memory may be used to save packets or for synchronization data.
The next step should focus on GPU processing task control.
Signed-off-by: Elena Agostini <eagostini@nvidia.com> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Thomas Monjalon [Mon, 8 Nov 2021 18:58:00 +0000 (18:58 +0000)]
gpudev: support multi-process
The device data shared between processes are moved in a struct
allocated in a shared memory (a new memzone for all GPUs).
The main struct rte_gpu references the shared memory
via the pointer mpshared.
The API function rte_gpu_attach() is added to attach a device
from the secondary process.
The function rte_gpu_allocate() can be used only by primary process.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Thomas Monjalon [Mon, 8 Nov 2021 18:57:59 +0000 (18:57 +0000)]
gpudev: add child device representing a device context
The computing device may operate in some isolated contexts.
Memory and processing are isolated in a silo represented by
a child device.
The context is provided as an opaque by the caller of
rte_gpu_add_child().
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Thomas Monjalon [Mon, 8 Nov 2021 18:57:58 +0000 (18:57 +0000)]
gpudev: add event notification
Callback functions may be registered for a device event.
Callback management is per-process and not thread-safe.
The events RTE_GPU_EVENT_NEW and RTE_GPU_EVENT_DEL
are notified respectively after creation and before removal
of a device, as part of the library functions.
Some future events may be emitted from drivers.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Anatoly Burakov [Thu, 28 Oct 2021 14:15:19 +0000 (14:15 +0000)]
vfio: set errno on unsupported OS
Currently, when code is running on FreeBSD or Windows, there is no way
to distinguish between a geniune error and a "VFIO is unsupported"
error. Fix the dummy implementations to also set the rte_errno flag.
Fixes: 279b581c897d ("vfio: expose functions") Fixes: c564a2a20093 ("vfio: expose clear group function for internal usages") Fixes: 964b2f3bfb07 ("vfio: export some internal functions") Fixes: ea2dc1066870 ("vfio: add multi container support") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Acked-by: Chenbo Xia <chenbo.xia@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Anatoly Burakov [Thu, 28 Oct 2021 14:15:18 +0000 (14:15 +0000)]
vfio: fix FreeBSD documentation
On FreeBSD, `rte_vfio_is_enabled()` and `rte_vfio_noiommu_is_enabled()`
API calls will not return error, and will instead return 0. This is
intentional, because the caller of this API does not care whether VFIO
is supported at all, and will instead be interested in whether VFIO is
enabled or not. However, the doxygen comments for these functions state
that they will return an error on FreeBSD, which is incorrect.
Fix the doxygen comment to call out the fact that these
functions are only relevant on Linux, but remove the reference to
returning errors.
Anatoly Burakov [Thu, 28 Oct 2021 14:15:17 +0000 (14:15 +0000)]
vfio: fix FreeBSD clear group stub
On FreeBSD, `rte_vfio_clear_group()` was returning 0 even though this
function is not valid for FreeBSD, and is called out to return error in
doxygen comments.
Fix the return value to match documentation.
Fixes: c564a2a20093 ("vfio: expose clear group function for internal usages") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Anatoly Burakov [Thu, 28 Oct 2021 14:15:16 +0000 (14:15 +0000)]
vfio: drop fallback Linux implementation
Currently, VFIO support for Linux is compiled unconditionally, and
supported kernel versions start with 4.4, so VFIO is assumed to always
be enabled. There is no way of disabling VFIO support at compile time
anyway, so just drop the "VFIO not available" fallback code altogether.
Satheesh Paul [Tue, 2 Nov 2021 10:50:41 +0000 (16:20 +0530)]
app/flow-perf: add random priority option
Added support to create flows with priority attribute set
randomly between 0 and a user supplied maximum value. This
is useful to measure performance on NICs which may have to
rearrange flows to honor flow priority.
Removed the lower limit of 100000 flows per batch.
Signed-off-by: Satheesh Paul <psatheesh@marvell.com> Acked-by: Wisam Jaddo <wisamm@nvidia.com>
Raja Zidane [Thu, 28 Oct 2021 13:58:50 +0000 (13:58 +0000)]
common/mlx5: fix MMO configuration in DevX queue pair
The QP extension valid bit was not set in the QP creation for MMO
configuration.
That caused the QP not to be connected to the GGA MMO engines,
and any MMO WQE job got CQE with an error.
Set the QP ext bit when MMO is configured.
Fixes: ddda0006188a ("common/mlx5: add MMO configuration for DevX queue pair") Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Raja Zidane [Thu, 28 Oct 2021 13:58:49 +0000 (13:58 +0000)]
common/mlx5: fix HCA capabilities PRM alignment
0x20 reserved bytes were missed in the HCA cap PRM structure before the
newly added fields for MMO QP capabilities.
That caused reading MMO QP caps incorrectly.
Add the reserved fields in the HCA cap structure.
Juraj Linkeš [Fri, 5 Nov 2021 11:56:30 +0000 (12:56 +0100)]
config/arm: split aarch32 options
Aarch32 config got overlooked when splitting march in a previous patch.
Fixes: 95e0f23022a3 ("config/arm: split -march into arch and features") Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Chengwen Feng [Tue, 2 Nov 2021 12:37:39 +0000 (20:37 +0800)]
dma/hisilicon: add probing
This patch add dmadev instances create during the PCI probe, and
destroy them during the PCI remove. Internal structures and HW
definitions was also included.
Ali Alnubani [Sun, 7 Nov 2021 16:30:54 +0000 (18:30 +0200)]
sched: fix debug build
Compare pkt_len to 0 instead of NULL to avoid the following build
failure with debug mode enabled:
../lib/sched/rte_pie.h: In function 'rte_pie_enqueue_empty':
../lib/sched/rte_pie.h:125:21: error: comparison between pointer
and integer [-Werror]
RTE_ASSERT(pkt_len != NULL);
Bugzilla ID: 878 Fixes: 44c730b0e379 ("sched: add PIE based congestion management") Signed-off-by: Ali Alnubani <alialnu@nvidia.com>
David Marchand [Fri, 5 Nov 2021 13:29:51 +0000 (14:29 +0100)]
app: fix external dependency linking
ext_deps was not used in app/meson.build
so testpmd dependency on jansson was ignored.
testpmd currently can be linked because metrics library is pulling
the dependency on libjansson.
Michael Baum [Wed, 3 Nov 2021 18:35:13 +0000 (20:35 +0200)]
common/mlx5: fix post doorbell barrier
The rdma-core library can map doorbell register in two ways, depending
on the environment variable "MLX5_SHUT_UP_BF":
- as regular cached memory, the variable is either missing or set to
zero. This type of mapping may cause the significant doorbell
register writing latency and requires an explicit memory write
barrier to mitigate this issue and prevent write combining.
- as non-cached memory, the variable is present and set to not "0"
value. This type of mapping may cause performance impact under
heavy loading conditions but the explicit write memory barrier is
not required and it may improve core performance.
The UAR creation function maps a doorbell in one of the above ways
according to the system. In run time, it always adds an explicit memory
barrier after writing to.
In cases where the doorbell was mapped as non-cached memory, the
explicit memory barrier is unnecessary and may impair performance.
The commit [1] solved this problem for a Tx queue. In run time, it
checks the mapping type and provides the memory barrier after writing to
a Tx doorbell register if it is needed. The mapping type is extracted
directly from the uar_mmap_offset field in the queue properties.
This patch shares this code between the drivers and extends the above
solution for each of them.
[1] commit 8409a28573d3
("net/mlx5: control transmit doorbell register mapping")
Michael Baum [Wed, 3 Nov 2021 18:35:12 +0000 (20:35 +0200)]
net/mlx5: remove duplicated reference of Tx doorbell
The Tx doorbell has different virtual addresses per process.
The secondary process takes the UAR physical page ID of the primary and
mmap it to its own virtual address.
The primary doorbell references were saved in two shared memory
locations: the TxQ structure and a dedicated doorbell array.
Remove the doorbell reference from the TxQ structure and move the
primary processes to take the UAR information from the primary doorbell
array.
Michael Baum [Wed, 3 Nov 2021 18:35:11 +0000 (20:35 +0200)]
common/mlx5: fix doorbell mapping configuration
UAR mapping type can be affected by the devarg tx_db_nc, which can cause
setting the environment variable MLX5_SHUT_UP_BF.
So, the MLX5_SHUT_UP_BF value and the UAR mapping parameter affect the
UAR cache mode.
Wrongly, the devarg was considered for the MLX5_SHUT_UP_BF but not for
the UAR mapping parameter in all the drivers except the net.
Take the tx_db_nc devarg into account for all the drivers.
Depending on kernel capabilities and rdma-core version the mapping of
UAR (User Access Region) of desired memory caching type (non-cached or
write combining) might fail. The PMD implements the flexible strategy
of UAR mapping, alternating the type of caching to succeed.
During this process the failure diagnostics messages are emitted.
These messages are merely diagnostics ones and the logging level should
be adjusted to DEBUG.