git.droids-corp.org - dpdk.git/log

net/mlx5: remove unused calculation in RSS expansion

The RSS flow expansion get a memory buffer to fill the new patterns of
the expanded flows.
This memory management saves the next address to write into the buffer
in a dedicated variable.

The calculation for the next address was wrongly also done when all the
patterns were ready.

Remove it.

Fixes: 4ed05fcd441b ("ethdev: add flow API to expand RSS flows")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix nested flow creation

If xmedata mode 1 enabled and create a flow with RSS and mark action,
there was an error that rdma-core failed to create RQT due to wrong
queue definition. This was due to mixed flow creation in thread specific
flow workspace.

This patch introduces nested flow workspace(context data), each flow
uses dedicate flow workspace, pop and restore workspace when nested flow
creation done, the original flow with continue with original flow
workspace. The total number of thread specific flow workspace should be
2 due to only one nested flow creation scenario so far.

Fixes: 8bb81f2649b1 ("net/mlx5: use thread specific flow workspace")
Fixes: 3ac3d8234b82 ("net/mlx5: fix index when creating flow")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix Unix socket path

mlx_steering_dump_parser.py tool failed to dump flow due to socket file
name changed.

Change socket file name back to make it consistent.

Fixes: e4b7b8d082db ("common/mlx5: fix PCI driver name")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix detection of counter offset support

Currently, the counter offset support is discovered by creating the
rule with invalid offset counter and jump action in root table. If
the rule creation fails with EINVAL errno, that mean counter offset
is not supported in root table.

However, jump action may not be supported in some rdma-core version.
In this case, the discover code will not work properly.

This commits changes the jump action to generic drop action. That
makes the discover code to be more compatible.

Fixes: 994829e695c0 ("net/mlx5: remove single counter container")
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix hairpin unbind

In the implementation of mlx5_hairpin_unbind, a copy-paste error was
inside. If a single peer Rx port needed to be unbound, it would be
bound again by mistake.

All the hardware resources were released when stopping the device and
no mess of the configuration was introduced. But when trying to unbind
the ports again, the issue would appear.

The typo of the function call is fixed. If there is no hairpin queue
bound between two ports, the unbinding process should be considered
successful.

Fixes: 37cd4501e873 ("net/mlx5: support two ports hairpin mode")
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: validate MPLSoGRE with GRE key

Currently PMD only accept flow which item_mpls directly follow item_gre,
means to match the GRE header without GRE optional field key in MPLSoGRE
encapsulation.

However, for the MPLSoGRE, the GRE header could have the optional field
(i.e, key) according to the RFC. So PMD need to accept this.

Add MLX5_FLOW_LAYER_GRE_KEY into allowed prev_layer to fix

Fixes: a7a0365565a4 ("net/mlx5: match GRE key and present bits")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyu Min <jackmin@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix check of eCPRI previous layer

Based on the specification, eCPRI can only follow ETH (VLAN) layer
or UDP layer. When creating a flow with eCPRI item, this should be
checked and invalid layout of the layers should be rejected.

Fixes: c7eca23657b7 ("net/mlx5: add flow validation of eCPRI header")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix Rx descriptors info for MPRQ

The number of descriptors configured is returned to a user
via the rxq_info_get API. This number is incorrect for MPRQ.
For SPRQ this number matches the number of mbufs allocated.
For MPRQ we have fewer external MPRQ buffers that can hold
multiple packets in strides of this big buffer. Take that
into account and return the number of MPRQ buffers multiplied
by the number of strides in this case.

Fixes: 26f1bae837eb ("net/mlx5: add Rx/Tx burst mode info")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: improve vectorized MPRQ descriptors locality

There is a performance penalty for the replenish scheme
used in vectorized Rx burst for both MPRQ and SPRQ.
Mbuf elements are being filled at the end of the mbufs
array and being replenished at the beginning. That leads
to an increase in cache misses and the performance drop.
The more Rx descriptors are used the worse the situation.

Change the allocation scheme for vectorized MPRQ Rx burst:
allocate new mbufs only when consumed mbufs are almost
depleted (always have one burst gap between allocated and
consumed indices). Keeping a small number of mbufs allocated
improves cache locality and improves performance a lot.

Unfortunately, this approach cannot be applied to SPRQ Rx
burst routine. In MPRQ Rx burst we simply copy packets from
external MPRQ buffers or attach these buffers to mbufs.
In SPRQ Rx burst we allow the NIC to fill mbufs for us.
Hence keeping a small number of allocated mbufs will limit
NIC ability to fill as many buffers as possible. This fact
offsets the advantage of better cache locality.

Fixes: 0f20acbf5eda ("net/mlx5: implement vectorized MPRQ burst")
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/bnxt: fix doorbell barrier location

Simplify some doorbell functions now that rte_cio_wmb() has been
eliminated and rte_io_wmb() is equivalent for Arm.

Fix a performance degradation on x86 platforms caused by a
previous Arm performance fix by moving the compiler barrier
closer to the I/O write.

Fixes: f0f5d844d138 ("eal: remove deprecated coherent IO memory barriers")
Fixes: bfc1d45875e2 ("net/bnxt: fix performance for Arm")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/pcap: fix registration of timestamp dynamic field

In pcap pmd, the timestamp mbuf dynamic field is mandatory. When the
pcap pmd is created in a secondary process (this is the case for pdump),
it cannot be registered because this is not allowed from a secondary
process.

To ensure that the field is properly registered, do it from probe()
instead of configure(). Indeed, probe() is first invoked on the primary
process when a device is created in a secondary, this enables
registering dynfield from secondary process.

Bugzilla ID: 571
Fixes: d23d73d088c1 ("net/pcap: switch Rx timestamp to dynamic mbuf field")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

app/testpmd: fix MTU after device configure

In 'rte_eth_dev_configure()', if 'DEV_RX_OFFLOAD_JUMBO_FRAME' is not set
the max frame size is limited to 'RTE_ETHER_MAX_LEN' (1518).
This is mistake because for the PMDs that has frame size bigger than
"RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN" (18 bytes), the MTU becomes
less than 1500, causing a valid frame with 1500 bytes payload to be
dropped.

Since 'rte_eth_dev_set_mtu()' works as expected, it is called after
'rte_eth_dev_configure()' to fix the MTU.
It may look redundant to set MTU after 'rte_eth_dev_configure()', both
with default values, but it is not, the resulting MTU config can be
different in the device based on frame overhead of the PMD.

And instead of setting the MTU to default value, it is first get via
'rte_eth_dev_get_mtu()' and set again, this is to cover cases MTU
changed from testpmd command line.

'rte_eth_dev_set_mtu()', '-ENOTSUP' error is ignored to prevent
irrelevant warning messages for the virtual PMDs.

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Tested-by: Igor Romanov <igor.romanov@oktetlabs.ru>

net/tap: allow all-zero checksum for UDP over IPv4

Unlike TCP, UDP checksums are optional and may be zero to indicate "not
set" [RFC 768] (except for IPv6, where this prohibited [RFC 8200]). Add
this special case to the checksum offload emulation in net/tap.

Signed-off-by: Michael Pfeiffer <michael.pfeiffer@tu-ilmenau.de>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

common/sfc_efx/base: fix macro to extract from 256-bit type

Fixes: eda1cc20c3bc ("common/sfc_efx/base: add 256-bit type")
Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

vhost: fix fd leak in kick setup

This patch fixes a file descriptor leak which happens
in the error path of vhost_user_set_vring_kick().

Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>

vhost: fix fd leak in dirty logging setup

This patch fixes a file descriptor leak which happens
in the error path of vhost_user_set_log_base().

Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application")
Cc: stable@dpdk.org
Reported-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>

vhost: fix error path when setting memory tables

If an error is encountered before the memory regions are
parsed, the file descriptors for these shared buffers are
leaked.

This patch fixes this by closing the message file descriptors
on error, taking care of avoiding double closing of the file
descriptors. guest_pages is also freed, even though it was not
leaked as its pointer was not overridden on subsequent function
calls.

Fixes: 8f972312b8f4 ("vhost: support vhost-user")
Cc: stable@dpdk.org
Reported-by: Xuan Ding <xuan.ding@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>

examples/vhost: fix ioat dependency

Fix vhost-switch compiling issue when ioat dependency is missing.
Change 'RTE_x86' check into 'RTE_RAW_IOAT' check in meson build file.
Use 'RTE_RAW_IOAT' to control conditional compiling in source file.
Clean some codes.

Fixes: abec60e7115d ("examples/vhost: support vhost async data path")
Fixes: 3a04ecb21420 ("examples/vhost: add async vhost args parsing")
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: David Marchand <david.marchand@redhat.com>

examples/vhost: fix string split error handling

Add checking return value of string split function to fix the
coverity issue.

Coverity issue: 363739
Fixes: 3a04ecb21420 ("examples/vhost: add async vhost args parsing")
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

examples/vhost: check argument length

Add args length check before copying to fix the coverity issue.

Coverity issue: 363741
Fixes: 3a04ecb21420 ("examples/vhost: add async vhost args parsing")
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

net/iavf: fix RSS when VF port closed

Check the VF RSS offload flag and ignore relative operation when
iavf hash uninit to avoid reset/close error.

Fixes: 7be10c3004be ("net/iavf: add RSS configuration for VF")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ice: support flow mark ID in AVX512

Support flow director mark ID parsing from flexible Rx descriptor
in avx512 path.

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Tested-by: Qin Sun <qinx.sun@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/ixgbe: remove redundant MAC flag check

The flag of RTE_ETHTYPE_FLAGS_MAC has been checked twice, so remove the
first error message "Not supported by ethertype filter" which is not so
specific, and keep the error message "mac compare is unsupported" which
aligns to the definition of RTE_ETHTYPE_FLAGS_MAC.

Fixes: eb3539fc8550 ("net/ixgbe: parse ethertype filter")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/bnxt: fix assignment instead of comparison

Use comparison operator instead of incorrectly using the assignment
operator.

Coverity issue: 363566, 36357
Fixes: 42c40f8902f7 ("net/bnxt: consolidate template table processing")
Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

net/txgbe: fix build by simplifying xstats return

When DPDK is compiled with gcc 7.5 with the optimization level set to 1
gcc sees the 'offset' variable in txgbe_ethdev.c as possibly being
uninitialised.
The 'txgbe_get_offset_by_id()' return value, "-(int)(id + 1)", seems
confusing gcc that it assumes '0' can be returned in the failure case.
To correct this the return statement for error case in
'txgbe_get_offset_by_id()' is simplified to return '-1'.

Fixes: 91fe49c87d76 ("net/txgbe: support device xstats")
Signed-off-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

net/hinic/base: add message check for command channel

In the command channel, a message may has several fragments,
and the several fragments should have same message id. To
prevent problems, this check is added.

Fixes: 1e4593db1d58 ("net/hinic/base: fix log info for PF command channel")
Cc: stable@dpdk.org
Signed-off-by: Guoyang Zhou <zhouguoyang@huawei.com>

net/iavf: fix performance drop after port reset

Needs to reset rxq->rxrearm_start to 0 when reset_rx_queue(),
otherwise, the random value of rxrearm_start will cause performance drop
due to L3 contested accesses.

Fixes: 69dd4c3d0898 ("net/avf: enable queue and device")
Cc: stable@dpdk.org
Signed-off-by: Leyi Rong <leyi.rong@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

doc: update firmware/driver mapping table for i40e

Update i40e PMD firmware/driver mapping table.

Signed-off-by: Shougang Wang <shougangx.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>

net/ice: fix perfect match for ACL rule

A rule with an imperfect match (wildcarding) will be routed through
ACL. A perfect match should be rejected by ACL.

Fixes: 40d466fa9f76 ("net/ice: support ACL filter in DCF")
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/af_xdp: fix pointer storage size

'uint64_t' is used to hold the pointer, for 32-bits build this
assumption is wrong and giving following build error:

rte_eth_af_xdp.c: In function ‘xdp_umem_configure’:
rte_eth_af_xdp.c:970:15:
    error: cast to pointer from integer of different size
           [-Werror=int-to-pointer-cast]
  970 |   base_addr = (void *)get_base_addr(mb_pool, &align);
      |               ^

Replacing the 'uint64_t' return type of the 'get_base_addr()' to the
'uintptr_t'.
Although not sure if the overall logic supports the 32-bits, using
'uintptr_t' should be safe both for 64/32 bits.

Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>

app/testpmd: support age shared action context

When an age action becomes aged-out the next call for
rte_flow_get_aged_flows API should return the action context supplied
by the action configuration structure.

In case the age action is created by the shared action API, the shared
action context of the Testpmd application was not set.

In addition, the application handler of the contexts returned by the
rte_flow_get_aged_flows API didn't consider the fact that the action
could be set by the shared action API and considered it as regular flow
context.

This caused a crash in Testpmd when the context is parsed.

This patch set context type in the flow and shared action context and
uses it to parse the aged-out contexts correctly.

Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Dekel Peled <dekelp@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

net/mlx5: fix shared RSS action release

As shared RSS action will be shared by multiple flows, the action
is created as global standalone action and managed only by the
relevant shared action management functions.

Currently, hrxqs will be created by shared RSS action or general
queue action. For hrxqs created by shared RSS action, they should
also only be released with shared RSS action. It's not correct to
release the shared RSS action hrxqs as general queue actions do
in flow destroy.

This commit adds a new fate action type for shared RSS action to
handle the shared RSS action hrxq release correctly.

Fixes: e1592b6c4dea ("net/mlx5: make Rx queue thread safe")
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix missing meter packet

For transfer flow with meter, packet was passed without applying flow
action. The group level was multiplied by 10 for group level 65531.

This patch fixes this issue by correcting suffix table group level
calculation.

Fixes: 3e8f3e51fd93 ("net/mlx5: fix meter table definitions")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/iavf: check RSS rule queue region size

When a rule is set to do RSS to redirect flows to a group of queues, the
queue region size should not be larger than the max RSS queue region
supported by HW. This patch added the step to check the queue region
size, and report failure if the size does not meet the requirement.

Fixes: e436cd43835b ("net/iavf: negotiate large VF and request more queues")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/iavf: fix releasing mbufs

In the function _iavf_rx_queue_release_mbufs_vec to release rx mbufs,
rxq->rxrearm_nb is given the value of rx descriptor number at last.
However, since the process to release and allocate mbufs lacks the
initialization of rxrearm_nb, if we try to release mbufs next time, it
will return without releasing directly. In this patch, rxrearm_nb is
initialized to be zero in rx queue reset.

Fixes: 319c421f3890 ("net/avf: enable SSE Rx Tx")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>

net/i40e: fix build for log format specifier

Build error:
i40e_common.c: In function "i40e_parse_discover_capabilities":
../drivers/net/i40e/base/../i40e_logs.h:43:50:
    error: format "%llX" expects argument of type "long long unsigned
           int", but argument 7 has type "u64" {aka "long unsigned int"
           [-Werror=format=]
   43 |  rte_log(RTE_LOG_ ## level, i40e_logtype_driver, "%s(): " fmt, \
      |                                                  ^~~~~~~~
.../i40e_osdep.h:87:3: note: in expansion of macro "PMD_DRV_LOG_RAW"
   87 |   PMD_DRV_LOG_RAW(DEBUG, "i40e %02x.%x " s,       \
      |   ^~~~~~~~~~~~~~~
.../base/i40e_common.c:4100:4: note: in expansion of macro "i40e_debug"
4100 |    i40e_debug(hw, I40E_DEBUG_INIT,
      |    ^~~~~~~~~~

There are multiple build error because of same reason, all fixed.

Using 'PRIX64' to fix the build error.

Fixes: 889bc9f0cd3a ("i40e/base: unify the capability function")
Fixes: 9b1041574cd4 ("i40e/base: enhance polling of NVM semaphore")
Fixes: 8db9e2a1b232 ("i40e: base driver")
Fixes: 3b7271f3958a ("i40e/base: catch NVM write semaphore timeout and retry")
Cc: stable@dpdk.org
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Luca Boccassi <bluca@debian.org>

net/ice: fix some RSS hash fields

Previous code in ice_rss_hash_set has mismatched hash fields and
headers for UDP and TCP.

Fixes: 16187528a923 ("net/ice/base: refactor RSS configure API")
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/i40e: add C++ include guard

Add extern "C" in rte_pmd_i40e.h when be compiled with CPP.

Fixes: 17e906a1ea9c ("net/i40e: support link status notification")
Cc: stable@dpdk.org
Signed-off-by: Prateek Agarwal <prateekag@cse.iitb.ac.in>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/iavf: check cache pointer before dereference

The return value of rte_mempool_default_cache should be
checked as it can be NULL.

Fixes: 9ab9514c150e ("net/iavf: enable AVX512 for Tx")
Reported-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

net/ice: check cache pointer before dereference

The return value of rte_mempool_default_cache should be
checked as it can be NULL.

Fixes: a4e480de268e ("net/ice: optimize Tx by using AVX512")
Reported-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>

net/i40e: fix flow director flex configuration

The configuration of FDIR flex mask and flex pit should not be set
during flow validate. It should be set when flow create.

Fixes: 6ced3dd72f5f ("net/i40e: support flexible payload parsing for FDIR")
Cc: stable@dpdk.org
Signed-off-by: Chenxu Di <chenxux.di@intel.com>
Tested-by: Jun W Zhou <junx.w.zhou@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>

net/ice: fix crash when device reset

When device resets, it should not implement ACL initialization
which is only executed in DCF. This patch disable ACL init and
uninit when in DPDK PF only mode.

Fixes: 40d466fa9f76 ("net/ice: support ACL filter in DCF")
Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/iavf: fix RXDID configure

When configure Rx queue by virtchnnl, the RXDID (Rx Descriptor ID)
should be configured only if the Rx queue has been set up.

Fixes: 12b435bf8f2f ("net/iavf: support flex desc metadata extraction")
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Tested-by: Yingya Han <yingyax.han@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

doc: add large iavf support to release notes

Update release note for large VF, supporting up to 256 queue pairs per
VF.

Fixes: e436cd43835b ("net/iavf: negotiate large VF and request more queues")
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>

net/hns3: use unsigned types for bit operator

According to bit operator reliability style, variables in
the right expression participating int bit operation
must be an unsigned type.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: remove some blank lines

According to the rule of the static check tools
that arrange blank lines properly to keep the
code compact, here remove some unnecessary blank
line to fix the above rule warning.

Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: fix queue state after reset

FLR operation will reset the queue enabling state and
the driver needs to restore the state after reset.
If the driver does not restore the state, it will result
in unpredictable behavior with reset when user start or
stop queue by calling the relevant function if.

This patch fix it by add a queue enabling state restore
function to the reset handler.

Fixes: fa29fe45a7b4 ("net/hns3: support queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: check PCI config space write

Here adds a check for the return value when calling
rte_pci_write_config.

Coverity issue: 363714
Fixes: cea37e513329 ("net/hns3: fix FLR reset")
Cc: stable@dpdk.org
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: adjust code style for struct initialization

According to the rule of the used static check tool,
each member is initialized on a separate lines when
struct and union members are initialized, here is
tempting to adjust some code lines in order to remove
the warning.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/hns3: use correct logging format specifiers

In current driver print log function, some print format
symbols does not match with the actual variable types.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>

net/sfc: use more robust string comparison

When it comes to comparing HW switch ID strings,
use strncmp to avoid reading past the buffer.

Fixes: e86b48aa46d4 ("net/sfc: add HW switch ID helpers")
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

common/sfc_efx/base: improve MCDI version/boot clarity

Improve the clarity of the code.

Fixes: 833cfcd590e2 ("common/sfc_efx/base: add API for querying board info")
Fixes: 312191e86eb0 ("common/sfc_efx/base: refactor version / boot info get helper")
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

net/txgbe: remove direct use of compiler attribute

Remove direct use of compiler attribute.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

vhost: fix virtqueue initialization

This patches fixes virtqueue initialization issue causing
segfault or file descriptor being closed unexpectedly.

The wrong index was passed to init_vring_queue() by
alloc_vring_queue() when a hole in the virtqueue array was
met.

Fixes: 8acd7c213353 ("vhost: fix virtqueues metadata allocation")
Cc: stable@dpdk.org
Reported-by: Yu Jiang <yux.jiang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Tested-by: Yu Jiang <yux.jiang@intel.com>

vhost: fix async inflight packet counter

Async inflight packet counter should take failed packets into account.
Failed packets will be deducted in the error handling logic.

Fixes: 6b3c81db8bb7 ("vhost: simplify async copy completion")
Fixes: cd6760da1076 ("vhost: introduce async enqueue for split ring")
Cc: stable@dpdk.org
Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

examples/vhost_crypto: add new line character in usage

Add new line character(\n) in the usage of vhost_crypto example for
better readability

Fixes: 709521f4c2cd ("examples/vhost_crypto: support multi-core")
Cc: stable@dpdk.org
Signed-off-by: Ibtisam Tariq <ibtisam.tariq@emumba.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

net/mlx5: fix Tx queue completion on stop

The Tx queue completion production index was not reset
on Tx queue stop and there were completions remaining
from the previous queue run. This caused the wrong
completion queue operating and overall Tx queue malfunction
on queue restart.

Fixes: 161d103b231c ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix Rx queue completion index consistency

The Rx queue completion consumer index got temporary
wrong value pointing to the midst of the compressed CQE
session. If application crashed at the moment the next
queue restart caused handling wrong CQEs pointed by index
and losing consuming index synchronization, that made
reliable queue restart impossible.

Fixes: 88c0733535d6 ("net/mlx5: extend Rx completion with error handling")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix hash list entry assert

The entry variable assert in the mlx5_hlist_register() function is not
correct. Remove the invalid entry variable.

Fixes: e69a59227db0 ("net/mlx5: support concurrent access for hash list")
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix use of local array for global error

Recent patch uses a local string array as input for function
rte_flow_error_set().
This stack memory may be later used by other code sections,
overwriting the desired error string.

This patch implements an error string for the specific case
requested, of ICMP item not supported in Verbs flow engine.

Fixes: d51475d1bfa5 ("net/mlx5: support item type error message in flow Verbs")
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix group value of sample suffix flow

mlx5 PMD split the sampling flow into prefix flow and suffix
flow. On the sample action translation function, the scaled
group value of suffix flow be attached into sample object and
saved into sample resource.

mlx5 PMD fetched the group value from the sample resource to
create the suffix flow. On the mlx5_flow_group_to_table
function the group value of suffix flow was scaled with table
factor again and translated into HW table. That caused the
incorrect group value of sample suffix flow.

The fix introduces a 'skip_scale' flag and sets it to 1 for the
sample suffix flow creation. On the mlx5_flow_group_to_table
function skips the scale with table factor to use the correct
group value.

Fixes: 4ec6360de37d ("net/mlx5: implement tunnel offload")
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix switch port id when representor in bonding

In the bonding configurations the port switch id for representors was
composed of pf index in bonding as the 1 MSB and the representor's index
as the remaining 15 LSBs. The special corner case for the host PF
representor on BF setups with representor id 0xFFFF was missed as well.

The new switch port id consists of 4 MSBs for the pf bonding index and
the remaining 12 LSBs for the representor index. The switch port id
ranges for each type of representors are as follows:

Uplink representor(AKA master): 0xFFFF
Host PF representor: 0x<pf_bond>FFF
VF representor: 0x<pf_bond>[0-FFE]

Fixes: bee57a0a3565 ("net/mlx5: update switch port id in bonding configuration")
Cc: stable@dpdk.org
Signed-off-by: Dong Zhou <dongzhou@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix aging queue doorbell ringing

Recent patch introduced a new SQ for ASO flow hit management.
This SQ uses two WQEBB's for each WQE.
The SQ producer index is 16 bits wide.

The enqueue loop posts new WQEs to the ASO SQ, using WQE index for
the SQ management.
This 16 bits index multiplied by 2 was wrongly used also for SQ
doorbell ringing.
The multiplication caused the SW index overlapping to be out of sync
with the hardware index, causing it to get stuck.

This patch separates the WQE index management from the doorbell index
management.
So, for each WQE index incrementation by 1, the doorbell index is
incremented by 2.

Fixes: f935ed4b645a ("net/mlx5: support flow hit action for aging")
Signed-off-by: Dekel Peled <dekelp@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix eCPRI common header endianness

The input header of a RTE flow item is with network byte order. In
the host with little endian, the bit field order are the same as the
byte order.
When checking the eCPRI message type, the wrong field will be selected.
Fixing to use correct field.

Fixes: daa38a8924a0 ("net/mlx5: add flow translation of eCPRI header")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

raw/ifpga/base: check adapter pointer before dereference

In opae_adapter_destroy(), pointer "adapter" is not validated before
passing it to opae_adapter_shm_free() and opae_adapter_mutex_close()
which dereference it.

Coverity issue: 363752
Fixes: e41856b515ce ("raw/ifpga/base: enhance driver reliability in multi-process")
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>

raw/ifpga/base: unlock mutex on Nios init failure

In fme_nios_spi_init(), a mutex is locked for protecting nios
initialization process, the mutex is only unlocked when process
is successful, it should also be unlocked when process fail.

Coverity issue: 363751
Fixes: e41856b515ce ("raw/ifpga/base: enhance driver reliability in multi-process")
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>

common/sfc_efx/base: avoid reading past buffer

Existing field ID validity check does not validate the field
descriptor availability. Make it more rigorous to avoid
reading past the buffer containing field descriptors.

Coverity issue: 363742
Fixes: 370ed675a952 ("common/sfc_efx/base: support setting PPORT in match spec")
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>

net/ring: fix typo in log message

Add a missing space.

Fixes: 869bf6d222bb ("net/ring: fix coding style")
Cc: stable@dpdk.org
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

net/txgbe: replace forbidden functions

Remove rte_panic(), and use rte_atomic_thread_fence()
instead of rte_smp_[r/w]mb.

Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>

net/mlx5: fix Tx queue stop state

The Tx queue stop API doesn't call the PMD callback when the state of
the queue is stopped.
The drivers should update the state to be stopped when the queue stop
callback is done successfully or when the port is stopped.
The drivers should update the state to be started when the queue start
callback is done successfully or when the port is started.

The driver wrongly didn't update the state as started when the port
start callback was done which kept the state as stopped.
Following call to a queue stop API was not completed by ethdev layer
because the state is already stopped.

Move the state update from the Tx queue setup to the port start
callback.

Fixes: 161d103b231c ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix Tx queue reference count check

The Txq refcnt 1 value means that there is no real reference to the
queue and only the control configurations are saved in the struct.

The patch below wrongly didn't consider it and caused a leak in the Txq
object resource.

Revert the specific update in the refcnt.

Fixes: b5c8b3e70cdf ("net/mlx5: use C11 atomics for RxQ/TxQ refcounts")
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

common/mlx5: free MR resource on device DMA unmap

mlx5 PMD created the MR (Memory Region) resource on the
mlx5_dma_map call to make the memory available for DMA
operations. On the mlx5_dma_unmap call the MR resource
was not freed but inserted to MR Free list for further
garbage collection.
Actual MR resource destroying happened on device stop
call. That caused the runtime out of memory in case of
application performed multiple DMA map/unmap calls.

The fix immediately frees the MR resource on mlx5_dma_unmap
call not engaging the list. The export for mlx5_mr_free
function from common PMD part is added as well.

Fixes: 989e999d9305 ("net/mlx5: support PCI device DMA map and unmap")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Wang <jiaweiw@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

net/mlx5: fix SQ resources release in error flow

Fix in error flow in which the function
mlx5_txq_release_devx_sq_resources is called twice by setting the
release object to NULL after the first call

The incorrect flow was introduced in the work done on generic
object creation.

Once an error flow inside mlx5_txq_create_devx_sq_resources
occurs the function will call mlx5_txq_release_devx_sq_resources
however the released pointers are not set to NULL after the release
calls and undefined memory is released in the same call in
mlx5_txq_release_devx_resources.

This results in calls to MLX5_FREE with
an already released memory addresses and assert in mlx5_release_dbr:

EAL: Error: Invalid memory
EAL: Error: Invalid memory

PANIC in mlx5_txq_release_devx_sq_resources():
assert "(mlx5_release_dbr(&txq_obj->txq_ctrl->priv->dbrpgs,
mlx5_os_get_umem_id (txq_obj->sq_dbrec_page->umem),
txq_obj->sq_dbrec_offset)) == 0" failed

The fix is setting the released pointers to NULL after the first release
calls.

Fixes: 86d259cec852 ("net/mlx5: separate Tx queue object creations")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/mlx5: fix Rx queue object allocation with MPRQ

The space for extra buffer pointers used by MPRQ routines was not
allocated in Rx queue object creation structure causing memory
corruption.
The fix allocates the extra memory for the pointers in case MPRQ is
engaged.

Fixes: a0a45e8af723 ("net/mlx5: configure Rx queue for buffer split")
Signed-off-by: Ophir Munk <ophirmu@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

net/bnxt: remove useless prefetches

Prefetching only helps performance if it is done several 100
instructions before the actual use. The purpose of the prefetch
is to read ahead, it doesn't help if the next instruction
will block.

The code in the bnxt driver was doing these unnecessary prefetches.

Fixes: 2eb53b134aae ("net/bnxt: add initial Rx code")
Cc: stable@dpdk.org
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

config: enable packet prefetching with Meson

With Make build system, RTE_PMD_PACKET_PREFETCH was enabled
by default. It got lost when transitioning to Meson build
system.

In order to avoid performance changes, this patch enables
packet prefetching in rte_config.h.

Fixes: 9314afb68a53 ("drivers: add infrastructure for meson build")
Cc: stable@dpdk.org
Reported-by: Marvin Liu <yong.liu@intel.com>
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>

gro: fix packet type detection with IPv6 tunnel

For VxLAN packets, GRO will mistakenly reassemble them
if inner L3 is IPv6, inner L4 is TCP or UDP, and outer L3
is IPv4 because the value of IS_IPV4_VXLAN_TCP4/UDP4_PKT
is true for them.

This fix makes sure IS_IPV4_TCP_PKT, IS_IPV4_UDP_PKT,
IS_IPV4_VXLAN_TCP4_PKT and IS_IPV4_VXLAN_UDP4_PKT can make
decision precisely.

Fixes: e2d811063673 ("gro: support VXLAN UDP/IPv4")
Fixes: 1ca5e6740852 ("gro: support UDP/IPv4")
Fixes: 9e0b9d2ec0f4 ("gro: support VxLAN GRO")
Fixes: 0d2cbe59b719 ("lib/gro: support TCP/IPv4")
Cc: stable@dpdk.org
Signed-off-by: Yi Yang <yangyi01@inspur.com>
Acked-by: Jiayu Hu <jiayu.hu@intel.com>

net/mlx5: fix UAR used by ASO queues

The dedicated UAR was allocated for the ASO queues.
The shared UAR created for Tx queues can be used instead.

Fixes: f935ed4b645a ("net/mlx5: support flow hit action for aging")
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

vdpa/mlx5: fix UAR allocation

This patch provides the UAR allocation workaround for the
hosts where UAR allocation with Write-Combining memory
mapping type fails.

Fixes: 8395927cdfaf ("vdpa/mlx5: prepare HW queues")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

regex/mlx5: fix UAR allocation

This patch provides the UAR allocation workaround for the
hosts where UAR allocation with Write-Combining memory
mapping type fails.

Fixes: b34d816363b5 ("regex/mlx5: support rules import")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

common/mlx5: share UAR allocation routine

This patch introduces the routine to allocate the UAR (User
Access Region) with various memory mapping types. The origin
patch being fixed provided the UAR allocation workaround
for the mlx5 net PMD only. As it was found the other mlx5
based drivers - vdpa and regex are affected by the issue
as well and must be fixed.

Fixes: a0bfe9d56f74 ("net/mlx5: fix UAR memory mapping type")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>

usertools: fix CPU layout script to be PEP8 compliant

The pycodestyle tool flagged the following issues, which are now fixed.

$ pycodestyle cpu_layout.py
cpu_layout.py:18:5: E722 do not use bare 'except'
cpu_layout.py:62:14: E231 missing whitespace after ','

Fixes: deb87e6777c0 ("usertools: use sysfs for CPU layout")
Fixes: c9208f1dc967 ("usertools: fix CPU layout with python 3")
Cc: stable@dpdk.org
Signed-off-by: Ciara Power <ciara.power@intel.com>

raw/ioat: fix queue index calculation

Coverity flags a possible problem where the 8-bit wq_idx value may have
errors when shifted and sign-extended to pointer size. Since this can
only occur if the shift index is larger than any expected value from
hardware, it's unlikely to cause any real problems, but we can eliminate
any possible errors, and the coverity issue, by explicitly typecasting
the uint8_t value to uintptr_t before any shift operations occur.

Coverity issue: 363695
Fixes: a33969462135 ("raw/ioat: fix work-queue config size")
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

build: fix MS linker flag with meson 0.54

Meson versions >= 0.54.0 include support for handling /implib
with msvc link. Specifying it explicitly causes failures when
linking against the dll. Tested using Link 14.27.29112.0 and
Clang 11.0.0.

There were a number of changes to the way that import libraries
are handled between 0.47.1 and 0.54.0. Only make the change
for >= 0.54.0, leaving the behaviour unchanged for earlier
versions.

Fixes: 77cca7ccec13 ("build: fix drivers library path on Windows")
Cc: stable@dpdk.org
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Tested-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Khoa To <khot@microsoft.com>

build: fix install on Windows

Don't run symlink-drivers-solibs.sh as part of 'install' because
Windows doesn't support shell scripts.

Fixes: 82ba4416dd87 ("build: add module definition files for Windows")
Cc: stable@dpdk.org
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Tested-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>

examples/qos_sched: fix usage string

The short option written for interactive mode is --i in usage of
this qos_sched example. Actually, it is -i.

Fixes: cfd5c971e5e ("examples/qos_sched: add stats")
Cc: stable@dpdk.org
Signed-off-by: Ibtisam Tariq <ibtisam.tariq@emumba.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>

table: fix exact match SWX table lookup

Fix for the exact match lookup function.

Fixes: d0a00966618b ("table: add exact match SWX table")
Signed-off-by: Churchill Khangar <churchill.khangar@intel.com>
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

doc: describe the SWX pipeline type

Add the new SWX pipeline type to the Programmer's Guide.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>

crypto/armv8: replace meson option with pkg-config support

With pkg-config support available within AArch64crypto library,
meson option 'armv8_crypto_dir' can be removed.
PKG_CONFIG_PATH environment variable should be set appropriately
to use the crypto library.

Suggested-by: Thomas Monjalon <thomas@monjalon.net>
Signed-off-by: Dharmik Thakkar <dharmik.thakkar@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>

eal/arm: fix clang build of native target

When doing Clang build with '-mcpu=native' on N1 platform, build failed
with:
../lib/librte_eal/arm/include/rte_atomic_64.h:76:39:
error: instruction requires: lse
__ATOMIC128_CAS_OP(__cas_128_release, "caspl")

This is because native detection for Neoverse N1 was added in Clang-11.
Prior version of Clang's assembler doesn't know LSE support on hardware.
Fixed this for Clang earlier than version 11 by specifying architecture
for assembler.
Referred to [1] for this fix.

Fixes: 7e2c3e17fe2c ("eal/arm64: add 128-bit atomic compare exchange")
Cc: stable@dpdk.org
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e0d5896bd356cd577f9710a02d7a474cdf58426b

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>

vfio: use static window sizing for sPAPR IOMMU

The SPAPR IOMMU requires that a DMA window size be defined before memory
can be mapped for DMA. Current code dynamically modifies the DMA window
size in response to every new memory allocation which is potentially
dangerous because all existing mappings need to be unmapped/remapped in
order to resize the DMA window, leaving hardware holding IOVA addresses
that are temporarily unmapped. The new SPAPR code statically assigns
the DMA window size on first use, using the largest physical memory
memory address when IOVA=PA and the highest existing memseg virtual
address when IOVA=VA.

Signed-off-by: David Christensen <drc@linux.vnet.ibm.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>

devtools: fix directory filter in forbidden token check

checkpatches.sh current complains on a patch [1] adding
ALLOW_EXPERIMENTAL_API in an example while this check is for app, lib
and drivers directories:

Warning in examples/ethtool/ethtool-app/Makefile:
Using experimental build flag for in-tree compilation

The regexp on entering files concerned by this filter is incorrect.
In the [1] case, the file full name is matched against "app" rather than
"+++ b/app".

1: https://patchwork.dpdk.org/patch/83902/

Fixes: 7413e7f2aeb3 ("devtools: alert on new calls to exit from libs")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>

examples: stop processing meson file if build impossible

Once it has been determined that an example cannot be built, there is
little point in continuing to process the meson.build file for that
example, so we can use subdir_done() to return to the calling file.
This can potentially prevent problems where later statement in the file
may cause an error on systems where the app cannot be built, e.g. on
Windows or FreeBSD.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

examples/l2fwd-keepalive: skip meson build if no librt

When librt is not present on a system, processing the meson.build file
for this example application causes an error. Make the library
non-mandatory and just mark the example as unbuildable if it is
not present.

Fixes: 89f0711f9ddf ("examples: build some samples with meson")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

examples: fix flattening directory layout on install

By installing the examples one-by-one in a loop in the examples
meson.build file we effectively flattened out the structure of the examples
folder and omitted some common and shared subfolders that were never
directly built. Instead, we can remove the loop and just have the whole
"examples" folder installed as-is in a single statement, preserving its
directory structure, and thereby fixing the build of a number of the
examples.

Fixes: 2daf565f91b5 ("examples: install as part of ninja install")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

mbuf: move pool pointer in first half

According to the Technical Board decision
(http://mails.dpdk.org/archives/dev/2020-November/191859.html),
the mempool pointer in the mbuf struct is moved
from the second to the first half.
It may increase performance in some cases
on systems having 64-byte cache line, i.e. mbuf split in two cache lines.

Due to this change, all fields after "pool" are moved up.
Hopefully no vector data path is impacted.

Moving this field gives more space to dynfield1
while dropping the temporary dynfield0.

This is how the mbuf layout looks like (pahole-style):

word  type                              name                byte  size
0    void *                            buf_addr;         /*   0 +  8 */
1    rte_iova_t                        buf_iova          /*   8 +  8 */
      /* --- RTE_MARKER64               rearm_data;                   */
2    uint16_t                          data_off;         /*  16 +  2 */
      uint16_t                          refcnt;           /*  18 +  2 */
      uint16_t                          nb_segs;          /*  20 +  2 */
      uint16_t                          port;             /*  22 +  2 */
3    uint64_t                          ol_flags;         /*  24 +  8 */
      /* --- RTE_MARKER                 rx_descriptor_fields1;        */
4    uint32_t             union        packet_type;      /*  32 +  4 */
      uint32_t                          pkt_len;          /*  36 +  4 */
5    uint16_t                          data_len;         /*  40 +  2 */
      uint16_t                          vlan_tci;         /*  42 +  2 */
5.5  uint64_t             union        hash;             /*  44 +  8 */
6.5  uint16_t                          vlan_tci_outer;   /*  52 +  2 */
      uint16_t                          buf_len;          /*  54 +  2 */
7    struct rte_mempool *              pool;             /*  56 +  8 */
      /* --- RTE_MARKER                 cacheline1;                   */
8    struct rte_mbuf *                 next;             /*  64 +  8 */
9    uint64_t             union        tx_offload;       /*  72 +  8 */
10    struct rte_mbuf_ext_shared_info * shinfo;           /*  80 +  8 */
11    uint16_t                          priv_size;        /*  88 +  2 */
      uint16_t                          timesync;         /*  90 +  2 */
11.5  uint32_t                          dynfield1[9];     /*  92 + 36 */
16    /* --- END                                             128      */

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>

drivers: disable OCTEON TX2 in 32-bit build

The drivers for OCTEON TX2 are not supported in 32-bit mode.

Suggested-by: Jerin Jacob <jerinj@marvell.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Jerin Jacob <jerinj@marvell.com>

devtools: allow custom set of examples in build test

To test the installation process of DPDK using "ninja install"
test-meson-builds.sh builds a subset of the examples using "make". To allow
more flexibility for people testing, allow the set of examples chosen for
this make test to be overridden using variable "DPDK_BUILD_TEST_EXAMPLES"
in the environment.

Since a number of example apps link against drivers directly even for
shared builds, we need to ensure that LD_LIBRARY_PATH points to the main
DPDK lib folder so any dependencies of those drivers can be found e.g. that
the PCI/vdev bus driver .so is found. [All drivers are symlinked from
drivers dir back to lib dir on install, so only one dir rather than two is
needed in the path.]

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>

devtools: fix x86-default build test install env

The x86-default environment was loaded after installing this target.
I did not see any problem with it, yet we should load corresponding
environment before installing a target.

Fixes: bd253daa7717 ("devtools: fix test of ninja install")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>

devtools: reduce build test verbosity

The default verbosity of test-meson-builds.sh is to be quiet.
In order to better apply the verbosity policy, some file descriptors
are open to redirect to stdout or /dev/null accordingly.

The target variable and meson/ninja commands are printed in verbose modes.
The installation commands are printed only in very verbose mode.
The examples build commands are printed only in very verbose mode.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>