Qi Zhang [Tue, 15 Dec 2020 04:21:21 +0000 (12:21 +0800)]
net/ice/base: support checking double VLAN mode
If a driver wants to configure double VLAN mode (DVM) it needs to
first check if the DDP supports DVM. To do this the driver needs to read
the package metadata section via the upload section AQ (0x04C1).
If the DDP doesn't support configuring double VLAN mode (DVM), then
there is nothing to do regarding configuring the VLAN mode of the
device.
The set_svm() or set_dvm() ops should only be called if the current
configuration supports configuring the VLAN mode of the device.
Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Brett Creeley <brett.creeley@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 15 Dec 2020 04:13:57 +0000 (12:13 +0800)]
net/ice/base: fix tunnel destroy
The TCAM information in AQ command buffer is not correct when destroying
the tunnel entries. The TCAM count was always ONE even multiple entries
are destroyed, and the offset of TCAM memory was also incorrect.
This patch is to fix this issue.
Qi Zhang [Tue, 15 Dec 2020 04:00:55 +0000 (12:00 +0800)]
net/ice/base: add interface to support configuring VLAN mode
The VLAN mode of the device has to be configured while the global
configuration lock is held while downloading the DDP, specifically after
the DDP has been downloaded. In order to support this a VLAN mode
interface was added. By default the device will stay in single VLAN
mode (SVM), which is the current implementation. However, this can be
changed by implementing the .set_dvm op.
Qi Zhang [Tue, 15 Dec 2020 02:44:39 +0000 (10:44 +0800)]
net/ice/base: implement inactive NVM version get
Similar to ice_get_inactive_orom_ver, add a function to read the NVM
version data from the inactive section of flash. The primary motivation
of this function is to allow the driver to report the version of
a pending update that has not yet been activated.
To do this, refactor ice_get_nvm_ver_info to allow it to take a bank
parameter. Read from the copy of the Shadow RAM in the NVM bank, rather
than reading from the RAM copy that is loaded by the device. This
ensures we get the accurate value when reading the inactive section.
Note that the start of the Shadow RAM copy is not directly following the
CSS header, but is actually aligned to the next 64-byte boundary. The
correct word offset must be rounded up to 32-bytes.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 15 Dec 2020 02:39:28 +0000 (10:39 +0800)]
net/ice/base: read option ROM combo version from CIVD
The driver currently reads the combo image version data from within the
Boot Configuration TLV block of the PFA area of the NVM. This allows
access to the active Option ROM version data, assuming that it has been
properly copied into this section.
There is no equivalent method for reading the Option ROM version data
from a pending Option ROM update, as it will not yet have been copied
into the PFA boot configuration block. Instead, replace this
implementation with one which scans for the CIVD data section of the
Option ROM image data.
This CIVD data is stored in a packed structured format within the Option
ROM. It is always aligned to a 512 byte boundary, and starts with
a special '$CIV' 4-byte signature. Data integrity is checked using
a simple modulo 256 sum of the structure bytes.
Implement a new ice_get_orom_civd_data function which allows reading
from the selected flash bank (active or inactive), and scans for valid
CIVD data. Use this instead of the boot configuration TLV in order to
report the combo version data of precisely what is in the Option ROM
data.
To allow access to reading the inactive Option ROM bank, introduce a new
ice_get_inactive_orom_ver function. Use of a new function is done in
order to avoid leaking the bank selection abstraction outside of
ice_nvm.c
With this new function, the driver can now read and display the version
of the to-be-activated Option ROM when an update has been initiated but
not yet finalized.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 15 Dec 2020 02:29:16 +0000 (10:29 +0800)]
net/ice/base: allow flash read with arbitrary size
Refactor ice_read_flash_module so that it takes a size and a length
value, rather than always reading in 2-byte increments. The
ice_read_nvm_module and ice_read_orom_module wrapper functions will
still read a u16 with the byte-swapping enabled.
This will be used in a future change to implement reading of the CIVD
data from the Option ROM module.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Modify ice_get_nvm_srev and ice_get_orom_srev to take the
ice_flash_bank enumeration that specifies whether to read from the
active or the inactive flash module. Rename and refactor the
ice_read_active_nvm_module and ice_read_active_orom_module functions to
take the bank enum value as well.
With this change, ice_get_nvm_srev and ice_get_orom_srev will be usable
in a future change to implement reading the version data for a pending
flash image.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 15 Dec 2020 01:45:41 +0000 (09:45 +0800)]
net/ice/base: refactor interface for flash read
The ice_read_flash_module interface for reading from the various NVM
modules was introduced.
It's purpose is two-fold. First, it enables reading data from the CSS
header, used to allow accessing the image security revisions. Second, it
allowed reading from either the 1st or the 2nd NVM bank. This interface
was necessary because the device has two copies of each module. Only one
bank is active at a time, but it could be different for each module. The
driver had to determine which bank was active and then use that to
calculate the offset into the flash to read.
Future plans include allowing access to read not just from the active
flash bank, but also the inactive bank. This will be useful for enabling
display of the version information for a pending flash update.
The current abstraction in ice_read_flash_module is to specify the exact
bank to read. This requires callers to know whether to read from the 1st
or 2nd flash bank. This is the wrong abstraction level, since in most
cases the decision point from a caller's perspective is whether to read
from the active bank or the inactive bank.
Add a new ice_bank_select enumeration, used to indicate whether a flow
wants to read from the active, or inactive flash bank. Refactor
ice_read_flash_module to take this new enumeration instead of a raw
flash bank.
Have ice_read_flash_module select which bank to read from based on the
cached data we load during NVM initialization. With this change, it will
be come easier to implement reading version data from the inactive flash
banks in a future change.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Dapeng Yu [Wed, 23 Dec 2020 05:30:18 +0000 (13:30 +0800)]
net/ice: check Rx queue number on RSS init
When RSS is initialized, rx queues number is used as denominator to set
default value into the RSS lookup table. If it is zero, there will be
error of being divided by 0. So add value check to avoid the error.
Fixes: 50370662b727 ("net/ice: support device and queue ops") Cc: stable@dpdk.org Signed-off-by: Dapeng Yu <dapengx.yu@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Xuan Ding [Wed, 23 Dec 2020 12:52:28 +0000 (12:52 +0000)]
net/iavf: improve default RSS
Add support to actively configure the RSS through port config.
Any kernel PF enabled default RSS will be disabled during
initialization.
Besides, default RSS will be configured based on
rte_eth_rss_conf->rss_hf.
Currently supported default rss_type: ipv4[6], ipv4[6]_udp, ipv4[6]_tcp,
ipv4[6]_sctp.
Signed-off-by: Xuan Ding <xuan.ding@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Junfeng Guo [Mon, 14 Dec 2020 06:49:09 +0000 (14:49 +0800)]
common/iavf: support eCPRI protocol header fields
Add eCPRI header and its field selectors, including MSG_TYPE, PCID
and RTCID. Since the offset of PCID is same as RTCID, we just add one
MACRO for these two fields. For MSG Type 0, ecpriRtcid/ecpriPcid field
within the eCPRI header will be extracted to Field Vector for FDIR and
RSS.
SPEC for eCPRI:
http://www.cpri.info/downloads/eCPRI_v_2.0_2019_05_10c.pdf
Dapeng Yu [Tue, 15 Dec 2020 10:10:31 +0000 (18:10 +0800)]
net/ixgbe: fix flex bytes flow director rule
When a flexbytes flow director rule is created, the FDIRCTRL.FLEX_OFFSET
register is set, and it keeps its affect even after the flow director
flexbytes rule is destroyed, causing packets to be transferred to the
wrong place.
It is because setting FDIRCTRL shall only be permitted on Flow Director
initialization flow or clearing the Flow Director table according to the
datasheet, otherwise device may behave unexpectedly.
In order to evade this limitation, simulate the Flow Director
initialization flow or clearing the Flow Director table by setting
FDIRCMD.CLEARHT to 0x1B and then clear it back to 0x0B.
Fixes: f35fec63dde1 ("net/ixgbe: enable flex bytes for generic flow API") Cc: stable@dpdk.org Signed-off-by: Dapeng Yu <dapengx.yu@intel.com> Tested-by: Jun W Zhou <junx.w.zhou@intel.com> Acked-by: Jeff Guo <jia.guo@intel.com>
Souvik Dey [Tue, 15 Dec 2020 13:28:15 +0000 (08:28 -0500)]
net/i40e: fix VLAN stripping in VF
When VF adds VLAN, Linux PF driver enables VLAN stripping by default,
this might have issues if the app configured DEV_RX_OFFLOAD_VLAN_STRIP.
This behavior of the Linux driver causes confusion with the DPDK app
using i40e_pmd. So it is better to reconfigure the vlan_offload, which
checks for DEV_RX_OFFLOAD_VLAN_STRIP flag in the dev_conf and enables or
disables the vlan strip in the PF.
Application cannot use rte_eth_dev_set_vlan_offload() to set
the VLAN_STRIP, as this will only work for the first time when
original and current config mismatch, but for all subsequent call
it will be ignored.
Fixes: 4861cde46116 ("i40e: new poll mode driver") Cc: stable@dpdk.org Signed-off-by: Souvik Dey <sodey@rbbn.com> Acked-by: Jeff Guo <jia.guo@intel.com>
Igor Ryzhov [Tue, 17 Nov 2020 08:56:39 +0000 (11:56 +0300)]
net/i40e: fix stats counters
When low and high registers are read separately, this opens the door to
a race condition:
- low register is read
- NIC updates the registers
- high register is read
Because of this, we may end up with an incorrect counter value.
Let's read the registers in one shot, as it is done in Linux kernel
since the introduction of the i40e driver.
Fixes: 4861cde46116 ("i40e: new poll mode driver") Cc: stable@dpdk.org Signed-off-by: Igor Ryzhov <iryzhov@nfware.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Liron Himi [Wed, 16 Dec 2020 21:36:52 +0000 (23:36 +0200)]
build: update meson for Marvell Armada drivers
With pkg-config support available within musdk library
(from musdk-release-SDK-10.3.5.0-PR2 version),
meson option 'lib_musdk_dir' can be removed.
PKG_CONFIG_PATH environment variable should be set appropriately
to use the musdk library.
docs are updated with new musdk version and meson instructions.
net/bonding: fix PCI address comparison on non-PCI ports
The bonding PMD will iterate over all available ETH ports and for each,
compare a chunk of bytes at an offset that would correspond to the PCI
address in an rte_pci_device.
This is incorrect and unsafe. Also, the rte_device using this PCI
address is already found, no need to compare again the PCI address of
all eth devices.
Refactoring the code to fix this, the initial check to find the PCI bus
is out of scope.
Fixes: c848b518bbc7 ("net/bonding: support bifurcated driver in eal") Cc: stable@dpdk.org Signed-off-by: Gaetan Rivet <grive@u256.net> Acked-by: Min Hu (Connor) <humin29@huawei.com>
The buffer split Rx offload is not compatible with Multi-Packet
Receiving Queue (MPRQ) Rx offload, hence, the buffer split
offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT and other related
values should be advertised only if there is no MPRQ engaged.
Wrong index is used to find mbufs belonging to an application in
the rxq_free_elts_sprq() function in the case of vectorized MPRQ.
elts_ci points to the last allocated mbuf in this case, not rq_ci.
Use this field to avoid double free of mbuf and segmentation fault.
Gregory Etelson [Tue, 8 Dec 2020 08:17:05 +0000 (10:17 +0200)]
net/mlx5: fix Direct Verbs flow descriptor allocation
Initialize flow descriptor tunnel member during flow creation.
Prevent access to stale data and pointers when flow descriptor is
reallocated after release.
Fix flow index validation.
Fixes: e7bfa3596a0a ("net/mlx5: separate the flow handle resource") Fixes: 8bb81f2649b1 ("net/mlx5: use thread specific flow workspace") Cc: stable@dpdk.org Signed-off-by: Gregory Etelson <getelson@nvidia.com> Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Murphy Yang [Tue, 15 Dec 2020 08:10:52 +0000 (08:10 +0000)]
net/ice: fix outer checksum flags
When received tunneled packets, the testpmd output log shows 'ol_flags'
value always is 'PKT_RX_OUTER_L4_CKSUM_UNKNOWN', but expected value is
'PKT_RX_OUTER_L4_CKSUM_GOOD' or 'PKT_RX_OUTER_L4_CKSUM_BAD'.
Add the 'PKT_RX_OUTER_L4_CKSUM_GOOD' and 'PKT_RX_OUTER_L4_CKSUM_BAD' to
'flags' for normal path, 'l3_l4_flags_shuf' for AVX2 and AVX512 vector
path and 'cksum_flags' for SSE vector path to ensure that the 'ol_flags'
can match correct flags.
Fixes: dbf3c0e77a22 ("net/ice: handle Rx flex descriptor") Fixes: 4ab7dbb0a0f6 ("net/ice: switch to Rx flexible descriptor in AVX path") Fixes: ece1f8a8f1c8 ("net/ice: switch to flexible descriptor in SSE path") Cc: stable@dpdk.org Signed-off-by: Murphy Yang <murphyx.yang@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Ting Xu [Mon, 14 Dec 2020 06:04:10 +0000 (14:04 +0800)]
net/iavf: fix memory leak in large VF
This patch fixed the issue that the memory allocated for structure
virtchnl_del_ena_dis_queues is not released at the end of the functions
iavf_enable_queues_lv, iavf_disable_queues_lv and iavf_switch_queue_lv.
Fixes: 9cf9c02bf6ee ("net/iavf: add enable/disable queues for large VF") Cc: stable@dpdk.org Signed-off-by: Ting Xu <ting.xu@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Ivan Malov [Fri, 11 Dec 2020 15:34:21 +0000 (18:34 +0300)]
common/sfc_efx/base: check for MAE privilege
VFs can't control MAE, so it's important to override the general
MAE capability bit by taking MAE privilege into account. Reorder
the code slightly to have the privileges queried before datapath
capabilities are discovered and add required MAE privilege check.
Fixes: eb4e80085fae ("common/sfc_efx/base: indicate support for MAE") Cc: stable@dpdk.org Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Ivan Malov [Fri, 11 Dec 2020 15:34:20 +0000 (18:34 +0300)]
common/sfc_efx/base: update MCDI headers for MAE privilege
VFs and unprivileged PFs should not be able to control MAE.
Add MAE privilege to MCDI headers in order to reflect that.
Fixes: 84d3fb7d7e1e ("common/sfc_efx/base: add MAE definitions to MCDI") Cc: stable@dpdk.org Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Somnath Kotur [Thu, 3 Dec 2020 06:38:47 +0000 (12:08 +0530)]
net/bnxt: fix PF resource query
This cmd should be called by every driver after 'hwrm_func_cfg'
to get the actual number of resources allocated by the HWRM.
The values returned in the cmd are the max values for that PF.
Also, now that the max values for the PF are computed in probe itself,
no need to invoke FUNC_QCAPs or any other cmd in dev_configure_op()
as that would just override the actual max values obtained above.
The current max_rings computation does not take into account the case
when max_nq_rings is <= num_async_cpr. This results in a wrong value
like 0, when max_nq_rings is 1. Fix this by subtracting num_async_cpr
only when max_cp_rings > num_async_cpr.
Apart from this, the entire logic is currently spread across a few
macros, making it hard to read and debug this code. Move this code
into an inline function.
max_msix is not used in the max_rings calculation.
Apparently the max_msix field returned in HWRM_RESC_QCAPS is only valid
for Thor and newer chips. On Wh+ it will be equal to min_compl_rings.
Also, when a function reset is performed on an application quit, FW
will not reset the VF resource pool as per design.
This can lead to a strange condition wherein the max_msix field
on Wh+ keeps changing on each application re-load thereby throwing
throwing off the max_rings computation.
Ajit Khaparde [Tue, 1 Dec 2020 19:15:23 +0000 (11:15 -0800)]
net/bnxt: remove references to Thor
Refactor code to remove references to Thor.
Instead use P5 as in phase 5 of development cycle since it is applicable
to boards other than Thor as well.
Kalesh AP [Tue, 17 Nov 2020 07:10:24 +0000 (12:40 +0530)]
net/bnxt: release HWRM lock in error
In __bnxt_hwrm_func_qcaps, when memory allocations fails
driver is not releasing the hwrm lock. This patch fixes it
by calling hwrm_unlock in that error case.
Fixes: b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Samik Gupta [Fri, 6 Nov 2020 21:41:21 +0000 (16:41 -0500)]
net/bnxt: fix VNIC config on Rx queue stop
Reconfigure a vnic's default ring if the current default ring is stopped
by the application. It picks the lowest numbered ring that is currently
active to be the new default, and issues the hwrm_vnic_cfg command to
update the configuration. Applies to adapters that are not Thor-based.
Samik Gupta [Thu, 12 Nov 2020 21:28:25 +0000 (13:28 -0800)]
net/bnxt: fix Rx rings in RSS redirection table
This commit introduces a limit on the number of RX rings included in
the RSS redirection table to a value no larger than the size supported
by Thor as defined by BNXT_RSS_TBL_SIZE_THOR.
Beilei Xing [Fri, 20 Nov 2020 08:49:47 +0000 (16:49 +0800)]
net/i40e: fix global register recovery
PMD configures the global register I40E_GLINT_CTL during
device initialization to work around the Rx write back
issue. But when a device is bound from DPDK to kernel,
the global register is not recovered to the original
state, it will cause kernel driver performance drop issue.
This patch fixes this issue.
Fixes: be6c228d4da3 ("i40e: support Rx interrupt") Fixes: 4ab831449a1c ("net/i40e: fix interrupt conflict with multi-driver") Cc: stable@dpdk.org Signed-off-by: Beilei Xing <beilei.xing@intel.com> Acked-by: Jeff Guo <jia.guo@intel.com>
Murphy Yang [Thu, 3 Dec 2020 07:50:30 +0000 (07:50 +0000)]
net/i40e: fix L4 checksum flag
When tunneled packet received that inner L4 checksum value is correct,
the test_pmd output log shows 'ol_flags' value is
'PKT_RX_L4_CKSUM_UNKNOWN', but expected value is 'PKT_RX_L4_CKSUM_GOOD'.
If the inner l4 checksum is correct, mark the 'PKT_RX_L4_CKSUM_GOOD'
flag to 'l3_l4e_flags' for sse and 'l3_l4_flags_shuf' for avx2 to
ensure that the 'ol_flags' can match correct flags.
Fixes: 9966a00a0688 ("net/i40e: enable bad checksum flags in vector Rx") Fixes: dafadd73762e ("net/i40e: add AVX2 Rx function") Cc: stable@dpdk.org Signed-off-by: Murphy Yang <murphyx.yang@intel.com> Acked-by: Jeff Guo <jia.guo@intel.com>
Murphy Yang [Mon, 23 Nov 2020 07:05:23 +0000 (07:05 +0000)]
net/ice: fix outer UDP Tx checksum offload
If hardware outer UDP Tx checksum offload enabled, it doesn't take
effect when 'IPv6/UDP/VXLAN' packet sent with wrong outer UDP checksum.
In order to take effect, set the 'L4T_CS' flag valid only when 'L4TUNT'
equals one and 'EIPT' is not zero. If 'L4T_CS' flag marked, the hardware
can calculate the outer tunneling UDP checksum.
Fixes: bd70c451532c ("net/ice: support Tx checksum offload for tunnel") Cc: stable@dpdk.org Signed-off-by: Murphy Yang <murphyx.yang@intel.com> Tested-by: Wei Xie <weix.xie@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Feifei Wang [Wed, 18 Nov 2020 10:48:56 +0000 (04:48 -0600)]
net/ixgbe: set VLAN strip flag for NEON Rx
For NEON vector of IXGBE PMD, introduce new flag PKT_RX_VLAN_STRIPPED to
show the case that the VLAN is stripped from the VLAN tagged packet.
This is because that the old flag PKT_RX_VLAN_PKT only indicates that
the packet is VLAN tagged, but cannot show whether VLAN is in
m->vlan_tci or in the packet at present. So add new flag to show the
vlan has been stripped by the hardware and its tci is saved in
m->vlan_tci when vlan stripping is enabled in the RX configuration of
the IXGBE PMD.
Signed-off-by: Feifei Wang <feifei.wang2@arm.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com>
Min Hu (Connor) [Thu, 10 Dec 2020 12:48:43 +0000 (20:48 +0800)]
net/hns3: fix FEC state query
As FEC is not supported below 10 Gbps,
CMD(HNS3_OPC_CONFIG_FEC_MODE) offered from
Firmware read will return fail in 10 Gbps device.
This patch will prevent read this CMD when below 10 Gbps,
as this is non-sense.
Fixes: 9bf2ea8dbc65 ("net/hns3: support FEC") Cc: stable@dpdk.org Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com>
Testing has shown that the packet forwarding rate for packet sizes
that are not a multiple of the cache line size is reduced when the
DMA size is padded to a multiple of the cache line size. Improve
performance for these packet sizes by disabling EOP padding.
Yunjian Wang [Tue, 1 Dec 2020 00:59:34 +0000 (08:59 +0800)]
net/bnxt: fix memory leak when mapping fails
We allocated memory for the 'buf' when sending message to HWRM,
but we don't free it when mapping the address to IO address
fails. It will lead to memory leak.
Fixes: 19e6af01bb36 ("net/bnxt: support get/set EEPROM") Cc: stable@dpdk.org Signed-off-by: Yunjian Wang <wangyunjian@huawei.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Suanming Mou [Thu, 3 Dec 2020 02:18:52 +0000 (04:18 +0200)]
net/mlx5: optimize hash list entry memory
Currently, the hash list saves the hash key in the hash entry. And the
key is mostly used to get the bucket index only.
Save the entire 64 bits key to the entry will not be a good option if
the key is only used to get the bucket index. Since 64 bits costs more
memory for the entry, mostly the signature data in the key only uses
32 bits. And in the unregister function, the key in the entry causes
extra bucket index calculation.
This commit saves the bucket index to the entry instead of the hash key.
For the hash list like table, tag and mreg_copy which save the signature
data in the key, the signature data is moved to the resource data struct
itself.
Suanming Mou [Thu, 3 Dec 2020 02:18:51 +0000 (04:18 +0200)]
net/mlx5: optimize hash list synchronization
Since all the hash table operations are related with one dedicated
bucket, the hash table lock and gen_cnt can be allocated per-bucket.
Currently, the hash table uses one global lock to protect all the
buckets, that global lock avoids the buckets to be operated at one
time, it hurts the hash table performance. And the gen_cnt updated
by the entire hash table causes incorrect redundant list research.
This commit optimized the lock and gen_cnt to bucket solid allows
different bucket entries can be operated more efficiently.
Dekel Peled [Tue, 24 Nov 2020 13:45:35 +0000 (15:45 +0200)]
net/mlx5: fix shared age action validation
Previous patch added support of shared age action.
This feature is supported on group 1 and higher, and validation was
added accordingly.
On FDB table the group 0 is skipped to improve performance.
As a result the mentioned validation is not relevant for transfer rules.
This patch adds the required check to ensure proper validation.
The rdma-core library uses callbacks to allocate and free memory
from DPDK. The memory allocation callback used the complicated
and incorrect way to get the NUMA socket ID from the context.
The context was wrong that might result in wrong socket ID
and allocating memory from wrong node.
The callbacks are assigned once as Infinibande device context
is created allowing early access to shared DPDK memory for all
Verbs internal objects need that.
Adding below APIs for axgbe
- axgbe_enable_rx_vlan_stripping: to enable vlan header stripping
- axgbe_disable_rx_vlan_stripping: to disable vlan header stripping
- axgbe_enable_rx_vlan_filtering: to enable vlan filter mode
- axgbe_disable_rx_vlan_filtering: to disable vlan filter mode
- axgbe_update_vlan_hash_table: crc calculation and hash table update
based on vlan values post filter enable
- axgbe_vlan_filter_set: setting of active vlan out of max 4K values
before doing hash update of same
- axgbe_vlan_tpid_set: setting of default tpid values
- axgbe_vlan_offload_set: a top layer function to call strip/filter etc
based on mask values
Ivan Malov [Tue, 1 Dec 2020 07:30:10 +0000 (10:30 +0300)]
common/sfc_efx/base: support alternative MAE match fields
If MAE slice is configured without conntrack support, outer
rules must match on IP SRC/DST. This isn't reported clearly
by the FW because IPv4 and IPv6 have separate SRC/DST pairs.
The FW reports status ALWAYS for all these four fields, and
having an all-zeros mask for either field prevents the spec
from being certified by the existing spec validation method.
Extend the spec validation to take the "alternative" fields
into account so that legitimate specs don't get turned down.
Fixes: ed15d7f8e064 ("common/sfc_efx/base: validate and compare outer match specs") Cc: stable@dpdk.org Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Currently, the queue stats mapping has the following problems:
1) Many PMD drivers don't support queue stats mapping. But there is no
failure message after executing the command "set stat_qmap rx 0 2 2".
2) Once queue mapping is set, unrelated and unmapped queues are also
displayed.
3) The configuration result does not take effect or can not be queried
in real time.
4) The mapping arrays, "tx_queue_stats_mappings_array" &
"rx_queue_stats_mappings_array" are global and their sizes are based
on fixed max port and queue size assumptions.
5) These record structures, 'map_port_queue_stats_mapping_registers()'
and its sub functions are redundant for majority of drivers.
6) The display of the queue stats and queue stats mapping is mixed
together.
Since xstats is used to obtain queue statistics, we have made the
following simplifications and adjustments:
1) If PMD requires and supports queue stats mapping, configure to driver
in real time by calling ethdev API after executing the command "set
stat_qmap rx/tx ...". If not, the command can not be accepted.
2) Based on the above adjustments, these record structures,
'map_port_queue_stats_mapping_registers()' and its sub functions can
be removed. "tx-queue-stats-mapping" & "rx-queue-stats-mapping"
parameters, and 'parse_queue_stats_mapping_config()' can be removed
too.
3) remove display of queue stats mapping in 'fwd_stats_display()' &
'nic_stats_display()', and obtain queue stats by xstats. Since the
record structures are removed, 'nic_stats_mapping_display()' can be
deleted.
Fixes: 4dccdc789bf4 ("app/testpmd: simplify handling of stats mappings error") Fixes: 013af9b6b64f ("app/testpmd: various updates") Fixes: ed30d9b691b2 ("app/testpmd: add stats per queue") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
RongQing Li [Wed, 25 Nov 2020 11:01:32 +0000 (19:01 +0800)]
net/af_xdp: optimize Rx mbuf allocation
While receiving packets, the max bunch number of mbufs are allocated
and if hardware does not receive the max bunch number packets, it
will free redundancy mbufs, this is low performance.
So optimize Rx performance, by allocating number of mbuf based on
result of xsk_ring_cons__peek, to avoid to redundancy allocation,
and free mbuf when receive packets.
And Rx cached_cons must be roll backed if fails to allocate mbuf.
Signed-off-by: RongQing Li <lirongqing@baidu.com> Signed-off-by: Dongsheng Rong <rongdongsheng@baidu.com> Acked-by: Ciara Loftus <ciara.loftus@intel.com>
net/bonding: fix port id validity check on parsing
If the port_id is equal to RTE_MAX_ETHPORTS, it should be considered
invalid. Additionally, UNUSED ports are also not valid port ids to be
used afterward.
To simplify following the ethdev API rules, use the exposed function
checking whether a port id is valid.
Fixes: 2efb58cbab6e ("bond: new link bonding library") Cc: stable@dpdk.org Signed-off-by: Gaetan Rivet <grive@u256.net> Acked-by: Min Hu (Connor) <humin29@huawei.com>
Joyce Kong [Mon, 21 Dec 2020 07:38:48 +0000 (15:38 +0800)]
rcu: use EAL memory barrier API
Use rte_atomic_thread_fence wrapper which has been provided for
__atomic_thread_fence builtins to support optimized code for
__ATOMIC_SEQ_CST memory order on x86 platforms.
Signed-off-by: Joyce Kong <joyce.kong@arm.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
mbuf: add C++ include guard for dynamic fields header
The header was missing the extern "C" directive which causes name
mangling of functions by C++ compilers, leading to linker errors
complaining of undefined references to these functions.
Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags") Cc: stable@dpdk.org Signed-off-by: Ashish Sadanandan <ashish.sadanandan@gmail.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>
Ferruh Yigit [Thu, 19 Nov 2020 11:58:57 +0000 (11:58 +0000)]
net/af_xdp: remove useless assignment
Assignment of function parameter 'umem' removed.
Fixes: f0ce7af0e182 ("net/af_xdp: remove resources when port is closed") Cc: stable@dpdk.org Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com>