Somnath Kotur [Mon, 26 Oct 2020 03:56:06 +0000 (20:56 -0700)]
net/bnxt: fix flow query count
Fix infinite loop in flow query count.
`nxt_resource_idx` could be zero in some cases which is invalid and
should be part of the while loop condition. Also synchronize access to
the flow db using the fdb_lock
Update ULP resource counts for Stingray device.
- FW needs some resources for normal operation. Account those
in the resource manager.
- Update the SR ULP requested resource counts to reflect
those available after AFM resources are accounted for.
- Add build option to select either 2 or 4 slot EM entries.
The default is 4 slot entries.
Signed-off-by: Peter Spreadborough <peter.spreadborough@broadcom.com> Signed-off-by: Farah Smith <farah.smith@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Farah Smith [Mon, 26 Oct 2020 03:56:04 +0000 (20:56 -0700)]
net/bnxt: add table scope to PF mapping
Add table scope to PF Mapping for SR and Wh+ devices.
Legacy devices require PF set of base addresses for EEM operation.
A table scope id is a logical construct and is mapped to the PF
associated with the communications channel used.
In the case of a VF, the parent PF is used.
Signed-off-by: Farah Smith <farah.smith@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Jay Ding [Mon, 26 Oct 2020 03:56:03 +0000 (20:56 -0700)]
net/bnxt: support two table scopes
Adding support for two table scopes. One for Exact Match tables
and other for External Exact Match tables.
New API to map a PARIF to an EEM table scope (set of Rx and Tx EEM
base addresses). It uses HWRM_TF_GLOBAL_CFG_SET HWRM to configure.
PARIF is handler to a partition of the physical port.
Adjustments to tf_global_cfg_set() to reduce overhead and nominal
name clarification.
Signed-off-by: Jay Ding <jay.ding@broadcom.com> Signed-off-by: Farah Smith <farah.smith@broadcom.com> Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
- Moved P4 chip specific code under the P4 directory
- Added P45 skeleton code for SR to build on
- Add SR support in TRUFLOW core layer.
The TRUFLOW core or the tf-core is a shim layer which communicates with
the CFA block in the hardware.
Signed-off-by: Peter Spreadborough <peter.spreadborough@broadcom.com> Signed-off-by: Jay Ding <jay.ding@broadcom.com> Reviewed-by: Farah Smith <farah.smith@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Qi Zhang [Tue, 20 Oct 2020 03:48:57 +0000 (11:48 +0800)]
net/ice/base: specify global RSS LUT id in get/set RSS LUT
There is no way to specify a global RSS lookup table (LUT) ID with the
current API and 0 is the only global LUT ID that can be supported since
it's hard coded.
Upcoming support to specify a global LUT ID will require this
flexibility. To fix this, update the API for ice_aq_get_rss_lut() and
ice_aq_set_rss_lut() to take the new structure
ice_aq_get_set_rss_params, which includes a global_lut_id member. A new
structure was introduced instead of adding another parameter to the
previously mentioned functions for 2 reasons:
1. Reduce the number of parameters passed to the functions.
2. Reduce the amount of change required if the arguments ever need to be
updated in the future.
Also, reduce duplicate code that was checking for an invalid vsi_handle
and lut parameter by moving the checks to the lower level
__ice_aq_get_set_rss_lut().
Qi Zhang [Tue, 20 Oct 2020 02:43:08 +0000 (10:43 +0800)]
net/ice/base: refactor RSS configure API
Use struct ice_rss_hash_cfg as parameter for
ice_add_rss_cfg, ice_add_rss_cfg_sync and
ice_rem_rss_cfg, ice_rem_rss_cfg_sync.
Introduce enmu ice_rss_cfg_hdr_type to allow user specify the more
flexible RSS configure.
ICE_RSS_OUTER_HEADERS - take outer layer as RSS inputset
ICE_RSS_INNER_HEADERS - take inner layer as RSS inputset
ICE_RSS_INNER_HEADERS_W_OUTER_IPV4
- take inner layer as RSS inputset for packet with outer IPV4
ICE_RSS_INNER_HEADERS_W_OUTER_IPV6
- take inner layer as RSS inputset for packet with outer IPV6
ICE_RSS_ANY_HEADERS - try with outer first then inner
(same as the behaviour without this change)
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 20 Oct 2020 01:26:08 +0000 (09:26 +0800)]
net/ice/base: remove duplicated AQ command flag setting
When sending the indirect Read/Write SFF EEPROM AQ command. The flag is
already added later in the code flow for all indirect AQ commands, i.e.
commands that provide an additional data buffer.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 20 Oct 2020 01:00:40 +0000 (09:00 +0800)]
net/ice/base: recognize 860 as iSCSI port in CEE mode
iSCSI can use both TCP ports 860 and 3260. However, in our current
implementation, the ice_aqc_opc_get_cee_dcb_cfg (0x0A07) AQ command
doesn't provide a way to communicate the protocol port number to the
AQ's caller. Thus, we assume that 3260 is the iSCSI port number at the
AQ's caller layer.
In this patch, we will rely on the dcbx-willing mode, desired QOS and
remote QOS configuration to determine which port number that iSCSI will
use.
Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 20 Oct 2020 00:52:03 +0000 (08:52 +0800)]
net/ice/base: implement shared rate limiter
Implemented shared bandwidth rate limit functionality to account for
dedicated bandwidth and minimum bandwidth. It requires non default
profile be programmed for CIR, EIR/PIR, and SRL.
Currently QSFP/SFP modules up to power class 4 are supported.
100G modules require higher power in many cases.
Also, low power mode requires support of power classes 7 and even 8.
This change extends "Get Link Status" AQ command (0x0607) to
support class 5+ modules.
The patch also add couple other missing bits for link status.
Signed-off-by: Shay Amir <shay.amir@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
net/ice/base: use package info from ice segment metadata
There are two package versions in the package binary. Today, these two
version numbers are the same. However, in the future that may change.
Update code to use the package info from the ice segment metadata
section, which is the package information that is actually downloaded to
the firmware during the download package process.
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
net/ice/base: allocate and free RSS global lookup table
Currently there is no API to allocate and free a RSS global LUT.
Incoming changes to support VFs having >16 queues will require using
RSS global LUT resources. The functions included will allow a PF to
configure a RSS global LUT for VFs that request >16 queues.
The main NVM module and the Option ROM module contain a security
revision in their CSS header. This security revision is used to
determine whether or not the signed module should be loaded at bootup.
If the module security revision is lower than the associated minimum
security revision, it will not be loaded.
The CSS header does not have a module id associated with it, and thus
requires flat NVM reads in order to access it. To do this, take
advantage of the cached bank information. Introduce a new
"ice_read_flash_module" function that takes the module and bank to read.
Implement both ice_read_active_nvm_module and
ice_read_active_orom_module. These functions will use the cached values
to determine the active bank and calculate the appropriate offset.
Using these new access functions, extract the security revision for both
the main NVM bank and the Option ROM into the associated info structure.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Added NVM Write Admin Command (0x703) ARQ response flags - as
returned in "Response flags" field.
Three flags are supported: POR, PERST and EMPR. All indicate the
type of reset required to get the NVM bank update effective.
Signed-off-by: Shay Amir <shay.amir@intel.com> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Acked-by: Qiming Yang <qiming.yang@intel.com>
Haiyue Wang [Thu, 29 Oct 2020 01:13:22 +0000 (09:13 +0800)]
net/ice: fix DCF crash on Rx
The initialization of selecting the handler for scalar Rx path FlexiMD
fields extraction into mbuf is missed, it will cause segmentation fault
(core dumped).
Also add the missed support to handle RXDID 16, which has RSS hash value
on Qword 1.
Thomas Monjalon [Tue, 27 Oct 2020 13:20:22 +0000 (14:20 +0100)]
ethdev: move non-offload capabilities
The definitions of RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP
and RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP were inserted
before the last comment of Tx offloads.
It is moved in a better place,
with comments moved to be before the definition.
A group comment is added to better describe device capabilities.
Fixes: cac923cfea47 ("ethdev: support runtime queue setup") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Beilei Xing [Tue, 27 Oct 2020 06:21:47 +0000 (14:21 +0800)]
net/i40e: fix flow director for eth + VLAN pattern
Currently, can't create more than one following flow for
ETH + VLAN pattern.
1. flow create 0 ingress pattern eth / vlan vid is 350 / end
actions queue index 2 / end
2. flow create 0 ingress pattern eth / vlan vid is 351 / end
actions queue index 3 / end
The root cause is the vlan_tci is not set correctly, it will
cause the keys of both of the two flows are the same.
Fixes: 42044b69c67d ("net/i40e: support input set selection for FDIR") Cc: stable@dpdk.org Signed-off-by: Beilei Xing <beilei.xing@intel.com> Acked-by: Jeff Guo <jia.guo@intel.com>
Add rte_eth_dev_info->rx_seg_capa parameters:
- receiving to multiple pools is supported
- buffer offsets are supported
- no offset alignment requirement
- reports the maximal number of segments
- reports the buffer split offload flag
Only the regular rx_burst routine is updated to support split,
because the vectorized ones does not support scatter and MPRQ
does not support split at all.
The split feature for receiving packets was added to the mlx5
PMD, now Rx queue can receive the data to the buffers belonging
to the different pools and the memory of all the involved pool
must be registered for DMA operations in order to allow hardware
to store the data.
The scatter-gather elements should be configured
accordingly to support the buffer split feature.
The application provides the desired settings for
the segments at the beginning of the packets and
PMD pads the buffer chain (if needed) with attributes
of last specified segment to accommodate the packet
of maximal length.
There are some limitations are implied. The MPRQ
feature should be disengaged if split is requested,
due to MPRQ neither supports pushing data to the
dedicated pools nor follows the flexible buffer sizes.
The vectorized rx_burst routines does not support
the scattering (these ones are extremely simplified
and work over the single segment only) and can't
handle split as well.
The routine to provide Rx queue setup with specifying
extended receiving buffer description is added.
It allows application to specify desired segment
lengths, data position offsets in the buffer
and dedicated memory pool for each segment.
Gregory Etelson [Sun, 25 Oct 2020 14:08:09 +0000 (16:08 +0200)]
net/mlx5: implement tunnel offload
Tunnel Offload API provides hardware independent, unified model
to offload tunneled traffic. Key model elements are:
- apply matches to both outer and inner packet headers
during entire offload procedure;
- restore outer header of partially offloaded packet;
- model is implemented as a set of helper functions.
Implementation details:
* tunnel_offload PMD parameter must be set to 1 to enable the feature.
* application cannot use MARK and META flow actions with tunnel.
* offload JUMP action is restricted to steering tunnel rule only.
Andrey Vesnovaty [Fri, 23 Oct 2020 10:24:10 +0000 (13:24 +0300)]
net/mlx5: support shared action for RSS
Implement shared action create/destroy/update/query. The current
implementation support is limited to shared RSS action only. The shared
RSS action create operation prepares hash RX queue objects for all
supported permutations of the hash. The shared RSS action update
operation relies on functionality to modify hash RX queue introduced in
one of the previous commits in this patch series.
Implement RSS shared action and handle shared RSS on flow apply and
release. The lookup for hash RX queue object for RSS action is limited
to the set of objects stored in the shared action itself and when
handling shared RSS action. The lookup for hash RX queue object inside
shared action is performed by hash only.
Current implementation limited to DV flow driver operations i.e. verbs
flow driver operations doesn't support shared action.
Andrey Vesnovaty [Fri, 23 Oct 2020 10:24:09 +0000 (13:24 +0300)]
net/mlx5: translate shared action for RSS action
Handle shared action on flow validation/creation/destruction.
mlx5 PMD translates shared action into a regular one before handling
flow validation/creation. The shared action translation applied to
utilize the same execution path for both shared and regular actions.
The current implementation supports shared action translation for shared
RSS action only.
RSS action validation split to validate shared RSS action on its
creation in addition to action validation in flow validation/creation
path.
Implement rte_flow shared action API for mlx5 PMD, mostly forwarding
calls to flow driver operations (see struct mlx5_flow_driver_ops).
Andrey Vesnovaty [Fri, 23 Oct 2020 10:24:08 +0000 (13:24 +0300)]
net/mlx5: modify hash Rx queue objects
Implement modification for hashed table of Rx queue object (see
mlx5_hrxq_modify()). This implementation relies on the capability to
modify TIR object via DevX API, i.e. current implementation doesn't
support verbs HW object operations. The functionality to modify hashed
table of Rx queue object is prerequisite to implement
rete_flow_shared_action_update() for shared RSS action in mlx5 PMD.
Andrey Vesnovaty [Fri, 23 Oct 2020 10:24:07 +0000 (13:24 +0300)]
common/mlx5: modify advanced Rx object via DevX
Implement TIR modification (see mlx5_devx_cmd_modify_tir()) using DevX
API. TIR is the object containing the hashed table of Rx queue. The
functionality to configure/modify this HW-related object is prerequisite
to implement rete_flow_shared_action_update() for shared RSS action in
mlx5 PMD. HW-related structures for TIR modification add in mlx5_prm.h.
Jesse Brandeburg [Fri, 23 Oct 2020 20:22:00 +0000 (13:22 -0700)]
net/ice: update writeback policy to reduce latency
Just like iavf, setting the value to 2us allows for generally good
streaming packet performance while keeping latency down, and
generally keeps the performance of the PF and VF interfaces similar.
The previous value of 0x10 was making latency on a single packet
receive be as much as 16us.
Fixes: 65dfc889d86b ("net/ice: support Rx queue interruption") Cc: stable@dpdk.org Reported-by: Brian Johnson <brian.johnson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Jesse Brandeburg [Fri, 23 Oct 2020 20:21:59 +0000 (13:21 -0700)]
net/iavf: fix performance with writeback policy
The iavf driver was trying to use writeback on ITR, but was
never setting an ITR, so it didn't work. This caused performance
to be limited due to too much PCIe traffic and partial writes
during most benchmarking workloads.
Set the ITR during queue setup, which can be checked at runtime
by reading register 0x2800. Setting the value to 2us allows
for generally good streaming packet performance while keeping
latency down.
Fixes: d6bde6b5eae9 ("net/avf: enable Rx interrupt") Cc: stable@dpdk.org Reported-by: Brian Johnson <brian.johnson@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Wei Huang [Fri, 23 Oct 2020 08:59:59 +0000 (04:59 -0400)]
raw/ifpga/base: enhance driver reliability in multi-process
Current hardware protection is based on pthread mutex which
work just for situation of multi-thread in one process. In
multi-process environment, hardware state machine would be
corrupted by concurrent access, that means original pthread
mutex mechanism need be enhanced.
The major modifications in this patch are list below:
1. Create a mutex for adapter in shared memory named
"mutex.IFPGA:domain:bus:dev.func" when device is probed.
2. Create a shared memory named "IFPGA:domain:bus:dev.func" during opae
adapter is initializing. There is a reference count in shared memory.
Shared memory will be destroyed once reference count turned to zero.
3. Two mutexs are created in shared memory and initialized with flag
PTHREAD_PROCESS_SHARED. One for SPI and the other for I2C. They will
be passed to SPI and I2C driver subsequently.
4. DTB data in flash will be cached in shared memory. Then MAX10 driver
can read DTB from shared memory instead of flash. This avoid
confliction of concurrent flash access between hardware and software.
Wei Huang [Fri, 23 Oct 2020 08:59:58 +0000 (04:59 -0400)]
raw/ifpga/base: free resources when destroying device
Add two functions to complete the resource free work, one is
'ifpga_adapter_destroy()', the other is 'ifpga_bus_uinit()'.
Then call 'opae_adapter_destroy()' and 'opae_adapter_data_free()'
in 'ifpga_rawdev_close()' to free resources.
Also 'opae_adapter_free()' is removed from 'ifpga_rawdev_destroy()',
because opae adapter is pointed by dev_private member in raw_dev,
it will be freed in 'rte_rawdev_pmd_release()'.
Interrupt handler copied to the local 'intr_handle' variable by value
before passing it to IRQ functions.
This leads IRQ functions update the local variable instead of
'ifpga_irq_handle'.
Instead, using 'intr_handle' local variable as pointer to
'ifpga_irq_handle' as intended.