examples/ipsec-secgw: update eth header during route lookup
Update ethernet header during route lookup instead of doing
way later while performing Tx burst. Advantages to doing
is at route lookup is that no additional IP version checks
based on packet data are needed and packet data is already
in cache as route lookup is already consuming that data.
This is also useful for inline protocol offload cases
of v4inv6 or v6inv4 outbound tunnel operations as
packet data will not have any info about what is the tunnel
protocol.
examples/ipsec-secgw: get security context from lcore conf
Store security context pointer in lcore Rx queue config and
get it from there in fast path for better performance.
Currently rte_eth_dev_get_sec_ctx() which is meant to be control
path API is called per packet basis. For every call to that
API, ethdev port status is checked.
examples/ipsec-secgw: allow larger burst size for vectors
Allow larger burst size of vector event mode instead of restricting
to 32. Also restructure traffic type struct to have num pkts first
so that it is always in first cacheline. Also cache align
traffic type struct. Since MAX_PKT_BURST is not used by
vector event mode worker, define another macro for its burst
size so that poll mode perf is not effected.
examples/ipsec-secgw: use HW parsed packet type in poll mode
Use HW parsed packet type when ethdev supports necessary protocols.
If packet type is not supported, then register ethdev callbacks
for parse packet in SW. This is better for performance as it
effects fast path.
examples/ipsec-secgw: disable Tx checksum for inline
Enable Tx IPv4 checksum offload only when Tx inline crypto, lookaside
crypto/protocol or cpu crypto is needed.
For Tx Inline protocol offload, checksum computation
is implicitly taken care by HW.
Rahul Bhansali [Thu, 19 May 2022 13:28:30 +0000 (18:58 +0530)]
config/arm: disable SVE ACLE for CN10K
This disable the sve_acle flag for cn10k.
For native build, -Dplatform=cn10k will require to
get sve_acle flag parameter in the build.
Performance impact:-
With l3fwd example, lpm lookup performance increased
by ~21% if Neon is used instead of SVE. Hence, disabled
sve_acle flag for cn10k.
Rahul Bhansali [Thu, 19 May 2022 13:28:29 +0000 (18:58 +0530)]
config/arm: add SVE ACLE control flag
An additional check of control flag sve_acle for
RTE_HAS_SVE_ACLE macro to be part of the build.
If any SoC config doesn't have sve_acle flag parameter
then default it will be considered as true.
Signed-off-by: Rahul Bhansali <rbhansali@marvell.com> Reviewed-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Juraj Linkeš <juraj.linkes@pantheon.tech> Acked-by: Ruifeng Wang <ruifeng.wang@arm.com>
Thomas Monjalon [Wed, 25 May 2022 09:53:07 +0000 (11:53 +0200)]
eal/ppc: undefine AltiVec keyword vector
The AltiVec header file is defining "vector", except in C++ build.
The keyword "vector" may conflict easily.
As a rule, it is better to use the alternative keyword "__vector".
The DPDK header file rte_altivec.h takes care of undefining "vector",
so the applications and dependencies are free to define the name "vector".
This is a compatibility breakage for applications which were using
the keyword "vector" for its AltiVec meaning.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Tested-by: Ali Alnubani <alialnu@nvidia.com> Acked-by: Tyler Retzlaff <roretzla@linux.microsoft.com> Acked-by: Ray Kinsella <mdr@ashroe.eu>
Add support for hash functions that compute a signature for an array
of bytes read from a packet header or meta-data. Useful for flow
affinity-based load balancing.
Previously, on lookup hit, the hit key had its timer automatically
rearmed with the same timeout in order to prevent its expiration. Now,
a broader set of actions is available on lookup hit, which has to be
managed explicitly: the key can have its timer rearmed with the same
or with a different timeout, or the key timer can be left unmodified.
The latter option allows the key to expire naturally when the timer
eventually runs out, unless the key is hit again and its timer rearmed
at that point. Needed by the TCP connection tracking state machine.
Add support for packet recirculation. The current packet is flagged
for recirculation using the new "recirculate" instruction; on TX, this
flag causes the packet to execute the full pipeline again as if it was
a new packet, except the packet meta-data is preserved. The new
"recircid" instruction can be used to read the pass number in case the
packet goes several times through the pipeline.
The packet mirroring is configured through slots and sessions, with
the number of slots and sessions set at init.
The new "mirror" instruction assigns one of the existing sessions to a
specific slot, which results in scheduling a mirror operation for the
current packet to be executed later at the time the packet is either
transmitted or dropped.
Several copies of the same input packet can be mirrored to different
output ports by using multiple slots.
Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com> Signed-off-by: Kamalakannan R <kamalakannan.r@intel.com>
Add support for arguments to the default action of regular and learner
tables at initialization time. Until now, only default actions with no
arguments were accepted in the .spec file.
pipeline: fix emit instruction for invalid headers
Fix the emit instruction for the pathological case of all headers to
be emitted being invalid. In this case, the for loop was essentially
skipped and the last emitted header (or an invalid memory location)
getting corrupted by setting its size to 0 through the assignment to
ho->n_bytes right after the for loop.
Shell used in documentation generation could not run on Windows.
Rewrite scripts in Python.
New scripts use proper path separators and handle paths with spaces.
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
API documentation index had spaces between link caption and URL,
which may be unsupported by some Markdown implementations.
That is, "[caption](URL)" is valid but "[caption] (URL)" is not.
The problematic behavior is observed with Doxygen on Windows.
Remove the spaces.
Unfortunately, Markdown syntax is not formally specified.
Fixes: 9bf486e606b0 ("doc: generate HTML for API with doxygen") Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
CSS for API documentation was customized by a shell script
modifying the file that Doxygen produces.
This way CSS code is kept in a script and an extra build step is added.
Move custom style to a plain CSS file.
Use Doxygen capability to attach this extra stylesheet.
If a /32 route is entered in the RIB the code to traverse
will not see end of the tree. This is due to trying
to do a negative shift which is an undefined in C.
Fix by checking for max depth as is already done in rib6.
Fixes: 5a5793a5ffa2 ("rib: add RIB library") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Huichao Cai [Fri, 15 Apr 2022 03:26:50 +0000 (11:26 +0800)]
ip_frag: add IPv4 options fragment
According to RFC791,the options may appear or not in datagrams.
They must be implemented by all IP modules (host and gateways).
What is optional is their transmission in any particular datagram,
not their implementation. So we have to deal with it during the
fragmenting process.
Add some test data for the IPv4 header optional field fragmenting.
Signed-off-by: Huichao Cai <chcchc88@163.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Nipun Gupta [Thu, 5 May 2022 09:05:18 +0000 (14:35 +0530)]
dma/dpaa2: introduce driver skeleton
The DPAA2 DMA driver is an implementation of the dmadev APIs,
that provide means to initiate a DMA transaction from CPU.
Earlier this was part of RAW driver, but with DMA drivers
added as separate flavor of drivers, this driver is being
moved to DMA drivers.
app/acl: support different formats for IPv6 address
Within ACL rule IPv6 address can be represented in different ways:
either as 4x4B fields, or as 2x8B fields.
Till now, only first format was supported.
Extend test-acl to support both formats, mainly for testing and
demonstrating purposes.
To control desired behavior '--ipv6' command-line option is extended
to accept an optional argument:
To be more precise:
'--ipv6' - use 4x4B fields format (default behavior)
'--ipv6=4B' - use 4x4B fields format (default behavior)
'--ipv6=8B' - use 2x8B fields format
Also replaced home brewed IPv4/IPv6 address parsing with inet_pton() calls.
Signed-off-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
In theory ACL library allows fields with 8B long.
Though in practice they usually not used, not tested,
and as was revealed by Ido, this functionality is not working properly.
There are few places inside ACL build code-path that need to be addressed.
Thomas Monjalon [Tue, 3 May 2022 12:03:21 +0000 (14:03 +0200)]
avoid AltiVec keyword vector
The AltiVec header file is defining "vector", except in C++ build.
The keyword "vector" may conflict easily.
As a rule, it is better to use the alternative keyword "__vector",
so we will be able to #undef vector after including AltiVec header.
Later it may become possible to #undef vector in rte_altivec.h
with a compatibility breakage.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
Current pmd_perf_autotest() in continuous mode tries
to enqueue MAX_TRAFFIC_BURST completely before starting
the test. Some drivers cannot accept complete
MAX_TRAFFIC_BURST even though rx+tx desc count can fit it.
This patch changes behaviour to stop enqueuing after few
retries.
Fixes: 002ade70e933 ("app/test: measure cycles per packet in Rx/Tx") Cc: stable@dpdk.org Signed-off-by: Rakesh Kudurumalla <rkudurumalla@marvell.com>
Elena Agostini [Fri, 29 Apr 2022 14:14:23 +0000 (14:14 +0000)]
gpu/cuda: unmap GPU memory while freeing
Enable GPU_REGISTERED flag in gpu/cuda driver in the memory list.
If a GPU memory address CPU mapped is freed before being
unmapped, CUDA driver unmaps it before freeing the memory.
Signed-off-by: Elena Agostini <eagostini@nvidia.com>
David Marchand [Fri, 20 May 2022 18:10:50 +0000 (19:10 +0100)]
eal/freebsd: fix use of newer cpuset macros
FreeBSD has updated its CPU macros to align more with the definitions
used on Linux[1]. Unfortunately, while this makes compatibility better
in future, it means we need to have both legacy and newer definition
support. Use a meson check to determine which set of macros are used.
Forcing inlining in test_ring_enqueue and test_ring_dequeue can cause
the compiled code to grow extensively when compiled with no optimization
(-O0 or -Og). This is default in the meson's debug configuration. This
can collide with compiler bugs and cause issues during linking of unit
tests where the api_type or esize are non-const variables causing
inlining cascade. In perf tests this is not the case in perf-tests as
esize and api_type are const values.
One such case was discovered when porting DPDK to RISC-V. GCC 11.2 (and
no fix still in 12.1) is generating a short relative jump instruction
(J <offset>) for goto and for loops. When loop body grows extensively in
ring test, the target offset goes beyond supported offfset of +/- 1MB
from PC. This is an obvious bug in the GCC as RISC-V has a
two-instruction construct to jump to any absolute address (AUIPC+JALR).
However there is no reason to force inlining as the test code works
perfectly fine without it.
GCC has a bug report for a similar case (with conditionals):
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93062
Fixes: a9fe152363e2 ("test/ring: add custom element size functional tests") Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com> Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com> Acked-by: Konstantin Ananyev <konstantin.v.ananyev@yandex.ru>
The lpm_process_event_pkt() can either process a packet using an
architecture specific (defined for X86/SSE, ARM/Neon and PPC64/Altivec)
path or a scalar one. The choice is however done using an ifdef
pre-processor macro. Because of that the scalar version was apparently
not widely exercised/compiled.
Due to some copy/paste errors, the scalar logic in
lpm_process_event_pkt() retained a "continue" statement where it should
utilize rfc1812_process() and return the port/BAD_PORT.
Fixes: 99fc91d18082 ("examples/l3fwd: add event lpm main loop") Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
Luc Pelletier [Fri, 25 Feb 2022 16:38:05 +0000 (11:38 -0500)]
eal/x86: fix unaligned access for small memcpy
Calls to rte_memcpy for 1 < n < 16 could result in unaligned
loads/stores, which is undefined behaviour according to the C
standard, and strict aliasing violations.
The code was changed to use a packed structure that allows aliasing
(using the __may_alias__ attribute) to perform the load/store
operations. This results in code that has the same performance as the
original code and that is also C standards-compliant.
Fixes: af75078fece3 ("first public release") Cc: stable@dpdk.org Signed-off-by: Luc Pelletier <lucp.at.work@gmail.com> Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Tested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Tyler Retzlaff [Thu, 12 May 2022 13:14:29 +0000 (06:14 -0700)]
eal: get/set thread affinity per thread identifier
Implement functions for getting/setting thread affinity.
Threads can be pinned to specific cores by setting their
affinity attribute.
Windows error codes are translated to errno-style error codes.
The possible return values are chosen so that we have as
much semantic compatibility between platforms as possible.
note: convert_cpuset_to_affinity has a limitation that all cpus of
the set belong to the same processor group.
Pavan Nikhilesh [Fri, 13 May 2022 18:27:10 +0000 (23:57 +0530)]
hash: unify CRC32 selection for x86 and Arm
Merge CRC32 hash calculation public API implementation for x86 and Arm.
Select the best available CRC32 algorithm when unsupported algorithm
on a given CPU architecture is requested by an application.
Previously, if an application directly includes `rte_crc_arm64.h`
without including `rte_hash_crc.h` it will fail to compile.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Shijith Thotton [Mon, 16 May 2022 17:35:48 +0000 (23:05 +0530)]
eventdev: add weight and affinity to queue attributes
Extended eventdev queue QoS attributes to support weight and affinity.
If queues are of the same priority, events from the queue with highest
weight will be scheduled first. Affinity indicates the number of times,
the subsequent schedule calls from an event port will use the same event
queue. Schedule call selects another queue if current queue goes empty
or schedule count reaches affinity count.
To avoid ABI break, weight and affinity attributes are not yet added to
queue config structure and rely on PMD for managing it. New eventdev op
queue_attr_get can be used to get it from the PMD.
Signed-off-by: Shijith Thotton <sthotton@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Shijith Thotton [Mon, 16 May 2022 17:35:47 +0000 (23:05 +0530)]
eventdev: support setting queue attributes at runtime
Added a new eventdev API rte_event_queue_attr_set(), to set event queue
attributes at runtime from the values set during initialization using
rte_event_queue_setup(). PMD's supporting this feature should expose the
capability RTE_EVENT_DEV_CAP_RUNTIME_QUEUE_ATTR.
Signed-off-by: Shijith Thotton <sthotton@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Pavan Nikhilesh [Fri, 13 May 2022 17:58:39 +0000 (23:28 +0530)]
eventdev: quiesce an event port
Add function to quiesce any core specific resources consumed by
the event port.
When the application decides to migrate the event port to another lcore
or teardown the current lcore it may to call `rte_event_port_quiesce`
to make sure that all the data associated with the event port are released
from the lcore, this might also include any prefetched events.
While releasing the event port from the lcore, this function calls the
user-provided flush callback once per event.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Pavan Nikhilesh [Fri, 13 May 2022 16:07:19 +0000 (21:37 +0530)]
examples/ipsec-secgw: cleanup worker state before exit
Event ports are configured to implicitly release the scheduler contexts
currently held in the next call to rte_event_dequeue_burst().
A worker core might still hold a scheduling context during exit as the
next call to rte_event_dequeue_burst() is never made.
This might lead to deadlock based on the worker exit timing and when
there are very less number of flows.
Add a cleanup function to release any scheduling contexts held by the
worker by using RTE_EVENT_OP_RELEASE.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Pavan Nikhilesh [Fri, 13 May 2022 16:07:18 +0000 (21:37 +0530)]
examples/l2fwd-event: clean up worker state before exit
Event ports are configured to implicitly release the scheduler contexts
currently held in the next call to rte_event_dequeue_burst().
A worker core might still hold a scheduling context during exit, as the
next call to rte_event_dequeue_burst() is never made.
This might lead to deadlock based on the worker exit timing and when
there are very less number of flows.
Add clean up function to release any scheduling contexts held by the
worker by using RTE_EVENT_OP_RELEASE.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Pavan Nikhilesh [Fri, 13 May 2022 16:07:17 +0000 (21:37 +0530)]
examples/l3fwd: clean up worker state before exit
Event ports are configured to implicitly release the scheduler contexts
currently held in the next call to rte_event_dequeue_burst().
A worker core might still hold a scheduling context during exit, as the
next call to rte_event_dequeue_burst() is never made.
This might lead to deadlock based on the worker exit timing and when
there are very less number of flows.
Add clean up function to release any scheduling contexts held by the
worker by using RTE_EVENT_OP_RELEASE.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Pavan Nikhilesh [Fri, 13 May 2022 16:07:16 +0000 (21:37 +0530)]
examples/eventdev: clean up worker state before exit
Event ports are configured to implicitly release the scheduler contexts
currently held in the next call to rte_event_dequeue_burst().
A worker core might still hold a scheduling context during exit, as the
next call to rte_event_dequeue_burst() is never made.
This might lead to deadlock based on the worker exit timing and when
there are very less number of flows.
Add clean up function to release any scheduling contexts held by the
worker by using RTE_EVENT_OP_RELEASE.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Pavan Nikhilesh [Fri, 13 May 2022 16:07:15 +0000 (21:37 +0530)]
app/eventdev: clean up worker state before exit
Event ports are configured to implicitly release the scheduler contexts
currently held in the next call to rte_event_dequeue_burst().
A worker core might still hold a scheduling context during exit, as the
next call to rte_event_dequeue_burst() is never made.
This might lead to deadlock based on the worker exit timing and when
there are very less number of flows.
Add clean up function to release any scheduling contexts held by the
worker by using RTE_EVENT_OP_RELEASE.
Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
Pavan Nikhilesh [Fri, 13 May 2022 16:07:14 +0000 (21:37 +0530)]
app/eventdev: simplify signal handling and teardown
Remove rte_*_dev calls from signal handler callback as signal handlers
are supposed to be light weight.
Split ethdev teardown into Rx and Tx sections, wait for
workers to finish processing after disabling Rx to allow workers
to complete processing currently held packets.
Verified SW event device on ARM64 using the following command:
Updated to allow overriding the default CQ depth of 32. Since there are
only 2048 DLB history list entries, increasing the CQ depth decreases
the number of available LDB ports to 2048/max_cq_depth. Resource query
will take this into account and return the correct maximum number of
LDB ports.
Volodymyr Fialko [Fri, 25 Mar 2022 10:59:39 +0000 (11:59 +0100)]
event/cnxk: fix base pointer for SSO head wait
Function roc_sso_hws_head_wait() expects a base as input pointer, and it
will itself get tag_op from the base. By passing tag_op instead of base
pointer to this function will add SSOW_LF_GWS_TAG register offset twice,
which will lead to accessing wrong register.
Fixes: 1f5b3d55c041 ("event/cnxk: store and reuse workslot status") Cc: stable@dpdk.org Signed-off-by: Volodymyr Fialko <vfialko@marvell.com> Acked-by: Jerin Jacob <jerinj@marvell.com>
David Marchand [Thu, 24 Mar 2022 15:28:30 +0000 (16:28 +0100)]
eventdev/eth_rx: fix telemetry Rx stats reset
Caught by covscan:
1. dpdk-21.11/lib/eventdev/rte_event_eth_rx_adapter.c:3279:
logical_vs_bitwise: "~(*__ctype_b_loc()[(int)*params] & 2048 /*
(unsigned short)_ISdigit */)" is always 1/true regardless of the values
of its operand. This occurs as the logical second operand of "||".
2. dpdk-21.11/lib/eventdev/rte_event_eth_rx_adapter.c:3279: remediation:
Did you intend to use "!" rather than "~"?
While isdigit return value should be compared as an int to 0,
prefer ! since all of this file uses this convention.
Fixes: 814d01709328 ("eventdev/eth_rx: support telemetry") Cc: stable@dpdk.org Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Raja Zidane [Thu, 12 May 2022 09:17:11 +0000 (12:17 +0300)]
net/mlx5: support ESP SPI match and RSS hash
In packets with ESP header, the inner IP will be encrypted, and
its fields cannot be used for RSS hashing. So, ESP packets
can be hashed only by the outer IP layer.
So, when using RSS on ESP packets, hashing may not be efficient,
because the fields used by the hash functions are only the outer IPs,
causing all traffic belonging to all tunnels between a given
pair of GWs to land on one core.
Adding the SPI hash field can extend the spreading of IPsec packets.
Signed-off-by: Raja Zidane <rzidane@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
When a meter with RSS action being used, there might be several
sub-flows using different sub-policies in the flow splitting stage.
If there's no green action, there's an error that will always use the
same sub-policy for all sub-flows, some resources will be
overwritten and cannot be released, leading assert during port close.
This patch fixes this issue by checking both green and yellow queue
index during getting a blank sub-policy, to avoid the incorrect
resource overwrite.
Fixes: b38a12272b3a ("net/mlx5: split meter color policy handling") Cc: stable@dpdk.org Signed-off-by: Shun Hao <shunh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Invoking bnxt_link_update_op() with wait_for_completion set would
result in the driver waiting for 10s in case the port link is down to
complete port initialization (dev_start_op()).
Change it by not waiting for the completion when invoking it in
dev_start_op()
In bnxt_free_all_filters(), all the filters attached to a vnic are removed.
But each of these filters hold a backreference ptr to the vnic and they
need to be reset to NULL now. Otherwise, during a normal testpmd quit, as
part of dev_close_op(), first bnxt_free_all_filters() is invoked in
dev_stop, followed by bnxt_free_filter_mem() from bnxt_uninit_resources(),
which finds a filter with a vnic back reference ptr and now
bnxt_hwrm_clean_up_l2_filter() also tries to remove the filter from the
vnic's filter list which was already done as part of
bnxt_free_all_filters().
Kalesh AP [Wed, 27 Apr 2022 14:58:19 +0000 (20:28 +0530)]
net/bnxt: recheck FW readiness if in reset process
If Firmware is still in reset process and returns the error
HWRM_ERR_CODE_HOT_RESET_PROGRESS, retry VER_GET command.
We have to do it in bnxt_handle_if_change_status().
Fixes: 0b533591238f ("net/bnxt: inform firmware about IF state changes") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 27 Apr 2022 14:58:18 +0000 (20:28 +0530)]
net/bnxt: fix link status when port is stopped
Driver forces link down during port stop. But device is not obliged
link down in certain scenarios, even when forced. In that case,
subsequent link queries returns link as up.
Fixed to return link status as down when port is stopped.
Driver is already doing that for VF/NPAR/MH functions.
Kalesh AP [Wed, 27 Apr 2022 14:58:17 +0000 (20:28 +0530)]
net/bnxt: force PHY update on certain configurations
Device is not obliged link down in certain scenarios, even
when forced. When FW does not allow any user other than the BMC
to shutdown the port, bnxt_get_hwrm_link_config() call always
returns link up. Force phy update always in that case,
else user configuration for speed/autoneg would not get applied
correctly.
Fixes: 7bc8e9a227cc ("net/bnxt: support async link notification") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Wed, 27 Apr 2022 14:58:16 +0000 (20:28 +0530)]
net/bnxt: fix speed autonegotiation
The "active_fec_signal_mode" in HWRM_PORT_PHY_QCFG response
does not return correct value till the link is up. Driver cannot
rely on active_fec_signal_mode while setting autoneg speed.
While setting autoneg speed, driver is currently checking only
"auto_link_speed_mask". Fixed to check "auto_pam4_link_speed_mask"
as well. Also, while setting auto mode and setting speed mask,
driver will have to set both NRZ and PAM4 mask.
Kalesh AP [Wed, 27 Apr 2022 14:58:15 +0000 (20:28 +0530)]
net/bnxt: avoid unnecessary endianness conversion
The "active_fec_signal_mode" in HWRM_PORT_PHY_QCFG response is uint8_t.
So no need of endianness conversion while parsing response.
Also, signal_mode is the first 4bits of "active_fec_signal_mode".
net/bnxt: handle queue stop during RSS flow create
The programming of the RSS table was not taking into account if
any of the queues in the set were stopped prior to the flow
creation, hence leading to a vnic RSS config cmd failure thrown by
the FW.
Fix by programming only the active queues in the RSS action queue
set.
Kalesh AP [Wed, 27 Apr 2022 14:58:13 +0000 (20:28 +0530)]
net/bnxt: check duplicate queue IDs
Currently driver does not have a check for duplicate queue ids.
User must either specify all Rx queues created or no queues in the
flow create command. Repeating the queue index is invalid.
Also, moved the check for invalid queue to the beginning of the function.
When an Rx queue is stopped and restarted, as part of that workflow,
for cards that have ring groups, we free and reallocate the ring group.
This new ring group is not communicated to the VNIC though via
HWRM_VNIC_CFG cmd.
Fix to issue HWRM_VNIC_CFG cmd on all adapters now in this scenario.
Fixes: ed0ae3502fc9 ("net/bnxt: update ring group after ring stop start") Cc: stable@dpdk.org Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Kalesh AP [Wed, 27 Apr 2022 14:58:11 +0000 (20:28 +0530)]
net/bnxt: fix RSS action
Specifying a subset of Rx queues created by the application in
the "flow create" command is invalid.
User must either specify all Rx queues created or no queues.
Also removed a wrong comment as RSS action will not be supported
if user or application specifies MARK or COUNT action.
'Count' action was never really implemented in the legacy/AFM model.
But there was some place holder code, remove it so that the user
will see a failure when a flow with 'count' action is being
created.
Kalesh AP [Wed, 27 Apr 2022 14:58:09 +0000 (20:28 +0530)]
net/bnxt: fix tunnel stateless offloads
The HW only supports tunnel header parsing globally for supported tunnel
types. When a function uses one default VNIC to receive both the tunnel
and non-tunnel packets, applying the same stateless offload operation to
both tunnel and non-tunnel packets can cause problems in certain scenarios.
To workaround these problems, the firmware advertises no tunnel header
parsing capabilities to the driver using the HWRM_FUNC_QCAPS.
The driver must check this flag setting and accordingly not advertise
tunnel packet stateless offload capabilities to the stack.
If the device supports VXLAN, GRE, IPIP and GENEVE tunnel parsing,
then reports RX_OFFLOAD_OUTER_IPV4_CKSUM, RX_OFFLOAD_OUTER_UDP_CKSUM
and TX_OFFLOAD_OUTER_IPV4_CKSUM in the Rx/Tx offload capabilities of
the device.
Also, advertise tunnel TSO capabilities based on FW support.
Fixes: 0a6d2a720078 ("net/bnxt: get device infos") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Wed, 27 Apr 2022 14:58:08 +0000 (20:28 +0530)]
net/bnxt: fix Rx configuration
We are currently not handling RX/RSS modes correctly.
After launching testpmd with multiple RXQs, if the user tries to set
the number of RXQs to 1, driver is not updating the "hash_type"
and "hash_mode" values of the VNICs. As a result, driver issues
bnxt_vnic_rss_configure() unnecessarily and the FW command fails.
Fixed bnxt_mq_rx_configure() to update VNIC RSS fields unconditionally.
Kalesh AP [Wed, 27 Apr 2022 14:58:06 +0000 (20:28 +0530)]
net/bnxt: fix device capability reporting
1. Added two functions bnxt_get_tx_port_offloads() and
bnxt_get_rx_port_offloads() to report the device
tx/rx offload capabilities to the application.
2. This avoids few duplicate code in the driver and make
VF-rep capability the same as VF.
3. This will help in selectively reporting offload capabilities
based on FW support.
Fixes: 0a6d2a720078 ("net/bnxt: get device infos") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>