Kalesh AP [Thu, 4 Mar 2021 09:07:28 +0000 (14:37 +0530)]
net/bnxt: fix VF info allocation
1. Renamed bnxt_hwrm_alloc_vf_info()/bnxt_hwrm_free_vf_info to
bnxt_alloc_vf_info()/bnxt_free_vf_info as it does not
issue any HWRM command to fw.
2. Fix missing unlock when memory allocation fails.
Fixes:
b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Thu, 4 Mar 2021 09:07:27 +0000 (14:37 +0530)]
net/bnxt: fix device readiness check
Fix HWRM_VER_GET command to handle DEV_NOT_RDY state.
Driver should fail probe if the device is not ready.
Conversely, the HWRM_VER_GET poll after reset can safely
retry until the existing timeout is exceeded.
Fixes:
804e746c7b73 ("net/bnxt: add hardware resource manager init code")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Ajit Khaparde [Thu, 4 Mar 2021 09:07:26 +0000 (14:37 +0530)]
net/bnxt: check flush status during ring free
When host SW issues a HWRM_RING_FREE for Tx/Rx/AGG ring in HW,
the FW flushes the BDs associated with the ring and performs other
cleanup in the HW. The host software should ideally check for an
indication from the FW indicating this step has been completed
successfully to avoid unexpected errors during cleanup.
The FW issues a HWRM_DONE response to the RING_FREE request on
the corresponding CQ ring. Poll the CQs during cleanup and
ensure the HWRM_FREE command is completed not just based on the
value of valid bit but also the HWRM_DONE response for the ring.
If the HWRM_DONE response is not seen, force the cleanup to
complete just based on the valid bit.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Lance Richardson [Tue, 2 Mar 2021 15:16:08 +0000 (10:16 -0500)]
net/bnxt: refactor mbuf pointer reset
Remove code for setting consumed mbuf pointers to NULL from the
vector receive functions as a minor performance optimization.
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Lance Richardson [Tue, 2 Mar 2021 14:29:27 +0000 (09:29 -0500)]
net/bnxt: fix Rx descriptor status
Fix a number of issues in the bnxt receive descriptor status
function, including:
- Provide status of receive descriptor instead of completion
descriptor.
- Remove invalid comparison of raw ring index with masked ring
index.
- Correct misinterpretation of offset parameter as ring index.
- Correct misuse of completion ring index for mbuf ring (the
two rings have different sizes).
Fixes:
0fe613bb87b2 ("net/bnxt: support Rx descriptor status")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Andy Gospodarek <gospo@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:53 +0000 (21:25 +0530)]
net/bnxt: fix PTP support for Thor
On Thor, Rx timestamp is present in the Rx completion record.
Only 32 bits of the timestamp is present in the completion.
The driver needs to periodically poll the current 48 bit
free running timer using the HWRM_PORT_TS_QUERY command.
It can combine the upper 16 bits from the HWRM response
with the lower 32 bits in the Rx completion to produce
the 48 bit timestamp for the Rx packet.
This patch adds an alarm thread to periodically poll the current 48 bit
free running timer using the HWRM_PORT_TS_QUERY command.
This avoids issuing the hwrm command from the Rx handler.
This patch also handles the timer roll over condition.
Fixes:
6cbd89f9f3d8 ("net/bnxt: support PTP for Thor")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:52 +0000 (21:25 +0530)]
net/bnxt: fix FW readiness check during recovery
Moved fw readiness check to a new routine bnxt_check_fw_ready().
During error recovery, driver needs to wait for fw readiness.
For that, it uses bnxt_hwrm_ver_get() function now and that
function does parsing of the VER_GET response as well.
Added a new lightweight function bnxt_hwrm_poll_ver_get() for polling
the firmware readiness which issues VER_GET and checks for success
without processing the command response.
Fixes:
df6cd7c1f73a ("net/bnxt: handle reset notify async event from FW")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:51 +0000 (21:25 +0530)]
net/bnxt: fix firmware fatal error handling
During some fatal firmware error conditions, the PCI config space
register 0x2e which normally contains the subsystem ID will become
0xffff. This register will revert back to the normal value after
the chip has completed core reset. If we detect this condition,
we can poll this config register immediately for the value to revert.
Because we use config read cycles to poll this register, there is no
possibility of Master Abort if we happen to read it during core reset.
This speeds up recovery significantly as we don't have to wait for the
conservative min_time before polling to see if the firmware has come
out of reset. As soon as this register changes value we can proceed
to re-initialize the device.
Fixes:
df6cd7c1f73a ("net/bnxt: handle reset notify async event from FW")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:50 +0000 (21:25 +0530)]
net/bnxt: handle echo request async message
This is a new async message that the firmware can send to check if it
can communicate with the driver. This is an added error detection
scheme that firmware can use if it suspects errors in the PCIe
interface. When the driver receives this async message, it will reply
back echoing some data in the async message. If the firmware is not
getting the reply with the proper data after some retries, error
recovery will kick in.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:49 +0000 (21:25 +0530)]
net/bnxt: log port id in async events
1. Used port id in async event logs.
2. Added a debug log in bnxt_hwrm_func_driver_unregister().
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Ajit Khaparde [Wed, 24 Feb 2021 15:55:48 +0000 (21:25 +0530)]
net/bnxt: update to new version of backing store
Update HWRM headers to version 1.10.2.15
which updates the backing store API for additional TQM rings.
Add support for 9th TQM ring using latest firmware interface.
Also make sure that we set only necessary bits in the enables
field in backing store request.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:47 +0000 (21:25 +0530)]
net/bnxt: update HWRM structures
Brought in the latest hsi_struct_def_dpdk.h.
HWRM API is now updated to version 1.10.2.15.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Venkat Duvvuru [Wed, 24 Feb 2021 15:55:46 +0000 (21:25 +0530)]
net/bnxt: fix queues per VNIC
Update queues per VNIC in single queue mode.
bp->rx_num_qs_per_vnic is not initialized in the single queue mode.
As a result of this when an interface is reconfigured to single
queue mode from an existing multiqueue mode, bp->rx_num_qs_per_vnic
is not updated to the value of 1. Hence, the driver will try to
access more than one queue resulting in a crash.
This patch fixes it by initializing bp->rx_num_qs_per_vnic in the
single queue mode as well.
Fixes:
36024b2e7fe5 ("net/bnxt: allow dynamic creation of VNIC")
Cc: stable@dpdk.org
Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:45 +0000 (21:25 +0530)]
net/bnxt: remove extra blank line
Removed an unnecessary extra blank line.
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:44 +0000 (21:25 +0530)]
net/bnxt: fix VNIC configuration
PMD should not set any flags to receive RoCE traffic while
configuring the vnic. Since the PMD does not support RoCE
some of the flags and code is unused. Clean it up.
Fixes:
b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Kalesh AP [Wed, 24 Feb 2021 15:55:43 +0000 (21:25 +0530)]
net/bnxt: remove unused macro
remove HWRM_SEQ_ID_INVALID macro.
Fixes:
804e746c7b73 ("net/bnxt: add hardware resource manager init code")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Ajit Khaparde [Wed, 10 Mar 2021 18:31:01 +0000 (10:31 -0800)]
devtools: add acronyms in dictionary for commit checks
Update word list with VNIC and Thor to catch errors in patch title.
Suggested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Andrew Rybchenko [Thu, 11 Mar 2021 10:00:40 +0000 (13:00 +0300)]
net/sfc: update copyright year
Bump copyright year to 2021.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Andrew Rybchenko [Thu, 11 Mar 2021 10:00:39 +0000 (13:00 +0300)]
common/sfc_efx: update copyright year
Bump copyright year to 2021.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Andrew Rybchenko [Thu, 11 Mar 2021 10:48:35 +0000 (13:48 +0300)]
app/testpmd: log reason of port start failure
Provide a bit more diagnostics information when port start fails.
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Ivan Malov [Mon, 8 Mar 2021 06:51:04 +0000 (09:51 +0300)]
net: fix comment in IPv6 header
The comment got it wrong. The payload length field
does not include the fixed IPv6 header size.
Fixes:
7eca7f7fd09d ("net: add missing endianness annotations")
Fixes:
af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Igor Russkikh [Wed, 10 Mar 2021 08:16:31 +0000 (09:16 +0100)]
maintainers: update for qede
Removing Shahed, adding me and Devendra
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Acked-by: Devendra Singh Rawat <dsinghrawat@marvell.com>
Ciara Loftus [Wed, 10 Mar 2021 07:48:16 +0000 (07:48 +0000)]
net/af_xdp: prefer busy polling
This commit introduces support for preferred busy polling
to the AF_XDP PMD. This feature aims to improve single-core
performance for AF_XDP sockets under heavy load.
A new vdev arg is introduced called 'busy_budget' whose default
value is 64. busy_budget is the value supplied to the kernel
with the SO_BUSY_POLL_BUDGET socket option and represents the
busy-polling NAPI budget. To set the budget to a different value
eg. 256:
--vdev=net_af_xdp0,iface=eth0,busy_budget=256
Preferred busy polling is enabled by default provided a kernel with
version >= v5.11 is in use. To disable it, set the budget to zero.
The following settings are also strongly recommended to be used in
conjunction with this feature:
echo 2 | sudo tee /sys/class/net/eth0/napi_defer_hard_irqs
echo 200000 | sudo tee /sys/class/net/eth0/gro_flush_timeout
.. where eth0 is the interface being used by the PMD.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ciara Loftus [Wed, 10 Mar 2021 07:48:15 +0000 (07:48 +0000)]
net/af_xdp: use recvfrom instead of poll syscall
poll() is more expensive and requires more tuning
when used with the upcoming busy polling functionality.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ciara Loftus [Wed, 10 Mar 2021 07:48:14 +0000 (07:48 +0000)]
net/af_xdp: allow bigger batch sizes
Prior to this commit, the maximum batch sizes for zero-copy and
copy-mode rx and copy-mode tx were set to 32. Apart from zero-copy tx,
the user could never rx/tx any more than 32 packets at a time and
without inspecting the code the user wouldn't be aware of this.
This commit removes these upper limits placed on the user and instead
sets an internal batch size equal to the default ring size (2048).
Batches larger than this are still processed, however they are split
into smaller batches similar to how it's done in other drivers. This is
necessary because some arrays used during rx/tx need to be sized at
compile-time.
Allowing a larger batch size allows for fewer batches and thus larger
bulk operations, fewer ring accesses and fewer syscalls which should
yield improved performance.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Thomas Monjalon [Tue, 16 Mar 2021 22:09:38 +0000 (23:09 +0100)]
bus/pci: fix Windows kernel driver categories
In Windows probing, the value RTE_PCI_KDRV_NONE was used
instead of RTE_PCI_KDRV_UNKNOWN.
This value covers the mlx case where the kernel driver is in place,
offering a bifurcated mode to the userspace driver.
When the kernel driver is listed as unknown,
there is no special treatment in DPDK probing, contrary to UIO modes.
The value RTE_PCI_KDRV_NIC_UIO (FreeBSD) was re-used
instead of having a new RTE_PCI_KDRV_NET_UIO for Windows NetUIO.
While adding the new value RTE_PCI_KDRV_NET_UIO
(at the end for ABI compatibility),
the enum of kernel driver categories is annotated.
Fixes:
b762221ac24f ("bus/pci: support Windows with bifurcated drivers")
Fixes:
c76ec01b4591 ("bus/pci: support netuio on Windows")
Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Thomas Monjalon [Wed, 17 Mar 2021 14:34:02 +0000 (15:34 +0100)]
eal: mark version parts API as experimental
Some functions were introduced in DPDK 21.05 to query the version parts
(prefix, year, month, minor, suffix, release) at runtime.
Per guidelines, these new public functions must be marked with
__rte_experimental and ABI versioned as EXPERIMENTAL.
Fixes:
5b637a848195 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Thomas Monjalon [Wed, 17 Mar 2021 09:18:22 +0000 (10:18 +0100)]
eal: fix version macro
The macro RTE_VERSION was broken since updated with function calls.
It is a build-time version number, and must be built with macros.
For a run-time version number, there is the function rte_version().
Fixes:
5b637a848195 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: David Marchand <david.marchand@redhat.com>
Hemant Agrawal [Tue, 23 Feb 2021 06:14:12 +0000 (11:44 +0530)]
examples/ptpclient: enable Rx timestamp offload
This patch add support to enable Rx offload for timestamp.
It is required to be enabled for some PMDs e.g. dpaa2.
Signed-off-by: Gagandeep Singh <g.singh@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Kirill Rybalchenko <kirill.rybalchenko@intel.com>
Anatoly Burakov [Wed, 3 Mar 2021 12:35:13 +0000 (12:35 +0000)]
doc: update power management in doxygen API index
The headers rte_power_intrinsics.h and rte_power_pmd_mgmt.h
were missing from the doxygen API index.
Fixes:
cda57d9388c0 ("eal: add power management intrinsics")
Fixes:
682a645438c5 ("power: add ethdev power management")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Khoa To [Mon, 1 Mar 2021 06:52:41 +0000 (22:52 -0800)]
bus/pci: support allow/block lists on Windows
EAL -a and -b options are used to specify which PCI devices are
explicitly allowed or blocked during PCI bus scan. This evaluation
is missing in the Windows implementation of rte_pci_scan.
This patch provides this missing functionality, so that apps can specify
which devices to ignore during PCI bus scan.
Signed-off-by: Khoa To <khot@microsoft.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Nick Connolly [Mon, 1 Mar 2021 09:56:44 +0000 (09:56 +0000)]
bus/pci: set Windows device class and bus
Attaching to an NVMe disk on Windows using SPDK requires the
PCI class ID and device.bus fields. Decode the class ID from the PCI
device info strings if it is present and set device.bus.
Signed-off-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Pallavi Kadam [Wed, 10 Feb 2021 20:36:54 +0000 (12:36 -0800)]
bus/pci: skip probing some Windows NDIS devices
Implement rte_pci_map_device() to distinguish between the devices bound
to netuio and NDIS devices.
Only return success for the netuio devices.
Fixes:
c76ec01b4591 ("bus/pci: support netuio on Windows")
Cc: stable@dpdk.org
Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tested-by: Narcisa Vasile <navasile@linux.microsoft.com>
Tal Shnaiderman [Thu, 18 Feb 2021 11:40:58 +0000 (13:40 +0200)]
eal/windows: fix default thread priority
The hard-coded thread priority for Windows threads in EAL
is REALTIME_PRIORITY_CLASS/THREAD_PRIORITY_TIME_CRITICAL.
This results in issues with DPDK threads causing OS thread starvation
and eventually a bugcheck.
The fix reduce the thread priority to
NORMAL_PRIORITY_CLASS/THREAD_PRIORITY_NORMAL.
Bugzilla ID: 600
Fixes:
53ffd9f080f ("eal/windows: add minimum viable code")
Cc: stable@dpdk.org
Reported-by: Odi Assli <odia@nvidia.com>
Signed-off-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Dmitry Kozlyuk [Sat, 27 Feb 2021 20:32:01 +0000 (23:32 +0300)]
eal/windows: add missing SPDX license tag
Fixes:
c08bd191b13d ("eal/windows: initialize hugepage info")
Cc: stable@dpdk.org
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Nick Connolly <nick.connolly@mayadata.io>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
Jie Zhou [Mon, 8 Mar 2021 18:05:39 +0000 (10:05 -0800)]
metrics: export telemetry stubs if no libjansson
This patch allows the same set of rte_metrics_tel_* functions to be
exported no matter JANSSON is available or not, by doing following:
1. Leverage dpdk_conf to set configuration flag RTE_HAS_JANSSON
when Jansson dependency is found.
2. In rte_metrics_telemetry.c, leverage RTE_HAS_JANSSON to handle the
case when JANSSON is not available by adding stubs for all the instances.
3. In meson.build, per dpdk/doc/guides/rel_notes/release_20_05.rst,
it is claimed that "Telemetry library is no longer dependent on the
external Jansson library, which allows Telemetry be enabled by default.",
thus make the deps and includes of Telemetry as not conditional anymore.
Signed-off-by: Jie Zhou <jizh@microsoft.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Ferruh Yigit [Tue, 9 Feb 2021 15:06:20 +0000 (15:06 +0000)]
log/linux: make default output stderr
In Linux by default DPDK log goes to stdout, as well as syslog.
It is possible for an application to change the library output stream
via 'rte_openlog_stream()' API, to set it to stderr, it can be used as:
rte_openlog_stream(stderr);
But still updating the default log output to 'stderr'.
Bugzilla ID: 8
Fixes:
af75078fece3 ("first public release")
Cc: stable@dpdk.org
Reported-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Juraj Linkeš [Thu, 11 Feb 2021 12:59:46 +0000 (13:59 +0100)]
build: support KNI cross-compilation
The KNI linux module is using a custom target for building, which
doesn't take into account any cross compilation arguments. The arguments
in question are ARCH, CROSS_COMPILE (for gcc, clang) and CC, LD (for
clang). Get those from the cross file and pass them to the custom
target.
The user supplied path may not contain the 'build' directory, such as
when using cross-compiled headers, so only append that in the default
case (when no path is supplied in native builds) and use the unmodified
path from the user otherwise. Also modify the install path accordingly.
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Reviewed-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Tue, 16 Feb 2021 15:13:29 +0000 (15:13 +0000)]
eal: fix querying DPDK version at runtime
For using a DPDK application, such as OVS, which is dynamically linked, the
DPDK version in use should always report the actual version, not the
version used at build time. This incorrect behaviour can be seen by
building OVS against one version of DPDK and running it against a later
one. Using "ovs-vsctl list Open_vSwitch" to query basic info, the
dpdk_version returned will be the build version not the currently running
one - which can be verified using the DPDK telemetry library client.
$ sudo ovs-vsctl list Open_vSwitch | grep dpdk_version
dpdk_version : "DPDK 20.11.0-rc4"
$ echo quit | sudo dpdk-telemetry.py
Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 21.02.0-rc2", "pid": 405659, "max_output_len": 16384}
-->
To fix this, we need to convert the rte_version() function, and any other
necessary parts of the rte_version.h, to be actual functions in EAL, not
just inlines/macros. The only complication in doing so is that telemetry
library cannot call rte_version() directly, and instead needs the version
string passed in on init.
Fixes:
af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Bruce Richardson [Fri, 12 Mar 2021 14:56:05 +0000 (14:56 +0000)]
build: exclude meson files from examples installation
The meson.build files in each example directory is simply to support
building the example as part of the main SDK build, and these should not
be installed with the example's source code and makefile. The exclude of
"meson.build" only filters out the top-level examples/meson.build file,
not the file in each subdirectory.
To fix this, we can build up the list of files to exclude based off the
list of all examples. With this change "find examples/ -name meson.build"
returns no hits when run on an installed instance.
Fixes:
e5b95003f1df ("examples: fix flattening directory layout on install")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Huawei Xie [Wed, 10 Mar 2021 17:36:30 +0000 (01:36 +0800)]
bus/pci: support MMIO for ioport
With I/O BAR, we get PIO (port-mapped I/O) address.
With MMIO (memory-mapped I/O) BAR, we get mapped virtual address.
We distinguish PIO and MMIO by their address range like how kernel does,
i.e, address below 64K is PIO.
ioread/write8/16/32 is provided to access PIO/MMIO.
By the way, for virtio on arch other than x86, BAR flag indicates PIO
but is mapped.
Signed-off-by: Huawei Xie <huawei.xhw@alibaba-inc.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Huawei Xie [Wed, 10 Mar 2021 17:36:29 +0000 (01:36 +0800)]
bus/pci: use Linux PCI sysfs to get PIO address
Currently virtio PMD assumes legacy device uses PIO bar.
There are three ways to get PIO (port-mapped I/O) address for virtio
legacy device.
1) under igb_uio
- get PIO address from uio/uio# sysfs attribute, for instance:
/sys/bus/pci/devices/0000:00:09.0/uio/uio0/portio/port0/start
2) under uio_pci_generic
- for X86, get PIO address from /proc/ioport
- for other ARCH, get PIO address from standard PCI sysfs attribute,
for instance: /sys/bus/pci/devices/0000:00:09.0/resource
Actually, "port0/start" in igb_uio and "resource" point to exactly the
same thing, i.e, pci_dev->resource[0] in kernel source code.
This patch refactors these messy things, and uses standard PCI sysfs
attribute "resource".
Signed-off-by: Huawei Xie <huawei.xhw@alibaba-inc.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Satheesh Paul [Tue, 9 Feb 2021 10:01:13 +0000 (15:31 +0530)]
net/octeontx2: fix VLAN filter
This patch fixes incorrect MCAM key preparation when creating
MCAM entry to allow VLAN IDs after vlan filtering is enabled on port.
Fixes:
ba1b3b081edf ("net/octeontx2: support VLAN offloads")
Cc: stable@dpdk.org
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Satheesh Paul [Mon, 8 Feb 2021 13:01:57 +0000 (18:31 +0530)]
net/octeontx2: support flow API dump
Add support to dump hardware internal representation information of
rte flow to file.
Every flow rule added will be dumped in the below format.
MCAM Index:1881
Interface :NIX-RX (0)
Priority :1
NPC RX Action:0X00000000404001
ActionOp:NIX_RX_ACTIONOP_UCAST (1)
PF_FUNC: 0X400
RQ Index:0X004
Match Id:0000
Flow Key Alg:0
NPC RX VTAG Action:0X00000000008100
VTAG0:relptr:0
lid:0X1
type:0
Patterns:
NPC_PARSE_NIBBLE_CHAN:000
NPC_PARSE_NIBBLE_LA_LTYPE:LA_ETHER
NPC_PARSE_NIBBLE_LB_LTYPE:NONE
NPC_PARSE_NIBBLE_LC_LTYPE:LC_IP
NPC_PARSE_NIBBLE_LD_LTYPE:LD_TCP
NPC_PARSE_NIBBLE_LE_LTYPE:NONE
LA_ETHER, hdr offset:0, len:0X6, key offset:0X8,\
Data:0X4AE124FC7FFF, Mask:0XFFFFFFFFFFFF
LA_ETHER, hdr offset:0XC, len:0X2, key offset:0X4, Data:0XCA5A,\
Mask:0XFFFF
LC_IP, hdr offset:0XC, len:0X8, key offset:0X10,\
Data:0X0A01010300000000, Mask:0XFFFFFFFF00000000
LD_TCP, hdr offset:0, len:0X4, key offset:0X18, Data:0X03450000,\
Mask:0XFFFF0000
MCAM Raw Data :
DW0 :
0000CA5A01202000
DW0_Mask:
0000FFFF0FF0F000
DW1 :
00004AE124FC7FFF
DW1_Mask:
0000FFFFFFFFFFFF
DW2 :
0A01010300000000
DW2_Mask:
FFFFFFFF00000000
DW3 :
0000000003450000
DW3_Mask:
00000000FFFF0000
DW4 :
0000000000000000
DW4_Mask:
0000000000000000
DW5 :
0000000000000000
DW5_Mask:
0000000000000000
DW6 :
0000000000000000
DW6_Mask:
0000000000000000
Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
Jiawei Zhu [Mon, 1 Mar 2021 17:19:50 +0000 (12:19 -0500)]
net/mlx5: fix Rx segmented packets on mbuf starvation
The issue occurred if mbuf starvation happened
in the middle of segmented packet reception.
In such a situation, after release the segments of
packet being received, code did not advance the
consumer index to the next stride. This caused
the receiving of the wrong segmented packet data.
The possible error scenario:
- we assume segs_n is 4 and we are receiving 4
segments of multi-segment packet.
- we fail to allocate mbuf while receiving the 3rd segment,
and this frees the mbufs of the packet chain we have built.
There are the 1st and 2nd segments in the chain.
- the 1st and the 2nd segments of this stride of Rx queue
are filled up (in elts array) with the new allocated
mbufs and their data are random (the 3rd and 4th
segments still contain the valid data of the packet though).
- on the next iteration of stride processing we get
the wrong two segments of the multi-segment packet.
Hence, we should skip these mbufs in the stride and
we should advance the consumer index on loop exit.
Fixes:
15a756b63734 ("net/mlx5: fix possible NULL dereference in Rx path")
Cc: stable@dpdk.org
Signed-off-by: Jiawei Zhu <zhujiawei12@huawei.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xiaoyun Li [Tue, 2 Mar 2021 07:03:20 +0000 (15:03 +0800)]
net/i40e: fix IPv4 fragment offload
IPv4 fragment_offset mask was required to be 0 no matter what the
spec value was. But zero mask means not caring about fragment_offset
field then both non-frag and frag packets should hit the rule.
But the actual fragment rules should be like the following:
Only non-fragment packets can hit Rule 1:
Rule 1: mask=0x3fff, spec=0
Only fragment packets can hit rule 2:
Rule 2: mask=0x3fff, spec=0x8, last=0x2000
This patch allows the above rules.
Fixes:
42044b69c67d ("net/i40e: support input set selection for FDIR")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Li <xiaoyun.li@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Wei Huang [Wed, 3 Mar 2021 02:34:31 +0000 (21:34 -0500)]
raw/ifpga: add miscellaneous APIs
Below miscellaneous APIs are used to implement OPAE application.
1. rte_pmd_ifpga_get_pci_bus() get PCI bus ifpga driver registered.
2. rte_pmd_ifpga_partial_reconfigure() do partial reconfiguration.
3. rte_pmd_ifpga_cleanup() free software resources allocated by driver.
4. rte_pmd_ifpga_set_rsu_status() set status of rsu process.
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Wei Huang [Wed, 3 Mar 2021 02:34:30 +0000 (21:34 -0500)]
raw/ifpga: add APIs to get FPGA information
There are some information data can be got from FPGA, they are
implemented in below APIs:
1. rte_pmd_ifpga_get_property() get properties of FPGA (include BMC).
2. rte_pmd_ifpga_get_phy_info() get information of PHY connect to FPGA.
3. rte_pmd_ifpga_get_rsu_status() get status of rsu process.
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Wei Huang [Wed, 3 Mar 2021 02:34:29 +0000 (21:34 -0500)]
raw/ifpga: add FPGA RSU APIs
RSU (Remote System Update) depends on secure manager which may be
different on various implementations, so a new secure manager device
is implemented for adapting such difference.
There are five APIs added:
1. rte_pmd_ifpga_get_dev_id() get raw device ID of ifpga device from PCI
address like 'Domain:Bus:Dev.Func'.
2. rte_pmd_ifpga_update_flash() update flash with specific image file.
3. rte_pmd_ifpga_stop_update() abort flash update process.
4. rte_pmd_ifpga_reboot_try() check current ifpga status and change it
to reboot status if it is idle.
5. rte_pmd_ifpga_reload() trigger full reconfiguration of ifpga device.
Signed-off-by: Wei Huang <wei.huang@intel.com>
Acked-by: Tianfei Zhang <tianfei.zhang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Beilei Xing [Wed, 24 Feb 2021 02:09:00 +0000 (10:09 +0800)]
net/i40evf: fix packet loss for X722
When Tx queue number is more than Rx queue number, and RSS is
enabled, there'll be packet loss with X722.
The root cause is the lookup table is not configured correctly,
since it uses VF's queue pair number but not Rx queue number.
Fixes:
2da3ba746795 ("net/i40e: fix VF runtime queues RSS config")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Signed-off-by: Hengjian Zhang <hengjianx.zhang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:07 +0000 (10:54 +0800)]
net/ice: clean GTPU flow type for flow director
Currently, FDIR only support GTPU outer fields in PF. Clean the
redundant GTPU inner info in flow type definition and align with
shared code.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:06 +0000 (10:54 +0800)]
net/ice: distinguish input set outer fields
Distinguish input_set_mask to inner and outer part. Use
input_set_mask_o for tunnel outer or non-tunnel input set.
input_set_mask_i is used for tunnel inner fields only.
Adjust indentation of ice_pattern_match_item list in switch, ACL, RSS
and FDIR for easy review.
For switch, ACL and RSS, only use
input_set_mask_o and set the input_set_mask_i all none.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:05 +0000 (10:54 +0800)]
net/ice: refactor input set config
For tunnel or non-tunnel packet, the input set is in outer_input_set
and use seg_tun[0]. seg_tun[1] is only used for tunnel inner fields.
This patch make align with input_set inner/outer with seg_tun[] and
simplify it.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:04 +0000 (10:54 +0800)]
net/ice: refactor flow pattern parser
Distinguish inner/outer input_set. And avoid too many nested
conditionals in each type's parser. input_set_o is used for
tunnel outer fields or non-tunnel fields , input_set_i is only
used for inner fields.
For GTPU, store the outer IP fields in inner part to align with
shared code behavior.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:03 +0000 (10:54 +0800)]
net/ice: refactor flow director filter structure
This patch use input_set_o and input_set_i to distinguish inner/outer
input set. input_set_i is only used for inner field.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Zhirun Yan [Tue, 2 Mar 2021 02:54:02 +0000 (10:54 +0800)]
net/ice: clean input set macro definition
Currently, the macro of input set use 2 bits, one bit for protocol and
inner/outer, another bit for src/dst field. But this could not
distinguish a rule with inner and outer fields for tunnel packet.
Redefine input set macro to make it clear. Only use these two bits for
protocol and field. Ignore the redundant inner/outer info.
ICE_INSET_TUN_* is used by switch module, should be removed after
switch refactor.
Signed-off-by: Zhirun Yan <zhirun.yan@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:57 +0000 (15:23 +0800)]
net/ice/base: cleanup filter list on error
When ice_remove_vsi_lkup_fltr is called, by calling
ice_add_to_vsi_fltr_list local copy of vsi filter list
is created. If any issues during creation of vsi filter
list occurs it up for the caller to free already
allocated memory. This patch ensures proper memory
deallocation in these cases.
Fixes:
c7dd15931183 ("net/ice/base: add virtual switch code")
Cc: stable@dpdk.org
Signed-off-by: Robert Malz <robertx.malz@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:56 +0000 (15:23 +0800)]
net/ice/base: fix uninitialized struct
One of the structs being used for ACL counter rules was allocated on
the stack and left uninitialized. Rather than depending on
undefined behavior around the .amount member during rule removal,
just leave a comment and initialize the struct to zero, as this is a
slow path call anyway. This bug could have caused silent failures
during counter removal.
Fixes:
f3202a097f12 ("net/ice/base: add ACL module")
Cc: stable@dpdk.org
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:55 +0000 (15:23 +0800)]
net/ice/base: update GTPU EH dummy packets for FDIR
Update GTPU EH dummy pkts for FDIR, including EH/DL/UL.
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:54 +0000 (15:23 +0800)]
net/ice/base: update boost TCAM for DVM
Add code to update boost TCAM entries to enable DVM. This requires
enabled DVM entries, and disabling SVM entries.
Signed-off-by: Dan Nowlin <dan.nowlin@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:53 +0000 (15:23 +0800)]
net/ice/base: mark ptype 2 as reserved
The entry for PTYPE 2 in the ice_ptype_lkup table incorrectly states
that this is an L2 packet with no payload. According to the datasheet,
this PTYPE is actually unused and reserved.
Modify the lookup entry to indicate this is an unused entry that is
reserved.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:52 +0000 (15:23 +0800)]
net/ice/base: fix payload indicator on ptype
The entry for PTYPE 90 indicates that the payload is layer 3. This does
not match the specification in the datasheet which indicates the packet
is a MAC, IPv6, UDP packet, with a payload in layer 4.
Fix the lookup table to match the data sheet.
Fixes:
64e9587d5629 ("net/ice/base: add structures for Rx/Tx queues")
Cc: stable@dpdk.org
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:51 +0000 (15:23 +0800)]
net/ice/base: support GTPU IP inner IPv6 for flow director
Support IPV4_GTPU with inner IPV6/UDP/TCP for FDIR.
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:50 +0000 (15:23 +0800)]
net/ice/base: support switch filter (GTP tunnel+IP flow)
Enabled support for advanced switch filter to satisfy match criteria
such as: GTP tunnel + Inner IPv4[6]
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:49 +0000 (15:23 +0800)]
net/ice/base: enable more GTPU inner L3 fields for FDIR
Add support for FDIR filter by GTPU inner L3 fields
(i.e., tos, ttl, proto).
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:48 +0000 (15:23 +0800)]
net/ice/base: expose link configuration error
Store the link_cfg_err byte in order to determine whether an unsupported
power configuration is preventing link establishment.
Signed-off-by: Jeb Cramer <jeb.j.cramer@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:47 +0000 (15:23 +0800)]
net/ice/base: enable GTPU inner L3/L4 for flow director
For FDIR, GTPU with inner L3/L4 layers should only support inner
L3/L4 addrs/ports, instead of outer fields. Thus, we use TUN offsets
for GTPU IP/EH to insert inner L3/L4 addrs/ports fields.
Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:46 +0000 (15:23 +0800)]
net/ice/base: indicate double reset solution restriction
Add capability which indicates double reset solution restriction.
Added "Post-update EMPR enabled" field to "Response Flags" field
(byte 19 in the response structure).
Signed-off-by: Amir Shay <shay.amir@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:45 +0000 (15:23 +0800)]
net/ice/base: support external device secure programming
External topology devices (e.g. PHYs) connected to controller or to SoC
might have a firmware engine within the device and the firmware is
usually loaded from NVM connected to the topology device.
In some cases, those firmware packages might need to be regularly
updated in a secure way to prevent malicious user to burn malicious
firmware into the topology device. In other cases, the topology device
firmware might be burned independently, as burning the NVM attached to
the device might cause the device to stop function but could be fixed
without permanent damage.
SoC topologies also enable mezzanine card, with an ID EEPROM
within it. This ID EEPROM might need an update also.
This patch provides these abilities.
Signed-off-by: Amir Shay <shay.amir@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Qi Zhang [Tue, 2 Mar 2021 07:23:44 +0000 (15:23 +0800)]
net/ice/base: support firmware log
Currently we do not provide full end-to-end solution for system level
debug and diagnostics. This change purpose is to fulfill design and
implementation gaps to provide full end-to-end (HW-FW-SW) diagnostic
solution. In addition to functional improvements, it will provide
feasible, user-friendly Debug information.
Signed-off-by: Amir Shay <shay.amir@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
Dapeng Yu [Fri, 19 Feb 2021 10:03:23 +0000 (18:03 +0800)]
net/e1000: remove MTU setting limitation
Currently, if requested MTU is bigger than mbuf size and scattered
receive is not enabled, setting MTU to that value fails.
This patch allows setting this special MTU when device is stopped,
because scattered_rx will be re-configured during next port start
and driver may enable scattered receive according new MTU value.
After this patch, driver may select different receive function
automatically after MTU set, according MTU values selected.
Fixes:
59d0ecdbf0e1 ("ethdev: MTU accessors")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Dapeng Yu [Fri, 19 Feb 2021 10:01:07 +0000 (18:01 +0800)]
net/igc: remove MTU setting limitation
Currently, if requested MTU is bigger than mbuf size and scattered
receive is not enabled, setting MTU to that value fails.
This patch allows setting this special MTU when device is stopped,
because scattered_rx will be re-configured during next port start
and driver may enable scattered receive according new MTU value.
After this patch, driver may select different receive function
automatically after MTU set, according MTU values selected.
Fixes:
a5aeb2b9e225 ("net/igc: support Rx and Tx")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
Alvin Zhang [Fri, 19 Feb 2021 05:13:46 +0000 (13:13 +0800)]
net/ice: fix VLAN filter with PF
The macro flag DEV_RX_OFFLOAD_VLAN_FILTER is used to enable/disable
Rx VLAN filter, but not Tx VLAN filter. Therefore, Tx VLAN filter
should not be enabled/disabled in function ice_vsi_config_vlan_filter
called after checking DEV_RX_OFFLOAD_VLAN_FILTER flag.
In addition, the kernel driver doesn't enable/disable the TX VLAN
filter in the similar function ice_cfg_vlan_pruning.
This patch removes the setting about the TX VLAN filter in function
ice_vsi_config_vlan_filter.
Fixes:
e0dcf94a0d7f ("net/ice: support VLAN ops")
Cc: stable@dpdk.org
Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Tested-by: Zhimin Huang <zhiminx.huang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Alexander Kozyrev [Wed, 3 Mar 2021 22:19:04 +0000 (22:19 +0000)]
ethdev: document generic modify flow action
Field IDs for the MODIFY_FIELD action lack doxygen comments
and not visible in online DPDK documentation because of that.
Provide a meaningful description for every Field ID for the
rte_flow_field_id enumeration.
Fixes:
73b68f4c54a0 ("ethdev: introduce generic modify flow action")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ferruh Yigit [Wed, 30 Sep 2020 11:02:40 +0000 (12:02 +0100)]
net/ring: support secondary process
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Ajit Khaparde [Fri, 5 Mar 2021 19:42:32 +0000 (11:42 -0800)]
app/testpmd: support forced ethernet speed
Add support for forced ethernet speed setting.
Currently testpmd tries to configure the Ethernet port in autoneg mode.
It is not possible to set the Ethernet port to a specific speed while
starting testpmd. In some cases capability to configure a forced speed
for the Ethernet port during initialization may be necessary. This patch
tries to add this support.
The patch assumes full duplex setting and does not attempt to change that.
So speeds like 10M, 100M are not configurable using this method.
The command line to configure a forced speed of 10G:
dpdk-testpmd -c 0xff -- -i --eth-link-speed 10000
The command line to configure a forced speed of 50G:
dpdk-testpmd -c 0xff -- -i --eth-link-speed 50000
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Hemant Agrawal [Fri, 5 Mar 2021 05:36:14 +0000 (11:06 +0530)]
doc: update release notes for dpaax
This patch updates the release notes for recently submitted changes.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Jiawen Wu [Fri, 5 Mar 2021 02:14:38 +0000 (10:14 +0800)]
net/txgbe: fix adding crypto SA
By register definition, Ipsec Rx IPv4 address should to be written
in the reg(0).
Fixes:
07cafb2adbc5 ("net/txgbe: add security session create operation")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Jiawen Wu [Fri, 5 Mar 2021 02:14:37 +0000 (10:14 +0800)]
net/txgbe: update packet type
Update the packet type lookup table according to the HW design.
Fix the bug that inner L3 and L4 type can not be parsed when
QINQ insert in tunnel packet.
Fixes:
9e30b88f60b2 ("net/txgbe: support packet type")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Jiawen Wu [Fri, 5 Mar 2021 02:14:36 +0000 (10:14 +0800)]
net/txgbe: fix Rx missed packet counter
Add the Rx dropped packet counter into stats->imissed, to ensure the
stats correct.
Fixes:
c9bb590d4295 ("net/txgbe: support device statistics")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Jiawen Wu [Fri, 5 Mar 2021 02:14:35 +0000 (10:14 +0800)]
net/txgbe: remove unused functions
Remove unused functions for EEPROM read and write.
Fixes:
35c90ecccfd4 ("net/txgbe: add EEPROM functions")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Vadim Podovinnikov [Wed, 17 Feb 2021 16:26:55 +0000 (16:26 +0000)]
net/bonding: fix LACP system address check
In bond (LACP) we have several NICs (ports), when we have negotiation
with peer about what port we prefer, we send information about what
system we preferred in partner system name field. Peer also sends us
what partner system name it prefer.
When we receive a message from it we must compare its preferred system
name with our system name, but not with our port mac address
In my test I have several problems with that:
1. If master port (mac address same as system address) shuts down (I
have two ports) I loose connection
2. If secondary port (mac address not same as system address) receives
message before master port, my connection is not established.
Fixes:
56cbc0817399 ("net/bonding: fix LACP negotiation")
Cc: stable@dpdk.org
Signed-off-by: Vadim Podovinnikov <podovinnikov@protei.ru>
Acked-by: Min Hu (Connor) <humin29@huawei.com>
Chengchang Tang [Thu, 4 Mar 2021 07:44:54 +0000 (15:44 +0800)]
net/hns3: fix imprecise statistics
Currently, the hns3 statistics may be inaccurate due to the
following two problems:
1. Queue-level statistics are read from the firmware, and only one Rx or
Tx can be read at a time. This results in a large time interval
between reading multiple queues statistics in a stress scenario, such
as 1280 queues used by a PF or 256 functions used at the same time.
Especially when the 256 functions are used at the same time, the
interval between every two firmware commands in a function can be
huge, because the scheduling mechanism of the firmware is similar to
RR.
2. The current statistics are read by type. The HW statistics are read
first, and then the software statistics are read. Due to preceding
reasons, HW reading may be time-consuming, which cause a
synchronization problem between SW and HW statistics of the same
queue.
In this patch, queue-level statistics are directly read from the bar
instead of the firmware, and all the statistics of a queue include HW
and SW are read at a time to reduce inconsistency.
Fixes:
8839c5e202f3 ("net/hns3: support device stats")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Hongbo Zheng [Thu, 4 Mar 2021 07:44:53 +0000 (15:44 +0800)]
net/hns3: process MAC interrupt
TNL is the abbreviation of tunnel, which means port
here. MAC TNL interrupt indicates the MAC status
report of the network port, which will be generated
when the MAC status changes.
This patch enables MAC TNL interrupt reporting, and
queries and prints the corresponding MAC status when
the interrupt is received, then clear the MAC interrupt
status. Because this interrupt uses the same interrupt
as RAS, the interrupt log is adjusted.
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Huisong Li [Thu, 4 Mar 2021 07:44:52 +0000 (15:44 +0800)]
net/hns3: fix mbuf leakage
The mbufs of rx queue will be allocated in "hns3_do_start" function.
But these mbufs are not released when "hns3_dev_start" executes
failed.
Fixes:
c4ae39b2cfc5 ("net/hns3: fix Rx interrupt after reset")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Huisong Li [Thu, 4 Mar 2021 07:44:51 +0000 (15:44 +0800)]
net/hns3: remove unused parameter markers
All input parameters in the "hns3_dev_xstats_get_by_id" API are used,
so the rte_unused flag of some variables should be deleted.
Fixes:
3213d584b698 ("net/hns3: fix xstats with id and names")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Chengchang Tang [Thu, 4 Mar 2021 07:44:50 +0000 (15:44 +0800)]
net/hns3: fix HW buffer size on MTU update
After MTU changed, the buffer used to store packets in HW should be
reallocated. And buffer size is allocated based on the maximum frame
size in the PF struct. However, the value of maximum frame size is
not updated in time when MTU is changed. This would lead to a packet
loss for not enough buffer.
This patch update the maximum frame size before reallocating the HW
buffer. And a rollback operation is added to avoid the side effects
of buffer reallocation failures.
Fixes:
1f5ca0b460cd ("net/hns3: support some device operations")
Fixes:
d51867db65c1 ("net/hns3: add initialization")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Chengwen Feng [Thu, 4 Mar 2021 07:44:49 +0000 (15:44 +0800)]
net/hns3: support Rx descriptor advanced layout
Currently, the driver get packet type by parse the
L3_ID/L4_ID/OL3_ID/OL4_ID from Rx descriptor and then lookup multiple
tables, it's time consuming.
Now Kunpeng930 support advanced RXD layout, which:
1. Combine OL3_ID/OL4_ID to 8bit PTYPE filed, so the driver get packet
type by lookup only one table. Note: L3_ID/L4_ID become reserved
fields.
2. The 1588 timestamp located at Rx descriptor instead of query from
firmware.
3. The L3E/L4E/OL3E/OL4E will be zero when L3L4P is zero, so driver
could optimize the good checksum calculations (when L3E/L4E is zero
then mark PKT_RX_IP_CKSUM_GOOD/PKT_RX_L4_CKSUM_GOOD).
Considering compatibility, the firmware will report capability of
RXD advanced layout, the driver will identify and enable it by default.
This patch only provides basic function: identify and enable the RXD
advanced layout, and lookup ptype table if supported.
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Huisong Li [Thu, 4 Mar 2021 07:44:48 +0000 (15:44 +0800)]
net/hns3: support PF device with copper PHYs
The normal operation of devices with copper phys depends on the
initialization and configuration of the PHY chip. The task of
driving the PHY chip is implemented in some firmware versions.
If firmware supports the phy driver, it will report a capability
flag to driver in probing process. The driver determines whether
to support PF device with copper phys based on the capability bit.
If supported, the driver set a flag indicating that the firmware
takes over the PHY, and then the firmware initializes the PHY.
This patch supports the query of link status and link info, and
existing basic features for PF device with copper phys.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Huisong Li [Thu, 4 Mar 2021 07:44:47 +0000 (15:44 +0800)]
net/hns3: fix device capabilities for copper media type
The configuration operation for PHY is implemented by firmware. And
a capability flag will be report to driver, which means the firmware
supports the PHY driver. However, the current implementation only
supports obtaining the capability bit, but some basic functions of
copper ports in driver, such as, the query of link status and link
info, are not supported.
Therefore, it is necessary for driver to set the copper capability
bit to zero when the firmware supports the configuration of the PHY.
Fixes:
438752358158 ("net/hns3: get device capability from firmware")
Fixes:
95e50325864c ("net/hns3: support copper media type")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Huisong Li [Thu, 4 Mar 2021 07:44:46 +0000 (15:44 +0800)]
net/hns3: encapsulate port shaping interface
When rate of port changes, the rate limit of the port needs to
be updated. So it is necessary to encapsulate an interface that
configures the rate limit based on the rate.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Min Hu (Connor) [Thu, 4 Mar 2021 07:44:45 +0000 (15:44 +0800)]
net/hns3: add imissed packet stats
This patch implement Rx imissed stats by querying cmdq.
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Min Hu (Connor) [Thu, 4 Mar 2021 07:44:44 +0000 (15:44 +0800)]
net/hns3: add bytes stats
In current HNS3 PMD, Rx/Tx bytes from packet stats are not
implemented.
This patch implemented Rx/Tx bytes using soft counters.
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Chengwen Feng [Thu, 4 Mar 2021 07:44:43 +0000 (15:44 +0800)]
net/hns3: implement Tx mbuf free on demand
This patch add support tx_done_cleanup ops, which could support for
the API rte_eth_tx_done_cleanup to free consumed mbufs on Tx ring.
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Chengchang Tang [Thu, 4 Mar 2021 07:44:42 +0000 (15:44 +0800)]
net/hns3: add more registers to dump
This patch makes more registers dumped in the dump_reg API to help
locate the fault.
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Chengchang Tang [Thu, 4 Mar 2021 07:44:41 +0000 (15:44 +0800)]
net/hns3: support module EEPROM dump
This patch add support for dumping module EEPROM.
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Matan Azrad [Thu, 25 Feb 2021 10:45:01 +0000 (10:45 +0000)]
net/mlx5: fix imissed statistics
The imissed port statistic counts packets that were dropped by the
device Rx queues.
In mlx5, the imissed counter summarizes 2 counters:
- packets dropped by the SW queue handling counted by SW.
- packets dropped by the HW queues due to "out of buffer" events
detected when no SW buffer is available for the incoming
packets.
There is HW counter object that should be created per device, and all
the Rx queues should be assigned to this counter in configuration time.
This part was missed when the Rx queues were created by DevX what
remained the "out of buffer" counter clean forever in this case.
Add 2 options to assign the DevX Rx queues to queue counter:
- Create queue counter per device by DevX and assign all the
queues to it.
- Query the kernel counter and assign all the queues to it.
Use the first option by default and if it is failed, fallback to the
second option.
Fixes:
e79c9be91515 ("net/mlx5: support Rx hairpin queues")
Fixes:
dc9ceff73c99 ("net/mlx5: create advanced RxQ via DevX")
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Matan Azrad [Thu, 25 Feb 2021 10:45:00 +0000 (10:45 +0000)]
common/mlx5: add DevX commands for queue counters
A queue counter set is an HW object that can be assigned to any RQ\QP
and it counts HW events on the assigned QPs\RQs.
Add DevX API to allocate and query queue counter set object.
The only used counter event is the "out of buffer" where the queue
drops packets when no SW buffer is available to receive it.
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Matan Azrad [Thu, 25 Feb 2021 10:44:59 +0000 (10:44 +0000)]
common/mlx5: add DevX command to query WQ
Add a DevX command to query Rx queues attributes created by VERBS.
Currently support only counter_set_id attribute.
This counter ID is managed by the kernel driver and being assigned to
any queue created by the kernel.
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Matan Azrad [Thu, 25 Feb 2021 10:44:58 +0000 (10:44 +0000)]
common/mlx5/linux: add glue function to query WQ
When Rx queue is created by VERBS API ibv_create_wq there is a dedicated
rdma-core API to query an information about this WQ(Work Queue).
VERBS WQ querying is needed for PMD cases which combine VERBS objects
with DevX objects.
Next feature to use this glue function is the HW queue counters.
Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>