Ivan Ilchenko [Fri, 23 Jul 2021 13:15:14 +0000 (16:15 +0300)]
net/sfc: add xstats for Rx/Tx doorbells
Rx/Tx doorbells statistics are collected in software and
available per queue. These stats are useful for performance
investigation.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Ilchenko [Fri, 23 Jul 2021 13:15:13 +0000 (16:15 +0300)]
net/sfc: prepare to add more xstats
Move getting MAC stats code that involves locking to separate functions
to simplify addition of new xstats.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Ilchenko [Fri, 23 Jul 2021 13:15:12 +0000 (16:15 +0300)]
net/sfc: simplify getting xstats count
There is no point to recalculate number of available xstats on
each request. The number is calculated once on device start
and may be returned on subsequent calls.
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Ilchenko [Fri, 23 Jul 2021 13:15:11 +0000 (16:15 +0300)]
net/sfc: fix MAC stats update for stopped device
Return the latest stats snapshot in stopped state
instead of returning an error.
Fixes:
1caab2f1e68 ("net/sfc: add basic statistics")
Cc: stable@dpdk.org
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Ilchenko [Fri, 23 Jul 2021 13:15:10 +0000 (16:15 +0300)]
net/sfc: fix xstats query by unsorted list of IDs
Device may support only some MAC stats. Add mapping from ids to subset
of supported MAC stats for each port.
Fixes:
73280c1e4ff ("net/sfc: support xstats retrieval by ID")
Cc: stable@dpdk.org
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Ilchenko [Fri, 23 Jul 2021 13:15:09 +0000 (16:15 +0300)]
net/sfc: fix xstats query by ID according to ethdev
Fix xstats by ID callbacks according to ethdev usage.
Handle combinations of input arguments that are required by ethdev
and sanity check and reject other combinations on callback entry.
Fixes:
73280c1e4ff ("net/sfc: support xstats retrieval by ID")
Cc: stable@dpdk.org
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Ilchenko [Fri, 23 Jul 2021 13:15:06 +0000 (16:15 +0300)]
net/sfc: fix reading adapter state without locking
Update MAC stats function reads adapter state with MAC stats locking
but without adapter locking. Add adapter locking before calling this
function and remove MAC stats locking since there's no point to have
it together with adapter locking. The second place MAC stats locking
is used is MAC stats reset function. It's called with adapter being
already locked so there's no point to use MAC stats locking anymore.
Fixes:
1caab2f1e68 ("net/sfc: add basic statistics")
Cc: stable@dpdk.org
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Ivan Ilchenko [Fri, 23 Jul 2021 13:15:05 +0000 (16:15 +0300)]
net/sfc: fix MAC stats lock in xstats query by ID
Add MAC stats lock in xstats_get_by_id() callback before reading
number of supported MAC stats.
Fixes:
73280c1e4ff ("net/sfc: support xstats retrieval by ID")
Cc: stable@dpdk.org
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Cheng Jiang [Fri, 23 Jul 2021 08:09:37 +0000 (08:09 +0000)]
examples/vhost: handle memory hotplug for async vhost
When the guest memory is hotplugged, the vhost application which
enables DMA acceleration must stop DMA transfers before the vhost
re-maps the guest memory.
To accomplish that, we need to do these changes in the vhost sample:
1. add inflight packets count.
2. add vring_state_changed() callback.
3. add inflight packets clear process in destroy_device() and
vring_state_changed().
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Jiayu Hu [Fri, 23 Jul 2021 08:09:36 +0000 (08:09 +0000)]
vhost: handle memory hotplug for async vhost
When the guest memory is hotplugged, the vhost application which
enables DMA acceleration must stop DMA transfers before the vhost
re-maps the guest memory.
This patch is to notify the vhost application of stopping DMA
transfers.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Cheng Jiang [Fri, 23 Jul 2021 08:09:35 +0000 (08:09 +0000)]
vhost: add unsafe async API to clear packets
Applications need to stop DMA transfers and finish all the inflight
packets when in VM memory hot-plug case and async vhost is used. This
patch is to provide an unsafe API to clear inflight packets which
are submitted to DMA engine in vhost async data path. Update the
program guide and release notes for virtqueue inflight packets clear
API in vhost lib.
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Cheng Jiang [Fri, 23 Jul 2021 08:09:34 +0000 (08:09 +0000)]
vhost: fix async callbacks return type
The async vhost callback ops should return negative value when there
are something wrong in the callback, so the return type should be
changed into int32_t. The issue in vhost example is also fixed.
Fixes:
cd6760da1076 ("vhost: introduce async enqueue for split ring")
Fixes:
819a71685826 ("vhost: fix async callback return type")
Fixes:
6b3c81db8bb7 ("vhost: simplify async copy completion")
Fixes:
abec60e7115d ("examples/vhost: support vhost async data path")
Fixes:
6e9a9d2a02ae ("examples/vhost: fix ioat dependency")
Fixes:
873e8dad6f49 ("vhost: support packed ring in async datapath")
Cc: stable@dpdk.org
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Hemant Agrawal [Mon, 19 Jul 2021 13:59:17 +0000 (19:29 +0530)]
doc: remove SDK info from DPAA2 drivers guides
The prerequisite info is already present in the platform guide.
No need to repeat it in individual dev guides.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Vanshika Shukla [Mon, 19 Jul 2021 13:59:16 +0000 (19:29 +0530)]
net/dpaa2: add some parameter validations
This patch adds validation of the port id for
rte_pmd_dpaa2_set_custom_hash API to check if the
port is a valid DPAA2 port. Also handles some
edge cases in the rte_pmd_dpaa2_mux_flow_create API.
Signed-off-by: Vanshika Shukla <vanshika.shukla@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Nipun Gupta [Mon, 19 Jul 2021 13:59:15 +0000 (19:29 +0530)]
net/dpaa2: add per-thread initialization API
DPAA2 hardware require a hardware portal context.
If a thread doing DPAA2 i/o do not have portal, it will
allocate it on run-time. This may cause a delay in the
datapath at run-time. To avoid it, it is better to allocate
a hw context portal at the start of thread expected to do
i/o with DPAA2 hardware.
This patch makes necessary changes for the same and creates
a pmd API to allocate a hw context portal for a thread.
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Rohit Raj [Mon, 19 Jul 2021 13:59:14 +0000 (19:29 +0530)]
net/dpaa: add check for parsing default Rx queue
Add check for the PCD queue from the kernel interface
for default and error queues.
Signed-off-by: Rohit Raj <rohit.raj@nxp.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Hemant Agrawal [Mon, 19 Jul 2021 13:59:13 +0000 (19:29 +0530)]
bus/dpaa: reduce thread ID syscall usage
Reuse DPDK rte_gettid instead of syscall.
It will help to reduce the dpaa portal allocation time.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Nipun Gupta [Mon, 19 Jul 2021 13:59:12 +0000 (19:29 +0530)]
net/dpaa: fix headroom in VSP case
This patch fixes providing the correct headroom size when
VSP is enabled.
Fixes:
e4abd4ff183c ("net/dpaa: support virtual storage profile")
Cc: stable@dpdk.org
Signed-off-by: Nipun Gupta <nipun.gupta@nxp.com>
Hemant Agrawal [Mon, 19 Jul 2021 13:59:11 +0000 (19:29 +0530)]
bus/dpaa: fix freeing in FMAN interface destructor
if was allocated with rte_malloc, free shall be equivalent.
Fixes:
4762b3d419c3 ("bus/dpaa: delay fman device list to bus probe")
Cc: stable@dpdk.org
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Michal Krawczyk [Fri, 23 Jul 2021 10:43:37 +0000 (12:43 +0200)]
maintainers: update for ena
Remove Guy Tzalik as the driver's maintainer and add Shai Brandes who
will now be another maintainer of the ENA DPDK driver.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Michal Krawczyk [Fri, 23 Jul 2021 10:24:54 +0000 (12:24 +0200)]
net/ena: update version to 2.4.0
This version update contains:
* Rx interrupts feature,
* Support for the RSS hash function reconfiguration,
* Small rework of the works,
* Reset trigger on Tx path fix.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Michal Krawczyk [Fri, 23 Jul 2021 10:24:53 +0000 (12:24 +0200)]
net/ena: rework RSS configuration
Allow user to specify his own hash key and hash ctrl if the
device is supporting that. HW interprets the key in reverse byte order,
so the PMD reorders the key before passing it to the ena_com layer.
Default key is being set in random matter each time the device is being
initialized.
Moreover, make minor adjustments for reta size setting in terms
of returning error values.
RSS code was moved to ena_rss.c file to improve readability.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Shay Agroskin <shayagr@amazon.com>
Reviewed-by: Amit Bernstein <amitbern@amazon.com>
Michal Krawczyk [Fri, 23 Jul 2021 10:24:52 +0000 (12:24 +0200)]
net/ena: support Rx interrupt
In order to support asynchronous Rx in the applications, the driver has
to configure the event file descriptors and configure the HW.
This patch configures appropriate data structures for the rte_ethdev
layer, adds .rx_queue_intr_enable and .rx_queue_intr_disable API
handlers, and configures IO queues to work in the interrupt mode, if it
was requested by the application.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Artur Rojek <ar@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Shay Agroskin <shayagr@amazon.com>
Michal Krawczyk [Fri, 23 Jul 2021 10:24:51 +0000 (12:24 +0200)]
net/ena: trigger reset on Tx prepare failure
If the prepare function failed, then it means the descriptors are in the
invalid state.
This condition now triggers the reset, which should be further handled
by the application.
To notify the application about prepare function failure, the error log
was added. In general, it should never fail in normal conditions, as the
Tx function checks for the available space in the Tx ring before the
preparation even starts.
Fixes:
2081d5e2e92d ("net/ena: add reset routine")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Reviewed-by: Shay Agroskin <shayagr@amazon.com>
Michal Krawczyk [Fri, 23 Jul 2021 10:24:50 +0000 (12:24 +0200)]
net/ena: use common debug options
ENA defined its own logger flags for Tx and Rx, but they weren't
technically used anywhere. Those data path loggers weren't used anywhere
after the definition.
This commit uses the generic RTE_ETHDEV_DEBUG_RX and RTE_ETHDEV_DEBUG_TX
flags to define PMD_TX_LOG and PMD_RX_LOG which are now being used on
the data path. The PMD_TX_FREE_LOG was removed, as it has no usage in
the current version of the driver.
RTE_ETH_DEBUG_[TR]X now wraps extra checks for the driver state in the
IO path - this saves extra conditionals on the hot path.
ena_com logger is no longer optional (previously it had to be explicitly
enabled by defining this flag: RTE_LIBRTE_ENA_COM_DEBUG). Having this
logger optional makes tracing of ena_com errors much harder.
Due to ena_com design, it's impossible to separate IO path logs
from the management path logs, so for now they will be always enabled.
Default levels for the affected loggers were modified. Hot path loggers
are initialized with the default level of DEBUG instead of NOTICE, as
they have to be explicitly enabled. ena_com logging level was reduced
from NOTICE to WARNING - as it's no longer optional, the driver should
report just a warnings in the ena_com layer.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Michal Krawczyk [Fri, 23 Jul 2021 10:24:49 +0000 (12:24 +0200)]
net/ena: adjust logs
ENA logs were not consistent regarding the new line character. Few of
them were relying on the new line character added by the PMD_*_LOG
macros, but most were adding the new line character by themselves. It
was causing ENA logs to add extra empty line after almost each log.
To unify this behavior, the missing new line characters were added to
the driver logs, and they were removed from the logging macros. After
this patch, every ENA log message should add '\n' at the end.
Moreover, the logging messages were adjusted in terms of wording
(removed unnecessary abbreviations), capitalizing of the words (start
sentences with capital letters, and use 'Tx/Rx' instead of 'tx/TX' etc.
Some of the logs were rephrased to make them more clear for the reader.
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Shai Brandes <shaibran@amazon.com>
Jiawen Wu [Wed, 14 Jul 2021 06:05:48 +0000 (14:05 +0800)]
net/txgbe: fix VLAN filter setting for VF
Fix the function call error on VLAN filter table address setting for VF.
Fixes:
aa1ae7941e71 ("net/txgbe: support VF VLAN")
Cc: stable@dpdk.org
Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
Guoyang Zhou [Fri, 16 Jul 2021 09:54:30 +0000 (17:54 +0800)]
net/hinic: fix MTU consistency with firmware
The configuration of MTU is inconsistent in the driver and
firmware when the port is stopped, started and reconfigured.
Before, HINIC_MAX_JUMBO_FRAME_SIZE include VLAN tag, but when
frame and pktlen are converted to each other do not include
VLAN tag. And port_mtu_set function will use HINIC_MAX_JUMBO_FRAME_SIZE
to calculate eth_overhead, so MTU will be inconsistent in the driver and
firmware.
Fixes:
e542ab51ab27 ("net/hinic: fix jumbo frame flag condition for MTU set")
Cc: stable@dpdk.org
Signed-off-by: Guoyang Zhou <zhouguoyang@huawei.com>
Guoyang Zhou [Fri, 16 Jul 2021 09:54:29 +0000 (17:54 +0800)]
net/hinic/base: fix LRO
The Rx queue must config as ceq disables, and must set MSI-X
state disabled. Otherwise when LRO is enables, there will be
problems with packet aggregation because of firmware.
Fixes:
9d02f40d6503 ("net/hinic: fix LRO")
Cc: stable@dpdk.org
Signed-off-by: Guoyang Zhou <zhouguoyang@huawei.com>
Guoyang Zhou [Fri, 16 Jul 2021 09:54:28 +0000 (17:54 +0800)]
net/hinic: increase protection of the VLAN
If the VLAN id 0 is deleted for hinic, all packets without
VLAN will be discarded when the VLAN filter is turned on.
Fixes:
50ce3e7aec8f ("ethdev: fix VLAN offloads set if no relative capabilities")
Cc: stable@dpdk.org
Signed-off-by: Guoyang Zhou <zhouguoyang@huawei.com>
Huisong Li [Sat, 17 Jul 2021 01:04:19 +0000 (09:04 +0800)]
net/hns3: disable PFC if not configured
If "dcb_capability_en" in "data->dev_conf" delivered from the dev_configure
does not have the ETH_DCB_PFC_SUPPORT flag, the user wants to disable PFC,
and only enable ETS. Therefore, this patch supports the function of
disabling PFC by the field. In addition, this patch updates
"current_fc_status" of the driver based on the flow control mode requested
by user so as to enable the flow control mode in multi-TC scenarios.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Huisong Li [Sat, 17 Jul 2021 02:02:56 +0000 (10:02 +0800)]
net/hns3: fix Tx prepare after stop
In some special scenarios, such as TSO scenarios, the user layer may need
to call the tx_pkt_prepare(), and then call tx_pkt_burst() to send packets.
If the return value of tx_pkt_parepare() isn't equal to the numbers of
packets requested to send, warning message may be printed at the user
layer. Currently, tx_pkt_prepare() is assigned to dummy function when
dev_stop() is called in hns3 PMD. At this moment, if user layer continues
to send packets, the warning message will always be printed. So this patch
modifies the address to NULL.
Fixes:
2790c6464725 ("net/hns3: support device reset")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Chengwen Feng [Sat, 17 Jul 2021 02:02:55 +0000 (10:02 +0800)]
net/hns3: fix flow rule list in multi-process
Currently, hns3 driver saves rte_flow list into the
rte_eth_dev.process_private field, it may cause following problem:
The FDIR/RSS rules cannot be managed in a unified manner because
the management structure is not visible between processes.
This patch fixes it by moving rte_flow list to struct hns3_hw which is
visible between processes.
Fixes:
fcba820d9b9e ("net/hns3: support flow director")
Fixes:
c37ca66f2b27 ("net/hns3: support RSS")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Huisong Li [Sat, 17 Jul 2021 02:02:54 +0000 (10:02 +0800)]
net/hns3: move speed auto-negotiation warning
PF driver prints a warning on device that does not support auto-negotiation
when user does not configure "link_speeds" (default 0), which means
auto-negotiation. Currently, this warning information is printed in
dev_configure stage and a success is returned. Perhaps the user may call
dev_configure multiple times before dev_start for some reason or purpose.
In this case, this message may be printed multiple times. So this patch
moves it to dev_start stage.
Fixes:
cfc9fe48c4d4 ("net/hns3: move link speeds check to configure")
Cc: stable@dpdk.org
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Chengchang Tang [Sat, 17 Jul 2021 02:02:53 +0000 (10:02 +0800)]
net/hns3: remove duplicate compile-time check
This patch delete duplicate compile-time check.
Fixes:
cb12e988f35f ("net/hns3: add compile-time verification on Rx vector")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Hongbo Zheng [Sat, 17 Jul 2021 02:02:52 +0000 (10:02 +0800)]
net/hns3: fix timing of clearing interrupt source
Currently, the PF/VF does not clear the interrupt source immediately
after receiving the interrupt. As a result, if the second interrupt
task is triggered when processing the first interrupt task, clearing
the interrupt source before exiting will clear the interrupt sources
of the two tasks at the same time. As a result, no interrupt is
triggered for the second task.
Clearing interrupt source immediately after checking event cause
ensures that:
1. Even if two interrupt tasks are triggered at the same time, they can
be processed.
2. If the second task is triggered during the processing of the first
task and the interrupt source is not cleared, the interrupt is reported
after vector0 is enabled.
Fixes:
a5475d61fa34 ("net/hns3: support VF")
Fixes:
3988ab0eee52 ("net/hns3: add abnormal interrupt process")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Chengwen Feng [Sat, 17 Jul 2021 02:02:51 +0000 (10:02 +0800)]
net/hns3: fix filter parsing comment
This patch fixed incorrect comment of hns3_parse_fdir_filter().
Fixes:
fcba820d9b9e ("net/hns3: support flow director")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Chengwen Feng [Sat, 17 Jul 2021 02:02:50 +0000 (10:02 +0800)]
net/hns3: remove unnecessary zero assignments
The output parameter 'cap' was cleared at the function entry, the
latter zero assignment 'cap' fields was unnecessary, so delete them.
Fixes:
c09c7847d892 ("net/hns3: support traffic management")
Cc: stable@dpdk.org
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Chengchang Tang [Sat, 17 Jul 2021 02:02:49 +0000 (10:02 +0800)]
net/hns3: fix residual MAC address entry
Currently, even if we fail to remove the origin MAC address from the HW,
the set_default_mac will go on, and add the new MAC address to the HW.
Eventually cause the original MAC address entry to remain in the HW, and
users may receive unexpected packets.
This patch make set_default_mac return directly to failure if deleting
the original MAC address fails, simplifying the behavior of the driver
and solving the problem of residual MAC address entry.
Fixes:
7d7f9f80bbfb ("net/hns3: support MAC address related operations")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Tudor Cornea [Wed, 14 Jul 2021 09:28:11 +0000 (12:28 +0300)]
net/af_packet: run on kernel without qdisc bypass support
Some older kernels do not support the PACKET_QDISC_BYPASS socket
option. Such an example is the CentOS 7 kernel (3.10).
If we only check for the definition of PACKET_QDISC_BYPASS, it might mean
that we will not be able to compile the PMD driver on a newer platform,
and run in on a machine with an older kernel.
Setting the socket option only if it is specifically requested from
the EAL arguments, allows us to have a way to run the PMD compiled
against newer kernel headers, on platforms having older kernels.
Signed-off-by: Tudor Cornea <tudor.cornea@keysight.com>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Dapeng Yu [Thu, 15 Jul 2021 05:38:14 +0000 (13:38 +0800)]
net/softnic: fix memory leak in arguments parsing
In function pmd_parse_args(), firmware path is duplicated from device
arguments as character string, but is never freed, which cause memory
leak.
This patch changes the type of firmware member of struct pmd_params to
character array, to make memory resource release unnecessary, and
changes the type of name member to character array, to keep the
consistency of character string handling in struct pmd_params.
Fixes:
7e68bc20f8c8 ("net/softnic: restructure")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Jasvinder Singh <jasvinder.singh@intel.com>
Tomasz Duszynski [Thu, 15 Jul 2021 13:53:30 +0000 (08:53 -0500)]
raw/cnxk_bphy: support setting FEC
Add support for setting FEC for a given LMAC.
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Tomasz Duszynski [Thu, 15 Jul 2021 13:53:29 +0000 (08:53 -0500)]
raw/cnxk_bphy: support reading FEC
Allow one to retrieve supported FEC setting for specific LMAC.
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Tomasz Duszynski [Thu, 15 Jul 2021 13:53:28 +0000 (08:53 -0500)]
common/cnxk: support setting BPHY CGX/RPM FEC
Add support for setting FEC for a given LMAC.
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Tomasz Duszynski [Thu, 15 Jul 2021 13:53:27 +0000 (08:53 -0500)]
common/cnxk: support reading BPHY CGX/RPM FEC
Before setting FEC for specific LMAC one needs to know which type is
actually supported because it generally differs between modes
LMAC operates in (SGMII, SFI, etc.).
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Jie Zhou [Wed, 7 Jul 2021 20:25:38 +0000 (13:25 -0700)]
eal/windows: check callback parameter of alarm functions
EAL functions rte_eal_alarm_set() and rte_eal_alarm_cancel()
did not for invalid parameters in Windows implementation,
which is caught by the unit test alarm_autotest.
Enforce parameter check to fail fast for invalid parameters.
Fixes:
f4cbdbc7fbd2 ("eal/windows: implement alarm API")
Cc: stable@dpdk.org
Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Andrew Rybchenko [Thu, 22 Jul 2021 07:49:05 +0000 (10:49 +0300)]
net/sfc: fix build with clang 3.4.2
Old clang requires libatomic as well as gcc. Avoid compiler name and
version based checks. Add custom test for 16-byte atomic operations
to find out if libatomic is required to build.
Bugzilla ID: 760
Fixes:
96fd2bd69b58 ("net/sfc: support flow action count in transfer rules")
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: David Marchand <david.marchand@redhat.com>
Anatoly Burakov [Wed, 21 Jul 2021 14:26:25 +0000 (14:26 +0000)]
power: fix multi-queue scale mode
Currently in scale mode, multi-queue initialization will attempt to
initialize and de-initialize the per-lcore power library structures
multiple times. Fix it to only do this whenever we either enabling
first queue or disabling last queue.
Fixes:
5dff9a72b0ef ("power: support callbacks for multiple Rx queues")
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: David Hunt <david.hunt@intel.com>
Akhil Goyal [Thu, 22 Jul 2021 08:37:39 +0000 (14:07 +0530)]
maintainers: update for crypto API
Claim ownership for crypto API layer.
Have been reviewing patches from quite some time.
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
Shijith Thotton [Thu, 22 Jul 2021 09:06:51 +0000 (14:36 +0530)]
crypto/octeontx: enable build on non-Linux OS
Enabled build of Octeontx crypto PMD on non linux OS.
Other Octeontx PMDs are enabled already.
This is to avoid ABI test failure on an OS once we add dependency
between a driver which is built to another which is not.
Fixes:
8dc6c2f12ecf ("crypto/octeontx: add crypto adapter framework")
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Shijith Thotton <sthotton@marvell.com>
Acked-by: Akhil Goyal <gakhil@marvell.com>
Liang Ma [Tue, 20 Jul 2021 13:36:45 +0000 (14:36 +0100)]
build: check for broken AVX512 compiler support
GCC 6.3.0 has a known bug which related to _mm512_extracti64x4_epi64.
Please reference https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82887
Some DPDK PMD AVX512 version heavily use _mm512_extracti64x4_epi6,
which cause building failure with debug buildtype.
Therefore, it's helpful to check if compiler work with
_mm512_extracti64x4_epi6.
This patch check the compiler compile result against the test code
snippet. If the checking is failed then disable AVX512.
Bugzilla ID: 717
Fixes:
e6a6a138919f ("net/i40e: add AVX512 vector path")
Fixes:
808a17b3c1e6 ("net/ice: add Rx AVX512 offload path")
Fixes:
4b64ccb328c9 ("net/iavf: fix VLAN extraction in AVX512 path")
Cc: stable@dpdk.org
Reported-by: Liang Ma <liangma@liangbit.com>
Signed-off-by: Liang Ma <liangma@bytedance.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Kalesh AP [Tue, 20 Jul 2021 16:21:58 +0000 (21:51 +0530)]
net/bnxt: fix null dereference in interrupt handler
Coverity reports that pointer "cpr->cp_ring_struct" may be
dereferenced with null value. This patch fixes this.
Coverity issue: 372063
Fixes:
5ed30db87fa8 ("net/bnxt: fix missing barriers in completion handling")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Kalesh AP [Sun, 18 Jul 2021 05:30:59 +0000 (11:00 +0530)]
net/bnxt: remove workaround for default VNIC
On older Wh+ firmware versions, HWRM_FUNC_QCFG returns zero
for the parent default vnic. Commit "
3fb93bc7c349" added a
temporary Wh+-specific workaround in the PMD.
This has been fixed in latest firmware and hence removing
the workaround.
Fixes:
3fb93bc7c349 ("net/bnxt: initialize parent PF information")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Ting Xu [Sun, 18 Jul 2021 14:50:17 +0000 (22:50 +0800)]
net/ice: fix L3 RSS with IPv6 fragment
Since the header type of IPv6 fragment is wrong, the L3 dst/src RSS hash
fields cannot work properly. This patch changed the header type from any
to outer.
Fixes:
f1ea76eb6394 ("net/ice: support RSS hash for IP fragment")
Cc: stable@dpdk.org
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Ting Xu [Thu, 15 Jul 2021 02:16:42 +0000 (10:16 +0800)]
net/ice: clear QoS bandwidth on DCF close
When closing DCF, the bandwidth limit configured for VFs by DCF is not
cleared correctly. The configuration will still take effect when DCF starts
again, if VFs are not re-allocated. This patch cleared VFs bandwidth limit
when DCF closes, and DCF needs to re-configure bandwidth for VFs when it
starts next time.
Fixes:
3a6bfc37eaf4 ("net/ice: support QoS config VF bandwidth in DCF")
Signed-off-by: Ting Xu <ting.xu@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Liang Ma [Sun, 18 Jul 2021 10:29:16 +0000 (11:29 +0100)]
net/mlx5: export PMD-specific API file
The file rte_pmd_mlx5.h should be exported by Meson.
Fixes:
efa79e68c8cd ("net/mlx5: support fine grain dynamic flag")
Fixes:
23f627e0ed28 ("net/mlx5: add flow sync API")
Cc: stable@dpdk.org
Signed-off-by: Liang Ma <liangma@bytedance.com>
Lior Margalit [Tue, 20 Jul 2021 15:17:18 +0000 (18:17 +0300)]
net/mlx5: reject inner ethernet matching in GTP
The user is able to create a flow rule pattern with ETH after GTP
although it is not supported by the flex-parser configuration.
Failed the rule validation in such case with proper error message.
Fixes:
23c1d42c7138 ("net/mlx5: split flow validation to dedicated function")
Cc: stable@dpdk.org
Signed-off-by: Lior Margalit <lmargalit@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Lior Margalit [Sun, 18 Jul 2021 11:15:04 +0000 (14:15 +0300)]
net/mlx5: fix RSS expansion for GTP
The flow did not expand correctly when it included a GTP item.
Added GTP node to the expansion graph as possible next node
after IPv4/IPv6 UDP node.
Fixes:
592f05b29a25 ("net/mlx5: add RSS flow action")
Cc: stable@dpdk.org
Signed-off-by: Lior Margalit <lmargalit@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Xueming Li [Wed, 7 Jul 2021 11:53:26 +0000 (19:53 +0800)]
net/mlx5: fix SF representor probing in isolate mode
Representor failed to probe in isolated mode due to callback of
retrieving representor info missing. This patch adds it back.
Fixes:
cb95feefdd03 ("net/mlx5: support sub-function representor")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Viacheslav Ovsiienko [Wed, 21 Jul 2021 08:31:40 +0000 (11:31 +0300)]
net/mlx5: fix RoCE LAG bond device probing
The RoCE LAG bond device requires neither E-Switch nor SR-IOV
configurations. It means the RoCE LAG bond device might be
presented as a single port Infiniband device.
The mlx5 PMD wrongly recognized standalone RoCE LAG bond device
as E-Switch configuration, this triggered the calls of E-Switch
ports related API and the latter failed (over the new OFED kernel
driver, starting since 5.4.1), causing the overall device probe
failure.
If there is a single port Infiniband bond device found the
E-Switch related flags must be cleared indicating standalone
configuration.
Also, it is not true anymore the bond device can exist
over E-Switch configurations only (as it was claimed for VF LAG
bond devices). The related checks are not relevant anymore
and removed.
Fixes:
790164ce1d2d ("net/mlx5: check kernel support for VF LAG bonding")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Alexander Kozyrev [Fri, 16 Jul 2021 08:43:05 +0000 (11:43 +0300)]
net/mlx5: reject copy to mark via modify action
The Mark action is a two-stage process in the Mellanox driver.
First, a hardware register is filled with the required value,
then this value is registered in the software resource table.
The MODIFY_FIELD action can instruct a Mellanox NIC to copy
some value from an arbitrary packet header field into the
hardware register, associated with the Mark item. But there
is no way NIC can modify the software resource table as well.
Due to these driver limitations the copying of arbitrary value
to the MARK can not be supported and should be rejected in the
MODIFY_FIELD action.
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Alexander Kozyrev [Tue, 20 Jul 2021 07:51:38 +0000 (10:51 +0300)]
net/mlx5: fix meta register conversion for extensive mode
Register C is used in the extensive metadata mode number 1 and its
width can vary from 0 to 32 bits depending on the kernel usage of it.
There are several issues associated with this mode (dv_xmeta_en=1):
1. The metadata setting assumes that the width is always 16 bits,
which is the most common case in this mode. Use the proper mask.
2. The same is true for the modify_field Flow API. 16-bits width
is hardcoded for dv_xmeta_en=1. Switch to the register C mask width.
3. Metadata is stored in the most significant bits in CQE in this
mode because the registers copy code was not updated during the
metadata conversion to the big-endian format. Update this code to
avoid shifting the metadata in the datapath.
Fixes:
b57e414b48 ("net/mlx5: convert meta register to big-endian")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Suanming Mou [Thu, 22 Jul 2021 06:59:40 +0000 (09:59 +0300)]
net/mlx5: fix indexed pools allocation on Windows
Currently, the flow indexed pools are allocated per port,
the allocation was missing in Windows code.
Allocate indexed pool for the Windows case too.
Fixes:
b4edeaf3efd5 ("net/mlx5: replace flow list with indexed pool")
Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Tal Shnaiderman <talshn@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Tested-by: Odi Assli <odia@nvidia.com>
Dmitry Kozlyuk [Wed, 21 Jul 2021 12:51:12 +0000 (15:51 +0300)]
net/mlx5: fix indirect action modify rollback
mlx5_ind_table_obj_modify() first references queues from the new list,
then applies the new list to HW. In case of apply failure the function
dereferenced queues from the old list, while it should be the new list.
Fixes:
fa7ad49e96b5 ("net/mlx5: fix shared RSS action update")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Dmitry Kozlyuk [Tue, 20 Jul 2021 07:53:35 +0000 (10:53 +0300)]
net/mlx5: fix Rx/Tx queue checks
When device configuration was interrupted by a signal,
mlx5_rxq/txq_release() could access yet unitinialized array
and crash the application. Add checks whether queue array
is initialized.
Fixes:
a1366b1a2be3 ("net/mlx5: add reference counter on DPDK Rx queues")
Fixes:
6e78005a9b30 ("net/mlx5: add reference counter on DPDK Tx queues")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Dong Zhou [Thu, 22 Jul 2021 07:48:39 +0000 (10:48 +0300)]
net/mlx5: check VLAN push/pop support
For ConnectX-6 in FDB domain, pop and push VLAN
on both ingress and egress directions are supported.
For ConnectX-6 in NIC domain, and ConnectX-5 in both FWD and NIC domain,
pop VLAN is only supported on ingress direction,
push VLAN is only supported on egress direction.
Signed-off-by: Dong Zhou <dongzhou@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Michael Baum [Mon, 12 Jul 2021 07:06:44 +0000 (10:06 +0300)]
regex/mlx5: fix redundancy in device removal
In the removal function, PMD releases all driver resources and
cancels the regexdev registry.
However, regexdev registration is accidentally canceled twice.
Remove one of them.
Fixes:
b34d816363b5 ("regex/mlx5: support rules import")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Michael Baum [Mon, 12 Jul 2021 07:06:43 +0000 (10:06 +0300)]
regex/mlx5: fix leak on device removal
In the removal function, PMD releases all driver resources allocated
in the probe function.
The MR btree memory is allocated in the probe function, but it is not
freed in remove function what caused a memory leak.
Release it.
Fixes:
cda883bbb655 ("regex/mlx5: add dynamic memory registration to datapath")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Michael Baum [Mon, 12 Jul 2021 07:06:42 +0000 (10:06 +0300)]
regex/mlx5: fix memory region unregistration
The issue can cause illegal physical address access while a huge-page A
is released and huge-page B is allocated on the same virtual address.
The old MR can be matched using the virtual address of huge-page B but
the HW will access the physical address of huge-page A which is no more
part of the DPDK process.
Register a driver callback for memory event in order to free out all the
MRs of memory that is going to be freed from the DPDK process.
Fixes:
cda883bbb655 ("regex/mlx5: add dynamic memory registration to datapath")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Ori Kam <orika@nvidia.com>
Michael Baum [Thu, 1 Jul 2021 06:39:16 +0000 (09:39 +0300)]
net/mlx5: fix overflow in mempool argument
The mlx5_mprq_alloc_mp function makes shifting to the numeric constant
1, for sending it as a parameter to rte_mempool_create function.
The rte_mempool_create function expects to get void pointer (uintptr_t,
might be 64-bit) and instead gets a 32-bit variable, because the
numeric constant size is a 32-bit.
In case the shift is greater than 32 the variable might lose its value
even though the function might get 64-bit argument.
Change the size of the numeric constant 1 to uintptr_t.
Fixes:
3a22f3877c9d ("net/mlx5: replace external mbuf shared memory")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Michael Baum [Thu, 1 Jul 2021 06:39:15 +0000 (09:39 +0300)]
vdpa/mlx5: fix overflow in queue attribute
The mlx5_vdpa_event_qp_create function makes shifting to the numeric
constant 1, then multiplies it by another constant and finally assigns
it into a uint64_t variable.
The numeric constant type is an int with a 32-bit sign. if after
shifting , its MSB (bit of sign) will change, the uint64 variable will
get into it a different value than what the function intended it to get.
Set the numeric constant 1 to be uint64_t in the first place.
Fixes:
8395927cdfaf ("vdpa/mlx5: prepare HW queues")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Michael Baum [Thu, 1 Jul 2021 06:39:14 +0000 (09:39 +0300)]
compress/mlx5: fix overflow in queue size
The mlx5_compress_qp_setup function makes shifting to the numeric
constant 1, then sends it as a parameter to rte_calloc function.
The rte_calloc function expects to get size_t (might be 64 bit) and
instead gets a 32-bit variable, because the numeric constant size is a
32-bit.
In case the shift is greater than 32 bit and it 64-system, the variable
will lose its value even though the function can get 64-bit argument.
Change the size of the numeric constant 1 to size_t.
Fixes:
8619fcd5161b ("compress/mlx5: support queue pair operations")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Michael Baum [Thu, 1 Jul 2021 06:39:13 +0000 (09:39 +0300)]
regex/mlx5: fix size of setup constants
The constant representing the size of the metadata is defined as an
unsigned int variable with 32-bit.
Similarly the constant representing the maximal output is also defined
as an unsigned int variable with 32-bit.
There is potentially overflowing expression when those constants are
evaluated using 32-bit arithmetic, and then used in a context that
expects an expression of type size_t that might be 64-bit.
Change the size of the above constants to size_t.
Fixes:
30d604bb1504 ("regex/mlx5: fix type of setup constants")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Bing Zhao [Wed, 21 Jul 2021 08:54:21 +0000 (11:54 +0300)]
net/mlx5: support meter for trTCM profiles
The support of RFC2698 and RFC4115 are added in mlx5 PMD. Only the
ASO metering supports these two profiles.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Bing Zhao [Wed, 21 Jul 2021 08:54:20 +0000 (11:54 +0300)]
net/mlx5: check consistency of meter policy and profile
In the previous implementation, only green color policy was
supported in mlx5 PMD. Since yellow color policy is supported now,
the consistency of meter policy and profile should be checked.
1. If the profile supports yellow but the policy doesn't, an error
should be returned when creating the meter. Or else, there is
no explicit steering action for the packets marked with yellow.
2. If the policy supports yellow but the profile doesn't, it will
be considered as a valid case. Even if no packet will be
handled with the yellow steering action, it is just like that
only the green policy presents.
Usually the green color is supported by default, but when it is
disabled intentionally with setting the CBS to a small value like
zero in the profile, the similar checking on green policy and
profile should also be done.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Bing Zhao [Wed, 21 Jul 2021 08:54:19 +0000 (11:54 +0300)]
net/mlx5: support yellow in meter policy validation
In the previous implementation, the policy for yellow color was not
supported. The action validation for yellow was skipped.
Since the yellow color policy needs to be supported, the validation
should also be done for the yellow color. In the meanwhile, due to
the fact that color policies of one meter should be used for the
same flow(s), the domains supported of both colors should be the
same. If both of the colors have RSS as the termination actions,
except the queues, all other parameters of RSS should be the same.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Bing Zhao [Wed, 21 Jul 2021 08:54:18 +0000 (11:54 +0300)]
net/mlx5: split meter color policy handling
If the fate action is either RSS or Queue of a meter policy, the
action will only be created in the flow splitting stage. With queue
as the fate action, only one sub-policy is needed. And RSS will
have more than one sub-policies if there is an expansion.
Since the RSS parameters are the same for both green and yellow
colors except the queues, the expansion result will be unique.
Even if only one color has the RSS action, the checking and possible
expansion will be done then. For each sub-policy, the action rules
need to be created separately on its own policy table.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Bing Zhao [Wed, 21 Jul 2021 08:54:17 +0000 (11:54 +0300)]
net/mlx5: support yellow meter policy rules
When creating a meter policy, both / either of the action rules for
green and yellow colors may be provided. After validation, usually
the actions are created before the meter is using by a flow rule.
If there is action specified for the yellow color, the action rules
should be created together with green color in the same time. The
action of green / yellow color can be empty, then the default
behavior is the jump action of the rule, just the same as that of
the default policy.
If the fate action of either one color is queue / RSS, all the
actions rules will be created on the flow splitting stage instead of
the policy adding stage.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Bing Zhao [Wed, 21 Jul 2021 08:54:16 +0000 (11:54 +0300)]
net/mlx5: enable meter bucket overflow for yellow color
To support the meter policy for yellow action, the prerequisite is
that the hardware needs to support the EBS, as defined in the
RFC2697.
https://datatracker.ietf.org/doc/html/rfc2697
Then some of the packets can be marked as yellow if the tokens of C
bucket is not enough but enough in E bucket. The color could be used
for the further steering of the packets.
In the current implementation EBS and overflow were ignored when
creating a meter profile. With this commit, if EBS is set by the
application, the generation of yellow color will be enabled in the
hardware for flow rules steering of packets.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Bing Zhao [Wed, 21 Jul 2021 08:54:15 +0000 (11:54 +0300)]
net/mlx5: handle yellow case in default meter policy
In order to support the yellow color for the default meter policy,
the default policy action for yellow should be created together
with the green policy.
The default policy action for yellow action is the same as that for
green. In the same table, the same matcher will be reused for yellow
and the destination group will be the same.
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:43 +0000 (22:37 +0800)]
common/mlx5: remove legacy PCI driver
Clean up legacy PCI bus driver since all mlx5 PMDs are moved
to the new bus-agnostic driver interface.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:42 +0000 (22:37 +0800)]
crypto/mlx5: migrate to bus-agnostic common interface
To support auxiliary bus, upgrade the driver to use mlx5 common driver
structure.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:41 +0000 (22:37 +0800)]
compress/mlx5: migrate to bus-agnostic common interface
To support auxiliary bus, upgrade the driver to use mlx5 common driver
structure.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Thomas Monjalon [Wed, 21 Jul 2021 14:37:40 +0000 (22:37 +0800)]
vdpa/mlx5: support Sub-Function
RoCE disabling requirement is based on PCI address.
In order to support Sub-Function, a conversion is needed
in the case of an auxiliary device.
SF device can be probed with such devargs string:
auxiliary:mlx5_core.sf.<id>,class=vdpa
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Thomas Monjalon [Wed, 21 Jul 2021 14:37:39 +0000 (22:37 +0800)]
vdpa/mlx5: migrate to bus-agnostic common interface
Replace PCI-specific handling with bus-agnostic structures.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Thomas Monjalon [Wed, 21 Jul 2021 14:37:38 +0000 (22:37 +0800)]
vdpa/mlx5: define driver name as macro
Use a macro for the PMD driver name.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:37 +0000 (22:37 +0800)]
regex/mlx5: migrate to bus-agnostic common interface
To support auxiliary bus, upgrades driver to use mlx5 common driver
structure.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:36 +0000 (22:37 +0800)]
net/mlx5: check maximum Verbs port number
Verbs API doesn't support device port number larger than 255 by design.
Add check and fail probing with proper error log.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:35 +0000 (22:37 +0800)]
net/mlx5: support Sub-Function
Introduce SF support.
Similar to VF, SF on auxiliary bus is a portion of hardware PF,
no representor or bonding parameters for SF.
Devargs to support SF:
-a auxiliary:mlx5_core.sf.8,dv_flow_en=1
New global syntax to support SF:
-a bus=auxiliary,name=mlx5_core.sf.8/class=eth/driver=mlx5,dv_flow_en=1
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:34 +0000 (22:37 +0800)]
net/mlx5: migrate to bus-agnostic common interface
To support SubFunction based on auxiliary bus, common driver supports
new bus-agnostic driver.
This patch migrates net driver to new common driver.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:33 +0000 (22:37 +0800)]
net/mlx5: reduce PCI dependency
To support more bus types, remove PCI dependency where possible.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Thomas Monjalon [Wed, 21 Jul 2021 14:37:32 +0000 (22:37 +0800)]
common/mlx5: get PCI device address from any bus
A function is exported to allow retrieving the PCI address
of the parent PCI device of a Sub-Function in auxiliary bus sysfs.
The function mlx5_dev_to_pci_str() is accepting both PCI and auxiliary
devices. In case of a PCI device, it is simply using the device name.
The function mlx5_dev_to_pci_addr(), which is based on sysfs path
and do not use any device object, is renamed to mlx5_get_pci_addr()
for clarity purpose.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:31 +0000 (22:37 +0800)]
common/mlx5: support auxiliary bus
Add auxiliary bus support for Sub-Function.
As a limitation of current driver, NUMA node of device is detected
from PCI bus of device sysfs symbol link.
It will be removed once NUMA node file will be available in sysfs.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Thomas Monjalon [Wed, 21 Jul 2021 14:37:30 +0000 (22:37 +0800)]
common/mlx5: move description of PCI sysfs functions
The Linux-specific functions mlx5_get_pci_addr() and
mlx5_get_ifname_sysfs() are better described in the .h file.
The requirement for using mlx5_get_pci_addr() is made explicit:
the node /device must exist in the provided sysfs path.
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:29 +0000 (22:37 +0800)]
common/mlx5: add bus-agnostic layer
To support auxiliary bus, introduces common device driver and callbacks,
supposed to replace mlx5 common PCI bus driver.
Mlx5 class drivers, i.e. eth, vDPA, regex and compress normally consumes
single Verbs device context to probe a device. The Verbs device comes
from PCI address if the device is PCI bus device, from Auxiliary sysfs
if the device is auxiliary bus device. Currently only PCI bus is
supported.
Common device driver is a middle layer between mlx5 class drivers and
bus, resolve and abstract bus info to Verbs device for class drivers.
Both PCI bus driver and Auxiliary bus driver can utilize the common
driver layer to cast bus operations to mlx5 class drivers.
Legacy mlx5 common PCI bus driver still being used by mlx5 eth, vDPA,
regex and compress PMD, will be removed once all PMD drivers
migrate to new common driver.
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Xueming Li [Wed, 21 Jul 2021 14:37:28 +0000 (22:37 +0800)]
common/mlx5: rename ethernet device class
To align with EAL class driver, rename internal class name
from "net" to "eth"
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Ivan Ilchenko [Tue, 20 Jul 2021 07:54:45 +0000 (10:54 +0300)]
net/virtio: fix Rx scatter offload
Report Rx scatter offload capability depending on VIRTIO_NET_F_MRG_RXBUF.
If Rx scatter is not requested, ensure that provided Rx buffers on
each Rx queue are big enough to fit Rx packets up to configured MTU.
Fixes:
ce17eddefc20 ("ethdev: introduce Rx queue offloads API")
Cc: stable@dpdk.org
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Jiayu Hu [Tue, 6 Jul 2021 08:29:34 +0000 (04:29 -0400)]
vhost: add thread-unsafe async registration
This patch adds thread unsafe version for async register and
unregister functions.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Jiayu Hu [Mon, 19 Jul 2021 15:00:46 +0000 (11:00 -0400)]
vhost: rework async configuration structure
This patch reworks the async configuration structure to improve code
readability. In addition, add preserved padding fields on the structure
for future usage.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Jiayu Hu [Mon, 19 Jul 2021 15:00:45 +0000 (11:00 -0400)]
vhost: fix lock on device readiness notification
The vhost notifies the application of device readiness via
vhost_user_notify_queue_state(), but calling this function
is not protected by the lock. This patch is to make this
function call lock protected.
Fixes:
d0fcc38f5fa4 ("vhost: improve device readiness notifications")
Cc: stable@dpdk.org
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>