David Marchand [Mon, 3 May 2021 16:43:42 +0000 (18:43 +0200)]
net/virtio: do not touch Tx offload flags
Tx offload flags are of the application responsibility.
Leave the mbuf alone and use a local storage for implicit tcp checksum
offloading in case of TSO.
Signed-off-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org>
Matan Azrad [Sun, 2 May 2021 10:45:10 +0000 (13:45 +0300)]
vdpa/mlx5: improve interrupt management
The driver should notify the guest for each traffic burst detected by CQ
polling.
The CQ polling trigger is defined by `event_mode` device argument,
either by busy polling on all the CQs or by blocked call to HW
completion event using DevX channel.
Also, the polling event modes can move to blocked call when the
traffic rate is low.
The current blocked call uses the EAL interrupt API suffering a lot
of overhead in the API management and serve all the drivers and
libraries using only single thread.
Use blocking FD of the DevX channel in order to do blocked call
directly by the DevX channel FD mechanism.
Signed-off-by: Matan Azrad <matan@nvidia.com> Acked-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
For now async vhost data path only supports split ring. This patch
enables packed ring in async vhost data path to make async vhost
compatible with virtio 1.1 spec.
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
This patch moves some code of async vhost split ring into
inline functions to improve the readability. Also, it
changes the pointer index style of iterator to make the
code more concise.
Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com>
Xueming Li [Wed, 14 Apr 2021 14:14:04 +0000 (22:14 +0800)]
net/virtio: fix vectorized Rx queue rearm
When Rx queue worked in vectorized mode and rxd <= 512, under traffic of
high PPS rate, testpmd often start and receive packets of rxd without
further growth.
Testpmd started with rxq flush which tried to rx MAX_PKT_BURST(512)
packets and drop. When Rx burst size >= Rx queue size, all descriptors
in used queue consumed without rearm, device can't receive more packets.
The next Rx burst returned at once since no used descriptors found,
rearm logic was skipped, rx vq kept in starving state.
To avoid rx vq starving, this patch always check the available queue,
rearm if needed even no used descriptor reported by device.
Fixes: fc3d66212fed ("virtio: add vector Rx") Fixes: 2d7c37194ee4 ("net/virtio: add NEON based Rx handler") Fixes: 52b5a707e6ca ("net/virtio: add Altivec Rx") Cc: stable@dpdk.org Signed-off-by: Xueming Li <xuemingl@nvidia.com> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Ciara Power [Wed, 5 May 2021 15:22:48 +0000 (15:22 +0000)]
telemetry: fix race on callbacks list
The list_commands() function accessed the callbacks list,
but did not take the lock. This may have caused inconsistencies if
callbacks were being registered at the same time.
This is now fixed to lock before iterating the list,
and unlock afterwards.
Fixes: f38748736eb2 ("telemetry: add default callback commands") Cc: stable@dpdk.org Reported-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ciara Power <ciara.power@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
While working on RISC-V port I have encountered a situation where worker
threads get stuck in the rte_distributor_return_pkt() function in the
burst test.
Investigation showed some of the threads enter this function with
flag RTE_DISTRIB_GET_BUF set in the d->retptr64[0]. At the same time the
main thread has already passed rte_distributor_process() so nobody will
clear this flag and hence workers can't return.
What I've noticed is that adding a flush just after the last _process(),
similarly to how quit_workers() function is written in the
test_distributor.c fixes the issue.
Lukasz Wojciechowski reproduced the same issue on x86 using a VM with 32
emulated CPU cores to force some lcores not to be woken up.
Fixes: 7c3287a10535 ("test/distributor: add performance test for burst mode") Cc: stable@dpdk.org Signed-off-by: Stanislaw Kardach <kda@semihalf.com> Acked-by: David Hunt <david.hunt@intel.com> Tested-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com> Reviewed-by: Lukasz Wojciechowski <l.wojciechow@partner.samsung.com>
test/distributor: fix worker notification in burst mode
Because a single worker can process more than one packet from the
distributor, the final set of notifications in burst mode should be
sent one-by-one to ensure that each worker has a chance to wake up.
This fix mirrors the change done in the functional test by
commit f72bff0ec272 ("test/distributor: fix quitting workers in burst
mode").
This patch fixes issue with OVS 2.15 not working on
DPAA/FSLMC based platform due to missing support for
these busses in dev_iterate.
This patch adds dpaa_bus and fslmc to dev iterator
for bus arguments.
Fixes: 214ed1acd125 ("ethdev: add iterator to match devargs input") Cc: stable@dpdk.org Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com> Acked-by: Thomas Monjalon <thomas@monjalon.net>
Some logs format u64 variables, mostly using hexadecimal which was not
readable.
This patch formats most u64 variables in decimal, and add '0x' prefix
to the ones that are not adjusted.
Fixes: c37ca66f2b27 ("net/hns3: support RSS") Fixes: 2790c6464725 ("net/hns3: support device reset") Fixes: 8839c5e202f3 ("net/hns3: support device stats") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
VMDq is not supported yet, so remove the unused code.
Fixes: d51867db65c1 ("net/hns3: add initialization") Fixes: 1265b5372d9d ("net/hns3: add some definitions for data structure and macro") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Currently, driver uses the macro HNS3_DEFAULT_RX_BURST whose value is
32 to limit the vector Rx burst size, as a result, the burst size
can't exceed 32.
This patch fixes this problem by support big burst size.
Also adjust HNS3_DEFAULT_RX_BURST to 64 as it performs better than 32.
Fixes: a3d4f4d291d7 ("net/hns3: support NEON Rx") Fixes: 952ebacce4f2 ("net/hns3: support SVE Rx") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
This patch improves data cache usage by:
1. Rearrange the rxq frequency accessed fields in the IO path to the
first 128B.
2. Rearrange the txq frequency accessed fields in the IO path to the
first 64B.
3. Make sure ptype table align cacheline size which is 128B instead of
min cacheline size which is 64B because the L1/L2 is 64B and L3 is
128B on Kunpeng ARM platform.
The performance gains are 1.5% in 64B packet macfwd scenarios.
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
This patch deletes some unused capabilities, include:
1. Delete some unused firmware capabilities definition, which are:
UDP_GSO, ATR, INT_QL, SIMPLE_BD, TX_PUSH, FEC and PAUSE.
2. Delete some unused driver capabilities definition, which are:
UDP_GSO, TX_PUSH.
3. Also redefine HNS3_DEV_SUPPORT_* as enum type, and change some of
the values. Note: the HNS3_DEV_SUPPORT_* values is used only inside
the driver, so it's safe to change the values.
Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Flow rule items supplied by application must explicitly specify
network headers referred by integrity item. For example:
flow create 0 ingress
pattern
integrity level is 0 value mask l3_ok value spec l3_ok /
eth / ipv6 / end …
or
flow create 0 ingress
pattern
integrity level is 0 value mask l4_ok value spec 0 /
eth / ipv4 proto is udp / end …
Add integrity item definition to the rte_flow_desc_item array.
The new entry allows to build RTE flow item from a data
stored in rte_flow_item_integrity type.
Huisong Li [Thu, 29 Apr 2021 09:03:59 +0000 (17:03 +0800)]
net/hns3: fix MAC enable failure rollback
If driver fails to enable MAC, it does not need to rollback the MAC
configuration. This patch fixes it.
Fixes: bdaf190f8235 ("net/hns3: support link speed autoneg for PF") Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Kalesh AP [Fri, 23 Apr 2021 05:22:26 +0000 (10:52 +0530)]
net/bnxt: drop unused attribute
Remove "__rte_unused" instances that are wrongly marked.
Fixes: 6dc83230b43b ("net/bnxt: support port representor data path") Fixes: 1bf01f5135f8 ("net/bnxt: prevent device access when device is in reset") Cc: stable@dpdk.org Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Andrew Rybchenko [Wed, 28 Apr 2021 14:17:02 +0000 (17:17 +0300)]
net/sfc: fix mark support in EF100 native Rx datapath
Decouple user mark from user flag. Usage of mark does not require to
use flag as well. Flag is not actually supported yet.
Fixes: 1aacc3d388d3 ("net/sfc: support user mark and flag Rx for EF100") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru> Reviewed-by: Andy Moreton <amoreton@xilinx.com> Reviewed-by: Ivan Malov <ivan.malov@oktetlabs.ru>
When starting VF, VF will issue reset command to PF, wait a fixed
amount of time, and assume VF reset is done on PF side. However,
compared with kernel PF, DPDK PF needs more time to setup. If we
run DPDK PF to support DPDK VF, the original delay will not be
enough.
When we first start VF after PF is launched, the execution
time of the statement info.msg_buf = rte_zmalloc("msg_buffer",
info.buf_len, 0); in the function i40e_dev_handle_aq_msg is more
than 200ms. It may cause VF start error.
Since iavf can hardly trigger this issue and i40evf will be replaced
by iavf in future DPDK versions, this patch provide a workaround.
We extend VF reset waiting time from 200ms to 500ms so that
VF can start normally when using DPDK PF and DPDK VF in most cases.
Robin Zhang [Wed, 28 Apr 2021 08:04:52 +0000 (08:04 +0000)]
net/i40e: fix primary MAC type when starting port
When start port, all MAC addresses will be set. We should set the MAC
type of default MAC address as VIRTCHNL_ETHER_ADDR_PRIMARY.
Fixes: 3f604ddf33cf ("net/i40e: fix lack of MAC type when set MAC address") Cc: stable@dpdk.org Signed-off-by: Robin Zhang <robinx.zhang@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Robin Zhang [Wed, 28 Apr 2021 08:04:51 +0000 (08:04 +0000)]
net/iavf: fix primary MAC type when starting port
When start port, all MAC addresses will be set. We should set the MAC
type of default MAC address as VIRTCHNL_ETHER_ADDR_PRIMARY.
Fixes: b335e7203475 ("net/iavf: fix lack of MAC type when set MAC address") Cc: stable@dpdk.org Signed-off-by: Robin Zhang <robinx.zhang@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
The device name format used in ifpga_rawdev_create() was changed to
"IFPGA:%02x:%02x.%x", but the format used in ifpga_rawdev_destroy()
was left as "IFPGA:%x:%02x.%x", it should be changed synchronously.
To prevent further similar errors, macro "IFPGA_RAWDEV_NAME_FMT" is
defined to replace this format string.
Wenzhuo Lu [Thu, 29 Apr 2021 01:33:57 +0000 (09:33 +0800)]
net/iavf: fix Rx function selection
A performance drop is caused by that the RX scalar path
is selected when AVX512 is disabled and some HW offload
is enabled.
Actually, the HW offload is supported by AVX2 and SSE.
In this scenario AVX2 path should be chosen.
This patch removes the offload related check for SSE and AVX2
as SSE and AVX2 do support the offload features.
No implementation change about the data path.
Fixes: eff56a7b9f97 ("net/iavf: add offload path for Rx AVX512") Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com> Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Michael Baum [Thu, 29 Apr 2021 09:55:42 +0000 (12:55 +0300)]
net/mlx5: use aging by counter when counter exists
The driver support 2 mechanisms in order to support AGE action:
1. Aging by counter - HW counter will be configured to the flow traffic,
the driver polls the counter values efficiently to detect flow timeout.
2. Aging by ASO flow hit bit - HW ASO flow-hit bit is allocated for the
flow, the driver polls the bit efficiently to detect flow timeout.
ASO bit is only single bit resource while counter is 16 bytes, hence, it
is better to use ASO instead of counter for aging.
When a non-shared COUNT action is also configured to the flow, the
driver can use the same counter also for AGE action and no need to
create more ASO action for it.
The current code always uses ASO when it is supported in the device,
change it to reuse the non-shared counter if it exists in the flow.
Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Michael Baum [Thu, 29 Apr 2021 09:55:41 +0000 (12:55 +0300)]
net/mlx5: fix flow age event triggering
A FLOW_AGE event should be invoked when a new aged-out flow is detected
by the PMD after the last user get-aged query calling.
The PMD manages 2 flags for this information and check them in order to
decide if an event should be invoked:
MLX5_AGE_EVENT_NEW - a new aged-out flow was detected. after the last
check.
MLX5_AGE_TRIGGER - get-aged query was called after the last aged-out
flow.
The 2 flags were unset after the event invoking.
When the user calls get-aged query from the event callback, the TRIGGER
flag was set inside the user callback and unset directly after the
callback what may stop the event invoking forever.
Unset the TRIGGER flag before the event invoking in order to allow set
it by the user callback.
Fixes: f935ed4b645a ("net/mlx5: support flow hit action for aging") Cc: stable@dpdk.org Reported-by: David Bouyeure <david.bouyeure@fraudbuster.mobi> Signed-off-by: Michael Baum <michaelba@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Michael Baum [Thu, 29 Apr 2021 09:55:38 +0000 (12:55 +0300)]
net/mlx5: support flow count action handle
Existing API supports counter action to count traffic of a single flow.
The user can share the count action among different flows using the
shared flag and the same counter ID in the count action configuration.
Recent patch [1] introduced the indirect action API.
Using this API, an action can be created as indirect, unattached to any
flow rule.
Multiple flows can then be created using the same indirect action.
The new API also supports query operation of an indirect action.
The new API is more efficient because the driver gets it's own handler
for the count action instead of managing a mapping between the user ID
to the driver handle.
Support create, query and destroy indirect action operations for flow
count action.
Application will use the indirect action query operation to query this
count action.
In the meantime the old sharing mechanism (with the sharing flag)
continues to be supported, and the user can choose the way he wants to
share the counter.
The new indirect action API is only supported in DevX, so sharing
counter action in Verbs can only be done through the old mechanism.
Tx prepare should be called only when necessary to reduce the impact on
performance.
For partial TX offload, users need to call rte_eth_tx_prepare() to
invoke the tx_prepare callback of PMDs. In this callback, the PMDs
adjust the packet based on the offloading used by the user. (e.g. For
some PMDs, pseudo-headers need to be calculated when the TX cksum is
offloaded.)
However, for the users, they cannot grasp all the hardware and PMDs
characteristics. As a result, users cannot decide when they need to
actually call tx_prepare. Therefore, we should assume that the user
calls rte_eth_tx_prepare() when using any Tx offloading to ensure that
related functions work properly. Whether packets need to be adjusted
should be determined by PMDs. They can make judgments in the
dev_configure or queue_setup phase. When the related function is not
used, the pointer of tx_prepare should be set to NULL to reduce the
performance loss caused by invoking rte_eth_tx_repare().
In this patch, if tx_prepare is not required for the offloading used by
the users, the tx_prepare pointer will be set to NULL.
Fixes: bba636698316 ("net/hns3: support Rx/Tx and related operations") Cc: stable@dpdk.org Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
The hns3_is_csq() and cmq_ring_to_dev() macro were defined in previous
version but never used.
Fixes: 737f30e1c3ab ("net/hns3: support command interface with firmware") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Currently, driver uses gettimeofday() API to get the time, and
then calculate the time delta, the delta will be used mainly in
judging timeout process.
But the time which gets from gettimeofday() API isn't monotonically
increasing. The process may fail if the system time is changed.
We use the following scheme to fix it:
1. Add hns3_clock_gettime() API which will get the monotonically
increasing time.
2. Add hns3_clock_calctime_ms() API which will get the milliseconds of
the monotonically increasing time.
3. Add hns3_clock_calctime_ms() API which will calc the milliseconds of
a given time.
Fixes: 2790c6464725 ("net/hns3: support device reset") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
If the reset process cost too much time, driver will log one error
message which formats the time delta, but the formatting is using
hexadecimal which was not readable.
This patch fixes it by formatting in decimal format.
Fixes: 2790c6464725 ("net/hns3: support device reset") Cc: stable@dpdk.org Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
The fwd_config_setup() is called after init_fwd_streams().
The fwd_config_setup() will reinitialize forwarding streams.
This patch removes init_fwd_streams() from init_config().
Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Huisong Li [Wed, 28 Apr 2021 06:40:45 +0000 (14:40 +0800)]
app/testpmd: add forwarding configuration to DCB config
This patch adds fwd_config_setup() at the end of cmd_config_dcb_parsed()
to update "cur_fwd_config", so that the actual forwarding streams can be
queried by the "show config fwd" cmd.
Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Huisong Li [Wed, 28 Apr 2021 06:40:44 +0000 (14:40 +0800)]
app/testpmd: verify DCB config during forward config
Currently, the check for doing DCB test is assigned to
start_packet_forwarding(), which will be called when
run "start" cmd. But fwd_config_setup() is used in many
scenarios, such as, "port config all rxq".
This patch moves the check from start_packet_forwarding()
to fwd_config_setup().
Fixes: 7741e4cf16c0 ("app/testpmd: VMDq and DCB updates") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Huisong Li [Wed, 28 Apr 2021 06:40:43 +0000 (14:40 +0800)]
app/testpmd: check DCB info support for configuration
Currently, '.get_dcb_info' must be supported for the port doing DCB
test, or all information in 'rte_eth_dcb_info' are zero. It should be
prevented when user run cmd "port config 0 dcb vt off 4 pfc off".
This patch adds the check for support of reporting dcb info.
Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Huisong Li [Wed, 28 Apr 2021 06:40:42 +0000 (14:40 +0800)]
app/testpmd: fix DCB re-configuration
After DCB mode is configured, if we decrease the number of RX and TX
queues, fwd_config_setup() will be called to setup the DCB forwarding
configuration. And forwarding streams are updated based on new queue
numbers in fwd_config_setup(), but the mapping between the TC and
queues obtained by rte_eth_dev_get_dcb_info() is still old queue
numbers (old queue numbers are greater than new queue numbers).
In this case, the segment fault happens. So rte_eth_dev_configure()
should be called again to update the mapping between the TC and
queues before rte_eth_dev_get_dcb_info().
Like:
set nbcore 4
port stop all
port config 0 dcb vt off 4 pfc on
port start all
port stop all
port config all rxq 8
port config all txq 8
Fixes: 900550de04a7 ("app/testpmd: add dcb support") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Huisong Li [Wed, 28 Apr 2021 06:40:41 +0000 (14:40 +0800)]
app/testpmd: fix DCB forwarding configuration
After DCB mode is configured, the operations of port stop and port start
change the value of the global variable "dcb_test", As a result, the
forwarding configuration from DCB to RSS mode, namely,
“dcb_fwd_config_setup()” to "rss_fwd_config_setup()".
Currently, the 'dcb_flag' field in struct 'rte_port' indicates whether
the port is configured with DCB. And it is sufficient to have
'dcb_config' as a global variable to control the DCB test status. So
this patch deletes the "dcb_test".
In addition, setting 'dcb_config' at the end of init_port_dcb_config()
in case that ports fail to enter DCB mode.
Fixes: 900550de04a7 ("app/testpmd: add dcb support") Fixes: ce8d561418d4 ("app/testpmd: add port configuration settings") Fixes: 7741e4cf16c0 ("app/testpmd: VMDq and DCB updates") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
Huisong Li [Wed, 28 Apr 2021 06:40:40 +0000 (14:40 +0800)]
app/testpmd: fix forward lcores number for DCB
For the DCB forwarding test, each core is assigned to each traffic class.
Number of forwarding cores for DCB test must be equal or less than number
of total TC. Otherwise, the following problems may occur:
1/ Redundant polling threads will be created when forwarding cores number
is greater than total TC number.
2/ Two cores would try to use a same queue on a port when Rx/Tx queue
number is greater than the used TC number, which is not allowed.
Fixes: 900550de04a7 ("app/testpmd: add dcb support") Fixes: ce8d561418d4 ("app/testpmd: add port configuration settings") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Lijun Ou <oulijun@huawei.com> Acked-by: Xiaoyun Li <xiaoyun.li@intel.com>
When requested MTU is bigger than mbuf size and scattered Rx is not
enabled, setting MTU fails for VF.
But scattered Rx can be enabled in next port start if required, so
enabling setting MTU bigger than mbuf size if device is stopped
independent from scattered Rx configuration.
Fixes: a2beaa4a769e ("net/txgbe: support VF MTU update") Cc: stable@dpdk.org Signed-off-by: Jiawen Wu <jiawenwu@trustnetic.com>
After restoring the remote states, the return value of ioctl() is not
checked. Therefore, users cannot know whether the remote state is
restored successfully.
This patch add log for restoring failure.
Fixes: 4810d3af8343 ("net/tap: restore state of remote device when closing") Cc: stable@dpdk.org Signed-off-by: Chengchang Tang <tangchengchang@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
In function cons_parse_ntuple_filter, item->spec and item->mask
should be confirmed not null before use memcmp on it, current
judgement (item->spec || item->mask) just can confirm item->spec
or item->mask is not null, and cause null pointer be used in
memcmp.
Huisong Li [Sun, 25 Apr 2021 12:06:29 +0000 (20:06 +0800)]
net/hns3: fix link speed when port is down
When the port is link down state, it is meaningless to display the
port link speed. It should be an undefined state.
Fixes: 59fad0f32135 ("net/hns3: support link update operation") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
Huisong Li [Sun, 25 Apr 2021 12:06:28 +0000 (20:06 +0800)]
net/hns3: fix link status when port is stopped
When port is stopped, link down should be reported to user. For HNS3
PF driver, link status comes from link status of hardware. If the port
supports NCSI feature, hardware MAC will not be disabled. At this case,
even if the port is stopped, the link status is still Up. So driver
should set link down when the port is stopped.
Fixes: 59fad0f32135 ("net/hns3: support link update operation") Cc: stable@dpdk.org Signed-off-by: Huisong Li <lihuisong@huawei.com> Signed-off-by: Min Hu (Connor) <humin29@huawei.com>
net/mlx5: fix probing device in legacy bonding mode
If the device was configured as legacy bond one (without
involving E-Switch), the mlx5 PMD erroneously tried to deduce
the vport index raising the fatal error and preventing
device from being used.
The patch checks whether there is E-Switch present and we
should use vport index indeed.
Fixes: 2eb4d0107acc ("net/mlx5: refactor PCI probing on Linux") Fixes: d5c06b1b10ae ("net/mlx5: query vport index match mode and parameters") Cc: stable@dpdk.org Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
The mlx4 PMD tracks the buffers (mbufs) for the packets being
transmitted in the dedicated array named as "elts". The tx_burst
routine frees the mbufs from this array once it needs to rearm
the hardware descriptor and store the new mbuf, so it looks
like as replacement mbuf pointer in the elts array.
On the device stop mlx4 PMD freed only the part of elts according
tail and head pointers, leaking the rest of buffers, remained in
the elts array.
Currently ASO meter must be followed by policy table, so this adds
the support that connecting meter and policy table.
There are several cases to be considered:
1. For non-termination policy, connect meter to the default policy
table.
2. For non-RSS termination policy case, simply get the policy
table id and connect meter to it.
3. For RSS termination policy case, need to split the flow due
to RSS info in policy, and translate each sub-flow using that RSS,
then create the sub policy table to be connected.
4. In termination policy case, if there's no actions to modify the
packet before meter, no need to use set_tag to save meter id in
register. Only add a new flow in drop table using the same match
criteria as suf-flow, to save cache miss.
Li Zhang [Tue, 27 Apr 2021 10:43:53 +0000 (13:43 +0300)]
net/mlx5: prepare sub-policy for flow with meter
When a flow has a RSS action, the driver splits
each sub flow finally is configured with
a different HW TIR action.
Any RSS action configured in meter policy may cause
a split in the flow configuration.
To save performance, any TIR action will be configured
in different flow table, so policy can be split to
sub-policies per TIR in the flow creation time.
Create a function to prepare the policy and
its sub-policies for a configured flow with meter.
Signed-off-by: Li Zhang <lizh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Li Zhang [Tue, 27 Apr 2021 10:43:52 +0000 (13:43 +0300)]
net/mlx5: support meter creation with policy
Create a meter with the new pre-defined policy.
The following cases to be considered:
1.Add entry match with meter_id in global drop table.
2.For non-termination policy (policy id 0),
add jump rule to suffix table for green and
jump rule to drop table for red.
3.Allocate counter per meter in drop table.
4.Allocate meter resource per domain per color.
5.It can work with both ASO and legacy meter HW objects.
Signed-off-by: Li Zhang <lizh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
Li Zhang [Tue, 27 Apr 2021 10:43:51 +0000 (13:43 +0300)]
net/mlx5: support meter policy operations
MLX5 PMD checks the validation of actions in policy while add
a new meter policy, if pass the validation, allocates the new
policy object from the meter policy indexed memory pool.
It is common to use the same policy for multiple meters.
MLX5 PMD supports two types of policy: termination policy and
no-termination policy.
Implement the next policy operations:
validate:
The driver doesn't support to configure actions in the flow
after the meter action except one case when the meter policy
is configured to do nothing in GREEN\YELLOW and only DROP action
in RED, this special policy is called non-terminated policy
and is handed as a singleton object internally.
For all the terminated policies, the next actions are supported:
GREEN - QUEUE, RSS, PORT_ID, JUMP, DROP, MARK and SET_TAG.
YELLOW - not supported at all -> must be empty.
RED - must include DROP action.
Hence, in ingress case, for example,
QUEUE\RSS\JUMP must be configured as last action for GREEN color.
All the above limitations will be validated.
create:
Validate the policy configuration.
Prepare the related tables and actions.
destroy:
Release the created policy resources.
Signed-off-by: Li Zhang <lizh@nvidia.com> Acked-by: Matan Azrad <matan@nvidia.com>
When statically linked the function prandom_bytes is exposed
and might conflict with something in application. All driver
functions should use the same prefix.
Fixes: 9738793f28ec ("net/bnxt: add VNIC functions and structs") Cc: stable@dpdk.org Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
An application using rte_flow may define a large number of queues
but only use a small subset of them at any one time.
Since querying the status of each queue requires a request/spin/reply
with the firmware, optimize by skipping the request for queues not
running.
For those queues the statistics will be 0.
This cuts the cost of single xstats query in half and has even
bigger gain for simple stats query.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Thomas Monjalon [Wed, 21 Apr 2021 17:59:23 +0000 (19:59 +0200)]
vdpa/mlx5: improve portability of thread naming
The function pthread_setname_np is non-portable,
so it may be unavailable in old glibc or other systems.
The function rte_thread_setname is workarounding portability issues.
vhost: allocate and free packets in bulk in Tx packed
Move allocation out further and perform all allocation in bulk. The same
goes for freeing packets. In the process, also introduce
virtio_dev_pktmbuf_prep and make virtio_dev_pktmbuf_alloc use that.
Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Instead of calculating the address of a packed descriptor based on the
vq->desc_packed and vq->last_used_idx every time, store that base
address in desc_base. On arm, this saves 176 bytes in code size of
function in which vhost_flush_enqueue_batch_packed gets inlined.
Signed-off-by: Balazs Nemeth <bnemeth@redhat.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Cheng Jiang [Wed, 17 Mar 2021 05:40:54 +0000 (05:40 +0000)]
examples/vhost: fix ioat ring space in callbacks
We use ioat ring space for determining if ioat callbacks can enqueue a
packet to ioat device. But there is one slot can't be used in ioat
ring due to the ioat driver design, so we need to reduce one slot in
ioat ring to prevent ring size mismatch in ioat callbacks.
Fixes: 2aa47e94bfb2 ("examples/vhost: add ioat ring space count and check") Cc: stable@dpdk.org Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com> Reviewed-by: Jiayu Hu <jiayu.hu@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Jiayu Hu [Tue, 20 Apr 2021 08:57:45 +0000 (04:57 -0400)]
vhost: fix redundant vring status change notification
When VHOST_USER_F_PROTOCOL_FEATURES is not negotiated,
there is no need for vhost_user_set_vring_kick() to
notify the application of vring enabled, as
vhost_user_msg_handler() also notifies the application.
This patch is to remove unnecessary vring_state_changed() call.
Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications") Cc: stable@dpdk.org Signed-off-by: Jiayu Hu <jiayu.hu@intel.com> Tested-by: Yinan Wang <yinan.wang@intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Min Hu (Connor) [Tue, 27 Apr 2021 08:51:21 +0000 (16:51 +0800)]
net/e1000: fix flow error message object
This patch fixes parameter misuse when set rte flow action error.
Fixes: c0688ef1eded ("net/igb: parse flow API n-tuple filter") Cc: stable@dpdk.org Signed-off-by: Min Hu (Connor) <humin29@huawei.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com>
When .get_monitor_addr API was introduced, it was implemented in the
i40e driver, but only for the physical function; the virtual function
portion of the driver does not support that API.
Add the missing function pointer to VF device structure.
The i40e driver is not meant to use the VF portion any more, as
currently i40e VF devices are supposed to be managed by iavf drier, but
add this just in case it needs backporting later.
Fixes: a683abf90a22 ("net/i40e: implement power management API") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: David Hunt <david.hunt@intel.com>
When .get_monitor_addr API was introduced, it was implemented in the
ixgbe driver, but only for the physical function; the virtual function
portion of the driver does not support that API.
Add the missing function pointer to VF device structure.
Fixes: 3982b7967bb7 ("net/ixgbe: implement power management API") Cc: stable@dpdk.org Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com> Reviewed-by: David Hunt <david.hunt@intel.com> Acked-by: Haiyue Wang <haiyue.wang@intel.com>
common/iavf: use macro to define offload/capability
Currently raw hex values are used to define specific bits for each
offload/capability in virtchnl.h. The can and has led to duplicate
defined bits. Fix this by using the BIT() macro so it's
immediately obvious which bits are used/available.
Support for allowing VFs to negotiate the descriptor format was added
previously.
This support requires that the VF specify which descriptor format to use
when requesting Rx queues. The VF is supposed to request the set of
supported formats via the new VIRTCHNL_OP_GET_SUPPORTED_RXDIDS, and then
set one of the supported formats in the rxdid field of the
virtchnl_rxq_info structure.
The virtchnl.h header does not provide an enumeration of the format
values. The existing implementations in the PF directly use the values
from the DDP package.
Make the formats explicit by defining an enumeration of the RXDIDs.
Provide an enumeration for the values as well as the bit positions as
returned by the supported_rxdids data from the
VIRTCHNL_OP_GET_SUPPORTED_RXDIDS.
The value of offload VIRTCHNL_VF_OFFLOAD_CRC bit already existed as
VIRTCHNL_VF_CAP_ADV_LINK_SPEED. Fix this now by changing the value of
VIRTCHNL_VF_OFFLOAD_CRC to a currently unused value.
Also, move the define for VIRTCHNL_VF_CAP_ADV_LINK_SPEED in the correct
place to line up with the other bit values and add a comment for its
purpose. Hopefully this will prevent from defining duplicate bits moving
forward.
net/ice: refactor input set fields for switch filter
Input set has been divided into inner and outer part to distinguish
different fields. However, the parse method of switch filter doesn't
match this update. Refactor switch filter to distinguish inner and outer
input set in the same way as other filters.