dpdk.git
2 years agonet/mlx5: fix Tx metadata endianness in data path
Bing Zhao [Mon, 27 Sep 2021 08:02:03 +0000 (11:02 +0300)]
net/mlx5: fix Tx metadata endianness in data path

The metadata can be set in the mbuf dynamic field and then used in
flow rules steering for egress direction. The hardware requires
network order for both the insertion of a rule and sending a packet.
Indeed, there is no strict restriction for the endianness. The order
for sending a packet and its steering rule should be consistent.

In the past, there was no endianness conversion due to the
performance reason. The flow rule converted the metadata into little
endian for hardware (if needed) and the packet hit the flow rule also
with little endian.

After the metadata was converted to big endian, the missing adaption
in the data path resulted in a flow miss of the egress packets.

Converting the metadata to big endian before posting a WQE to the
hardware solves this issue.

Fixes: b57e414b48c0 ("net/mlx5: convert meta register to big-endian")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2 years agonet/mlx5: fix flow tables double release
Bing Zhao [Tue, 28 Sep 2021 04:08:51 +0000 (07:08 +0300)]
net/mlx5: fix flow tables double release

In the function mlx5_alloc_shared_dr(), there are various reasons
to result in a failure and error clean up process. While in the
caller of mlx5_dev_spawn(), once there is a error occurring after
the mlx5_alloc_shared_dr(), the mlx5_os_free_shared_dr() is called
to release all the resources.

To prevent a double release, the pointers of the resources should
be checked before the releasing and set to NULL after done.

In the mlx5_free_table_hash_list(), after the releasing, the pointer
was missed to set to NULL and a double release may cause a crash.

By setting the tables pointer to NULL as done for other resources,
the double release and crash could be solved.

Fixes: 54534725d2f3 ("net/mlx5: fix flow table hash list conversion")
Cc: stable@dpdk.org
Signed-off-by: Bing Zhao <bingz@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2 years agonet/mlx5: support new global device syntax
Xueming Li [Thu, 23 Sep 2021 06:45:26 +0000 (14:45 +0800)]
net/mlx5: support new global device syntax

This patch support new global device syntax like:
bus=pci,addr=BB:DD.F/class=eth/driver=mlx5,devargs,..

In driver parameters check, ignores "driver" key which is part of new
global device syntax instead of reporting error.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2 years agonet/virtio: revert forcing IOVA as VA mode for virtio-user
Maxime Coquelin [Thu, 30 Sep 2021 08:12:59 +0000 (10:12 +0200)]
net/virtio: revert forcing IOVA as VA mode for virtio-user

This patch removes the simplification in Virtio descriptors
handling, where their buffer addresses are IOVAs for Virtio
PCI devices, and VA-only for Virtio-user devices, which
added a requirement on Virtio-user that it only supported
IOVA as VA.

This change introduced a regression for applications using
Virtio-user and other physical PMDs that require IOVA as PA
because they don't use an IOMMU.

This patch reverts to the old behaviour, but needed to be
reworked because of the refactoring that happened in v21.02.

Fixes: 17043a2909bb ("net/virtio: force IOVA as VA mode for virtio-user")
Cc: stable@dpdk.org
Reported-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2 years agonet/virtio-user: fix init when using existing tap
David Marchand [Tue, 28 Sep 2021 08:51:14 +0000 (10:51 +0200)]
net/virtio-user: fix init when using existing tap

When attaching to an existing mono queue tap, the virtio-user was not
reporting that the virtio device was not properly initialised which
prevented from starting the port later.

$ ip tuntap add test mode tap
$ dpdk-testpmd --vdev \
  net_virtio_user0,iface=test,path=/dev/vhost-net,queues=2 -- -i

...
virtio_user_dev_init_mac(): (/dev/vhost-net) No valid MAC in devargs or
device, use random
vhost_kernel_open_tap(): TUNSETIFF failed: Invalid argument
vhost_kernel_enable_queue_pair(): fail to open tap for vhost kernel
virtio_user_start_device(): (/dev/vhost-net) Failed to start device
...
Configuring Port 0 (socket 0)
vhost_kernel_open_tap(): TUNSETIFF failed: Invalid argument
vhost_kernel_enable_queue_pair(): fail to open tap for vhost kernel
virtio_set_multiple_queues(): Multiqueue configured but send command
failed, this is too late now...
Fail to start port 0: Invalid argument
Please stop the ports first
Done

The virtio-user with vhost-kernel backend was going through a lot
of complications to initialise tap fds only when using them.

For each qp enabled for the first time, a tapfd was created via
TUNSETIFF with unneeded additional steps (see below) and then mapped to
the right qp in the vhost-net backend.
Unneeded steps (as long as it has been done once for the port):
- tap features were queried while this is a constant on a running
  system,
- the device name in DPDK was updated,
- the mac address of the tap was set,

On subsequent qps state change, the vhost-net backend fd mapping was
updated and the associated queue/tapfd were disabled/enabled via
TUNSETQUEUE.

Now, this patch simplifies the whole logic by keeping all tapfds opened
and in enabled state (from the tap point of view) at all time.

Unused ioctl defines are removed.

Tap features are validated earlier to fail initialisation asap.
Tap name discovery and mac address configuration are moved when
configuring qp 0.

To support attaching to mono queue tap, the virtio-user driver now tries
to attach in multi queue first, then fallbacks to mono queue.

Finally (but this is more for consistency), VIRTIO_NET_F_MQ feature is
exposed only if the underlying tap supports multi queue.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agonet/bnxt: fix tunnel port accounting
Ajit Khaparde [Fri, 24 Sep 2021 19:52:47 +0000 (12:52 -0700)]
net/bnxt: fix tunnel port accounting

Fix the tunnel port counting logic.
Currently we are incrementing the port count without checking
the if bnxt_hwrm_tunnel_dst_port_alloc would return success or failure.
Modify the logic to increment it only if the firmware returns success.

Fixes: 10d074b2022d ("net/bnxt: support tunneling")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
2 years agonet/bnxt: improve error recovery information messages
Kalesh AP [Fri, 24 Sep 2021 05:17:53 +0000 (10:47 +0530)]
net/bnxt: improve error recovery information messages

The error recovery async event messages are often mistaken
for errors. Improved the wording to clarify the meaning of
these events.
Also, take the first step towards more inclusive language.
The references to master will be changed to primary.
For example: "bnxt_is_master_func" will be renamed to
"bnxt_is_primary_func()".

Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
2 years agonet/bnxt: fix memzone free for Tx and Rx rings
Ajit Khaparde [Mon, 20 Sep 2021 23:11:51 +0000 (16:11 -0700)]
net/bnxt: fix memzone free for Tx and Rx rings

The device cleanup logic was freeing most of the ring related memory,
but was not freeing up the memzone associated with the rings.
This patch fixes the issue.

Fixes: 2eb53b134aae ("net/bnxt: add initial Rx code")
Fixes: 6eb3cc2294fd ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
2 years agonet/bnxt: fix Tx queue startup state
Ajit Khaparde [Fri, 17 Sep 2021 20:20:45 +0000 (13:20 -0700)]
net/bnxt: fix Tx queue startup state

Default queue state of Tx queues on startup is not correct.
Fix this by setting the state when the port is started.

Fixes: 6eb3cc2294fd ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
2 years agonet/bnxt: fix function driver register/unregister
Kalesh AP [Wed, 22 Sep 2021 08:30:44 +0000 (14:00 +0530)]
net/bnxt: fix function driver register/unregister

1. Fix to use correct fields in the request structure of
   HWRM_FUNC_DRV_RGTR.
2. Remove the "flags" argument to bnxt_hwrm_func_driver_unregister()
   as it is not needed.

Fixes: beb3087f5056 ("net/bnxt: add driver register/unregister")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
2 years agonet/ice: support IEEE 1588 PTP
Simei Su [Tue, 28 Sep 2021 06:27:53 +0000 (14:27 +0800)]
net/ice: support IEEE 1588 PTP

Add ice support for new ethdev APIs to enable/disable and read/write/adjust
IEEE1588 PTP timestamps. Currently, only scalar path supports 1588 PTP,
vector path doesn't.

The example command for running ptpclient is as below:
./build/examples/dpdk-ptpclient -c 1 -n 3 -- -T 0 -p 0x1

Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: retry getting VF VSI map after failure
Dapeng Yu [Fri, 24 Sep 2021 08:08:20 +0000 (16:08 +0800)]
net/ice: retry getting VF VSI map after failure

The request of getting VF VSI map request may fail when DCF is busy,
this patch adds retry mechanism to make it able to succeed.

Fixes: b09d34ac8584 ("net/ice: fix flow redirector")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Acked-by: Haiyue Wang <haiyue.wang@intel.com>
2 years agocommon/iavf: fix ARQ resource leak
Qiming Chen [Fri, 10 Sep 2021 03:12:49 +0000 (11:12 +0800)]
common/iavf: fix ARQ resource leak

In the iavf_init_arq function, if an exception occurs in the
iavf_config_arq_regs function, and the previously applied ARQ (Admin
Receive Queue) bufs resource is released. This patch maintains the same
modification as the iavf_init_asq function to roll back resources.

Fixes: 87aca6d8d8a4 ("net/iavf/base: fix command buffer memory leak")
Cc: stable@dpdk.org
Signed-off-by: Qiming Chen <chenqiming_huawei@163.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: fix Rx queue IRQ resource leak
Qiming Chen [Fri, 10 Sep 2021 06:54:57 +0000 (14:54 +0800)]
net/iavf: fix Rx queue IRQ resource leak

In the iavf_config_rx_queues_irqs function, the memory pointed to by the
intr_handle->intr_vec and qv_map addresses is not released in the
subsequent hook branch, resulting in resource leakage.

Fixes: f593944fc988 ("net/iavf: enable IRQ mapping configuration for large VF")
Cc: stable@dpdk.org
Signed-off-by: Qiming Chen <chenqiming_huawei@163.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: enable Rx timestamp on flex descriptor
Simei Su [Sun, 26 Sep 2021 14:04:45 +0000 (22:04 +0800)]
net/ice: enable Rx timestamp on flex descriptor

Use the dynamic mbuf to register timestamp field and flag.
The ice has the feature to dump Rx timestamp value into dynamic
mbuf field by flex descriptor. This feature is turned on by dev
config "enable-rx-timestamp". Currently, it's only supported
under scalar path.

Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/i40e: remove i40evf
Robin Zhang [Fri, 24 Sep 2021 06:22:26 +0000 (06:22 +0000)]
net/i40e: remove i40evf

The default VF driver for Intel 700 Series Ethernet Controller already
switch to iavf in DPDK 21.05. And i40evf is no need to maintain now,
so remove i40evf related code.

Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: remove i40evf devargs option
Robin Zhang [Fri, 24 Sep 2021 06:22:27 +0000 (06:22 +0000)]
net/iavf: remove i40evf devargs option

Due to i40evf will be removed, so there's no need to keep the devargs
option "driver=i40evf" in iavf.

Signed-off-by: Robin Zhang <robinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/iavf: support IPv4/L4 checksum RSS offload
Alvin Zhang [Fri, 24 Sep 2021 09:57:29 +0000 (17:57 +0800)]
net/iavf: support IPv4/L4 checksum RSS offload

Add supports for RSS_IPV4_CHKSUM & RSS_L4_CHKSUM RSS offload types
in RSS flow.

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agocommon/iavf: enable hash calculation based on IPv4 checksum
Alvin Zhang [Fri, 24 Sep 2021 09:57:28 +0000 (17:57 +0800)]
common/iavf: enable hash calculation based on IPv4 checksum

Add IPv4 header checksum field selector, it can be used in creating
FDIR or RSS rules related to IPv4 header checksum.

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: support IPv4/L4 checksum RSS offload
Alvin Zhang [Fri, 24 Sep 2021 09:53:41 +0000 (17:53 +0800)]
net/ice: support IPv4/L4 checksum RSS offload

Add supports for RSS_IPV4_CHKSUM & RSS_L4_CHKSUM RSS offload types
in RSS flow.

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: support low Rx latency
Alvin Zhang [Fri, 24 Sep 2021 09:34:29 +0000 (17:34 +0800)]
net/ice: support low Rx latency

This patch adds a devarg parameter to enable/disable low Rx latency.

Signed-off-by: Alvin Zhang <alvinx.zhang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/ice: fix double free ACL flow entry
Dapeng Yu [Fri, 3 Sep 2021 10:04:11 +0000 (18:04 +0800)]
net/ice: fix double free ACL flow entry

If call ice_flow_rem_entry() directly without checking entry_id, may
cause an ACL flow entry to be freed more than once.

This patch tries to find entry_id first, then call ice_flow_rem_entry()
to avoid the defect.

Fixes: 40d466fa9f76 ("net/ice: support ACL filter in DCF")
Cc: stable@dpdk.org
Signed-off-by: Dapeng Yu <dapengx.yu@intel.com>
Reviewed-by: Simei Su <simei.su@intel.com>
2 years agonet/iavf: fix high CPU usage on frequent command
Qiming Chen [Sat, 11 Sep 2021 04:02:21 +0000 (12:02 +0800)]
net/iavf: fix high CPU usage on frequent command

There is currently a scenario test, which will continuously obtain port
statistics, causing the CPU usage to soar, which does not meet the
demand. After positioning analysis, it is found that the VF and PF
command interaction is completed through the iavf_execute_vf_cmd
function.
After the message is sent, it needs to wait for the interrupt thread to
obtain the response from the PF. For the data, the rte_delay_ms
interface is used here to wait, but the CPU will not be released during
the waiting period of this interface, which will cause the statistics to
keep occupying the CPU. This is also the root cause of the soaring CPU.

The command interaction should belong to the control plane, and there
will not be too high requirements for performance. It is recommended to
wait for the interface iavf_msec_delay to complete without taking up the
CPU time.

Fixes: 22b123a36d07 ("net/avf: initialize PMD")
Cc: stable@dpdk.org
Signed-off-by: Qiming Chen <chenqiming_huawei@163.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
2 years agonet/virtio: do not use PMD log type
David Marchand [Thu, 16 Sep 2021 13:25:02 +0000 (15:25 +0200)]
net/virtio: do not use PMD log type

Fixes: 1982462eadea ("net/virtio: add Rx free threshold setting")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agonet/virtio: remove blank lines in log
Thomas Monjalon [Thu, 16 Sep 2021 09:53:43 +0000 (11:53 +0200)]
net/virtio: remove blank lines in log

The macros PMD_*_LOG already include the line feed character.
Redundant \n are removed.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2 years agovhost: normalize return type and function name
Xuan Ding [Thu, 16 Sep 2021 04:34:15 +0000 (04:34 +0000)]
vhost: normalize return type and function name

In some function definitions, adjust return type and function name on
a separate line to be consistent with DPDK coding style.

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2 years agonet/virtio: report Tx descriptor limits in dev info
Ivan Ilchenko [Wed, 15 Sep 2021 12:23:27 +0000 (15:23 +0300)]
net/virtio: report Tx descriptor limits in dev info

Report max/min/align Tx descriptors limits in device info get callback.
Before calling the callback, rte_eth_dev_info_get() provides
default values of nb_min as zero and nb_max as UINT16_MAX that are
not correct for the driver, so one can't rely on them.

Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovhost: rework RARP packet injection
David Marchand [Wed, 15 Sep 2021 14:54:47 +0000 (16:54 +0200)]
vhost: rework RARP packet injection

Caught by code review, this copy is unnecessary.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2 years agonet/virtio: wait device ready during reset
Xueming Li [Wed, 15 Sep 2021 10:12:04 +0000 (18:12 +0800)]
net/virtio: wait device ready during reset

According to virtio spec, the device MUST reset when 0 is written to
device_status, and present 0 in device_status once reset is done.

This patch waits status value to be 0 during reset operation, if
timeout in 3 seconds, log and continue.

Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2 years agonet/virtio: fix Tx completed mbuf leak on device stop
Ivan Ilchenko [Wed, 15 Sep 2021 09:19:42 +0000 (12:19 +0300)]
net/virtio: fix Tx completed mbuf leak on device stop

Free Tx completed mbufs on device stop. Not completed Tx mbufs cannot be
freed since they are still in use.

Fixes: c1f86306a026 ("virtio: add new driver")
Cc: stable@dpdk.org
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agonet/virtio: fix Tx cleanup functions to have same signature
Ivan Ilchenko [Wed, 15 Sep 2021 09:19:41 +0000 (12:19 +0300)]
net/virtio: fix Tx cleanup functions to have same signature

There is a family of cleanup from completed transmits functions.
Fix packed virtqueues cleanup functions to have the same signature
as split virtqueues have. This lets all functions of the family to
match the same callback prototype.

Fixes: 892dc798fa9c ("net/virtio: implement Tx path for packed queues")
Cc: stable@dpdk.org
Signed-off-by: Ivan Ilchenko <ivan.ilchenko@oktetlabs.ru>
Signed-off-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2 years agovhost: clean IOTLB cache on vring stop
Eugenio Pérez [Fri, 27 Aug 2021 16:12:31 +0000 (18:12 +0200)]
vhost: clean IOTLB cache on vring stop

Old IOVA cache entries are left when there is a change on virtio driver
in VM. In case that all these old entries have iova addresses lesser
than new iova entries, vhost code will need to iterate all the cache to
find the new ones. In case of just a new iova entry needed for the new
translations, this condition will last forever.

This has been observed in virtio-net to testpmd's vfio-pci driver
transition, reducing the performance from more than 10Mpps to less than
0.07Mpps if the hugepage address was higher than the networking
buffers. Since all new buffers are contained in this new gigantic page,
vhost needs to scan IOTLB_CACHE_SIZE - 1 for each translation at worst.

Fixes: 69c90e98f483 ("vhost: enable IOMMU support")
Cc: stable@dpdk.org
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reported-by: Pei Zhang <pezhang@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
2 years agocommon/cnxk: support merging base steering rule
Satheesh Paul [Tue, 31 Aug 2021 04:16:14 +0000 (09:46 +0530)]
common/cnxk: support merging base steering rule

This patch adds an ROC API to merge base steering rule with rules
added by VF.

Signed-off-by: Satheesh Paul <psatheesh@marvell.com>
Reviewed-by: Kiran Kumar K <kirankumark@marvell.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2 years agoraw/cnxk_bphy: support reading NPA/SSO PF function
Tomasz Duszynski [Sun, 15 Aug 2021 23:12:02 +0000 (01:12 +0200)]
raw/cnxk_bphy: support reading NPA/SSO PF function

Add support for reading NPA/SSO pf_func which will be used
by a PSM to access NPA/SSO. PSM is a hardware block capable
of dispatching jobs to different blocks within a baseband
module.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: support reading NPA/SSO PF function
Tomasz Duszynski [Sun, 15 Aug 2021 23:12:01 +0000 (01:12 +0200)]
common/cnxk: support reading NPA/SSO PF function

Add support for reading NPA/SSO pf_func which will be used
by a PSM to access NPA/SSO. PSM is a hardware block capable
of dispatching jobs to different blocks within a baseband
module.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2 years agoraw/cnxk_bphy: fix device lookup
Tomasz Duszynski [Sun, 15 Aug 2021 23:12:00 +0000 (01:12 +0200)]
raw/cnxk_bphy: fix device lookup

Name needs to be prepared before the lookup otherwise
PMD will not be released.

Fixes: 24d9c5d59d5d ("raw/cnxk_bphy: add baseband PHY skeleton driver")
Cc: stable@dpdk.org
Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Reviewed-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/octeontx2: fix link event message size
Harman Kalra [Fri, 30 Jul 2021 16:08:06 +0000 (21:38 +0530)]
common/octeontx2: fix link event message size

Due to wrong size of mbox message allocated for sending link status
to the VF, incorrect link status is observed.

Fixes: cb8d769fb6fe ("common/octeontx2: send link event to VF")
Cc: stable@dpdk.org
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: update NPC MACM range for cn98xx
Harman Kalra [Fri, 30 Jul 2021 16:10:08 +0000 (21:40 +0530)]
common/cnxk: update NPC MACM range for cn98xx

NPC MCAM entry distribution is based on maximum number of PFs and LFs
available. Fixing the max no of PFs and LFs available on cn98xx to
fix the MCAM alloc entry range.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: support loop mode for cn98xx
Harman Kalra [Fri, 30 Jul 2021 16:10:07 +0000 (21:40 +0530)]
common/cnxk: support loop mode for cn98xx

In case of cn98xx, 2 NIX blocks and 4 LBK blocks are present. Moreover
AF VFs are alternatively attached to NIX0 and NIX1 to ensure load
balancing. To support loopback functionality between pairs NIX0/NIX1
are attached to LBK1/LBK2 for transmission/reception respectively.
But in this default configuration NIX blocks cannot receive the
packets they sent from the same LBK, which is an important requirement
as some ODP applications only uses one AF VF for loopback functionality.
To support this scenario, NIX0 can use LBK0 (NIX1 - LBK3) by setting a
loop flag while making LF alloc mailbox request.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agoraw/cnxk_bphy: use named value for queue count
Jakub Palider [Mon, 26 Jul 2021 13:58:15 +0000 (08:58 -0500)]
raw/cnxk_bphy: use named value for queue count

Queue counter is used in a few places so it was given some
reasonable name.

Signed-off-by: Jakub Palider <jpalider@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: align function naming
Jakub Palider [Mon, 26 Jul 2021 13:58:14 +0000 (08:58 -0500)]
common/cnxk: align function naming

There is an inconsistency in naming interrupt control
functions. This patch aligns names accordingly.

Signed-off-by: Jakub Palider <jpalider@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: reduce function visibility
Jakub Palider [Mon, 26 Jul 2021 13:58:13 +0000 (08:58 -0500)]
common/cnxk: reduce function visibility

Some functions are not used outside of local ROC scope. These need
updating classifiers and removal from header.

Signed-off-by: Jakub Palider <jpalider@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agoraw/cnxk_bphy: do not include IRQ header directly
Tomasz Duszynski [Mon, 26 Jul 2021 13:58:12 +0000 (08:58 -0500)]
raw/cnxk_bphy: do not include IRQ header directly

One should only use roc_api.h which exports all internal headers.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: remove duplicated constant
Tomasz Duszynski [Mon, 26 Jul 2021 13:58:11 +0000 (08:58 -0500)]
common/cnxk: remove duplicated constant

Drop duplicated constant.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agocommon/cnxk: return saner error codes
Tomasz Duszynski [Mon, 26 Jul 2021 13:58:10 +0000 (08:58 -0500)]
common/cnxk: return saner error codes

If particular LMAC does not exist then it's saner to return ENODEV
instead of EINVAL.

Signed-off-by: Tomasz Duszynski <tduszynski@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
2 years agonet/cnxk: add TM shaper and node operations
Satha Rao [Wed, 22 Sep 2021 06:11:48 +0000 (02:11 -0400)]
net/cnxk: add TM shaper and node operations

Implemented TM node, shaper profile, hierarchy_commit and
statistic operations.

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agonet/cnxk: add TM capabilities and queue rate limit handlers
Satha Rao [Wed, 22 Sep 2021 06:11:47 +0000 (02:11 -0400)]
net/cnxk: add TM capabilities and queue rate limit handlers

Initial version of TM implementation added basic infrastructure,
TM node_get, capabilities operations and rate limit queue operation.

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: add handlers to get TM hierarchy internals
Satha Rao [Wed, 22 Sep 2021 06:11:46 +0000 (02:11 -0400)]
common/cnxk: add handlers to get TM hierarchy internals

Platform specific TM tree hierarchy details are part of common cnxk
driver. This patch introduces missing HAL APIs to return state of
TM hierarchy required to support ethdev TM operations inside cnxk PMD.

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: support TM error type get
Satha Rao [Wed, 22 Sep 2021 06:11:45 +0000 (02:11 -0400)]
common/cnxk: support TM error type get

Different TM handlers returns various platform specific errors,
this patch introduces new API to convert these internal error
types to RTE_TM* error types.
Also updated error message API with missed TM error types.

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: handle packet mode shaper limits
Satha Rao [Wed, 22 Sep 2021 06:11:44 +0000 (02:11 -0400)]
common/cnxk: handle packet mode shaper limits

Add new macros to reflect HW shaper PPS limits. New API to validate
input rates for packet mode. Increase adjust value to support lesser
PPS (<61).

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: increase sched weight and shaper burst limit
Nithin Dabilpuram [Wed, 22 Sep 2021 06:11:43 +0000 (02:11 -0400)]
common/cnxk: increase sched weight and shaper burst limit

Increase sched weight and shaper burst limit for cn10k.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: support SMQ flush
Satha Rao [Wed, 22 Sep 2021 06:11:42 +0000 (02:11 -0400)]
common/cnxk: support SMQ flush

Each NIX interface had one or more SMQs connected to SQs to send
packets. When flush enabled on SMQ, hardware will push all packets
from SMQ to physical link. This API will enable flush on all SMQs
of an interface.

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agocommon/cnxk: set appropriate max frame size for SDP and LBK
Satha Rao [Wed, 22 Sep 2021 06:11:41 +0000 (02:11 -0400)]
common/cnxk: set appropriate max frame size for SDP and LBK

For SDP interface all platforms supports up to 65535 frame size.
Updated API with new check for SDP interface.

Signed-off-by: Satha Rao <skoteshwar@marvell.com>
Acked-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
2 years agombuf: promote Tx offload helper to stable
Stephen Hemminger [Mon, 4 Oct 2021 19:33:01 +0000 (12:33 -0700)]
mbuf: promote Tx offload helper to stable

This function should be made stable now.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2 years agombuf: promote check helper to stable
Stephen Hemminger [Mon, 4 Oct 2021 19:33:00 +0000 (12:33 -0700)]
mbuf: promote check helper to stable

This one has been in for required time period.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2 years agombuf: promote dynamic fields to stable
Stephen Hemminger [Mon, 4 Oct 2021 19:32:59 +0000 (12:32 -0700)]
mbuf: promote dynamic fields to stable

These functions to register dynamic fields were added in 19.11
and should be promoted to stable.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2 years agombuf: promote more helpers to stable
Stephen Hemminger [Mon, 4 Oct 2021 19:32:58 +0000 (12:32 -0700)]
mbuf: promote more helpers to stable

These two functions were added in 19.11 as experimental.
Time to promote the to stable status.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2 years agombuf: promote some helpers to stable
David Marchand [Sat, 2 Oct 2021 14:16:14 +0000 (16:16 +0200)]
mbuf: promote some helpers to stable

Those accessors have been introduced more than two years ago
(rte_mbuf_to_priv in v18.08, rte_mbuf_*_addr* in v19.02).
Time to mark them stable.

rte_mbuf_to_baddr() could be removed, but since we lack a deprecation
notice, keep it as a simple wrapper.

Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
2 years agoring: promote new sync modes and peek to stable
Sean Morrissey [Mon, 4 Oct 2021 09:22:18 +0000 (09:22 +0000)]
ring: promote new sync modes and peek to stable

These methods were introduced in 20.05.
There has been no changes in their public API since then.
They seem mature enough to remove the experimental tag.

Signed-off-by: Sean Morrissey <sean.morrissey@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
2 years agotest/mem: fix memory autotests on FreeBSD
Bruce Richardson [Fri, 17 Sep 2021 15:09:17 +0000 (16:09 +0100)]
test/mem: fix memory autotests on FreeBSD

The memory autotests were failing on FreeBSD, due to an incorrect errno
variable being checked for ENOTSUP. The test checked "errno" while the
DPDK API sets "rte_errno". Changing to check the right variable makes
the test behave properly.

Fixes: c3e35a0966b8 ("test/mem: check segment fd API")

Reported-by: Brandon Lo <blo@iol.unh.edu>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
2 years agoeal/freebsd: lock memory device to prevent conflicts
Bruce Richardson [Mon, 13 Sep 2021 14:08:48 +0000 (15:08 +0100)]
eal/freebsd: lock memory device to prevent conflicts

Only a single DPDK process on the system can be using the /dev/contigmem
mappings at a time, but this was never explicitly enforced, e.g. when
using --in-memory flag on two processes. To prevent possible conflict
issues, we lock the dev node when it's in use, preventing other DPDK
processes from starting up and causing problems for us.

Fixes: 764bf26873b9 ("add FreeBSD support")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
2 years agofib: promote API to stable
Vladimir Medvedkin [Mon, 6 Sep 2021 16:01:15 +0000 (17:01 +0100)]
fib: promote API to stable

The fib and fib6 API's have been in since 19.11 and
should be marked as stable.

Signed-off-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
Acked-by: Conor Walsh <conor.walsh@intel.com>
2 years agorib: promote API to stable
Stephen Hemminger [Tue, 31 Aug 2021 21:49:38 +0000 (14:49 -0700)]
rib: promote API to stable

The rib and rib6 API's have been in since 19.11 and
should be marked as stable.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Vladimir Medvedkin <vladimir.medvedkin@intel.com>
2 years agonet: promote string to ethernet to stable
Stephen Hemminger [Tue, 31 Aug 2021 20:07:18 +0000 (13:07 -0700)]
net: promote string to ethernet to stable

This function has been in since 19.11.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2 years agonet: promote make rarp packet function to stable
Xiao Wang [Wed, 8 Sep 2021 10:59:15 +0000 (18:59 +0800)]
net: promote make rarp packet function to stable

rte_net_make_rarp_packet was introduced in version v18.02, there was no
change in this public API since then, and it's still being used by vhost
lib and virtio driver, so promote it as stable ABI.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
2 years agolog: promote some function to stable
Ivan Malov [Tue, 31 Aug 2021 18:14:35 +0000 (21:14 +0300)]
log: promote some function to stable

This one might be quite mature to be attested as stable.

Signed-off-by: Ivan Malov <ivan.malov@oktetlabs.ru>
Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2 years agoeal: promote random generator with upper bound to stable
Mattias Rönnblom [Wed, 1 Sep 2021 07:29:12 +0000 (09:29 +0200)]
eal: promote random generator with upper bound to stable

Remove experimental tag from rte_rand_max().

Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2 years agousertools: silence prompts for telemetry input pipe
Bruce Richardson [Mon, 13 Sep 2021 10:51:37 +0000 (11:51 +0100)]
usertools: silence prompts for telemetry input pipe

When the input to the script is coming from a device which is not a TTY
then we become less verbose and skip the prompts and helpful messages
about what is happening.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2 years agousertools: fix handling EOF for telemetry input pipe
Bruce Richardson [Mon, 13 Sep 2021 10:51:36 +0000 (11:51 +0100)]
usertools: fix handling EOF for telemetry input pipe

To allow the script to take queries from input pipes e.g. "echo
/ethdev/stats,0 | dpdk-telemetry.py", we need to handle the case of EOF
correctly without crashing with an exception. Do this by using a
try-except block around the input handling.

Fixes: 6a2967c112a3 ("usertools: add new telemetry script")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2 years agousertools: fix flake8 compliance of telemetry script
Bruce Richardson [Mon, 13 Sep 2021 10:51:35 +0000 (11:51 +0100)]
usertools: fix flake8 compliance of telemetry script

Fix style errors reported by flake8.

Fixes: 6a2967c112a3 ("usertools: add new telemetry script")
Fixes: 2d9a697e41ca ("usertools: add file-prefix option for telemetry")
Cc: stable@dpdk.org
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Kevin Laatz <kevin.laatz@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
2 years agotelemetry: promote API to stable
Bruce Richardson [Wed, 15 Sep 2021 16:55:35 +0000 (17:55 +0100)]
telemetry: promote API to stable

The telemetry APIs have been present and unchanged for >1 year now,
so remove experimental tag from them.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ciara Power <ciara.power@intel.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2 years agomempool/stack: build on Windows
Jie Zhou [Fri, 1 Oct 2021 00:50:12 +0000 (17:50 -0700)]
mempool/stack: build on Windows

Enable build of mempool/stack on Windows.

Signed-off-by: Jie Zhou <jizh@linux.microsoft.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2 years agoefd: allow more CPU sockets in table creation
Pablo de Lara [Tue, 28 Sep 2021 13:58:39 +0000 (13:58 +0000)]
efd: allow more CPU sockets in table creation

rte_efd_create() function was using uint8_t for a socket bitmask,
for one of its parameters.
This limits the maximum of NUMA sockets to be 8.
Changing to uint64_t increases it to 64, which should be
more future-proof.

Coverity issue: 366390
Fixes: 56b6ef874f8 ("efd: new Elastic Flow Distributor library")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yipeng Wang <yipeng1.wang@intel.com>
Tested-by: David Christensen <drc@linux.vnet.ibm.com>
2 years agobitrate: promote free function to stable
Kevin Traynor [Fri, 9 Jul 2021 15:19:38 +0000 (16:19 +0100)]
bitrate: promote free function to stable

rte_stats_bitrate_free() has been in DPDK since 20.11.

Its signature is very basic as it just frees an opaque
data struct allocated in rte_stats_bitrate_create()
and returns void.

It's unlikely that such a basic signature would need to change
so might as well promote it to stable for the next major ABI.

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2 years agobitrate: fix calculation to match API description
Kevin Traynor [Fri, 9 Jul 2021 15:19:37 +0000 (16:19 +0100)]
bitrate: fix calculation to match API description

rte_stats_bitrate_calc() API states it returns 'Negative value on error'.

However, the implementation will return the error code from
rte_eth_stats_get() which may be non-zero on error.

Change the implementation of rte_stats_bitrate_calc() to match
the API description by always returning a negative value on error.

Fixes: 2ad7ba9a6567 ("bitrate: add bitrate statistics library")

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2 years agobitrate: fix registration to match API description
Kevin Traynor [Fri, 9 Jul 2021 15:19:36 +0000 (16:19 +0100)]
bitrate: fix registration to match API description

rte_stats_bitrate_reg() API states it returns 'Zero on success'.

However, the implementation directly returns the return of
rte_metrics_reg_names() which may be zero or positive on success,
with a positive value also indicating the index.

The user of rte_stats_bitrate_reg() should not care about the
index as it is stored in the opaque rte_stats_bitrates struct.

Change the implementation of rte_stats_bitrate_reg() to match
the API description by always returning zero on success.

Fixes: 2ad7ba9a6567 ("bitrate: add bitrate statistics library")

Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
2 years agotelemetry: detach threads
Stephen Hemminger [Thu, 19 Aug 2021 02:38:19 +0000 (19:38 -0700)]
telemetry: detach threads

There are a number telemetry threads which are created and
there is nothing that does pthread_join() to wait for them.
Mark these threads as detached, so that the pthread library
can cleanup state when the thread exits.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Ciara Power <ciara.power@intel.com>
2 years agoring: fix Doxygen comment of internal function
Cian Ferriter [Mon, 23 Aug 2021 17:28:44 +0000 (18:28 +0100)]
ring: fix Doxygen comment of internal function

Change "enqueue" to "dequeue" because the __rte_ring_move_cons_head()
function is updating the consumer head for dequeue.

Fixes: 0dfc98c507b1 ("ring: separate out head index manipulation")
Cc: stable@dpdk.org
Signed-off-by: Cian Ferriter <cian.ferriter@intel.com>
Acked-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
2 years agoeal: remove sys/queue.h from public headers
William Tu [Tue, 24 Aug 2021 16:21:03 +0000 (16:21 +0000)]
eal: remove sys/queue.h from public headers

Currently there are some public headers that include 'sys/queue.h', which
is not POSIX, but usually provided by the Linux/BSD system library.
(Not in POSIX.1, POSIX.1-2001, or POSIX.1-2008. Present on the BSDs.)
The file is missing on Windows. During the Windows build, DPDK uses a
bundled copy, so building a DPDK library works fine.  But when OVS or other
applications use DPDK as a library, because some DPDK public headers
include 'sys/queue.h', on Windows, it triggers an error due to no such
file.

One solution is to install the 'lib/eal/windows/include/sys/queue.h' into
Windows environment, such as [1]. However, this means DPDK exports the
functionalities of 'sys/queue.h' into the environment, which might cause
symbols, macros, headers clashing with other applications.

The patch fixes it by removing the "#include <sys/queue.h>" from
DPDK public headers, so programs including DPDK headers don't depend
on the system to provide 'sys/queue.h'. When these public headers use
macros such as TAILQ_xxx, we replace it by the ones with RTE_ prefix.
For Windows, we copy the definitions from <sys/queue.h> to rte_os.h
in Windows EAL. Note that these RTE_ macros are compatible with
<sys/queue.h>, both at the level of API (to use with <sys/queue.h>
macros in C files) and ABI (to avoid breaking it).

Additionally, the TAILQ_FOREACH_SAFE is not part of <sys/queue.h>,
the patch replaces it with RTE_TAILQ_FOREACH_SAFE.

[1] http://mails.dpdk.org/archives/dev/2021-August/216304.html

Suggested-by: Nick Connolly <nick.connolly@mayadata.io>
Suggested-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: Narcisa Vasile <navasile@linux.microsoft.com>
2 years agolib: remove sched.h from public headers
Dmitry Kozlyuk [Sat, 28 Aug 2021 22:13:45 +0000 (01:13 +0300)]
lib: remove sched.h from public headers

Public headers including POSIX-specific <sched.h> were unusable
on Windows. These includes were superfluous, remove them.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
2 years agoeal/windows: fix export list
Dmitry Kozlyuk [Sun, 29 Aug 2021 02:16:02 +0000 (05:16 +0300)]
eal/windows: fix export list

* Version and randomness API were not added to .def file by mistake,
  which is why they were later excluded from the export list.
* Device API stubs were added to EAL but not exported.

Fixes: edd66d57d55c ("eal/windows: add random function")
Fixes: 3d2fcb0e0aec ("eal/windows: add device event stubs")
Fixes: 5b637a848195 ("eal: fix querying DPDK version at runtime")

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2 years agoeal: remove Windows-specific list of common files
Dmitry Kozlyuk [Sun, 29 Aug 2021 02:16:01 +0000 (05:16 +0300)]
eal: remove Windows-specific list of common files

The majority of common EAL sources that are built for all platforms were
listed separately for Windows and for other OS. It seems that developers
adding modules to EAL perceived this as if Windows supported
only a limited subset of modules and only added new ones into another.
Factor the truly common modules into a shared list,
then extend it with modules supported by different platforms.

When the two lists were created, UUID API implementation was removed
from Windows build (apparently by mistake), then excluded from the
export list for no reason other than not being built. Restore it.

Fixes: df3ff6be2b33 ("eal: simplify meson build of common directory")

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2 years agoeal/windows: export version function
William Tu [Thu, 5 Aug 2021 17:48:19 +0000 (17:48 +0000)]
eal/windows: export version function

When OVS inits, it calls rte_version to get the DPDK's version.
The patch fixes the error below by exposing rte_version symbol.
libopenvswitch.a(dpdk.c.obj) : error LNK2019: unresolved external symbol
rte_version referenced in function dpdk_init

Fixes: 5b637a848195 ("eal: fix querying DPDK version at runtime")
Cc: stable@dpdk.org
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
2 years agonet/iavf: build on Windows
Pallavi Kadam [Thu, 9 Sep 2021 23:23:19 +0000 (16:23 -0700)]
net/iavf: build on Windows

- Enable IAVF PMD build on Windows
- Replace x86intrin.h with rte_vect.h to avoid __m_prefetchw conflicting
  types
- Fix for pointer and integer sign warnings using Clang compiler on
  Windows
- Add extra cflags '-fno-asynchronous-unwind-tables'
  to avoid MinGW build error:
  Error: invalid register for .seh_savexmm

Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Shivanshu Shukla <shivanshu.shukla@intel.com>
2 years agonet: enable random address on Windows
Pallavi Kadam [Thu, 9 Sep 2021 23:23:20 +0000 (16:23 -0700)]
net: enable random address on Windows

IAVF PMD needs to generate a random MAC address if it is not configured
by host.
'random' is now supported on Windows.

Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Reviewed-by: Ranjit Menon <ranjit.menon@intel.com>
Acked-by: Shivanshu Shukla <shivanshu.shukla@intel.com>
2 years agobus/pci: fix unknown NUMA node value on Windows
Pallavi Kadam [Mon, 27 Sep 2021 18:43:22 +0000 (11:43 -0700)]
bus/pci: fix unknown NUMA node value on Windows

Based on the rte_eth_dev_socket_id() documentation,
set the default numa_node to -1. When the API is unsuccessful,
set numa_node to 0.
This change more correctly resembles the Linux code.

Fixes: bf7cf1f947bd ("bus/pci: fix unknown NUMA node value on Windows")
Cc: stable@dpdk.org
Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Pallavi Kadam <pallavi.kadam@intel.com>
Acked-by: Ranjit Menon <ranjit.menon@intel.com>
2 years agokvargs: fix comments style
Olivier Matz [Wed, 29 Sep 2021 21:39:43 +0000 (23:39 +0200)]
kvargs: fix comments style

A '*' is missing at 2 places, add them.

Fixes: e1a00536c8ed ("kvargs: add a new library to parse key/value arguments")
Fixes: 3ab385063cb9 ("kvargs: add get by key")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2 years agokvargs: add function to get from key and value
Olivier Matz [Wed, 29 Sep 2021 21:39:41 +0000 (23:39 +0200)]
kvargs: add function to get from key and value

A quite common scenario with kvargs is to lookup for a <key>=<value> in
a kvlist. For instance, check if name=foo is present in
name=toto,name=foo,name=bar. This is currently done in drivers/bus with
rte_kvargs_process() + the rte_kvargs_strcmp() handler.

This approach is not straightforward, and can be replaced by this new
function.

rte_kvargs_strcmp() is then removed.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2 years agokvargs: promote get from key as stable
Olivier Matz [Wed, 29 Sep 2021 21:39:40 +0000 (23:39 +0200)]
kvargs: promote get from key as stable

The function rte_kvargs_get() is used by eal and pci bus driver since
its introduction in commit 3ab385063cb9 ("kvargs: add get by key") and
commit d2a66ad79480 ("bus: add device arguments name parsing"), in
dpdk 21.05.

Let's promote it as stable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2 years agokvargs: promote delimited parsing as stable
Olivier Matz [Wed, 29 Sep 2021 21:39:39 +0000 (23:39 +0200)]
kvargs: promote delimited parsing as stable

This function is used by EAL to parse key/value strings separated with
specified delimiters.

It was introduced in 2018 by commit 5d6af85ab00c ("kvargs: introduce a
more flexible parsing function"), and can be promoted as stable.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Xueming Li <xuemingl@nvidia.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Ray Kinsella <mdr@ashroe.eu>
2 years agonet/af_packet: remove timestamp from packet status
Tudor Cornea [Thu, 23 Sep 2021 18:13:24 +0000 (21:13 +0300)]
net/af_packet: remove timestamp from packet status

We should eliminate the timestamp status from the packet
status. This should only matter if timestamping is enabled
on the socket, but we might hit a kernel bug, which is fixed
in newer releases.

For interfaces of type 'veth', the sent skb is forwarded
to the peer and back into the network stack which timestamps
it on the RX path if timestamping is enabled globally
(which happens if any socket enables timestamping).

When the skb is destructed, tpacket_destruct_skb() is called
and it calls __packet_set_timestamp() which doesn't check
the flags on the socket and returns the timestamp if it is
set in the skb (and for veth it is, as mentioned above).

See the following kernel commit for reference [1]:

net: packetmmap: fix only tx timestamp on request

The packetmmap tx ring should only return timestamps if requested
via setsockopt PACKET_TIMESTAMP, as documented. This allows
compatibility with non-timestamp aware user-space code which checks
tp_status == TP_STATUS_AVAILABLE; not expecting additional timestamp
flags to be set in tp_status.

[1] https://www.spinics.net/lists/kernel/msg3959391.html

Signed-off-by: Mihai Pogonaru <pogonarumihai@gmail.com>
Signed-off-by: Tudor Cornea <tudor.cornea@gmail.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
2 years agoethdev: use extension header for GTP PSC item
Raslan Darawsheh [Mon, 23 Aug 2021 10:55:39 +0000 (13:55 +0300)]
ethdev: use extension header for GTP PSC item

This updates the gtp_psc flow item to use the net header
definition of the gtp_psc to be based on RFC 38415-g30

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2 years agonet: add extension header for GTP PSC
Raslan Darawsheh [Mon, 23 Aug 2021 10:55:38 +0000 (13:55 +0300)]
net: add extension header for GTP PSC

Define new rte header for GTP PDU session container
based on RFC 38415-g30

Signed-off-by: Raslan Darawsheh <rasland@nvidia.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
2 years agonet/memif: fix chained mbuf determination
Junxiao Shi [Thu, 9 Sep 2021 14:42:06 +0000 (14:42 +0000)]
net/memif: fix chained mbuf determination

Previously, TX functions call rte_pktmbuf_is_contiguous to determine
whether an mbuf is chained. However, rte_pktmbuf_is_contiguous is
designed to work on the first mbuf of a packet only. In case a packet
contains three or more segment mbufs in a chain, it may cause truncated
packets or rte_mbuf_sanity_check panics.

This patch updates TX functions to determine chained mbufs using
mbuf_head->nb_segs field, which works in all cases. Moreover, it
maintains that the second cacheline is only accessed when chained mbuf
is actually present.

Fixes: 09c7e63a71f9 ("net/memif: introduce memory interface PMD")
Fixes: 43b815d88188 ("net/memif: support zero-copy slave")
Cc: stable@dpdk.org
Signed-off-by: Junxiao Shi <git@mail1.yoursunny.com>
Reviewed-by: Jakub Grajciar <jgrajcia@cisco.com>
2 years agoethdev: group constant definitions in Doxygen
Thomas Monjalon [Mon, 30 Aug 2021 10:42:32 +0000 (12:42 +0200)]
ethdev: group constant definitions in Doxygen

A lot of flags are parts of a group but are documented alone.
The Doxygen syntax @{ and @} for grouping is used
to make flags appear together and have a common description.

Some Rx/Tx offload flags and RSS definitions are not grouped
because they need to be all properly documented first.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
2 years agonet/mlx5: fix shared RSS destruction
Dmitry Kozlyuk [Wed, 1 Sep 2021 08:07:52 +0000 (11:07 +0300)]
net/mlx5: fix shared RSS destruction

Shared RSS resources were released before checking that the shared RSS
has no more references. If it had, the destruction was aborted, leaving
the shared RSS in an invalid state where it could no longer be used.
Move reference counter check before resource release.

Fixes: d2046c09aa64 ("net/mlx5: support shared action for RSS")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2 years agonet/mlx5: fix flow indirect action reference counting
Dmitry Kozlyuk [Wed, 1 Sep 2021 08:19:58 +0000 (11:19 +0300)]
net/mlx5: fix flow indirect action reference counting

When an indirect action is used in a flow rule with a pattern that
causes RSS expansion, each device flow generated by the expansion
incremented the reference counter of the action. When such a flow was
destroyed, its action reference counter had been decremented only once.
The action remained marked as being used and could not be destroyed.
COUNT, AGE, and CONNTRACK indirect actions have been affected
(for AGE the error was not immediately observable).
Increment action counter only once for the original flow rule.

Fixes: 81073e1f8ce1 ("net/mlx5: support shared age action")
Fixes: 2d084f69aa26 ("net/mlx5: add translation of connection tracking action")
Fixes: f3191849f2c2 ("net/mlx5: support flow count action handle")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2 years agonet/mlx5: report error on indirect CT action destroy
Dmitry Kozlyuk [Wed, 1 Sep 2021 08:19:57 +0000 (11:19 +0300)]
net/mlx5: report error on indirect CT action destroy

When an indirect CT action of mlx5 PMD could not be destroyed,
rte_action_handle_destroy() was returning (-1), but the error
structure was not filled. This lead to a segfault in testpmd
on an attempt to print it. Fill the details for each possible
cause of this error.

Fixes: c5a49265fc23 ("net/mlx5: add ASO connection tracking destroy")
Cc: stable@dpdk.org
Signed-off-by: Dmitry Kozlyuk <dkozlyuk@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
2 years agocommon/mlx5: fix resource cleaning in device removal
Michael Baum [Sun, 12 Sep 2021 10:36:28 +0000 (13:36 +0300)]
common/mlx5: fix resource cleaning in device removal

The common remove function call in a loop to remove function for each
driver which have been registered.

If all removes are succeeded, it return 0 without to free the device
which allocated in probe function. Otherwise, it free the device.
In fact we expect exactly the opposite behavior. If all removes are
failed, it returns error without freeing the device which allocated in
probe function. Otherwise, it free the device and return 0.

Replace it with the correct behavior.

Fixes: 8a41f4deccc3 ("common/mlx5: introduce layer for multiple class drivers")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
2 years agocommon/mlx5: fix device list operations concurrency
Michael Baum [Sun, 12 Sep 2021 10:36:27 +0000 (13:36 +0300)]
common/mlx5: fix device list operations concurrency

The mlx5 common driver has a global list of mlx5 devices which are
probed.

In probe function it creates one and insert it to the list. Similarly it
removes the device in remove function.
These operations are not safe as there can be such operations in
parallel, by different threads.

Add global lock for the list and use it to insert or remove.

Fixes: 8a41f4deccc3 ("common/mlx5: introduce layer for multiple class drivers")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>