dpdk.git
4 years agonet/bnxt: get VF representor action record
Kishore Padmanabha [Fri, 12 Jun 2020 12:50:14 +0000 (18:20 +0530)]
net/bnxt: get VF representor action record

Added flow db api to get the vf representor action
record for a given flow.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: support VXLAN encap/decap templates
Kishore Padmanabha [Fri, 12 Jun 2020 12:50:13 +0000 (18:20 +0530)]
net/bnxt: support VXLAN encap/decap templates

Two templates are added to ulp template db, an ingress rule
for vxlan decap and an egress rule for vxlan encap.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: set maximum flow count
Shuanglin Wang [Fri, 12 Jun 2020 12:50:12 +0000 (18:20 +0530)]
net/bnxt: set maximum flow count

User could set max flow count by passing a devarg
"-w 0000:0d:00.0,max_num_kflows=64" to a DPDK application;
The value must be not less than 32K and be power-of-2;
the default value is 32K.

Signed-off-by: Shuanglin Wang <shuanglin.wang@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: refactor and rename some fields and enums
Kishore Padmanabha [Fri, 12 Jun 2020 12:50:08 +0000 (18:20 +0530)]
net/bnxt: refactor and rename some fields and enums

- rename regfile_wr_idx to regfile_idx
  The regfile index shall be used for both write and read operations.
  Hence the field is renamed.
- remove the unused enum BNXT_ULP_REGFILE_INDEX_CACHE_ENTRY_PTR
- rename the enums in the bnxt_ulp_resource_sub_type
  The enums in the bnxt_ulp_resource_sub_type are renamed to reflect
  the table types explicitly.
- rename an enum in the regfile index
  The BNXT_ULP_REGFILE_INDEX_ACTION_PTR_MAIN is renamed to
  BNXT_ULP_REGFILE_INDEX_MAIN_ACTION_PTR since it is the main
  action pointer.
- remove cache_tbl_id enums
  The bnxt_ulp_cache_tbl_id enums are not required any longer
  since the index is now calculated using resource sub type
  and direction.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: convert critical resource to enum
Kishore Padmanabha [Fri, 12 Jun 2020 12:50:07 +0000 (18:20 +0530)]
net/bnxt: convert critical resource to enum

The critical resource field in the template table is assigned
enumeration values instead of hard coded values.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: use vport in PHY port action handler
Kishore Padmanabha [Fri, 12 Jun 2020 12:50:06 +0000 (18:20 +0530)]
net/bnxt: use vport in PHY port action handler

The phy port action handler should get vport details and not
vnic id. The fix is to calculate the vport of the given
port.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: process action templates
Kishore Padmanabha [Fri, 12 Jun 2020 12:50:05 +0000 (18:20 +0530)]
net/bnxt: process action templates

Extend index table processing to process action templates.
The index table processing is extended to address encapsulation fields
so that action template index table can be processed by a common index
processing function that can process both class and action index
tables.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: support action bitmap opcode
Kishore Padmanabha [Fri, 12 Jun 2020 12:50:02 +0000 (18:20 +0530)]
net/bnxt: support action bitmap opcode

This enables using the action bitmap to update the action result
fields in the flow creation instead of using computed header fields.
Direction bit needs to be added to the action bitmap during
flow parsing, so that egress flows can be matched to the
template signature.
An example would be the usage of the vlan pop action bitmap that is
used to updated action result field as part of this commit.
Also the ulp action bitmap enumeration values that
contain open flow string are renamed.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: remove cache table ID from mapper class table
Kishore Padmanabha [Fri, 12 Jun 2020 12:50:00 +0000 (18:20 +0530)]
net/bnxt: remove cache table ID from mapper class table

The cache table id is not needed anymore since the value can
be calculated from resource sub type and direction.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add resource subtype to class and action tables
Kishore Padmanabha [Fri, 12 Jun 2020 12:49:59 +0000 (18:19 +0530)]
net/bnxt: add resource subtype to class and action tables

Added support for resource sub type to class and action tables
renamed table id to resource type.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: change default identifier to global resource
Kishore Padmanabha [Fri, 12 Jun 2020 12:49:58 +0000 (18:19 +0530)]
net/bnxt: change default identifier to global resource

The default identifier list in ulp mapper is extended to support
other truflow resource types and not just identifiers.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: update compute field list and access macros
Kishore Padmanabha [Fri, 12 Jun 2020 12:49:57 +0000 (18:19 +0530)]
net/bnxt: update compute field list and access macros

The compute field is extended to support action fields and not
just header fields, hence CHF is changed to CF. The access macro
for compute field is renamed to address this.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add computed header field in result opcode
Kishore Padmanabha [Fri, 12 Jun 2020 12:49:56 +0000 (18:19 +0530)]
net/bnxt: add computed header field in result opcode

Added support for computed header fields in the result field
processing. The computed header fields are fields that are extracted
from header fields or derived from data that is not part of the flow
command but shall be used in setting up of the flow rule.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: support more resource functions in flow database
Kishore Padmanabha [Fri, 12 Jun 2020 12:49:54 +0000 (18:19 +0530)]
net/bnxt: support more resource functions in flow database

Added support to include more resource functions in the flow
database. The number of bits increased from 3 to 8 for storing
the resource function.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: remove fields from bitmap and mapper table
Kishore Padmanabha [Fri, 12 Jun 2020 12:49:50 +0000 (18:19 +0530)]
net/bnxt: remove fields from bitmap and mapper table

Remove unnecessary fields from bitmap and mapper table.

- remove svif and VLAN info from header bitmap
The svif and vlan information are removed from header bitmap
signature so that the matching algorithm does not use these
fields to perform matching. So flows with or without vlan
tag could use the same flow template.
- remove mem field from mapper class table
Remove the unused mem field in the ulp mapper class table structure

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: distribute makefile to subdirectories
Kishore Padmanabha [Fri, 12 Jun 2020 12:49:49 +0000 (18:19 +0530)]
net/bnxt: distribute makefile to subdirectories

Created sub Makefile for tf_ulp and and tf_core for easy management.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/octeontx2: fix DMAC filtering
Harman Kalra [Wed, 3 Jun 2020 14:52:13 +0000 (20:22 +0530)]
net/octeontx2: fix DMAC filtering

Issue has been observed where packets are getting dropped
at DMAC filtering if a new dmac address is added before
starting of port.

Fixes: c43adf61682f ("net/octeontx2: add unicast MAC filter")
Cc: stable@dpdk.org
Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Sunil Kumar Kori <skori@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
4 years agovhost: split vDPA header file
Maxime Coquelin [Fri, 26 Jun 2020 14:04:41 +0000 (16:04 +0200)]
vhost: split vDPA header file

This patch split the vDPA header file in two, making
rte_vdpa_device structure opaque to the application.

Applications should only include rte_vdpa.h, while drivers
should include both rte_vdpa.h and rte_vdpa_dev.h.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agovhost: remove vDPA device count API
Maxime Coquelin [Fri, 26 Jun 2020 14:04:40 +0000 (16:04 +0200)]
vhost: remove vDPA device count API

This API is no more useful, this patch removes it.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agoexamples/vdpa: remove useless device count
Maxime Coquelin [Fri, 26 Jun 2020 14:04:39 +0000 (16:04 +0200)]
examples/vdpa: remove useless device count

The VDPA example now uses the vDPA class iterator, so
knowing the number of available devices beforehand is
no longer needed.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agoexamples/vdpa: use new wrappers instead of ops
Maxime Coquelin [Fri, 26 Jun 2020 14:04:38 +0000 (16:04 +0200)]
examples/vdpa: use new wrappers instead of ops

Now that wrappers to query number of queues, Virtio
features and Vhost-user protocol features are available,
let's make the vDPA example to use them.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agovhost: introduce wrappers for some vDPA ops
Maxime Coquelin [Fri, 26 Jun 2020 14:04:37 +0000 (16:04 +0200)]
vhost: introduce wrappers for some vDPA ops

This patch is preliminary work to make the vDPA device
structure opaque to the user application. Some callbacks
of the vDPA devices are used to query capabilities before
attaching to a Vhost port. This patch introduces wrappers
for these ops.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agovhost: use linked list for vDPA devices
Maxime Coquelin [Fri, 26 Jun 2020 14:04:36 +0000 (16:04 +0200)]
vhost: use linked list for vDPA devices

There is no more notion of device ID outside of vdpa.c.
We can now move from array to linked-list model for keeping
track of the vDPA devices.

There is no point in using array here, as all vDPA API are
used from the control path, so no performance concerns.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agovhost: remove useless vDPA API
Maxime Coquelin [Fri, 26 Jun 2020 14:04:35 +0000 (16:04 +0200)]
vhost: remove useless vDPA API

vDPA is no more used outside of the vDPA internals,
so remove rte_vdpa_get_device() API that is now useless.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agovhost: replace device ID in applications
Maxime Coquelin [Fri, 26 Jun 2020 14:04:34 +0000 (16:04 +0200)]
vhost: replace device ID in applications

This patch replaces the use of vDPA device ID with
vDPA device pointer. The goals is to remove the vDPA
device ID to avoid confusion with the Vhost ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agovhost: replace vDPA device ID in Vhost
Maxime Coquelin [Fri, 26 Jun 2020 14:04:33 +0000 (16:04 +0200)]
vhost: replace vDPA device ID in Vhost

This removes the notion of device ID in Vhost library
as a preliminary step to get rid of the vDPA device ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agovhost: replace device ID in vDPA ops
Maxime Coquelin [Fri, 26 Jun 2020 14:04:32 +0000 (16:04 +0200)]
vhost: replace device ID in vDPA ops

This patch is a preliminary step to get rid of the
vDPA device ID. It makes vDPA callbacks to use the
vDPA device struct as a reference instead of the ID.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agovhost: make vDPA framework bus agnostic
Maxime Coquelin [Fri, 26 Jun 2020 14:04:31 +0000 (16:04 +0200)]
vhost: make vDPA framework bus agnostic

This patch makes the vDPA framework to no more
support only PCI devices, but any devices by relying
on the generic device name as identifier.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agovhost: introduce vDPA device class
Maxime Coquelin [Fri, 26 Jun 2020 14:04:30 +0000 (16:04 +0200)]
vhost: introduce vDPA device class

This patch introduces vDPA device class. It will enable
application to iterate over the vDPA devices.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agobus/fslmc: fix iterating on a class type
Maxime Coquelin [Fri, 26 Jun 2020 14:04:29 +0000 (16:04 +0200)]
bus/fslmc: fix iterating on a class type

This patches fixes a null pointer dereferencing that happens
when the device string passed to the iterator is NULL. This
situation can happen when iterating on a class type.
For example:

RTE_DEV_FOREACH(dev, "class=eth", &dev_iter) {
    ...
}

Fixes: e67a61614d0b ("bus/fslmc: support device iteration")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agobus/dpaa: fix iterating on a class type
Maxime Coquelin [Fri, 26 Jun 2020 14:04:28 +0000 (16:04 +0200)]
bus/dpaa: fix iterating on a class type

This patches fixes a null pointer dereferencing that happens
when the device string passed to the iterator is NULL. This
situation can happen when iterating on a class type.
For example:

RTE_DEV_FOREACH(dev, "class=eth", &dev_iter) {
    ...
}

Fixes: e79df833d3f6 ("bus/dpaa: support hotplug ops")
Cc: stable@dpdk.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Adrián Moreno <amorenoz@redhat.com>
4 years agonet/i40e: support aarch32
Ruifeng Wang [Wed, 24 Jun 2020 07:10:15 +0000 (15:10 +0800)]
net/i40e: support aarch32

Expand vector PMD support to aarch32.
Enable i40e PMD by default for armv7 make build.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/ixgbe: fix include of vector header file
Ruifeng Wang [Wed, 24 Jun 2020 07:10:14 +0000 (15:10 +0800)]
net/ixgbe: fix include of vector header file

The include of 'arm_neon.h' causes issues to old gcc and aarch32.
Including 'rte_vect.h' instead fixes these issues.

Fixes: b20971b6cca0 ("net/ixgbe: implement vector driver for ARM")
Cc: stable@dpdk.org
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/ixgbe: support aarch32
Ruifeng Wang [Wed, 24 Jun 2020 07:10:13 +0000 (15:10 +0800)]
net/ixgbe: support aarch32

Expand vector PMD support to aarch32.
Enable ixgbe PMD by default for armv7 make build.

Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agoeal/arm: add vcopyq intrinsic for aarch32
Ruifeng Wang [Wed, 24 Jun 2020 07:10:12 +0000 (15:10 +0800)]
eal/arm: add vcopyq intrinsic for aarch32

vcopyq_laneq_u32 should be implemented for aarch32 which doesn't have
the intrinsic.
This fixes build of examples/l3fwd for armv7.

Fixes: 3c4b4024c225 ("arch/arm: add vcopyq_laneq_u32 for old gcc")
Cc: stable@dpdk.org
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/mvpp2: fix non-EAL thread support
David Marchand [Tue, 16 Jun 2020 09:47:00 +0000 (11:47 +0200)]
net/mvpp2: fix non-EAL thread support

Caught by code inspection, for a non-EAL thread identified with
rte_lcore_id() == LCORE_ID_ANY, the code currently arbitrarily uses
lcore 0 while there is no guarantee this lcore is used.

Fixes: 3588aaa68eab ("net/mrvl: fix HIF objects allocation")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Liron Himi <lironh@marvell.com>
4 years agonet/qede: fix multicast drop in promiscuous mode
Devendra Singh Rawat [Thu, 18 Jun 2020 08:15:55 +0000 (13:45 +0530)]
net/qede: fix multicast drop in promiscuous mode

After enabling promiscuous mode all packets whose destination MAC
address is a multicast address were being dropped. This fix configures
H/W to receive all traffic in promiscuous mode. Promiscuous mode also
overrides allmulticast mode on/off status.

Fixes: 40e9f6fc1558 ("net/qede: enable VF-VF traffic with unmatched dest address")
Cc: stable@dpdk.org
Signed-off-by: Devendra Singh Rawat <dsinghrawat@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Rasesh Mody <rmody@marvell.com>
4 years agonet/octeontx2: support CN98xx
Harman Kalra [Wed, 24 Jun 2020 12:46:48 +0000 (18:16 +0530)]
net/octeontx2: support CN98xx

New cn98xx SOC comes up with two NIX blocks wrt
cn96xx, cn93xx, to achieve higher performance.
Also the no of cores increased to 36 from 24.

Adding support for cn98xx where need a logic to
detect if the LF is attached to NIX0 or NIX1 and
then accordingly use the respective NIX block.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
4 years agonet/mlx5: fix host physical function representor naming
Viacheslav Ovsiienko [Tue, 23 Jun 2020 07:48:34 +0000 (07:48 +0000)]
net/mlx5: fix host physical function representor naming

The new kernel adds the names like "pf0" for Host PCI physical
function representor on Bluefield SmartNIC hosts. This patch
provides correct HPF representor recognition over the kernel
versions 5.7 and laters.

The following port naming formats are supported:

  - missing physical port name (no sysfs/netlink key) at all,
    master is assumed

  - decimal digits (for example "12"), representor is
    assumed, the value is the index of attached VF

  - "p" followed by decimal digits, for example "p2", master
    is assumed

  - "pf" followed by PF index, for example "pf0", Host PF
     representor is assumed on SmartNIC systems.

  - "pf" followed by PF index concatenated with "vf" followed by
     VF index, for example "pf0vf1", representor is assumed.
     If index of VF is "-1" it is a special case of Host PF
     representor, this representor must be indexed in devargs
     as 65535, for example representor=[0-3,65535] will
     allow representors for VF0, VF1, VF2, VF3 and for host PF.

Fixes: 79aa430721b1 ("common/mlx5: split common file under Linux directory")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/ice: initialize and update RSS based on user config
Junyu Jiang [Wed, 24 Jun 2020 02:09:39 +0000 (02:09 +0000)]
net/ice: initialize and update RSS based on user config

Initialize and update RSS configure based on user request
(rte_eth_rss_conf) from dev_configure and .rss_hash_update ops.
All previous default configure has been removed.

Signed-off-by: Junyu Jiang <junyux.jiang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agocommon/mlx5: move doorbell record from net driver
Ori Kam [Fri, 19 Jun 2020 07:30:09 +0000 (07:30 +0000)]
common/mlx5: move doorbell record from net driver

The creation of DBR can be used by a number of different
Mellanox PMDs. for example RegEx / Net / VDPA.

This commits moves the DBR creation and release functions to common
folder.

Signed-off-by: Ori Kam <orika@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
4 years agocommon/mlx5: move some getter functions from net driver
Ophir Munk [Fri, 19 Jun 2020 07:30:08 +0000 (07:30 +0000)]
common/mlx5: move some getter functions from net driver

Getter functions such as: 'mlx5_os_get_ctx_device_name',
'mlx5_os_get_ctx_device_path', 'mlx5_os_get_dev_device_name',
'mlx5_os_get_umem_id' are implemented under net directory. To enable
additional devices (e.g. regex, vdpa) to access these getter functions
they are moved under common directory.

As part of this commit string sizes DEV_SYSFS_NAME_MAX and
DEV_SYSFS_PATH_MAX are increased by 1 to make sure that the destination
string size in strncpy() function is bigger than the source string size.
This update will avoid GCC version 8 error -Werror=stringop-truncation.

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: optimize free counter lookup
Suanming Mou [Thu, 18 Jun 2020 08:12:50 +0000 (16:12 +0800)]
net/mlx5: optimize free counter lookup

Currently, when allocate a new counter, it needs loop the whole
container pool list to get a free counter.

In the case with millions of counters allocated, and all the pools
are empty, allocate the new counter will still need to loop the
whole container pool list first, then allocate a new pool to get a
free counter. It wastes the cycles during the pool list traversal.

Add a global free counter list in the container helps to get the free
counters more efficiently.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: optimize single counter pool search
Suanming Mou [Thu, 18 Jun 2020 07:24:44 +0000 (15:24 +0800)]
net/mlx5: optimize single counter pool search

For single counter, when allocate a new counter, it needs to find the pool
it belongs in order to do the query together.

Once there are millions of counters allocated, the pool array in the
counter container will become very large. In this case, the pool search
from the pool array will become extremely slow.

Save the minimum and maximum counter ID to have a quick check of current
counter ID range. And start searching the pool from the last pool in the
container will mostly get the needed pool since counter ID increases
sequentially.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: manage shared counters in three-level table
Suanming Mou [Thu, 18 Jun 2020 07:24:43 +0000 (15:24 +0800)]
net/mlx5: manage shared counters in three-level table

Currently, to check if any shared counter with same ID existing, it will
have to loop the counter pools to search for the counter. Even add the
counter to the list will also not so helpful while there are thousands
of shared counters in the list.

Change Three-Level table to look up the counter index saved in the
relevant table entry will be more efficient.

This patch introduces the Three-level table to save the ID relevant
counter index in the table. Then the next while the same ID comes, just
check the table entry of this ID will get the counter index directly.
No search will be needed.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: add three-level table utility
Suanming Mou [Thu, 18 Jun 2020 07:24:42 +0000 (15:24 +0800)]
net/mlx5: add three-level table utility

For the case which data is linked with sequence increased index, the
array table will be more efficient than hash table once need to search
one data entry in large numbers of entries. Since the traditional hash
tables has fixed table size, when huge numbers of data saved to the hash
table, it also comes lots of hash conflict.

But simple array table also has fixed size, allocates all the needed
memory at once will waste lots of memory. For the case don't know the
exactly number of entries will be impossible to allocate the array.

Then the multiple level table helps to balance the two disadvantages.
Allocate a global high level table with sub table entries at first,
the global table contains the sub table entries, and the sub table will
be allocated only once the corresponding index entry need to be saved.
e.g. for up to 32-bits index, three level table with 10-10-12 splitting,
with sequence increased index, the memory grows with every 4K entries.

The currently implementation introduces 10-10-12 32-bits splitting
Three-Level table to help the cases which have millions of entries to
save. The index entries can be addressed directly by the index, no
search will be needed.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
4 years agonet/mlx5: remove redundant newline from logs
David Marchand [Wed, 17 Jun 2020 13:53:24 +0000 (15:53 +0200)]
net/mlx5: remove redundant newline from logs

The DRV_LOG macro already appends a newline.

Fixes: 46287eacc1b1 ("net/mlx5: introduce hash list")
Fixes: 860897d2895a ("net/mlx5: reorganize flow tables with hash list")
Fixes: e484e4032332 ("net/mlx5: optimize tag traversal with hash list")
Fixes: 6801116688fe ("net/mlx5: fix multiple flow table hash list")
Cc: stable@dpdk.org
Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
4 years agonet/sfc: reap Tx descriptors at least once
Andrew Rybchenko [Fri, 19 Jun 2020 10:25:23 +0000 (11:25 +0100)]
net/sfc: reap Tx descriptors at least once

Improve cache hit and increase packet rate on benchmarks.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
4 years agovdpa/mlx5: support MTU feature
Matan Azrad [Thu, 18 Jun 2020 19:06:03 +0000 (19:06 +0000)]
vdpa/mlx5: support MTU feature

The guest virtio device may request MTU updating when the vhost backend
device exposes a capability to support it.

Expose the MTU feature capability.

At configuration time, check the requested MTU and update it in the HW
device.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agocommon/mlx5: share kernel interface name getter
Matan Azrad [Thu, 18 Jun 2020 19:06:02 +0000 (19:06 +0000)]
common/mlx5: share kernel interface name getter

Some configuration of the mlx5 port are done by the kernel net device
associated to the IB device represents the PCI device.

The DPDK mlx5 driver uses Linux system calls, for example ioctl, in
order to configure per port configurations requested by the DPDK user.

One of the basic knowledges required to access the correct kernel net
device is its name.

Move function to get interface name from IB device path to the common
library.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovdpa/mlx5: adjust virtio queue protection domain
Matan Azrad [Tue, 2 Jun 2020 15:51:44 +0000 (15:51 +0000)]
vdpa/mlx5: adjust virtio queue protection domain

In other to fill the new requirement for virtq
configuration, set the single PD managed by the driver for
all the virtqs.

Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agocommon/mlx5: add virtio queue protection domain
Matan Azrad [Tue, 2 Jun 2020 15:51:43 +0000 (15:51 +0000)]
common/mlx5: add virtio queue protection domain

Starting from FW version 22.27.4002, it is required to
configure protection domain (PD) for each virtq created by
DevX.

Add PD requirement in virtq DevX APIs.

Cc: stable@dpdk.org
Signed-off-by: Matan Azrad <matan@mellanox.com>
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agoexamples/vdpa: add statistics show command
Matan Azrad [Thu, 18 Jun 2020 18:59:44 +0000 (18:59 +0000)]
examples/vdpa: add statistics show command

A new vDPA driver feature was added to query the virtq
statistics from the HW.

Use this feature to show the HW queues statistics for the virtqs.

Command description: stats X Y.
X is the device ID.
Y is the queue ID, Y=0xffff to show all the virtio queues
statistics of the device X.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovdpa/mlx5: support virtio queue statistics get
Matan Azrad [Thu, 18 Jun 2020 18:59:43 +0000 (18:59 +0000)]
vdpa/mlx5: support virtio queue statistics get

Add support for statistics operations.

A DevX counter object is allocated per virtq in order to
manage the virtq statistics.

The counter object is allocated before the virtq creation
and destroyed after it, so the statistics are valid only in
the life time of the virtq.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agocommon/mlx5: support DevX virtq stats operations
Matan Azrad [Thu, 18 Jun 2020 18:59:42 +0000 (18:59 +0000)]
common/mlx5: support DevX virtq stats operations

Add DevX API to create and query virtio queue statistics
from the HW. The next counters are supported by the HW per
virtio queue:
received_desc.
completed_desc.
error_cqes.
bad_desc_errors.
exceed_max_chain.
invalid_buffer.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovhost: introduce operation to get vDPA queue stats
Matan Azrad [Thu, 18 Jun 2020 18:59:41 +0000 (18:59 +0000)]
vhost: introduce operation to get vDPA queue stats

The vDPA device offloads all the datapath of the vhost
device to the HW device.

In order to expose to the user traffic information this
patch introduces new 3 APIs to get traffic statistics, the
device statistics name and to reset the statistics per
virtio queue.

The statistics are taken directly from the vDPA driver
managing the HW device and can be different for each vendor
driver.

Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovhost: enable reply-ack systematically
Maxime Coquelin [Thu, 28 May 2020 09:03:47 +0000 (11:03 +0200)]
vhost: enable reply-ack systematically

As announced during v20.05 release cycle, this
patch makes reply-ack protocol feature to be enabled
unconditionally.

This protocol feature makes the communication between the
master and the slave more robust, avoiding for example
possible undefined behaviour with VHOST_USER_SET_MEM_TABLE.

Also, reply-ack support will be required for upcoming
VHOST_USER_SET_STATUS request.

Note that this protocol feature was disabled by default
because Qemu version 2.7.0 to 2.9.0 had a bug causing a
deadlock when reply-ack was negotiated and multiqueue
enabled. These Qemu version are now very old and no more
maintained, so we can reasonably consider we no more
support them.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
4 years agonet/ice/base: replace RSS profile locks
Qi Zhang [Thu, 11 Jun 2020 08:43:30 +0000 (16:43 +0800)]
net/ice/base: replace RSS profile locks

Replacing flow profile locks with RSS profile locks in the function to
remove all RSS rules for a given VSI. This is to align the locks used
for RSS rule addition to VSI and removal during VSI teardown to avoid
a race condition owing to several iterations of the above operations.
In function to get RSS rules for given VSI and protocol header replacing
the pointer reference of the RSS entry with a copy of hash value to
ensure thread safety.

Signed-off-by: Vignesh Sridhar <vignesh.sridhar@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: fix VSI ID mask to 10 bits
Qi Zhang [Thu, 11 Jun 2020 08:43:29 +0000 (16:43 +0800)]
net/ice/base: fix VSI ID mask to 10 bits

set_rss_lut failed due to incorrect vsi_id mask. vsi_id is 10 bit
but mask was 0x1FF whereas it should be 0x3FF.

For vsi_num >= 512, FW set_rss_lut has been failing with return code
EACCESS (vsi ownership issue) because software was providing
incorrect vsi_num (dropping 10th bit due to incorrect mask) for
set_rss_lut admin command

Fixes: a90fae1d0755 ("net/ice/base: add admin queue structures and commands")
Cc: stable@dpdk.org
Signed-off-by: Kiran Patil <kiran.patil@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: choose TCP dummy packet by protocol
Qi Zhang [Thu, 11 Jun 2020 08:43:28 +0000 (16:43 +0800)]
net/ice/base: choose TCP dummy packet by protocol

In order to find proper dummy packets for switch filter,
it need to check ipv4 next protocol number, if it is 0x06,
which means next payload is TCP, we need to use TCP
format dummy packet.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: get tunnel type for recipe
Qi Zhang [Thu, 11 Jun 2020 08:43:27 +0000 (16:43 +0800)]
net/ice/base: get tunnel type for recipe

This patch add support to get tunnel type of recipe
after get recipe from FW. This will fix the issue in
function ice_find_recp() for tunnel type comparing.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: support flow director for GTPU with outer IPv6
Qi Zhang [Thu, 11 Jun 2020 08:43:26 +0000 (16:43 +0800)]
net/ice/base: support flow director for GTPU with outer IPv6

Add FDIR support for MAC_IPV6_GTPU type with outer IPv6 address, teid
and qfi fields matching.

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: rename misleading variable
Qi Zhang [Thu, 11 Jun 2020 08:43:25 +0000 (16:43 +0800)]
net/ice/base: rename misleading variable

The grst_delay variable in ice_check_reset contains the maximum time
(in 100 msec units) that the driver will wait for a reset event to
transition to the Device Active state. The value is the sum of three
separate components:
1) The maximum time it may take for the firmware to process its
outstanding command before handling the reset request.
2) The value in RSTCTL.GRSTDEL (the delay firmware inserts between first
seeing the driver reset request and the actual hardware assertion).
3) The maximum expected reset processing time in hardware.

Referring to this total time as "grst_delay" is misleading and
potentially confusing to someone checking the code and cross-referencing
the hardware specification.

Fix this by renaming the variable to "grst_timeout", which is more
descriptive of its actual use.

Signed-off-by: Nick Nunley <nicholas.d.nunley@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: add commands for system diagnostic
Qi Zhang [Thu, 11 Jun 2020 08:43:24 +0000 (16:43 +0800)]
net/ice/base: add commands for system diagnostic

System diagnostic solution extend the ability to fetch FW
internal status data and error indication.

Signed-off-by: Sharon Haroni <sharon.haroni@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: support flow director for outer IP of GTPU
Qi Zhang [Thu, 11 Jun 2020 08:43:23 +0000 (16:43 +0800)]
net/ice/base: support flow director for outer IP of GTPU

Add outer IP address fields while generating the training packets for
GTPU, so that we can support FDIR based on outer IP of GTPU.

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: refactor to avoid need to retry
Qi Zhang [Thu, 11 Jun 2020 08:43:22 +0000 (16:43 +0800)]
net/ice/base: refactor to avoid need to retry

The ice_discover_caps function is used to read the device and function
capabilities, updating the hardware capabilities structures with
relevant data.

The exact number of capabilities returned by the hardware is unknown
ahead of time. The AdminQ command will report the total number of
capabilities in the return buffer.

The current implementation involves requesting capabilities once,
reading this returned size, and then re-requested with that size.

This isn't really necessary. The firmware interface has a maximum size
of ICE_AQ_MAX_BUF_LEN. Firmware can never return more than
ICE_AQ_MAX_BUF_LEN / sizeof(struct ice_aqc_list_caps_elem) capabilities.

Avoid the retry loop by simply allocating a buffer of size
ICE_AQ_MAX_BUF_LEN. This is significantly simpler than retrying. The
extra allocation isn't a big deal, as it will be released after we
finish parsing the capabilities.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice/base: adjust profile ID map locks
Qi Zhang [Thu, 11 Jun 2020 08:43:21 +0000 (16:43 +0800)]
net/ice/base: adjust profile ID map locks

The profile id map lock should be held till the caller completes
all references of that profile entries.

The current code releases the lock right after the match search.
This caused a driver issue when the profile map entries were
referenced after it was freed in other thread after the lock was
released earlier.

Also return type of get/set profile functions were changed to
return the ice status instead of the profile entry pointer.
This will prevent the caller referencing the profile fields
outside the lock.

Signed-off-by: Victor Raj <victor.raj@intel.com>
Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agobuild: replace meson OS detection with variable
Thomas Monjalon [Mon, 29 Jun 2020 20:31:19 +0000 (22:31 +0200)]
build: replace meson OS detection with variable

Some places were calling the meson function host_machine.system()
instead of the variables is_windows and is_linux defined
in config/meson.build.

At the same time, the missing "Linux restriction" reason is added to
pfe and octeontx2 crypto PMDs.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
4 years agoapp/flow-perf: use macro for cache alignment
Thomas Monjalon [Mon, 29 Jun 2020 20:57:50 +0000 (22:57 +0200)]
app/flow-perf: use macro for cache alignment

The macro __rte_cache_aligned is better suited for aligning
a structure on a cache line (of any size).

Fixes: 15c431864000 ("app/flow-perf: add packet forwarding support")

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Wisam Jaddo <wisamm@mellanox.com>
4 years agoring: enable for Windows
Fady Bader [Sun, 21 Jun 2020 12:06:19 +0000 (15:06 +0300)]
ring: enable for Windows

Building ring on Windows.

Signed-off-by: Fady Bader <fady@mellanox.com>
4 years agodevtools: add Windows cross-build test with MinGW
Thomas Monjalon [Sun, 14 Jun 2020 21:58:45 +0000 (23:58 +0200)]
devtools: add Windows cross-build test with MinGW

The Meson cross file is renamed from meson_mingw.txt to cross-mingw,
and is added to test-meson-builds.sh.

The only example supported on Windows so far is "helloworld",
that's why the default list of examples is overridden.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
4 years agodevtools: add ppc64 in meson build test
Thomas Monjalon [Sun, 14 Jun 2020 22:01:44 +0000 (00:01 +0200)]
devtools: add ppc64 in meson build test

Add cross-compilation support of a PPC target in the build test matrix.
The CPU is defined as Power8, running as little endian.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
4 years agodevtools: allow non-standard toolchain in meson test
Thomas Monjalon [Sun, 14 Jun 2020 22:18:44 +0000 (00:18 +0200)]
devtools: allow non-standard toolchain in meson test

If a compiler is not found in $PATH, the compilation test is skipped.
In some cases, the compiler could be found after extending $PATH
in an environment configuration script (called by load-devel-config).

The decision to skip is deferred to a later stage, after loading the
configuration script.

In such case, the variable DPDK_TARGET, used by the configuration script
as input, is the compiler name.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
4 years agodevtools: shrink cross-compilation test definition
Thomas Monjalon [Sun, 14 Jun 2020 22:03:29 +0000 (00:03 +0200)]
devtools: shrink cross-compilation test definition

Each cross-compilation case needs to define the target compiler
and the meson cross file.
Given the compiler is already defined in the cross file,
the latter is enough.

The function "build" is changed to accept a cross file alternatively
to the compiler name. In the case of a file (detected if readable),
the compiler is extracted with sed and tr, and the option --cross-file
is automatically added.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Reviewed-by: David Christensen <drc@linux.vnet.ibm.com>
4 years agoeal/windows: fix thread handle
Tasnim Bashar [Thu, 25 Jun 2020 19:25:39 +0000 (12:25 -0700)]
eal/windows: fix thread handle

Casting thread ID to handle is not accurate way to get thread handle.
Need to use OpenThread function to get thread handle from thread ID.

pthread_setaffinity_np and pthread_getaffinity_np functions
for Windows are affected because of it.

Signed-off-by: Tasnim Bashar <tbashar@mellanox.com>
4 years agobus/pci: support Windows with bifurcated drivers
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:40 +0000 (15:37 +0300)]
bus/pci: support Windows with bifurcated drivers

Uses SetupAPI.h functions to scan PCI tree.
Uses DEVPKEY_Device_Numa_Node to get the PCI NUMA node.
Uses SPDRP_BUSNUMBER and SPDRP_BUSNUMBER to get the BDF.
scanning currently supports types RTE_KDRV_NONE.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agobus/pci: introduce Windows support with stubs
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:39 +0000 (15:37 +0300)]
bus/pci: introduce Windows support with stubs

Addition of stub eal and bus/pci functions to compile
bus/pci for Windows.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agopci: fix address domain format size
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:36 +0000 (15:37 +0300)]
pci: fix address domain format size

the struct rte_pci_addr defines domain as uint32_t variable however
the PCI_PRI_FMT macro used for logging the struct sets the format
of domain to uint16_t.

The mismatch causes the following warning messages
in Windows clang build:

format specifies type 'unsigned short' but the argument
has type 'uint32_t' (aka 'unsigned int') [-Wformat]

Fixes: af75078fece3 ("first public release")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agopci: build on Windows
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:35 +0000 (15:37 +0300)]
pci: build on Windows

Added <sys/types.h> in rte_pci header file
to include off_t type since it is missing for Windows.

Define the implementation of the Linux function rte_pci_get_sysfs_path
in pci_common.c for Linux OS only as it is unneeded for other OSs
and to avoid the warning on deprecated call to getenv() on Windows:

"warning: 'getenv' is deprecated: This function or variable may be unsafe.
Consider using _dupenv_s instead."

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agopci: use OS generic memory mapping functions
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:34 +0000 (15:37 +0300)]
pci: use OS generic memory mapping functions

Changing all of PCIs Unix memory mapping to the
new memory allocation API wrapper.

Change all of PCI mapping function usage in
bus/pci to support the new API.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agoeal: move OS common options usage functions
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:33 +0000 (15:37 +0300)]
eal: move OS common options usage functions

Move common functions between Unix and Windows to eal_common_options.c.

Those functions are getter functions for rte_application_usage_hook.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agoeal: move OS common config objects
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:32 +0000 (15:37 +0300)]
eal: move OS common config objects

Move common functions between Unix and Windows to eal_common_config.c.

Those functions are getter functions for IOVA,
configuration, Multi-process.

Move rte_config, internal_config, early_mem_config and runtime_dir
to be defined in the common file with getter functions.

Refactor the users of the config variables above to use
the getter functions.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agobuild: generate version map file for MinGW
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:41 +0000 (15:37 +0300)]
build: generate version map file for MinGW

The MinGW build for Windows has special cases where exported
function contain additional prefix:

__emutls_v.per_lcore__*

To avoid adding those prefixed functions to the version.map file
the map_to_def.py script was modified to create a map file for MinGW
with the needed changed.

The file name was changed to map_to_win.py and lib/meson.build map output
was unified with drivers/meson.build output

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agobuild: fix drivers library path on Windows
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:38 +0000 (15:37 +0300)]
build: fix drivers library path on Windows

import library (/IMPLIB) in meson.build should use
the 'drivers' and not 'libs' folder.

The error is: fatal error LNK1149: output filename matches input filename.
The fix uses the correct folder.

Fixes: 5ed3766981 ("drivers: process shared link dependencies as for libs")
Cc: stable@dpdk.org
Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agobuild: skip pmdinfogen on Windows
Tal Shnaiderman [Mon, 29 Jun 2020 12:37:37 +0000 (15:37 +0300)]
build: skip pmdinfogen on Windows

pmdinfogen generation is currently unsupported for Windows.
The relevant part in meson.build is skipped.

Signed-off-by: Tal Shnaiderman <talshn@mellanox.com>
4 years agomk: add a paused deprecation warning before each build
Thomas Monjalon [Wed, 17 Jun 2020 23:52:32 +0000 (01:52 +0200)]
mk: add a paused deprecation warning before each build

DPDK 20.05 had some deprecation notes after "make config"
and after the build.
For DPDK 20.08, the config note is replaced with a warning
before the config and before the build.
After the warning, there is a pause which can be skipped
with the variable MAKE_PAUSE.

This deprecation process was discussed in the Technical Board:
http://mails.dpdk.org/archives/dev/2020-April/162839.html

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
4 years agodoc: update build instructions in the Linux guide
Thomas Monjalon [Wed, 17 Jun 2020 21:41:25 +0000 (23:41 +0200)]
doc: update build instructions in the Linux guide

Before removing the "make" build system completely,
the Linux guide instructions are made more concise and accurate.
Some detailed explanations are also available in
doc/guides/prog_guide/dev_kit_root_make_help.rst

This is the swan song for makefile system,
in order to have accurate information backported in LTS.

Cc: stable@dpdk.org
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
4 years agodoc: remove some build instructions where unneeded
Thomas Monjalon [Wed, 17 Jun 2020 21:37:51 +0000 (23:37 +0200)]
doc: remove some build instructions where unneeded

The build should be described only in few places,
in order to maintain up-to-date, accurate and detailed instructions.
This change is removing some of the unneeded repetitions.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: Jay Zhou <jianjay.zhou@huawei.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
4 years agodoc: remove outdated guidelines for library addition
Thomas Monjalon [Wed, 17 Jun 2020 21:29:32 +0000 (23:29 +0200)]
doc: remove outdated guidelines for library addition

There was a doc about how to extend DPDK by adding a library.
It could have been useful but was never updated,
so it is lacking a lot of explanations about doxygen,
meson, versioning, maintainership, etc.

Anyway such guidelines should fit in the contributors guide.
Better to completely remove this obsolete document.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: David Marchand <david.marchand@redhat.com>
4 years agoapp/flow-perf: add packet forwarding support
Wisam Jaddo [Thu, 4 Jun 2020 13:35:02 +0000 (13:35 +0000)]
app/flow-perf: add packet forwarding support

Introduce packet forwarding support to the app to do
some performance measurements.

The measurements are reported in term of packet per
second unit. The forwarding will start after the end
of insertion/deletion operations.

The support has single and multi performance measurements.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
4 years agoapp/flow-perf: add memory dump to app
Wisam Jaddo [Thu, 4 Jun 2020 13:35:01 +0000 (13:35 +0000)]
app/flow-perf: add memory dump to app

Introduce new feature to dump memory statistics of each socket
and a total for all before and after the creation.

This will give two main advantage:
1- Check the memory consumption for large number of flows
"insertion rate scenario alone"

2- Check that no memory leackage after doing insertion then
deletion.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
4 years agoapp/flow-perf: add deletion rate calculation
Wisam Jaddo [Thu, 4 Jun 2020 13:35:00 +0000 (13:35 +0000)]
app/flow-perf: add deletion rate calculation

Add the ability to test deletion rate for flow performance
application.

This feature is disabled by default, and can be enabled by
add "--deletion-rate" in the application command line options.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
4 years agoapp/flow-perf: add insertion rate calculation
Wisam Jaddo [Thu, 4 Jun 2020 13:34:59 +0000 (13:34 +0000)]
app/flow-perf: add insertion rate calculation

Add insertion rate calculation feature into flow
performance application.

The application now provide the ability to test
insertion rate of specific rte_flow rule, by
stressing it to the NIC, and calculate the
insertion rate.

The application offers some options in the command
line, to configure which rule to apply.

After that the application will start producing
rules with same pattern but increasing the outer IP
source address by 1 each time, thus it will give
different flow each time, and all other items will
have open masks.

The current design have single core insertion rate.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
4 years agoapp/flow-perf: add flow performance skeleton
Wisam Jaddo [Thu, 4 Jun 2020 13:34:58 +0000 (13:34 +0000)]
app/flow-perf: add flow performance skeleton

Add flow performance application skeleton.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Xiaoyu Min <jackmin@mellanox.com>
4 years agombuf: add dump of free dynamic flags
Xiaolong Ye [Sat, 13 Jun 2020 15:49:21 +0000 (23:49 +0800)]
mbuf: add dump of free dynamic flags

Add support to dump free_flags as below format:

Free bit in mbuf->ol_flags (0 = occupied, 1 = free):
  0000: 0 0 0 0 0 0 0 0
  0008: 0 0 0 0 0 0 0 0
  0010: 0 0 0 0 0 0 0 1
  0018: 1 1 1 1 1 1 1 1
  0020: 1 1 1 1 1 1 1 1
  0028: 1 0 0 0 0 0 0 0
  0030: 0 0 0 0 0 0 0 0
  0038: 0 0 0 0 0 0 0 0

Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
4 years agombuf: fix dynamic field dump log
Xiaolong Ye [Sat, 13 Jun 2020 15:49:20 +0000 (23:49 +0800)]
mbuf: fix dynamic field dump log

For each mbuf byte, free_space[i] == 0 means the space is occupied,
free_space[i] != 0 means space is free.

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
4 years agombuf: fix free space update for dynamic field
Xiaolong Ye [Sat, 13 Jun 2020 15:49:19 +0000 (23:49 +0800)]
mbuf: fix free space update for dynamic field

The value free_space[i] is used to save the size of biggest aligned
element that can fit in the zone, current implementation has one flaw,
for example, if user registers dynfield1 (size = 4, align = 4, req = 124)
first, the free_space would be as below after registration:

  0070: 08 08 08 08 08 08 08 08
  0078: 08 08 08 08 00 00 00 00

Then if user continues to register dynfield2 (size = 4, align = 4),
free_space would become:

  0070: 00 00 00 00 04 04 04 04
  0078: 04 04 04 04 00 00 00 00

Further request dynfield3 (size = 8, align = 8) would fail to register
due to alignment requirement can't be satisfied, though there is enough
space remained in mbuf.

This patch fixes above issue by saving alignment only in aligned zone,
after the fix, above registrations order can be satisfied, free_space
would be like:

After dynfield1 registration:

  0070: 08 08 08 08 08 08 08 08
  0078: 04 04 04 04 00 00 00 00

After dynfield2 registration:

  0070: 08 08 08 08 08 08 08 08
  0078: 00 00 00 00 00 00 00 00

After dynfield3 registration:

  0070: 00 00 00 00 00 00 00 00
  0078: 00 00 00 00 00 00 00 00

This patch also reduces iterations in process_score() by jumping align
steps in each loop.

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
4 years agombuf: fix error code in dynamic field/flag registration
Xiaolong Ye [Sat, 13 Jun 2020 15:49:18 +0000 (23:49 +0800)]
mbuf: fix error code in dynamic field/flag registration

Set rte_errno as ENOMEM when allocation failure.

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
4 years agombuf: fix boundary check at dynamic field registration
Xiaolong Ye [Sat, 13 Jun 2020 15:49:17 +0000 (23:49 +0800)]
mbuf: fix boundary check at dynamic field registration

We should make sure off + size < sizeof(struct rte_mbuf) to avoid
possible out-of-bounds access of free_space array, there is no issue
currently due to the low bits of free_flags (which is adjacent to
free_space) are always set to 0. But we shouldn't rely on it since it's
fragile and layout of struct mbuf_dyn_shm may be changed in the future.
This patch adds boundary check explicitly to avoid potential risk of
out-of-bounds access.

Fixes: 4958ca3a443a ("mbuf: support dynamic fields and flags")
Cc: stable@dpdk.org
Signed-off-by: Xiaolong Ye <xiaolong.ye@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
4 years agobus/pci: fix VF memory access
Haiyue Wang [Thu, 25 Jun 2020 03:50:46 +0000 (11:50 +0800)]
bus/pci: fix VF memory access

To fix CVE-2020-12888, the linux vfio-pci module will invalidate mmaps
and block MMIO access on disabled memory, it will send a SIGBUS to the
application:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=abafbc551fdd

When the application opens the vfio PCI device, the vfio-pci module will
enable the bus memory space through PCI read/write access. According to
the PCIe specification, the 'Memory Space Enable' is always zero for VF:

             Table 9-13 Command Register Changes

Bit Location | PF and VF Register Differences | PF         | VF
             | From Base                      | Attributes | Attributes
-------------+--------------------------------+------------+-----------
             | Memory Space Enable - Does not |            |
             | apply to VFs. Must be hardwired|  Base      |  0b
     1       | to 0b for VFs. VF Memory Space |            |
             | is controlled by the VF MSE bit|            |
             | in the VF Control register.    |            |
-------------+--------------------------------+------------+-----------

Afterwards the vfio-pci will initialize its own virtual PCI config space
data ('vconfig') by reading the VF's physical PCI config space, then the
'Memory Space Enable' bit in vconfig will always be 0b value. This will
make the vfio-pci treat the BAR memory space as disabled, and the SIGBUS
will be triggered if access these BARs.

By investigation, the VF PCI device *passthrough* into the Guest OS by
QEMU has the 'Memory Space Enable' with 1b value. That's because every
PCI driver will start to enable the memory space, and this action will
be hooked by vfio-pci virtual PCI read/write to set the 'Memory Space
Enable' in vconfig space to 1b. So VF runs in guest OS has 'Mem+', but
VF runs in host OS has 'Mem-'.

Align with PCI working mode in Guest/QEMU/Host, in DPDK, enable the PCI
bus memory space explicitly to avoid access on disabled memory.

Fixes: 33604c31354a ("vfio: refactor PCI BAR mapping")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Harman Kalra <hkalra@marvell.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Tested-by: Thierry Martin <thierry.martin.public@gmail.com>