dpdk.git
4 years agonet/hns3: support flow action of queue region
Chengwen Feng [Tue, 22 Sep 2020 12:03:21 +0000 (20:03 +0800)]
net/hns3: support flow action of queue region

Kunpeng 930 hardware support spread packets to region of queues which can
be configured by FDIR rule, it means user can create one FDIR rule which
action is one region of queues, and then RSS use the region info to spread
packets.

As we know, RTE_FLOW_ACTION_TYPE_RSS is used to spread packets among
several queues, user could config such as func/level/types/key/queue
parameter to control RSS function, so we provide this feature under the
RTE_FLOW_ACTION_TYPE_RSS framework.

Consider RSS input tuple don't have eth header, we use the following
rule to distinguish them (whether it's queue region configuration or
rss general configuration):
Case 1: pattern have ETH and action's queue_num > 0, indicate it is
queue region configuration.
Case other: rss general configuration.

So if user want to configure one flow which ipv4=192.168.1.192 spread to
queue region of queue 0/1/2/3, the patter should:
  RTE_FLOW_ITEM_TYPE_ETH with spec=last=mask=NULL
  RTE_FLOW_ITEM_TYPE_IPV4 with spec=192.168.1.192 last=mask=NULL
  RTE_FLOW_ITEM_TYPE_END
the action should:
  RTE_FLOW_ACTION_TYPE_RSS with queue_num=4 queue=0/1/2/3
  RTE_FLOW_ACTION_TYPE_END
after calling rte_flow_create, one FDIR rule will be created.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
4 years agonet/hns3: break loop in adding error stats
Hongbo Zheng [Tue, 22 Sep 2020 12:03:20 +0000 (20:03 +0800)]
net/hns3: break loop in adding error stats

This patch solves the redundant operation during traversal. In the internal
function named hns3_error_int_stats_add for adding error statistics,
because only one statistical item will be found in the for loop statement,
a break can be executed after finding the error statistical item without
traversing the remaining table entries.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
4 years agonet/hns3: add default to switch when parsing fd tuple
Wei Hu (Xavier) [Tue, 22 Sep 2020 12:03:19 +0000 (20:03 +0800)]
net/hns3: add default to switch when parsing fd tuple

This patch solves the static check warning in the internal function named
hns3_fd_convert_tuple as follow:
    "The switch statement must have a 'default' branch".

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
4 years agonet/hns3: skip VF register access when PF in FLR
Hongbo Zheng [Tue, 22 Sep 2020 12:03:18 +0000 (20:03 +0800)]
net/hns3: skip VF register access when PF in FLR

According to the protocol of PCIe, FLR to a PF device resets the PF state
as well as the SR-IOV extended capability including VF Enable which means
that VFs no longer exist.

When PF device is in FLR reset stage, at this time, the register state
of VF device is not reliable, so VF device's register state detection
is not carried out in PF FLR.

In this case, we just ignore the register states to avoid accessing
nonexistent register and return false in the internal function named
hns3vf_is_reset_pending to indicate that there are no other reset states
that need to be processed by PMD driver.

Fixes: 2790c6464725 ("net/hns3: support device reset")
Cc: stable@dpdk.org
Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
4 years agonet/hns3: add TSO pseudo header calculation compatibility
Wei Hu (Xavier) [Tue, 22 Sep 2020 12:03:17 +0000 (20:03 +0800)]
net/hns3: add TSO pseudo header calculation compatibility

In kunpeng 920, when process pkts which need TSO, the network driver
need to erase the L4 len value of the TCP TSO pseudo header and
recalculate the pseudo header checksum. kunpeng930 support not need
to erase the L4 len value of the TCP TSO pseudo header.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
4 years agonet/hns3: add max number of segments compatibility
Hongbo Zheng [Tue, 22 Sep 2020 12:03:16 +0000 (20:03 +0800)]
net/hns3: add max number of segments compatibility

Kunpeng 920 supports the maximum nb_segs of non-tso packet is 8 in Tx
direction, kunpeng 930 expands this limit value to 18, this patch sets
the corresponding value by querying the maximum number of non-tso nb_segs
supported by the device during initialization.

Signed-off-by: Hongbo Zheng <zhenghongbo3@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
4 years agonet/hns3: add default case to switch in Rx VLAN processing
Chengchang Tang [Tue, 22 Sep 2020 12:03:15 +0000 (20:03 +0800)]
net/hns3: add default case to switch in Rx VLAN processing

This patch solves the static check warning as follow:
"The switch statement must have a 'default' branch".

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
4 years agonet/hns3: fix deleting default VLAN from PF
Chengchang Tang [Tue, 22 Sep 2020 12:03:14 +0000 (20:03 +0800)]
net/hns3: fix deleting default VLAN from PF

Currently, the default VLAN (vlan id 0) will never be deleted from the
hardware VLAN table based on hns3 PF device. As a result, even a
non-zero PVID is set by calling rte_eth_dev_set_vlan_pvid based on hns3
PF device, packets with VLAN 0 and without VLAN are still received by PF
driver in Rx direction.

This patch deletes the restriction that VLAN 0 cannot be removed in PVID
configuration to ensure packets without PVID will be filtered when PVID
is set. And the patch adds VLAN 0 to the soft list when initializing
vlan configuration to ensure that VLAN 0 will be deleted from the
hardware VLAN table when device is closed.

Fixes: 411d23b9eafb ("net/hns3: support VLAN")
Cc: stable@dpdk.org
Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
4 years agonet/hns3: add VLAN configuration compatibility
Wei Hu (Xavier) [Tue, 22 Sep 2020 12:03:13 +0000 (20:03 +0800)]
net/hns3: add VLAN configuration compatibility

Because of hardware limitation based on the old version of hns3 network
engine, there are some restrictions:
a) HNS3 PMD driver needs select different processing mode for VLAN based
   on whether PVID is set which means our driver need sense the PVID
   states.
b) For packets transmitting process, only two layer of VLAN tag is
   supported. If the total number of VLAN tags in mbuf and VLAN offload
   by hardware (VLAN insert by descriptor) exceeds two, the VLAN in mbuf
   will be overwritten by VLAN in the descriptor.
c) If port based VLAN is set, only one VLAN header is allowed in mbuf or
   it will be discard by hardware.

In order to solve these restriction, two change is implemented on the
new versions of network engine.
1) add a new VLAN tagged insertion mode, named tag shift mode;
2) add a new VLAN strip control bit, named strip hide enable;

The tag shift mode means that VLAN tag will shift automatically when the
inserted place has a tag. For PMD driver, the VLAN tag1 and tag2
configurations in Tx side do not need to be considered because the
hardware completes it. However, the related configuration will still be
retained to be compatible with the old version of network engine.

The VLAN strip hide means that hardware will strip the VLAN tag and hide
VLAN in descriptor (VLAN ID exposed as zero and related STRIP_TAGP is
off).

These changes make it no longer necessary for the hns3 PMD driver to be
aware of the PVID status and have the ability to send mult-layer (more
than two) VLANs packets. Therefore, hns3 PMD driver introduces the
concept of VLAN mode and adds a new VLAN mode named HNS3_PVID_MODE to
indicate that PVID-related IO process can be implemented by the
hardware. And VF driver does not need to be modified because the related
mailbox messages will not be sent by PF kernel mode netdev driver under
new network engine and all the related hardware configuration is on the
PF side.

Signed-off-by: Chengchang Tang <tangchengchang@huawei.com>
Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
4 years agonet/bnxt: check representors devargs before probe
Somnath Kotur [Fri, 25 Sep 2020 10:40:44 +0000 (16:10 +0530)]
net/bnxt: check representors devargs before probe

Check for num_rep before invoking rep port probe. num_rep should be !=0
if representor devargs provided.

Fixes: 6dc83230b43b ("net/bnxt: support port representor data path")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/bnxt: support 200G PAM4 link
Ajit Khaparde [Thu, 20 Aug 2020 03:51:02 +0000 (20:51 -0700)]
net/bnxt: support 200G PAM4 link

Thor based NICs can support PAM4 as wells as NRZ link negotiation.
With this patch we are adding support for 200G link speeds based on
PAM4 signaling. While PAM4 can negotiate speeds for 50G and 100G as
well, the PMD will use NRZ signalling for these speeds.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
4 years agoapp/testpmd: add EEPROM command
David Liu [Tue, 15 Sep 2020 17:27:40 +0000 (13:27 -0400)]
app/testpmd: add EEPROM command

Add module EEPROM/EEPROM dump command
   "show port <port_id> (module_eeprom|eeprom)"
Commands will dump the content of the EEPROM/module
EEPROM for the selected port.

Signed-off-by: David Liu <dliu@iol.unh.edu>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/bnxt: fix queue get info
Lance Richardson [Tue, 22 Sep 2020 17:30:35 +0000 (13:30 -0400)]
net/bnxt: fix queue get info

Return current offloads in rxq_info_get()/txq_info_get().

Fixes: 2fc201884be8 ("net/bnxt: support rxq/txq get information")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix drop enable in get Rx queue info
Lance Richardson [Tue, 22 Sep 2020 17:30:34 +0000 (13:30 -0400)]
net/bnxt: fix drop enable in get Rx queue info

Return correct value for rx_drop_en. Add per-queue field to
track rx_drop_en configuration.

Fixes: 2fc201884be8 ("net/bnxt: support rxq/txq get information")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix crash during NAT configuration
Kishore Padmanabha [Tue, 22 Sep 2020 07:06:32 +0000 (12:36 +0530)]
net/bnxt: fix crash during NAT configuration

Initialize the global parameters structure to avoid segmentation fault
in the TRUFLOW global configuration set API.

Fixes: 0a58be6f7c1e ("net/bnxt: add access to NAT global register")
Cc: stable@dpdk.org
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix flow match to ignore packet type
Kishore Padmanabha [Tue, 22 Sep 2020 07:06:31 +0000 (12:36 +0530)]
net/bnxt: fix flow match to ignore packet type

The pkt_type field in the profile TCAM table needs to be ignored and
should not be set to normal packet type. The pkt_type for the packets
that are segmented due to transmit segment offload feature in the driver
are not marked as normal pkt_type and this shall result in profile TCAM
table miss and flow not being offloaded hence resulting in the reduction
of the throughput.

Fixes: fe82f3e02701 ("net/bnxt: support exact match templates")
Cc: stable@dpdk.org
Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: support representors on remote host domain
Somnath Kotur [Tue, 22 Sep 2020 07:06:30 +0000 (12:36 +0530)]
net/bnxt: support representors on remote host domain

In the Stingray use case, representors are conventionally run
inside the SoC domain representing functions that are on the
X86 domain. In order to support this mechanism of building
representors for endpoints that are not in the same host domain,
additional dev args have been in the PMD like so:
rep-based-pf=<physical index> rep-is-pf=<VF=0 or PF=1>
where `rep-based-pf` specifies the physical index of the base PF
that the representor is derived off of.
Since representor(s) can be created for endpoint PFs as well,
rename struct bnxt_vf_representor to bnxt_representor and other such
dev_ops and function names.
devargs have also been extended to specify the exact CoS queue along
with flow control enablement to be used for the conduit between the
representor and the endpoint function.
This is how a sample devargs would look with all the extended devargs

-w 0000:06:02.0,host-based-truflow=1,representor=[1],rep-based-pf=8,
rep-is-pf=1,rep-q-r2f=1,rep-fc-r2f=0,rep-q-f2r=1,rep-fc-f2r=1

Call CFA_PAIR_ALLOC only in case of Stingray instead of CFA_VFR_ALLOC.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix TruFlow devarg handling
Somnath Kotur [Tue, 22 Sep 2020 07:06:28 +0000 (12:36 +0530)]
net/bnxt: fix TruFlow devarg handling

Set the TRUFLOW Enable bit in bp->flags only if the value passed in
devargs was 1. Otherwise set it to 0.

Fixes: 313ac35ac701 ("net/bnxt: support ULP session manager init")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix shift operation
Somnath Kotur [Tue, 22 Sep 2020 07:06:27 +0000 (12:36 +0530)]
net/bnxt: fix shift operation

In page_roundup() left shifting by more than 31 bits could have
undefined behavior as the return value is int and in page_getenum()
it is possible to return a value as high as 63.
Fix that to cap the return value to less than 32.

Coverity issue: 343463
Fixes: b7778e8a1c00 ("net/bnxt: refactor to properly allocate resources for PF/VF")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: simplify representor Rx ring creation
Somnath Kotur [Tue, 22 Sep 2020 07:06:26 +0000 (12:36 +0530)]
net/bnxt: simplify representor Rx ring creation

rx_queue_setup_op for representor was using a common function to
initialize the software data structures for the Rx ring. But that
routine has code to init other rings not needed for representors like
cp/agg ring etc.
Define and invoke a new function to setup structures just for the
representor Rx ring

Fixes: 6dc83230b43b ("net/bnxt: support port representor data path")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: support IPv6 VXLAN decap action
Kishore Padmanabha [Tue, 22 Sep 2020 07:06:25 +0000 (12:36 +0530)]
net/bnxt: support IPv6 VXLAN decap action

Add a template to support ipv6 VXLAN flows to enable support for
vxlan decap for those flows.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/af_xdp: support shared UMEM
Ciara Loftus [Wed, 23 Sep 2020 07:34:39 +0000 (07:34 +0000)]
net/af_xdp: support shared UMEM

Kernel v5.10 will introduce the ability to efficiently share a UMEM
between AF_XDP sockets bound to different queue ids on the same or
different devices. This patch integrates that functionality into the AF_XDP
PMD.

A PMD will attempt to share a UMEM with others if the shared_umem=1 vdev
arg is set. UMEMs can only be shared across PMDs with the same mempool, up
to a limited number of PMDs goverened by the size of the given mempool.
Sharing UMEMs is not supported for non-zero-copy (aligned) mode.

The benefit of sharing UMEM across PMDs is a saving in memory due to not
having to register the UMEM multiple times. Throughput was measured to
remain within 2% of the default mode (not sharing UMEM).

A version of libbpf >= v0.2.0 is required and the appropriate pkg-config
file for libbpf must be installed such that meson can determine the
version.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
4 years agonet/ixgbe: fix VF reset HW error handling
Steve Yang [Tue, 15 Sep 2020 06:46:22 +0000 (06:46 +0000)]
net/ixgbe: fix VF reset HW error handling

When start a VF with no initial MAC address assigned by the underlying
Host PF driver, just reuse the MAC address assigned when VF is
initializing.

Fixes: f69166c9a3c9 ("net/ixgbe: fix reset error handling")
Cc: stable@dpdk.org
Signed-off-by: Steve Yang <stevex.yang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
4 years agonet/mlx5: manage header reformat actions with hashed list
Suanming Mou [Wed, 16 Sep 2020 10:19:48 +0000 (18:19 +0800)]
net/mlx5: manage header reformat actions with hashed list

To manage encap decap header format actions mlx5 PMD used the single
linked list and lookup and insertion operations took too long times if
there were millions of objects and this impacted the flow
insertion/deletion rate.

In order to optimize the performance the hashed list is engaged. The
list implementation is updated to support non-unique keys with few
collisions.

Signed-off-by: Suanming Mou <suanmingm@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: use bond index for netdev operations
Xueming Li [Tue, 15 Sep 2020 03:05:53 +0000 (03:05 +0000)]
net/mlx5: use bond index for netdev operations

In case of bonding, device ifindex was detected as the PF ifindex, so
any operation using ifindex applied to PF instead of the bond device.
These operations includes MTU get/set, up/down and mac address
manipulation, etc.

This patch detects bond interface ifindex and name for PF that join a
bond interface, uses it by default for netdev operations.

Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@nvidia.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
4 years agonet/mlx5: fix vectorized Rx burst check
Viacheslav Ovsiienko [Sun, 13 Sep 2020 19:33:39 +0000 (19:33 +0000)]
net/mlx5: fix vectorized Rx burst check

The Rx queue start/stop feature is not supported if vectorized
rx_burst routine is engaged. There was a routine address typo
and rx_burst type check was wrong.

Fixes: 161d103b231c ("net/mlx5: add queue start and stop")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Acked-by: Matan Azrad <matan@nvidia.com>
4 years agonet/bnxt: fix link status during device recovery
Kalesh AP [Tue, 22 Sep 2020 05:34:16 +0000 (11:04 +0530)]
net/bnxt: fix link status during device recovery

Driver should not send the phy_cfg request to bring link down
during reset recovery. If the driver sends the phy_cfg request
in recovery process, then FW needs to re-establish the link which
in turn increases the recovery time based on PHY type and link partners.

Fixes: df6cd7c1f73a ("net/bnxt: handle reset notify async event from FW")
Cc: stable@dpdk.org
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix PCI per function stats
Lance Richardson [Mon, 21 Sep 2020 17:45:49 +0000 (13:45 -0400)]
net/bnxt: fix PCI per function stats

Fix to use correct value offset for PCI function stats.

Fixes: 5f9374de2a3a ("net/bnxt: add PCI function stats to extended stats")
Cc: stable@dpdk.org
Signed-off-by: Lance Richardson <lance.richardson@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/iavf: support GTPU outer and inner co-exist
Jeff Guo [Fri, 18 Sep 2020 05:46:56 +0000 (13:46 +0800)]
net/iavf: support GTPU outer and inner co-exist

Although currently only the gtpu inner hash be enabled while not the
gtpu outer hash, but the outer protocol still needed to co-exist with
inner protocol when configure the gtpu inner hash rule, that would
allow the gtpu inner hash support for the different outer protocols.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/ice: refactor Rx FlexiMD handling
Haiyue Wang [Tue, 22 Sep 2020 06:40:24 +0000 (14:40 +0800)]
net/ice: refactor Rx FlexiMD handling

The hardware supports many kinds of FlexiMDs set into Rx descriptor, and
the FlexiMDs can have different offsets in the descriptor according the
DDP package setting.

The FlexiMDs type and offset are identified by the RXDID, which will be
used to setup the queue.

For expanding to support different RXDIDs in the future, refactor the Rx
FlexiMD handling by the functions mapped to related RXDIDs.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
4 years agonet/i40e: fix byte counters
Junyu Jiang [Tue, 22 Sep 2020 09:19:31 +0000 (09:19 +0000)]
net/i40e: fix byte counters

This patch fixed the issue that rx/tx bytes statistics counters
overflowed on 48 bit limitation by enlarging the limitation.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Junyu Jiang <junyux.jiang@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/af_xdp: avoid deadlock due to empty fill queue
RongQing Li [Fri, 18 Sep 2020 11:32:31 +0000 (19:32 +0800)]
net/af_xdp: avoid deadlock due to empty fill queue

While receiving packets, it is possible to fail to reserve
fill queue, since buffer ring is shared between tx and rx,
and maybe not available temporary. As a result both fill
queue and Rx queue will be empty.

Then kernel side will not be able to receive packets due to
empty fill queue, and dpdk will not be able to reserve fill
queue because dpdk doesn't have packets to receive, finally
deadlock will happen.

So move reserve fill queue before xsk_ring_cons__peek to fix it.

Cc: stable@dpdk.org
Signed-off-by: RongQing Li <lirongqing@baidu.com>
Signed-off-by: Dongsheng Rong <rongdongsheng@baidu.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
4 years agonet/ena: expose ENI stats as additional xstats
Michal Krawczyk [Thu, 17 Sep 2020 05:30:35 +0000 (07:30 +0200)]
net/ena: expose ENI stats as additional xstats

New HAL allows driver to read extra ENI stats. Exact meaning of each of
them can be found in base/ena_defs/ena_admin_defs.h file and structure
ena_admin_eni_stats.

The ena_eni_stats structure is exactly the same as ena_admin_eni_stats,
but it was required to be added for compatibility with xstats macros.

Reading ENI stats requires communication with the admin queue.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena: lock dynamic usages of admin queue
Michal Krawczyk [Thu, 17 Sep 2020 05:30:34 +0000 (07:30 +0200)]
net/ena: lock dynamic usages of admin queue

There are some cases, where the admin queue commands after the
configuration phase finished - for example, the application could ask
for the driver statistics from multiple cores at once.

As by the design, the admin queue is not multithread safe, the spinlock
was added to protect all usages of the admin queue after the
configuration is done.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: update generation date and commit
Michal Krawczyk [Thu, 17 Sep 2020 05:30:33 +0000 (07:30 +0200)]
net/ena/base: update generation date and commit

The current ena_com version was generated on 26.04.2020.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: simplify loop copying Rx descriptors
Michal Krawczyk [Thu, 17 Sep 2020 05:30:32 +0000 (07:30 +0200)]
net/ena/base: simplify loop copying Rx descriptors

Checking for the cdesc not being NULL doesn't have any sense if the idx
argument is not 0, so it can be skipped, as the error won't be detected
anyway.

To simplify that, only the 'i' value is being verified and the code is
breaking from the infinite loop in case when all descriptors were copied
into the buffer.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: convert values to 32-bit before shifting
Michal Krawczyk [Thu, 17 Sep 2020 05:30:31 +0000 (07:30 +0200)]
net/ena/base: convert values to 32-bit before shifting

When filling out meta descriptor, all values should be converted to the
desired type (u32 in case of the meta descriptor) to prevent losing the
data.

For example, io_sq->phase is of type u8. If
ENA_ETH_IO_TX_META_DESC_PHASE_SHIFT would be greater then 8, all data
would be lost.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: check null meta desc
Michal Krawczyk [Thu, 17 Sep 2020 05:30:30 +0000 (07:30 +0200)]
net/ena/base: check null meta desc

Static code analysis showed up, that it's possible for meta_desc being
NULL. To avoid dereference of the NULL pointer, extra check was added if
the pointer is in fact valid.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: store admin stats as 64-bit
Michal Krawczyk [Thu, 17 Sep 2020 05:30:29 +0000 (07:30 +0200)]
net/ena/base: store admin stats as 64-bit

To minimize chance of integer overflow, the type of admin statistics was
changed from u32 to u64.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: add missing unlikely
Michal Krawczyk [Thu, 17 Sep 2020 05:30:28 +0000 (07:30 +0200)]
net/ena/base: add missing unlikely

To align the error checking code with other parts of the ena_com,
the conditional check is being tested for the error was wrapped inside
unlikely().

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: cleanup coding style
Michal Krawczyk [Thu, 17 Sep 2020 05:30:27 +0000 (07:30 +0200)]
net/ena/base: cleanup coding style

* Function argument style improvement (space after *)
* Align indentation of the define
* Typo fix in the documentation
* Remove extra empty line after license (aligned with other files)
* Extra alignment of one line was fixed

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: check for RSS key configuration support
Michal Krawczyk [Thu, 17 Sep 2020 05:30:26 +0000 (07:30 +0200)]
net/ena/base: check for RSS key configuration support

Setting RSS hash function could not be supported by the device. In that
situation there is no need to fill in default hash key or even allocate
hash key.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: do not use hardcoded RSS key buffer size
Michal Krawczyk [Thu, 17 Sep 2020 05:30:25 +0000 (07:30 +0200)]
net/ena/base: do not use hardcoded RSS key buffer size

It's well defined how the RSS key buffer looks from the device
perspective, so the constant value should be used instead of magic
number. Also it doesn't has to be calculated dynamically.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: split RSS function and hash getters
Michal Krawczyk [Thu, 17 Sep 2020 05:30:24 +0000 (07:30 +0200)]
net/ena/base: split RSS function and hash getters

There is no need to keep single function for both hash function and
the key. If the caller want's to get only single value, then it had to
pass NULL as one of the values, making the API harder to use.

Except reading functions from the device, one can also use function
ena_com_get_current_hash_function() to get the integer value, which
is representing current hash function stored in the ena_com layer.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: add ENI stats
Michal Krawczyk [Thu, 17 Sep 2020 05:30:23 +0000 (07:30 +0200)]
net/ena/base: add ENI stats

The Elastic Netfwork Interface (ENI) stats can be acquired from the HW.

They can provide advanced values which can be further used by the
application for better flow management.

It isn't available to the DPDK application, yet. The PMD must expose
them directly.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: rework setup of accelerated LLQ mode
Michal Krawczyk [Thu, 17 Sep 2020 05:30:22 +0000 (07:30 +0200)]
net/ena/base: rework setup of accelerated LLQ mode

The purpose of this change is general code simplification and
type safety improvement for the logical values.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: remove mmiowb not defined macro
Michal Krawczyk [Thu, 17 Sep 2020 05:30:21 +0000 (07:30 +0200)]
net/ena/base: remove mmiowb not defined macro

As there is no replacement for mmiowb() and there is no need to use both
versions in the DPDK, this ifdef was simply removed.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: fix release of wait event
Michal Krawczyk [Thu, 17 Sep 2020 05:30:20 +0000 (07:30 +0200)]
net/ena/base: fix release of wait event

The wait event is being accessed without making sure it the completion
context exists. The check for that is just below, so it could be used
for releasing wait even safely.

Fixes: 3adcba9a8987 ("net/ena: update HAL to the newer version")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: make delay exponential in polling functions
Michal Krawczyk [Thu, 17 Sep 2020 05:30:19 +0000 (07:30 +0200)]
net/ena/base: make delay exponential in polling functions

Instead of the fixes, 5 ms delay in the polling functions, use
values into given range (by default from 100 us 5000 us) and increase
them exponentially each time, the operation isn't finished.

This change can improve responsiveness of the driver for the fast
operations.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: support admin status for resource busy
Michal Krawczyk [Thu, 17 Sep 2020 05:30:18 +0000 (07:30 +0200)]
net/ena/base: support admin status for resource busy

The admin command could return ENA_ADMIN_RESOURCE_BUSY status, which
is meaning that currently the given resource cannot be used.

However, the request can be repeated, so it's being converted to the
ENA_COM_TRY_AGAIN error code.

Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: specify delay operations
Michal Krawczyk [Thu, 17 Sep 2020 05:30:17 +0000 (07:30 +0200)]
net/ena/base: specify delay operations

ENA_MSLEEP() and ENA_UDELAY() were expecting different behavior - the
first one is expecting driver to sleep, while the other, to busy wait.

For both cases, the rte_delay_(u|m)s() function was used, which could
be either sleep or block, depending on the configuration.

To make the macros valid, the operations should be specified directly.
Because of that, the rte_delay_us_sleep() and rte_delay_us_block() are
now being used.

Fixes: 9ba7981ec992 ("ena: add communication layer for DPDK")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agonet/ena/base: use min/max macros with type conversion
Michal Krawczyk [Thu, 17 Sep 2020 05:30:16 +0000 (07:30 +0200)]
net/ena/base: use min/max macros with type conversion

Usage of RTE_MIN(MAX) in ENA_MIN32, ENA_MIN16, ENA_MIN8 (and same for
the MAX), was not enough, as the HAL code is assuming that those macros
will convert both arguments to the specified uintX_t type.

As RTE_MIN(MAX) is using 'typeof' operator, the behavior won't be the
same, especially if arguments has different types (and it could cause
compilation warnings).

To satisfy that, the ENA_MIN_T and ENA_MAX_T macros were added, which
are converting both arguments to the type which is being passed as an
argument.

Fixes: 9ba7981ec992 ("ena: add communication layer for DPDK")
Cc: stable@dpdk.org
Signed-off-by: Michal Krawczyk <mk@semihalf.com>
Reviewed-by: Igor Chauskin <igorch@amazon.com>
Reviewed-by: Guy Tzalik <gtzalik@amazon.com>
Reviewed-by: Stephen Hemminger <stephen@networkplumber.org>
4 years agocommon/octeontx2: upgrade mbox definition to version 9
Harman Kalra [Wed, 16 Sep 2020 17:48:19 +0000 (23:18 +0530)]
common/octeontx2: upgrade mbox definition to version 9

Update mail box data structures to sync with af driver mbox
changes done to retrieve VF's base steering rule.

Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Harman Kalra <hkalra@marvell.com>
4 years agonet/qede: fix milliseconds sleep macro
Devendra Singh Rawat [Mon, 27 Jul 2020 14:16:44 +0000 (19:46 +0530)]
net/qede: fix milliseconds sleep macro

The macro defined for milliseconds sleep was not putting the thread
to sleep and was simply calling a delay routine. This fix redefines
the macro to call the correct rte sleep API.

Fixes: ec94dbc57362 ("qede: add base driver")
Cc: stable@dpdk.org
Signed-off-by: Devendra Singh Rawat <dsinghrawat@marvell.com>
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: Rasesh Mody <rmody@marvell.com>
4 years agonet/qede/base: add missing licence
Ferruh Yigit [Thu, 10 Sep 2020 12:09:05 +0000 (13:09 +0100)]
net/qede/base: add missing licence

Adding BSD-3 SPDX license tag.

Fixes: 519438f7c17f ("net/qede/base: re-arrange few structures for DDC")
Cc: stable@dpdk.org
Reported-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Igor Russkikh <irusskikh@marvell.com>
4 years agonet/hinic/base: fix clock definition with glibc version
Xiaoyun Wang [Mon, 14 Sep 2020 14:31:46 +0000 (22:31 +0800)]
net/hinic/base: fix clock definition with glibc version

Sync the repair of patch("fix compile error for old glibc
caused by CLOCK_MONOTONIC_RAW") in the community.

Fixes: efeed0894e9c ("net/hinic/base: avoid system time jump")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
4 years agonet/hinic/base: get default cos from chip
Xiaoyun Wang [Mon, 14 Sep 2020 14:31:45 +0000 (22:31 +0800)]
net/hinic/base: get default cos from chip

Get default cos of pf driver from chip configuration file.

Fixes: 6691acef0d3d ("net/hinic: support VF")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
4 years agonet/hinic: fix Rx nombuf stats
Xiaoyun Wang [Mon, 14 Sep 2020 14:31:44 +0000 (22:31 +0800)]
net/hinic: fix Rx nombuf stats

rx_mbuf_alloc_failed value is not set to 0 when get stats from driver,
which may cause this counter added every time when call this ops.

Fixes: cb7b6606ebff ("net/hinic: add RSS stats and promiscuous ops")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
4 years agonet/hinic: fix TCAM filter set
Xiaoyun Wang [Mon, 14 Sep 2020 14:31:43 +0000 (22:31 +0800)]
net/hinic: fix TCAM filter set

hinic supports two methods: linear table and tcam table,
if tcam filter enables failed but linear table is ok,
which also needs to enable filter, so for this scene,
driver should not close fdir switch.

Fixes: f4ca3fd54c4d ("net/hinic: create and destroy flow director filter")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
4 years agonet/hinic: fix filters on memory allocation failure
Xiaoyun Wang [Mon, 14 Sep 2020 14:31:42 +0000 (22:31 +0800)]
net/hinic: fix filters on memory allocation failure

If rte_zmalloc failed, pmd driver should also delete the ntuple
filter or ethertype filter or normal and tcam filter that already
added before.

Fixes: d7964ce192e7 ("net/hinic: check memory allocations in flow creation")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
4 years agonet/sfc: move MCDI helpers to common driver
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:43 +0000 (07:34 +0100)]
net/sfc: move MCDI helpers to common driver

These helper will be reused by other libefx consumers, e.g. vDPA
driver.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: use MCDI control structure as libefx ops context
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:42 +0000 (07:34 +0100)]
net/sfc: use MCDI control structure as libefx ops context

Now MCDI helpers interface is independent from network driver and
may be moved into common driver.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: add MCDI callback to poll management event queue
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:41 +0000 (07:34 +0100)]
net/sfc: add MCDI callback to poll management event queue

Management event queue polling is required in the case of
MCDI proxy authentication (client driver code).

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: add MCDI callback to schedule restart
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:40 +0000 (07:34 +0100)]
net/sfc: add MCDI callback to schedule restart

MC reboot handling is driver specific.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: add MCDI callbacks to allocate/free DMA memory
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:39 +0000 (07:34 +0100)]
net/sfc: add MCDI callbacks to allocate/free DMA memory

Net driver should use rte_eth_dma_zone_reserve(), but it is ethdev
specific API which is not available for vDPA.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: avoid panic in the case of MCDI timeout
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:38 +0000 (07:34 +0100)]
net/sfc: avoid panic in the case of MCDI timeout

Implement dummy MCDI timeout handling which simply rejects
further MCDI requests.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: avoid usage of NIC pointer from adapter context
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:37 +0000 (07:34 +0100)]
net/sfc: avoid usage of NIC pointer from adapter context

Prepare to avoid usage of the adapter context in common MCDI helpers.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: use own logging helper macros
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:36 +0000 (07:34 +0100)]
net/sfc: use own logging helper macros

Network driver logging macros depends on sfc_adapter which is
specific to the driver and cannot be used in common code.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: start to make MCDI helpers interface shareable
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:35 +0000 (07:34 +0100)]
net/sfc: start to make MCDI helpers interface shareable

sfc_adapter is network driver specific structure which finally
should not be used in shared MCDI helpers interface.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: make MCDI logging helper macros local
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:34 +0000 (07:34 +0100)]
net/sfc: make MCDI logging helper macros local

Prepare to move MCDI helpers to drivers/common.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: move MCDI helper interface to dedicated namespace
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:33 +0000 (07:34 +0100)]
net/sfc: move MCDI helper interface to dedicated namespace

MCDI helpers will be moved to common/sfc_efx and it is better
to do dummy renamings first before non-trivial changes.

Existing functionality should be split into common and network
driver specific parts. Prepare to do it.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: add dedicated header file with MCDI interface
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:32 +0000 (07:34 +0100)]
net/sfc: add dedicated header file with MCDI interface

MCDI helpers will be shared by net and vDPA drivers.
Prepare to move it to common/sfc_efx.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: introduce common driver library
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:31 +0000 (07:34 +0100)]
net/sfc: introduce common driver library

Move libefx (base driver) into common driver.

Prepare to add vDPA driver which will use the common driver as well.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc: include header with debug helpers directly
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:30 +0000 (07:34 +0100)]
net/sfc: include header with debug helpers directly

Avoid build failures on further restructuring.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
4 years agonet/sfc/base: decorate libefx internal extern functions
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:29 +0000 (07:34 +0100)]
net/sfc/base: decorate libefx internal extern functions

The decorator may be used in the future to instruct linker
to put it into dedicated sections or hide.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@xilinx.com>
Reviewed-by: Mark Spender <mspender@xilinx.com>
Reviewed-by: Richard Houldsworth <rhouldsw@xilinx.com>
4 years agonet/sfc/base: decorate libefx API functions
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:28 +0000 (07:34 +0100)]
net/sfc/base: decorate libefx API functions

The decorators will be used in the future to mark libefx API
functions as __rte_internal.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Reviewed-by: Richard Houldsworth <rhouldsw@xilinx.com>
4 years agonet/sfc/base: add missing extern storage-class specifiers
Andrew Rybchenko [Thu, 17 Sep 2020 06:34:27 +0000 (07:34 +0100)]
net/sfc/base: add missing extern storage-class specifiers

libefx coding standard requires it and the specifier is
used for almost all functions in the header file.

Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Reviewed-by: Richard Houldsworth <rhouldsw@xilinx.com>
4 years agodoc: add new SWX pipeline type to release notes
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:10 +0000 (11:20 +0100)]
doc: add new SWX pipeline type to release notes

Add the new SWX pipeline type to the release notes.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agoexamples/pipeline: add VXLAN encapsulation example
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:09 +0000 (11:20 +0100)]
examples/pipeline: add VXLAN encapsulation example

Add VXLAN encapsulation example to the SWX pipeline application. The
VXLAN tunnels can be generated with the vxlan_table.py script. Example
command line: ./build/pipeline -l0-1 -- -s ./examples/vxlan.cli

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agoexamples/pipeline: add l2fwd with MAC swap example
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:08 +0000 (11:20 +0100)]
examples/pipeline: add l2fwd with MAC swap example

Add L2 Forwarding example with MAC destination and source address swap
to the SWX pipeline application. Example command line:
./build/pipeline -l0-1 -- -s ./examples/l2fwd_macswp.cli

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agoexamples/pipeline: add l2fwd example
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:07 +0000 (11:20 +0100)]
examples/pipeline: add l2fwd example

Add L2 Forwarding example to the SWX pipeline application. Example
command line: ./build/pipeline -l0-1 -- -s ./examples/l2fwd.cli

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agoexamples/pipeline: add configuration commands
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:06 +0000 (11:20 +0100)]
examples/pipeline: add configuration commands

Add CLI commands for application configuration and query.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agoexamples/pipeline: add message passing mechanism
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:05 +0000 (11:20 +0100)]
examples/pipeline: add message passing mechanism

Add network-based connectivity mechanism for the application to allow
for the exchange of configuration messages through the network as
opposed to local CLI only.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agoexamples/pipeline: add new example application
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:04 +0000 (11:20 +0100)]
examples/pipeline: add new example application

Add new example application to showcase the API of the newly
introduced SWX pipeline type.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agotable: add exact match SWX table
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:03 +0000 (11:20 +0100)]
table: add exact match SWX table

Add the exact match table type for the SWX pipeline. Used under the
hood by the SWX pipeline table instruction.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agoport: add source and sink SWX ports
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:02 +0000 (11:20 +0100)]
port: add source and sink SWX ports

Add the PCAP file-based source (input) and sink (output) port types
for the SWX pipeline. The sink port is typically used to implement the
packet drop pipeline action. Used under the hood by the pipeline rx
and tx instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agoport: add ethernet device SWX port
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:01 +0000 (11:20 +0100)]
port: add ethernet device SWX port

Add the Ethernet device input/output port type for the SWX pipeline.
Used under the hood by the pipeline rx and tx instructions.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: add SWX pipeline specification file
Cristian Dumitrescu [Thu, 1 Oct 2020 10:20:00 +0000 (11:20 +0100)]
pipeline: add SWX pipeline specification file

Add support for building the SWX pipeline based on specification file
with syntax aligned to the P4 language. The specification file may be
generated by the P4C compiler in the future.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: add SWX table update high level API
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:59 +0000 (11:19 +0100)]
pipeline: add SWX table update high level API

High-level transaction-oriented API for SWX pipeline table updates. It
supports multi-table atomic updates, i.e. multiple tables can be
updated in a single step with only the before and after table set
visible to the packets. Uses the lower-level table update mechanisms.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: add SWX pipeline flush
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:58 +0000 (11:19 +0100)]
pipeline: add SWX pipeline flush

Flush the packets currently buffered by the SWX pipeline output ports.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: add SWX pipeline query API
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:57 +0000 (11:19 +0100)]
pipeline: add SWX pipeline query API

Query API to be used by the control plane to detect the configuration
and state of the SWX pipeline and its internal objects.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: add SWX instruction optimizer
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:56 +0000 (11:19 +0100)]
pipeline: add SWX instruction optimizer

Instruction optimizer. Detects frequent patterns and replaces them
with some more powerful vector-like pipeline instructions without any
user effort. Executes at instruction translation, not at run-time.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: add SWX instruction verifier
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:55 +0000 (11:19 +0100)]
pipeline: add SWX instruction verifier

Instruction verifier. Executes at instruction translation time during
SWX pipeline build, i.e. at initialization instead of run-time.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: add SWX instruction description
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:54 +0000 (11:19 +0100)]
pipeline: add SWX instruction description

Added SWX instruction set reference table.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: introduce SWX jump and return instructions
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:53 +0000 (11:19 +0100)]
pipeline: introduce SWX jump and return instructions

The jump instructions are either unconditional (jmp) or conditional on
positive/negative tests such as header validity (jmpv/jmpnv), table
lookup hit/miss (jmph/jmpnh), executed action (jmpa/jmpna), equality
(jmpeq/jmpneq), comparison result (jmplt/jmpgt). The return
instruction resumes the pipeline execution after action subroutine.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: introduce SWX extern instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:52 +0000 (11:19 +0100)]
pipeline: introduce SWX extern instruction

The extern instruction calls one of the member functions of a given
extern object or it calls the given extern function. The function
arguments must be written in advance to the mailbox. The results
are available in the same place after execution.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: introduce SWX table instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:51 +0000 (11:19 +0100)]
pipeline: introduce SWX table instruction

The table instruction looks up the input key into the table and then
it triggers the execution of the action found in the table entry. On
lookup miss, the default table action is executed.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: introduce SWX SHR instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:50 +0000 (11:19 +0100)]
pipeline: introduce SWX SHR instruction

The shr (i.e. shift right) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: introduce SWX SHL instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:49 +0000 (11:19 +0100)]
pipeline: introduce SWX SHL instruction

The shl (i.e. shift left) instruction source can be header field (H),
meta-data field (M), extern object (E) or function (F) mailbox field,
table entry action data field (T) or immediate value (I). The
destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
4 years agopipeline: introduce SWX XOR instruction
Cristian Dumitrescu [Thu, 1 Oct 2020 10:19:48 +0000 (11:19 +0100)]
pipeline: introduce SWX XOR instruction

The xor (i.e. bitwise exclusive or) instruction source can be header
field (H), meta-data field (M), extern object (E) or function (F)
mailbox field, table entry action data field (T) or immediate value
(I). The destination is HMEF.

Signed-off-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>