dpdk.git
4 years agonet/vhost: fix interrupt mode
Maxime Coquelin [Fri, 24 Jul 2020 14:45:30 +0000 (16:45 +0200)]
net/vhost: fix interrupt mode

At .new_device() time, only the first vring pair is
now ready, other vrings are configured later.

Problem is that when application will setup and enable
interrupts, only the first queue pair Rx interrupt will
be enabled.

This patches fixes the issue by setting the number of
max interrupts to the number of Rx queues that will be
later initialized. Then, as soon as a Rx vring is ready
and interrupt enabled by the application, it removes the
corresponding uninitialized epoll event, and installs a
new one with the valid FD.

Fixes: 604052ae5395 ("net/vhost: support queue update")

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
4 years agonet/vhost: fix queue update
Maxime Coquelin [Thu, 23 Jul 2020 13:08:54 +0000 (15:08 +0200)]
net/vhost: fix queue update

Now that the vhost library saves the guest notifications
enablement value in its virtqueues metadata, it is not
necessary to do it in the vring_state_changed callback.

One effect of the patch is also to prevent possible
deadlock happening in vhost library.

Fixes: 604052ae5395 ("net/vhost: support queue update")

Reported-by: Yinan Wang <yinan.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
4 years agovhost: fix guest notification setting
Maxime Coquelin [Thu, 23 Jul 2020 13:08:53 +0000 (15:08 +0200)]
vhost: fix guest notification setting

If rte_vhost_enable_guest_notification is called before
the virtqueue is ready, the configuration is lost.

This patch fixes this by saving the guest notification
enablement value requested by the application, and apply
it before the virtqueue is made ready to the application.

Fixes: 604052ae5395 ("net/vhost: support queue update")

Reported-by: Yinan Wang <yinan.wang@intel.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Yinan Wang <yinan.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
4 years agonet/bnxt: remove EEM system memory support
Randy Schacher [Tue, 28 Jul 2020 21:03:42 +0000 (17:03 -0400)]
net/bnxt: remove EEM system memory support

Remove the memory management scheme for Extended Exact Match
using system memory. Using host memory scheme instead which
was the default anyway.

Fixes: b2da02480cb7 ("net/bnxt: support EEM system memory")

Signed-off-by: Randy Schacher <stuart.schacher@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Peter Spreadborough <peter.spreadborough@broadcom.com>
Reviewed-by: Farah Smith <farah.smith@broadcom.com>
4 years agoapp/testpmd: fix timestamp init in txonly mode
Viacheslav Ovsiienko [Wed, 29 Jul 2020 12:29:54 +0000 (12:29 +0000)]
app/testpmd: fix timestamp init in txonly mode

The testpmd application forwards data in multiple threads.
In the txonly mode the Tx timestamps must be initialized
on per thread basis to provide phase shift for the packet
burst being sent. This per thread initialization was performed
on zero value of the variable in thread local storage and
happened only once after testpmd forwarding start. Executing
"start" and "stop" commands did not cause thread local variables
zeroing and wrong timestamp values were used.

Fixes: 4940344dab1d ("app/testpmd: add Tx scheduling command")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
4 years agonet: fix IPv6 checksum with TSO
Yuying Zhang [Tue, 28 Jul 2020 17:09:50 +0000 (17:09 +0000)]
net: fix IPv6 checksum with TSO

The ol_flags check lacks of flag for IPv6 which causes checksum
flag configuration error while IPv6/TCP TSO packet is sent.
This patch fixes the issue by adding PKT_TX_TCP_SEG flag.

The rte_net_intel_cksum_flags_prepare() function prepares the
pseudo header checksum in packet data when doing checksum or TSO
offload.

Fixes: 520059a41aa9 ("net: check fragmented headers in non-debug as well")

Signed-off-by: Yuying Zhang <yuying.zhang@intel.com>
Tested-by: Xi Zhang <xix.zhang@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
4 years agonet/ice: calculate TCP header size for offload
Haiyue Wang [Wed, 29 Jul 2020 07:50:39 +0000 (15:50 +0800)]
net/ice: calculate TCP header size for offload

The ice needs the exact TCP header size including options for TCP
checksum offload, but according to PKT_TX_TCP_CKSUM note, l4_len
is not required to be set, so it needs to calculate the TCP header
size if not set.

Fixes: 17c7d0f9d6a4 ("net/ice: support basic Rx/Tx")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/ice: fix GTPU down/uplink and extension conflict
Simei Su [Tue, 28 Jul 2020 11:07:56 +0000 (19:07 +0800)]
net/ice: fix GTPU down/uplink and extension conflict

When adding a RSS rule with GTPU_DWN/UP, it will write from top to
bottom for profile due to firmware limitation. If a RSS rule with
GTPU_EH already exists, then GTPU_DWN/UP packet will match GTPU_EH
profile. This patch solves this issue by remembering a gtpu_eh RSS
configure and removing it before the corresponding RSS configure
for downlink/uplink rule is issued.

Fixes: 2e2810fc1868 ("net/ice: fix GTPU RSS")

Signed-off-by: Simei Su <simei.su@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/ice: add memory allocation check in RSS init
Yunjian Wang [Tue, 28 Jul 2020 13:11:27 +0000 (21:11 +0800)]
net/ice: add memory allocation check in RSS init

The function rte_zmalloc() could return NULL, the return
value need to be checked.

Fixes: 50370662b727 ("net/ice: support device and queue ops")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/ice: fix memory leak when releasing VSI
Yunjian Wang [Tue, 28 Jul 2020 13:11:12 +0000 (21:11 +0800)]
net/ice: fix memory leak when releasing VSI

At the end of the vsi release, we should free the 'rss_lut'
and 'rss_key' for the vsi.

Fixes: 50370662b727 ("net/ice: support device and queue ops")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/ice: fix GTPU TEID hash
Jeff Guo [Tue, 28 Jul 2020 09:20:57 +0000 (17:20 +0800)]
net/ice: fix GTPU TEID hash

Refine gtpu teid hash rule mapping for GTPU_IP/GTPU_EH/GTPU_DWN/GTPU_UP.

Fixes: 37e444b77814 ("net/ice: support hash for GTPU protocols")

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/i40e: fix bitmap free
Chenmin Sun [Tue, 28 Jul 2020 12:50:58 +0000 (20:50 +0800)]
net/i40e: fix bitmap free

This patch fixes the coverity warning #361024.
rte_bitmap_free() is not a right way to free a bitmap, replacing
it with rte_free().

Coverity issue: 361024
Fixes: febc61d350bf ("net/i40e: optimize flow director update rate")

Signed-off-by: Chenmin Sun <chenmin.sun@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/ice: fix TCP checksum offload
Haiyue Wang [Tue, 28 Jul 2020 13:42:03 +0000 (21:42 +0800)]
net/ice: fix TCP checksum offload

The L4LEN field of the Descriptor Header Offset for TCP should be the
real length including the TCP options.

Fixes: 17c7d0f9d6a4 ("net/ice: support basic Rx/Tx")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agodoc: add vhost multi-queue reconnection issue
Xuan Ding [Tue, 28 Jul 2020 01:33:47 +0000 (01:33 +0000)]
doc: add vhost multi-queue reconnection issue

This patch added known issue for vhost multi-queue reconnection
with virtio-net/virtio-pmd.

Signed-off-by: Xuan Ding <xuan.ding@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovdpa/ifc: support vring update after device config
Chenbo Xia [Tue, 28 Jul 2020 14:32:24 +0000 (14:32 +0000)]
vdpa/ifc: support vring update after device config

The device ready state in vhost lib is now defined as the state
that first queue pair is ready. And kick/callfd may be updated
by QEMU when ifc device is configured.

Although now ifc driver only supports one queue pair, it still
has to update callfd when working with QEMU. This patch fixes
this vring update problem by implementing the set_vring_state
callback.

Suggested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Rosen Xu <rosen.xu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovdpa/mlx5: fix event queue number query
Xueming Li [Tue, 28 Jul 2020 12:32:29 +0000 (12:32 +0000)]
vdpa/mlx5: fix event queue number query

Vdpa example failed on vq setup, the api to get event queue of specified
core failed.

Internal api devx_query_eqn expects index of event queue vectors, no
need to use cpu id. As the doorbell handling thread is per device, it's
sufficient to use default event queue.

This patch uses the default id(0) as event queue index.

Fixes: 8395927cdfaf ("vdpa/mlx5: prepare HW queues")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agonet/virtio-user: fix status management
Xiao Wang [Tue, 28 Jul 2020 06:52:12 +0000 (14:52 +0800)]
net/virtio-user: fix status management

Apart from the virtio status, there should be also a network related
status for link status management, current implementation mixes up these
two statuses.

One issue caused by this mixup is when virtio-user running in server mode
and vhost as a client connects to it, a RARP packet will be generated by
virtio-user due to VIRTIO_NET_S_ANNOUNCE bit is detected in the "status"
in interrupt handler.

VIRTIO_NET_S_LINK_UP and VIRTIO_NET_S_ANNOUNCE should be managed by a
separated field. This patch adds a "net_status" field for this purpose.

Fixes: e9efa4d93821 ("net/virtio-user: add new virtual PCI driver")
Cc: stable@dpdk.org
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovdpa/mlx5: fix completion queue initialization
Xueming Li [Mon, 27 Jul 2020 14:29:53 +0000 (14:29 +0000)]
vdpa/mlx5: fix completion queue initialization

Vdpa device failed to initialize 2nd VQ during setup. From FW syndrome,
unsupported CQE size was specified in CQ initialization attributes.

The unsupported CQE size comes from uninitialized stack struct data, and
the struct has new fields defined recently which are not initialized in
vdpa code.

This patch initializes cq creation attributes with zero to avoid such
random data.

Fixes: 79a7e409a2f6 ("common/mlx5: prepare support of packet pacing")

Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovdpa/mlx5: fix notification timing
Matan Azrad [Mon, 27 Jul 2020 14:00:44 +0000 (14:00 +0000)]
vdpa/mlx5: fix notification timing

The issue is relevant only for the timer event modes: 0 and 1.

When the HW finishes to consume a burst of the guest Rx descriptors,
it creates a CQE in the CQ.
When traffic stops, the mlx5 driver arms the CQ to get a notification
when a specific CQE index is created - the index to be armed is the
next CQE index which should be polled by the driver.

The mlx5 driver configured the kernel driver to send notification to
the guest callfd in the same time of the armed CQE event.
It means that the guest was notified only for each first CQE in a
poll cycle, so if the driver polled CQEs of all the virtio queue
available descriptors, the guest was not notified again for the rest
because there was no any new CQE to trigger the guest notification.

Hence, the Rx queues might be stuck when the guest didn't work with
poll mode.

Remove prior kernel notification, and do manual notification after CQ
polling.

Fixes: a9dd7275a149 ("vdpa/mlx5: optimize notification events")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovdpa/mlx5: fix steering update in virtq unset
Matan Azrad [Mon, 27 Jul 2020 08:07:59 +0000 (08:07 +0000)]
vdpa/mlx5: fix steering update in virtq unset

When a virtq is destroyed by the driver, it must be removed from the
steering RQT which holds its reference.

The driver didn't remove the virtq from RQT before destroying it what
caused HW syndrome in virtq unset.

Remove the virtq from RQT before destroying it.

Fixes: 9f09b1ca15c5 ("vdpa/mlx5: recreate a virtq becoming enabled")
Cc: stable@dpdk.org
Signed-off-by: Xueming Li <xuemingl@mellanox.com>
Signed-off-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agodoc: add a supported NIC in mlx5 vDPA guide
Sergey Madaminov [Sat, 25 Jul 2020 03:03:07 +0000 (23:03 -0400)]
doc: add a supported NIC in mlx5 vDPA guide

Update the docs, adding MCX621102AN-ADAT to the list of NICs supported
by MLX5 vDPA driver.

Suggested-by: William Tu <u9012063@gmail.com>
Signed-off-by: Sergey Madaminov <sergey.madaminov@gmail.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovdpa/mlx5: fix live migration termination
Matan Azrad [Fri, 24 Jul 2020 12:07:11 +0000 (12:07 +0000)]
vdpa/mlx5: fix live migration termination

There are a lot of per virtq operations in the live migration
handling.

Before the driver support for queue update, when a virtq was not valid,
all the LM handling was terminated.

But now, when the driver supports queue update, the virtq can be invalid
as legal stage.

Skip invalid virtq in LM handling.

Fixes: c47d6e83334e ("vdpa/mlx5: support queue update")

Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Xueming Li <xuemingl@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovhost: fix async callback return type
Patrick Fu [Thu, 23 Jul 2020 05:39:06 +0000 (13:39 +0800)]
vhost: fix async callback return type

The async copy device callbacks are used by async APIs to transfer data
and check completion status. Async APIs return the number of packets
successfully processed to the caller applications and no error
(negative) value is allowed for API return value. Thus, negative return
values from async device callbacks don't have meaningful usage, while
adding overhead in checking the return value validity. This patch change
the callback return values from "int" to "uint32_t" to get aligned with
async API definition.

Fixes: 78639d54563a ("vhost: introduce async enqueue registration API")

Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agodoc: describe async API in vhost guide
Patrick Fu [Wed, 22 Jul 2020 15:01:52 +0000 (23:01 +0800)]
doc: describe async API in vhost guide

Update vhost guides to document vhost async APIs

Signed-off-by: Patrick Fu <patrick.fu@intel.com>
Reviewed-by: Chenbo Xia <chenbo.xia@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agovdpa/mlx5: fix compatibility with MISC4
Bing Zhao [Tue, 21 Jul 2020 08:13:40 +0000 (16:13 +0800)]
vdpa/mlx5: fix compatibility with MISC4

When dynamic flex parser feature is introduced, the support for misc
parameters 4 of flow table entry (FTE) match set is needed. The
structure of "mlx5_ifc_fte_match_param_bits" is extended with
"mlx5_ifc_fte_match_set_misc4_bits" at the end of it. The total size
of the FTE match set will be changed into 384 bytes from 320 bytes.
Low level user space driver (rdma-core) will have the validation of
the length of FTE match set. In the old release that no MISC4
supported in the rdma-core, and this will break the backward
compatibility, even if the MISC4 is not used in most cases, like
in vDPA driver.
In order not to break the compatibility old rdma-core, the length
adjustment needs to be done. In mlx5 vDPA driver, the lengths of
the matcher and value are both set to 320 without MISC4. There is
no need to change the structure definition, all bytes of the MISC4
will be discarded if it is not needed. Since the MISC4 parameter
is aligned with a 64B boundary and so does the whole FTE match set
parameter, there is no need to take any padding and alignment into
consideration when calculating the size.

Fixes: daa38a8924a0 ("net/mlx5: add flow translation of eCPRI header")

Signed-off-by: Bing Zhao <bingz@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
4 years agonet/bnxt: fix build with extra cflags
Ajit Khaparde [Fri, 24 Jul 2020 06:40:01 +0000 (23:40 -0700)]
net/bnxt: fix build with extra cflags

When we compile PMD with CFLAGS set to -O -g, build fails because of
uninitialized error. This patch fixes it.

Bugzilla ID: 509
Fixes: 1e46b3962620 ("net/bnxt: fill cfa action in Tx descriptor")

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
4 years agonet/bnxt: fix logical AND in if condition
Ajit Khaparde [Fri, 24 Jul 2020 00:04:37 +0000 (17:04 -0700)]
net/bnxt: fix logical AND in if condition

The if condition in bnxt_restore_mac_filters needs to check for
the result of logical AND. But it was not doing it resulting in
an incorrect check.

Fixes: b02f1573cd07 ("net/bnxt: restore MAC filters during reset recovery")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: cleanup VF representor device operations
Somnath Kotur [Thu, 23 Jul 2020 11:56:39 +0000 (17:26 +0530)]
net/bnxt: cleanup VF representor device operations

No need to access rx_cfa_code, cfa_code_map from the VF-Rep functions
anymore.

Fixes: 322bd6e70272 ("net/bnxt: add port representor infrastructure")

Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: enable shadow tables during session open
Mike Baucom [Thu, 23 Jul 2020 11:56:38 +0000 (17:26 +0530)]
net/bnxt: enable shadow tables during session open

Turn on shadow memory in the core to allow search before allocate.
This allows reuse of constrained resources.

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add templates for search before alloc
Mike Baucom [Thu, 23 Jul 2020 11:56:37 +0000 (17:26 +0530)]
net/bnxt: add templates for search before alloc

Search before alloc allows reuse of constrained resources such as tcam,
encap, and source modifications.  The new templates will search the
entry and alloc only if necessary.

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add TCAM table processing for search and alloc
Kishore Padmanabha [Thu, 23 Jul 2020 11:56:36 +0000 (17:26 +0530)]
net/bnxt: add TCAM table processing for search and alloc

Added support for tcam table processing to enable the search
and allocate support. This also includes the tcam entry update
support.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
4 years agonet/bnxt: fix FW rule deletion on representor create
Venkat Duvvuru [Thu, 23 Jul 2020 11:56:34 +0000 (17:26 +0530)]
net/bnxt: fix FW rule deletion on representor create

Truflow stack adds VFR to VF and VF to VFR conduits when VF
representor is created. However, in the ingress direction the
VF's fw rules conflict with Truflow rules, resulting in not hitting
the Truflow VFR rules. To fix this, fw is going to remove it’s
VF rules when vf representor is created in Truflow mode and will
restore the removed rules when vf representor is destroyed.

This patch invokes the vf representor alloc and free hwrm commands
as part of which fw will do the above mentioned actions.

Fixes: 1e18ec58ed5c ("net/bnxt: create default flow rules for port reprentor")

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Shahaji Bhosle <sbhosle@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix port default rule create/destroy
Venkat Duvvuru [Thu, 23 Jul 2020 11:56:33 +0000 (17:26 +0530)]
net/bnxt: fix port default rule create/destroy

Currently, the flow_ids of port_to_app/app_to_port & tx_cfa_action
for the first port are getting over-written by the second port because
these fields are stored in the ulp context which is common across the
ports.

This patch fixes the problem by having per port structure to store these
fields.

Fixes: 9f702636d7ba ("net/bnxt: add port default rules for ingress and egress")

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
4 years agonet/bnxt: modify ULP mapper to use table search
Mike Baucom [Thu, 23 Jul 2020 11:56:32 +0000 (17:26 +0530)]
net/bnxt: modify ULP mapper to use table search

Modified ulp mapper to use the new tf_search_tbl_entry API.
When search before allocation is requested, mapper calls
tc_search_tbl_entry with the alloc flag.

- On HIT, the result and table index is returned.
- On MISS, the table index is returned but the result is
created and the table entry is set.
- On REJECT, the flow request is rejected.

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add shadow table capability with search
Mike Baucom [Thu, 23 Jul 2020 11:56:31 +0000 (17:26 +0530)]
net/bnxt: add shadow table capability with search

- Added Index Table shadow tables for searching
- Added Search API to allow reuse of Table entries

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Farah Smith <farah.smith@broadcom.com>
4 years agonet/bnxt: update shadow TCAM to use TruFlow hash
Mike Baucom [Thu, 23 Jul 2020 11:56:30 +0000 (17:26 +0530)]
net/bnxt: update shadow TCAM to use TruFlow hash

Removed the hash calculation from tf_shadow_tcam in favor of using a
new common implementation.

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Farah Smith <farah.smith@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add egress template with VLAN tag match
Kishore Padmanabha [Thu, 23 Jul 2020 11:56:29 +0000 (17:26 +0530)]
net/bnxt: add egress template with VLAN tag match

Added egress template with VLAN tag match

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Shahaji Bhosle <sbhosle@broadcom.com>
4 years agonet/bnxt: ignore VLAN priority mask
Kishore Padmanabha [Thu, 23 Jul 2020 11:56:28 +0000 (17:26 +0530)]
net/bnxt: ignore VLAN priority mask

This is a work around for the OVS setting offload rules that
are passing vlan priority mask as wild card and currently we
do not support it.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
4 years agonet/bnxt: configure PARIF for egress rules
Kishore Padmanabha [Thu, 23 Jul 2020 11:56:27 +0000 (17:26 +0530)]
net/bnxt: configure PARIF for egress rules

The parif for the egress rules need to be dynamically
configured based on the port type.
PARIF is handler to a partition of the physical port.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
4 years agonet/bnxt: fix NAT template
Kishore Padmanabha [Thu, 23 Jul 2020 11:56:26 +0000 (17:26 +0530)]
net/bnxt: fix NAT template

The template is updated to support additional combinations
of NAT actions.

Fixes: 2951f7f31112 ("net/bnxt: support NAT action items")

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: fix mark id update to mbuf
Venkat Duvvuru [Thu, 23 Jul 2020 11:56:25 +0000 (17:26 +0530)]
net/bnxt: fix mark id update to mbuf

When a packet is looped back from VF to VFR, it is marked to identify
the VFR interface. However, this mark_id shouldn't be percolated up to
the OVS as it is internal to pmd.
This patch fixes it by skipping mark injection into mbuf if the packet
is received on VFR interface.

Fixes: 6dc83230b43b ("net/bnxt: support port representor data path")

Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com>
Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
4 years agonet/bnxt: add TruFlow hash function
Mike Baucom [Thu, 23 Jul 2020 11:56:24 +0000 (17:26 +0530)]
net/bnxt: add TruFlow hash function

Added TruFlow hash API for common hash uses across TruFlow
core functions.

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Farah Smith <farah.smith@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
4 years agonet/bnxt: modify ULP mapper to use TCAM search
Mike Baucom [Thu, 23 Jul 2020 11:56:23 +0000 (17:26 +0530)]
net/bnxt: modify ULP mapper to use TCAM search

Modified ulp mapper to use the new tf_search_tcam_entry API.
When search before allocation is requested, mapper calls
tc_search_tcam_entry with the alloc flag.

- On HIT, the result and tcam index is returned.
- On MISS, the tcam index is returned but the result is
created and the tcam entry is set.
- On REJECT, the flow request is rejected.

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: configure PARIF for offload miss rules
Kishore Padmanabha [Thu, 23 Jul 2020 11:56:22 +0000 (17:26 +0530)]
net/bnxt: configure PARIF for offload miss rules

PARIF is handler to a partition of the physical port.
For the offload miss rules, the parif miss path needs to be
considered. The higher parif are reserved for handling this.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add access to NAT global register
Kishore Padmanabha [Thu, 23 Jul 2020 11:56:21 +0000 (17:26 +0530)]
net/bnxt: add access to NAT global register

Add support to enable or disable the NAT global registers.
The NAT feature is enabled in hardware during initialization
and disabled at deinitialization of the application.

Signed-off-by: Kishore Padmanabha <kishore.padmanabha@broadcom.com>
Reviewed-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/bnxt: add shadow and search capability to TCAM
Mike Baucom [Thu, 23 Jul 2020 11:56:20 +0000 (17:26 +0530)]
net/bnxt: add shadow and search capability to TCAM

- Add TCAM shadow tables for searching
- Add Search API to allow reuse of TCAM entries

Signed-off-by: Mike Baucom <michael.baucom@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
4 years agonet/mlx4: fix premature disabling of interrupt
Ophir Munk [Tue, 28 Jul 2020 11:54:06 +0000 (11:54 +0000)]
net/mlx4: fix premature disabling of interrupt

RXQ interrupts under Linux are based on the epoll mechanism. An
expected order of operations is as follows:
1. Call rte_eth_dev_rx_intr_enable(), to arm the CQ for receiving events
on data input.
2. Block on rte_epoll_wait() with an array of file descriptors
representing the CQ events. Upon data arrival the kernel will signal an
input event on the corresponding CQ fd.
3. Call rte_eth_dev_rx_intr_disable() after the event was received and
continue in polling mode. The mlx4 implementation of
rte_eth_dev_rx_intr_disable() is to get the CQ event and ack it.

In practice applications may wake up from rte_epoll_wait() due to
timeout with no event to ack but still call
rte_eth_dev_rx_intr_disable() unconditionally.  In such cases the call
should return EAGAIN (since the file descriptors are non-blocked), as
opposed to EINVAL which indicates a real failure.  In case of EAGAIN the
PMD should not warn on "unable to disable interrupt on rx queue".

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
4 years agonet/hinic: check memory allocations in flow creation
Yunjian Wang [Tue, 28 Jul 2020 12:34:46 +0000 (20:34 +0800)]
net/hinic: check memory allocations in flow creation

The function rte_zmalloc() could return NULL, the return
value need to be checked.

Fixes: f4ca3fd54c4d ("net/hinic: create and destroy flow director filter")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
4 years agonet/hinic/base: avoid system time jump
Xiaoyun Wang [Sat, 25 Jul 2020 08:15:36 +0000 (16:15 +0800)]
net/hinic/base: avoid system time jump

Replace gettimeofday() with clock_gettime(CLOCK_MONOTONIC_RAW, &now),
the reason is same with
commit d08d304508a8 ("eal/linux: make alarm not affected by system time
jump")

Fixes: 81d53291a466 ("net/hinic/base: add various headers")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
4 years agonet/hinic/base: modify VHD type for SDI
Xiaoyun Wang [Sat, 25 Jul 2020 08:15:35 +0000 (16:15 +0800)]
net/hinic/base: modify VHD type for SDI

For ovs offload scenario, when fw processes the virtio header,
there is no need to offset; and for standard card scenarios,
fw does not care about the vhd_type parameter, so in order to
be compatible with these two scenarios, use 0 offset instead.

Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
4 years agonet/hinic: optimize Rx performance for x86
Xiaoyun Wang [Sat, 25 Jul 2020 08:15:34 +0000 (16:15 +0800)]
net/hinic: optimize Rx performance for x86

For x86 platform, the rq cqe without cache aligned, which can
improve performance for some gateway scenarios.

Fixes: 361a9ccf81d6 ("net/hinic: optimize Rx performance")
Cc: stable@dpdk.org
Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
4 years agonet/hinic: refactor checksum functions
Xiaoyun Wang [Sat, 25 Jul 2020 08:15:33 +0000 (16:15 +0800)]
net/hinic: refactor checksum functions

Encapsulate different types of packet checksum preprocessing
into functions.

Signed-off-by: Xiaoyun Wang <cloud.wangxiaoyun@huawei.com>
4 years agonet/mlx5: fix Rx interrupt handling and cleanup
Dekel Peled [Mon, 27 Jul 2020 08:50:47 +0000 (11:50 +0300)]
net/mlx5: fix Rx interrupt handling and cleanup

Recent patch added creation of Rx CQ using DevX API.
The reading of events from DevX channel was not done correctly.
This patch fixes the event reading, using the correct data structure.
Cleanup after CQ creation, in case of error, is also updated.

Fixes: 08d1838f645a ("net/mlx5: implement CQ for Rx using DevX API")

Signed-off-by: Dekel Peled <dekelp@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
4 years agonet/ice: fix hash action validation
Jeff Guo [Mon, 27 Jul 2020 07:58:31 +0000 (15:58 +0800)]
net/ice: fix hash action validation

An invalid rule should not be validated successfully. If the rule is not
in the supporting list, just return failure to application.

Fixes: 5ad3db8d4bdd ("net/ice: enable advanced RSS")
Cc: stable@dpdk.org
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/ice: remove RSS for SCTP in PPPoE
Qi Zhang [Mon, 27 Jul 2020 05:16:04 +0000 (13:16 +0800)]
net/ice: remove RSS for SCTP in PPPoE

We don't support SCTP in PPPoE RSS, remove it.

Fixes: d117de460035 ("net/ice: fix GTPU/PPPoE packets with no hash value")
Fixes: 0b952714e9c1 ("net/ice: refactor PF hash flow")

Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Nannan Lu <nannan.lu@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
4 years agonet/ice/base: fix RSS interference
Qi Zhang [Sat, 25 Jul 2020 12:14:24 +0000 (20:14 +0800)]
net/ice/base: fix RSS interference

A new symmetric RSS rule may force another asymmetric rule to be
symmetric, vice versa. The reason is due to the flow engine will
try to reuse the existing profile if the input set matches with the
new rule. The fix is to disable this optimization for RSS since we
are not at the situation as profile shortage.

Fixes: ddae0440353f ("net/ice/base: enable symmetric hash for RSS")
Cc: stable@dpdk.org
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Tested-by: Nannan Lu <nannan.lu@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
4 years agonet/i40e: enable QinQ stripping
Wei Zhao [Fri, 24 Jul 2020 07:01:37 +0000 (15:01 +0800)]
net/i40e: enable QinQ stripping

This patch enable i40e outer vlan strip on and off in QinQ
mode with mask bit of DEV_RX_OFFLOAD_QINQ_STRIP, users can
use "vlan set qinq_strip on 0" to enable or "vlan set
qinq_strip off 0" to disable i40e outer vlan strip when
try with testpmd app.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Reviewed-by: Qiming Yang <qiming.yang@intel.com>
4 years agonet/af_xdp: remove mempool freeing on umem destruction
Ciara Loftus [Fri, 24 Jul 2020 13:20:32 +0000 (13:20 +0000)]
net/af_xdp: remove mempool freeing on umem destruction

Other PMDs may be using the mempool, so don't free it when destroying the
UMEM.

Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
Cc: stable@dpdk.org
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
4 years agonet/mlx5: fix dynamic inline hint handling
Gregory Etelson [Thu, 23 Jul 2020 14:21:04 +0000 (17:21 +0300)]
net/mlx5: fix dynamic inline hint handling

The ConnectX NICs can transfer data from the host memory with two
approaches: provide the pointer to the data buffer, or do data inline
- copy the data to the transmit descriptor (WQE) entirely or only the
part of data. In some configurations the NIC hardware requires the
minimal data to be inline in the descriptor to operate correctly. And
there is the special dynamic flag to hint PMD not to inline the data
(for example, if buffer is located on some other device - storage or
GPU) on per packet basis.

If there was a packet with length shorter than the minimal inline data
length requested by the NIC hardware and the no-inline hint was set
the PMD tried to inline the packet with minimal required length
instead of actual packet's one.  This patch adds the missed length
check into no-inline hint handling branch.

Fixes: cacb44a09962 ("net/mlx5: add no-inline Tx flag")
Cc: stable@dpdk.org
Signed-off-by: Gregory Etelson <getelson@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
4 years agonet/mlx5: fix metadata storing for NEON Rx
Viacheslav Ovsiienko [Thu, 23 Jul 2020 10:53:32 +0000 (10:53 +0000)]
net/mlx5: fix metadata storing for NEON Rx

There was the typo introducing the bug, affected the mlx5 vectorized
rx_burst on ARM architectures in case if CQE compression was enabled.

Fixes: 6c55b622a956 ("net/mlx5: set dynamic flow metadata in Rx queues")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
4 years agonet/mlx5: fix UAR memory mapping type
Viacheslav Ovsiienko [Wed, 22 Jul 2020 14:59:08 +0000 (14:59 +0000)]
net/mlx5: fix UAR memory mapping type

The User Access Region is a special mechanism to provide direct
access to the hardware registers, and is the part of PCI address
space that is mapped to CPU virtual address. The mapping can be
performed with the type "Write-Combining" or "Non-Cached", and
these ones might be supported or not on different setups.

To prevent device probing failure the UAR allocation attempt
with alternative mapping type is performed. The datapath
takes the actual UAR mapping into account on queue creation.

There was another issue with NULL UAR base address.
OFED 5.0.x and Upstream rdma_core before v29 returned the NULL as
UAR base address if UAR was not the first object in the UAR page.
It caused the PMD failure and we should try to get another UAR
till we get the first one with non-NULL base address returned.

Fixes: fc4d4f732bbc ("net/mlx5: introduce shared UAR resource")

Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
4 years agonet/i40e: fix hash lookup table
Shougang Wang [Fri, 24 Jul 2020 09:38:34 +0000 (09:38 +0000)]
net/i40e: fix hash lookup table

The hash look up table (LUT) is managed by global register but it is not
initialized when RSS is disabled. Once user wants to enable RSS during
runtime, the LUT will not be initialized.
This patch fixes the issue by initializing the LUT whatever RSS enabled
or not.

Fixes: feaae285b342 ("net/i40e: support hash configuration in RSS flow")
Cc: stable@dpdk.org
Signed-off-by: Shougang Wang <shougangx.wang@intel.com>
Tested-by: Xi Zhang <xix.zhang@intel.com>
Acked-by: Jeff Guo <jia.guo@intel.com>
4 years agonet/iavf: delete unsupported RSS types
Jeff Guo [Fri, 24 Jul 2020 04:07:02 +0000 (12:07 +0800)]
net/iavf: delete unsupported RSS types

The combined hash type should be bound with prefix protocol when
configure it, so delete some useless and unsupported part for
rss types mapping.

Fixes: 7be10c3004be ("net/iavf: add RSS configuration for VF")
Cc: stable@dpdk.org
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/qede: remove dead code
Yunjian Wang [Fri, 17 Jul 2020 11:16:23 +0000 (19:16 +0800)]
net/qede: remove dead code

This patch removes logically dead code reported by coverity.

Coverity issue: 261777, 261778
Fixes: dd28bc8c6ef4 ("net/qede: fix VF port creation sequence")
Cc: stable@dpdk.org
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Rasesh Mody <rmody@marvell.com>
4 years agonet/iavf: add GTPU in default hash
Jeff Guo [Fri, 24 Jul 2020 02:21:52 +0000 (10:21 +0800)]
net/iavf: add GTPU in default hash

Add GTPU_IP and GTPU_EH hash in default.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/i40e: fix flow director MSI-X resource allocation
Mao Jiang [Thu, 23 Jul 2020 16:11:52 +0000 (00:11 +0800)]
net/i40e: fix flow director MSI-X resource allocation

FDIR allocating msix resource is not strictly necessary, if no
resource left, jump the error.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Mao Jiang <maox.jiang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/i40e: fix binding interrupt without MSI-X vector
Mao Jiang [Thu, 23 Jul 2020 15:27:10 +0000 (23:27 +0800)]
net/i40e: fix binding interrupt without MSI-X vector

The value of vsi->nb_msix shouldn't`t be zero, otherwise, all of
interrupts will be bind to vector 0.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Mao Jiang <maox.jiang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agodoc: update release notes for hns3 driver
Wei Hu (Xavier) [Wed, 22 Jul 2020 11:56:32 +0000 (19:56 +0800)]
doc: update release notes for hns3 driver

Add release notes for Hisilicon hns3 PMD driver.

Signed-off-by: Wei Hu (Xavier) <xavier.huwei@huawei.com>
4 years agodoc: announce dpaa-specific API parameter change
Sachin Saxena [Tue, 14 Jul 2020 11:32:42 +0000 (17:02 +0530)]
doc: announce dpaa-specific API parameter change

'port_id' storage size should be 'uint16_t', the API
'rte_pmd_dpaa_set_tx_loopback()' has it as 'uint8_t' but fixing it is an
ABI breakage, that is why planning the fix in v20.11 release where ABI
breakage is allowed.

Signed-off-by: Sachin Saxena <sachin.saxena@nxp.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Acked-by: Zhiyong Yang <zhiyong.yang@intel.com>
4 years agonet/mlx5: fix VF MAC address set over BlueField
Raslan Darawsheh [Wed, 22 Jul 2020 09:07:55 +0000 (12:07 +0300)]
net/mlx5: fix VF MAC address set over BlueField

When trying to set MAC address of an ethernet device and if it was
a representor, PMD sets the MAC over the corresponding VF instead.

For the case of HPF (Host PF representor on BlueField), PMD shouldn't
attempt to set it, since it doesn't have any corresponding VF and fails.

This will fix the issue by setting the MAC on the dev directly.

Fixes: 0d1d73170820 ("net/mlx5: set VF MAC address from host")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
4 years agonet/mlx5: fix vectorized mini-CQE prefetching
Alexander Kozyrev [Wed, 22 Jul 2020 20:32:38 +0000 (20:32 +0000)]
net/mlx5: fix vectorized mini-CQE prefetching

There was an optimization work to prefetch all the CQEs before
their invalidation. It allowed us to speed up the mini-CQE
decompression process by preheating the cache in the vectorized
Rx routine.

Prefetching of the next mini-CQE, on the other hand, showed
no difference in the performance on x86 platform. So, that was
removed. Unfortunately this caused the performance drop on ARM.

Prefetch the mini-CQE as well as all the soon to be
invalidated CQEs to get both CQE and mini-CQE on the hot path.

Fixes: 28a4b96321a3 ("net/mlx5: prefetch CQEs for a faster decompression")
Cc: stable@dpdk.org
Signed-off-by: Alexander Kozyrev <akozyrev@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
4 years agonet/iavf: disable simple XOR RSS hash function
Junfeng Guo [Thu, 23 Jul 2020 11:10:25 +0000 (11:10 +0000)]
net/iavf: disable simple XOR RSS hash function

Function simple_xor for AVF RSS is not required currently, thus we
just return rte_flow error when the command line has item simple_xor.

Signed-off-by: Junfeng Guo <junfeng.guo@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agonet/mlx5: optimize stack memory in probe
Michael Baum [Tue, 21 Jul 2020 12:05:16 +0000 (12:05 +0000)]
net/mlx5: optimize stack memory in probe

The device configuration struct is not small enough to be used as
function argument by value.

Call spawn function with device configuration by reference.

Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: fix unnecessary init in mark conversion
Michael Baum [Tue, 21 Jul 2020 12:04:37 +0000 (12:04 +0000)]
net/mlx5: fix unnecessary init in mark conversion

The flow_dv_convert_action_mark function defines an array of
field_modify_info structures and initializes the first entity.

In the first entity id field, it initializes to 0, even though its type
is an enum that has no value of 0.
In fact, the function does not use this id field before assigning the
appropriate register id into it, so the initialization is unnecessary.
Moreover, this initialization is int into enum, and it would be better
not to create a type conflict for no reason.

Wait for the first entity initialization until the appropriate register
id is already known.

Fixes: 55deee1715f0 ("net/mlx5: extend flow mark support")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: optimize critical section in device free
Michael Baum [Tue, 21 Jul 2020 12:03:38 +0000 (12:03 +0000)]
net/mlx5: optimize critical section in device free

When PMD releases shared IB device context, It locks the
mlx5_ibv_list_mutex lock throughout the function so that it does not
happen while removing a device from the list, another process will try
to insert another device into it.
On the other hand, having removed the device from the list even if it
has not yet released all of its resources, it should not care about
other processes and can release the lock.

However, the PMD does not release the lock even though it can, and
performs a number of operations, some of which include sleep and may be
long.
To improve this, shorten the lock time to the minimum necessary.

Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: fix unlimited parsing of switch info
Michael Baum [Tue, 21 Jul 2020 12:02:32 +0000 (12:02 +0000)]
net/mlx5: fix unlimited parsing of switch info

In mlx5_sysfs_switch_info function, the driver gets switch information
associated with network interface.

The driver writes the port name into buffer and translates it.
However, when it writes the name, it does not limit writing to the
buffer size.

Limit writing to the size of the buffer.

Fixes: 1256805dd54d ("net/mlx5: move Linux-specific functions")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx4: optimize stack memory size in probe
Michael Baum [Tue, 21 Jul 2020 12:01:09 +0000 (12:01 +0000)]
net/mlx4: optimize stack memory size in probe

The mlx4_pci_probe function sets a pointer to the mlx4_priv structure,
and during that function fills its fields one by one with relevant
values.

It wants to put a value in the intr_handle field that has all its fields
zero except 2. To do so, it initializes a local struct rte_intr_handle
type variable and updates it only 2 fields and assigns it into the
appropriate field. However, it initializes a very large structure on the
stack while not at all certain that this place exists and in any case it
is very wasteful.

Reset all fields directly to the pointer by memset, then format the 2
fields to the relevant values.

Fixes: 63c2f23c852a ("net/mlx4: use a single interrupt handle")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: remove ineffective increment in hairpin split
Michael Baum [Tue, 21 Jul 2020 12:00:04 +0000 (12:00 +0000)]
net/mlx5: remove ineffective increment in hairpin split

The flow_hairpin_split function defines a pointer called addr that
points to the list of items.
When the function wants to progress in the list, it adds the size of an
item to the pointer.

At the end of the function, it precedes the pointer one more time even
though it is not used afterwards. In fact, this line is unaffected and
the operation of the function would have been no different without it.

Remove the line where the pointer is preceded unnecessarily.

Fixes: d85c7b5ea59f ("net/mlx5: split hairpin flows")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: fix crash in NVGRE item translation
Michael Baum [Tue, 21 Jul 2020 11:59:04 +0000 (11:59 +0000)]
net/mlx5: fix crash in NVGRE item translation

The flow_dv_translate_item_nvgre function add NVGRE item to matcher and
to the value.
It defines a pointer named nvrge_m that receives the item's mask into
it, and then copies some of it to the matcher.

Before copying, it checks for mask validation, and in case the mask is
NULL the function gives it a pointer to rte_flow_item_nvgre_mask.
However, the function calls from the vni mask's field before the check,
and if there is no mask, it actually does dereference to the NULL
pointer and indeed the program crashes with segfault.

Move the call from the vni field to post-validation.

Fixes: cd18e1b72f73 ("net/mlx5: fix build on Arm")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: fix initialization of steering registers
Michael Baum [Tue, 21 Jul 2020 11:57:21 +0000 (11:57 +0000)]
net/mlx5: fix initialization of steering registers

The mlx5_flow_action_copy_mreg structure contains a field called src
type enum modify_reg, similarly the mlx5_rte_flow_item_tag field
contains a field called id type enum modify_reg.
The enum modify_reg variable represents different registers in the
system and it also has a field called REG_NONE whose value is 0 which
means that the register does not exist.

The flow_mreg_add_copy_action function sets a variable of struct
mlx5_flow_action_copy_mreg type, and initializes the src field to be 0.
Similarly the flow_create_split_metadata function sets a variable of
struct mlx5_rte_flow_item_tag type and initializes the id field to be 0.
In both functions, they initialize a enum modify_reg type variable with
an int type value while modify_reg has an appropriate field for that
value (REG_NONE).

Replace assigning 0 with REG_NONE in both functions.

Fixes: dd3c774f6ffb ("net/mlx5: add metadata register copy table")
Fixes: 71e254bc0294 ("net/mlx5: split Rx flows to provide metadata copy")
Cc: stable@dpdk.org
Signed-off-by: Michael Baum <michaelba@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: fix counter query
Suanming Mou [Wed, 22 Jul 2020 07:58:47 +0000 (15:58 +0800)]
net/mlx5: fix counter query

Currently, the counter query requires the counter ID should start
with 4 aligned. In none-batch mode, the counter pool might have the
chance to get the counter ID not 4 aligned. In this case, the counter
should be skipped, or the query will be failed.

Skip the counter with ID not 4 aligned as the first counter in the
none-batch count pool to avoid invalid counter query. Once having
new min_dcs ID in the poll less than the skipped counters, the
skipped counters will be returned to the pool free list to use.

Fixes: 5382d28c2110 ("net/mlx5: accelerate DV flow counter transactions")
Cc: stable@dpdk.org
Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: separate aging counter pool range
Suanming Mou [Wed, 22 Jul 2020 07:58:45 +0000 (15:58 +0800)]
net/mlx5: separate aging counter pool range

Currently, when allocate the counter or counter based age from group 0,
counter and age may share the same counter dcs ID range. Both age and
pure counter need to sync up with each other's container to check if
the ID range exists and update the min_dcs.

It comes two disadvantages:
1. If the ID range is shared, this counter range will be queried twice
   both from age and pure counter container in 1s.
2. The same range counter check between the two container makes the
   counter allocate sync min_dcs time to time with extra min_dcs
   updating.

This patch avoid the same ID range to be shared when allocate the new
pool. If the same ID range exists in other container, just add the
counter to the other container until get new range which saves the
min_dcs sync up time to time.

Signed-off-by: Suanming Mou <suanmingm@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agocommon/mlx5: fix queue doorbell record size
Viacheslav Ovsiienko [Tue, 21 Jul 2020 11:11:29 +0000 (11:11 +0000)]
common/mlx5: fix queue doorbell record size

When Rx/Tx queue was being created with DevX the allocated
doorbell record size was only uint64_t. That was definitely
less than size of CPU cacheline and it might have happened the
doorbell records attached to different queues handled by
different cores were allocated within same cacheline. It might
have caused the contention on doorbell record writing.

This patch extends the allocated memory size for doorbell
record to cacheline size.

Fixes: 21cae8580fd0 ("net/mlx5: allocate door-bells via DevX")
Cc: stable@dpdk.org
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Matan Azrad <matan@mellanox.com>
4 years agonet/mlx5: fix flow items size calculation
Raslan Darawsheh [Thu, 16 Jul 2020 12:14:55 +0000 (15:14 +0300)]
net/mlx5: fix flow items size calculation

flow_dv_get_item_len returns the actual header size of
an rte_flow item.

Changing any of the structs for rte_flow items by adding
or removing some extra fields will break this function.

This fixes the behavior by returning the actual header size
of each item.

Fixes: 34d41b7aa3bf ("net/mlx5: add VXLAN encap action to Direct Verbs")
Cc: stable@dpdk.org
Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
4 years agonet/mlx5: fix premature disabling of interrupt
Ophir Munk [Tue, 21 Jul 2020 14:41:07 +0000 (14:41 +0000)]
net/mlx5: fix premature disabling of interrupt

RXQ interrupts under Linux are based on the epoll mechanism. An expected
order of operations is as follows:
1. Call rte_eth_dev_rx_intr_enable(), to arm the CQ for receiving events
   on data input.
2. Block on rte_epoll_wait() with an array of file descriptors
   representing the CQ events. Upon data arrival the kernel will signal
   an input event on the corresponding CQ fd.
3. Call rte_eth_dev_rx_intr_disable() after the event was received and
   continue in polling mode. The mlx5 implementation of
   rte_eth_dev_rx_intr_disable() is to get the CQ event and ack it.

In practice applications may wake up from rte_epoll_wait() due to
timeout with no event to ack but still call
rte_eth_dev_rx_intr_disable() unconditionally.  In such cases the call
should return EAGAIN (since the file descriptors are non-blocked), as
opposed to EINVAL which indicates a real failure.  In case of EAGAIN the
PMD should not warn on "Unable to disable interrupt on Rx queue".

This commit fixes a earlier commit where the returned value 0 from
function devx_get_event() - was considered an error.

Fixes: 08d1838f645a ("net/mlx5: implement CQ for Rx using DevX API")

Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
Acked-by: Raslan Darawsheh <rasland@mellanox.com>
4 years agonet/ice: fix bytes statistics
Junyu Jiang [Tue, 21 Jul 2020 07:20:21 +0000 (07:20 +0000)]
net/ice: fix bytes statistics

This patch fixed the issue that rx/tx bytes overflowed
on 40 bit limitation by enlarging the limitation.

Fixes: a37bde56314d ("net/ice: support statistics")
Cc: stable@dpdk.org
Signed-off-by: Junyu Jiang <junyux.jiang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
4 years agoapp/regex: add RegEx test application
Yuval Avnery [Wed, 29 Jul 2020 18:09:57 +0000 (18:09 +0000)]
app/regex: add RegEx test application

Following the new RegEx class.
There is a need to create a dedicated test application in order to
validate this class and PMD.

Unlike net device this application loads data from a file.

This commit introduces the new RegEx test app.

The basic app flow:
1. Configure the RegEx device to use one queue, and set the rule
   database, using precompiled file.
2. Allocate mbufs based on the requested number of jobs, each job will
i  get one mbuf.
3. Enqueue as much as possible jobs.
4. Dequeue jobs.
5. if the number of dequeue jobs < requested number of jobs job to step

Signed-off-by: Ori Kam <orika@mellanox.com>
Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
4 years agoregex/mlx5: fix overrun on enqueueing
Yuval Avnery [Wed, 29 Jul 2020 02:14:51 +0000 (02:14 +0000)]
regex/mlx5: fix overrun on enqueueing

When enqueueing a buffer the PMD check if there is room
in its send queue (SQ).
The current implementation did not take into account that
queue indices are wrapping around, which may result in
consumer index (sq->ci) can have bigger value than than
the producer index (sq->pi).

Fixes: 4d4e245ad637 ("regex/mlx5: support enqueue")

Signed-off-by: Yuval Avnery <yuvalav@mellanox.com>
Acked-by: Ori Kam <orika@mellanox.com>
4 years agobus/vmbus: use SMP memory barrier for signaling read
Long Li [Fri, 17 Jul 2020 20:18:29 +0000 (13:18 -0700)]
bus/vmbus: use SMP memory barrier for signaling read

rte_smp_mb() uses the same locked ADD as the in-kernel vmbus driver,
and it has slightly performance improvement over rte_mb().

Signed-off-by: Long Li <longli@microsoft.com>
4 years agoapp/crypto-perf: support security protocol in PMDCC mode
David Coyle [Tue, 21 Jul 2020 14:56:48 +0000 (15:56 +0100)]
app/crypto-perf: support security protocol in PMDCC mode

This patch adds support for DOCSIS and PDCP security protocols to the
pmd-cyclecount mode of the crypto performance tool. Adding this support
involves freeing the correct session type (i.e. security or cryptodev
session) when the test ends, depending on the op_type specified.

Signed-off-by: David Coyle <david.coyle@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
4 years agoapp/crypto-perf: fix mbuf lengths for DOCSIS
David Coyle [Thu, 16 Jul 2020 15:31:11 +0000 (16:31 +0100)]
app/crypto-perf: fix mbuf lengths for DOCSIS

Set the source mbuf data and packet lengths correctly for DOCSIS
performance tests.

Fixes: d4a131a9498d ("test/crypto-perf: support DOCSIS protocol")

Signed-off-by: David Coyle <david.coyle@intel.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
4 years agocrypto/armv8: remove redundant assert definition
Ruifeng Wang [Tue, 28 Jul 2020 09:24:06 +0000 (17:24 +0800)]
crypto/armv8: remove redundant assert definition

No need to define assert function in PMD since RTE provides the same.
Remove private definition and use RTE_VERIFY instead.

Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
4 years agocrypto/armv8: use dedicated log type
Ruifeng Wang [Tue, 28 Jul 2020 09:24:05 +0000 (17:24 +0800)]
crypto/armv8: use dedicated log type

armv8 crypto PMD used CRYPTODEV general log type.
Create a dedicated log type for the PMD to not pollute CRYPTODEV log type.

Typo in crypto dev name macro caused unexpected device name in log.
Fixed the typo to log with correct device name.

Fixes: 169ca3db550c ("crypto/armv8: add PMD optimized for ARMv8 processors")
Cc: stable@dpdk.org
Suggested-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
4 years agocrypto/armv8: remove debug option
Ruifeng Wang [Tue, 28 Jul 2020 09:24:04 +0000 (17:24 +0800)]
crypto/armv8: remove debug option

Typo in debug log switch macro caused debug log cannot be enabled.
Since no log used in data path, remove the debug option entirely
and have logs always enabled.

Resolved compilation error when debug log is enabled:
rte_armv8_pmd.c: In function ‘process_armv8_chained_op’:
rte_armv8_pmd.c:633:22: error: expected ‘)’ before ‘crypto_func’
  ARMV8_CRYPTO_ASSERT(crypto_func != NULL);
                      ^

Fixes: 169ca3db550c ("crypto/armv8: add PMD optimized for ARMv8 processors")
Cc: stable@dpdk.org
Reported-by: David Marchand <david.marchand@redhat.com>
Signed-off-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
4 years agodoc: update QAT PMD release notes
Adam Dybkowski [Mon, 27 Jul 2020 10:14:08 +0000 (12:14 +0200)]
doc: update QAT PMD release notes

This patch updates 20.08 release notes inside
the part that describe changes in Intel QuickAssist PMD.

Fixes: faa57df0b458 ("crypto/qat: support ChaCha20-Poly1305")
Fixes: 9904ff684981 ("common/qat: improve multi-process handling")

Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
4 years agocommon/qat: support GEN2 device 200xx
Adam Dybkowski [Mon, 27 Jul 2020 10:14:07 +0000 (12:14 +0200)]
common/qat: support GEN2 device 200xx

This adds pci detection and documentation for Intel GEN2
QuickAssist device 200xx (PF Did 0x18ee, VF Did 0x18ef).

Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
4 years agocommon/qat: fix uninitialized variable
Adam Dybkowski [Fri, 24 Jul 2020 09:40:10 +0000 (11:40 +0200)]
common/qat: fix uninitialized variable

This patch fixes the uninitialized variable bug in QAT PMD.

Fixes: 9f27a860dc16 ("crypto/qat: move generic qp function to qp file")
Cc: stable@dpdk.org
Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
4 years agocommon/qat: remove unused fields
Adam Dybkowski [Tue, 21 Jul 2020 13:36:58 +0000 (15:36 +0200)]
common/qat: remove unused fields

This patch removes unused fields from structs qat_qp and
qat_qp_config, together with their initializations.

Signed-off-by: Adam Dybkowski <adamx.dybkowski@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>
4 years agocrypto/octeontx2: fix structure alignment
Tejasree Kondoj [Tue, 21 Jul 2020 04:16:18 +0000 (09:46 +0530)]
crypto/octeontx2: fix structure alignment

The structure cpt_request_info needs only 8 byte alignment.
This patch replaces __rte_cache_aligned of cpt_request_info
with __rte_aligned(8) and removes __rte_aligned(8) in
cpt_meta_info structure.

Fixes: fab634eb87ca ("crypto/octeontx2: support security session data path")

Signed-off-by: Tejasree Kondoj <ktejasree@marvell.com>
Acked-by: Anoob Joseph <anoobj@marvell.com>
4 years agocrypto/qat: fix DOCSIS performance
David Coyle [Tue, 21 Jul 2020 14:47:18 +0000 (15:47 +0100)]
crypto/qat: fix DOCSIS performance

DOCSIS protocol performance in the downlink direction can be improved
significantly in the QAT SYM PMD, especially for larger packets, by
pre-processing all CRC generations in a batch before building and
enqueuing any requests to the HW. This patch adds this optimization.

Fixes: 6f0ef237404b ("crypto/qat: support DOCSIS protocol")

Signed-off-by: David Coyle <david.coyle@intel.com>
Acked-by: Fiona Trahe <fiona.trahe@intel.com>