dpdk.git
6 years agonet/enic: support mbuf fast free offload
Hyong Youb Kim [Fri, 29 Jun 2018 09:29:38 +0000 (02:29 -0700)]
net/enic: support mbuf fast free offload

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/enic: use mbuf pointer array for inflight Tx packets
Hyong Youb Kim [Fri, 29 Jun 2018 09:29:37 +0000 (02:29 -0700)]
net/enic: use mbuf pointer array for inflight Tx packets

WQ is currently using vnic_wq_buf to store mbuf pointers for Tx
packets. But, it contains an unused mempool pointer and mbuf is
unnecessarily cast to void pointer. Remove vnic_wq_buf entirely and
use an mbuf pointer array instead.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/enic: add handlers to add/delete vxlan port number
Hyong Youb Kim [Fri, 29 Jun 2018 09:29:36 +0000 (02:29 -0700)]
net/enic: add handlers to add/delete vxlan port number

The NIC has one configurable VXLAN port, which is set to the default
4789 upon vNIC reset. Adding a non-default port replaces this single
VXLAN port. Deleting the previously added non-default port restores
the VXLAN port to the hardware default.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/enic: add devarg to specify ingress VLAN rewrite mode
Hyong Youb Kim [Fri, 29 Jun 2018 09:29:35 +0000 (02:29 -0700)]
net/enic: add devarg to specify ingress VLAN rewrite mode

Add a new devarg "ig-vlan-rewrite" to allow the user to set
non-default rewrite mode. The UCS VIC may add/remove/modify the VLAN
header of an ingress packet depending on the ingress VLAN rewrite
mode.

By default, the driver sets the pass-through mode, which tells the NIC
"do not touch VLAN header and preserve it as is". This mode is usually
sufficient, but can complicate deployments for certain environments.
For example, OVS-DPDK in UCS blade environments may want to use "untag
default VLAN mode", which removes the VLAN header from an ingress
packet if it matches vNIC's default VLAN.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/enic: report ring limits and preferred default values
Hyong Youb Kim [Fri, 29 Jun 2018 09:29:34 +0000 (02:29 -0700)]
net/enic: report ring limits and preferred default values

Report min/max ring sizes, alignments, and so on, and rely on the
common checks implemented in the rte_ethdev layer.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/enic: initialize RQ fetch index before enabling RQ
Hyong Youb Kim [Fri, 29 Jun 2018 09:29:33 +0000 (02:29 -0700)]
net/enic: initialize RQ fetch index before enabling RQ

The fetch index must be initialized only when RQ is
disabled. Otherwise, it may lead to stale entries in IG descriptor
cache on the VIC.

Fixes: a74629cfa3a1 ("net/enic: enable RQ first and then post Rx buffers")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/enic: do not overwrite admin Tx queue limit
Hyong Youb Kim [Fri, 29 Jun 2018 09:29:32 +0000 (02:29 -0700)]
net/enic: do not overwrite admin Tx queue limit

Currently, enic_alloc_wq (via rte_eth_tx_queue_setup) may overwrite
the admin limit with a lower value. This is wrong as seen in the
following sequence.

1. UCS admin-set Tx queue limit (config.wq_desc_count) = 4096
2. Set up tx queue with 512 descriptors
   The admin limit (config.wq_desc_count) becomes 512.
3. Stop ports and now set up Tx queue with 1024 descriptors.
   This fails because 1024 is greater than the admin limit (512).

Do not modify the admin limit, and when queried, report the current
number of descriptors instead of the admin limit. The rx queue setup
(enic_alloc_rq) does not this problem.

Fixes: fefed3d1e62c ("enic: new driver")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/enic: update the UDP RSS detection mechanism
Hyong Youb Kim [Fri, 29 Jun 2018 09:29:31 +0000 (02:29 -0700)]
net/enic: update the UDP RSS detection mechanism

The UDP RSS interface has changed in the release firmware for 100G VIC
adapters. The capability bit is now in NIC_CFG. Also the driver is
supposed to use CMD_NIC_CFG_CHK and check if RSS config is
successful. No more changes are expected with respect to UDP RSS API.

Fixes: 94c351895888 ("net/enic: update UDP RSS controls")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/enic: fix receive packet types
Hyong Youb Kim [Fri, 29 Jun 2018 09:29:30 +0000 (02:29 -0700)]
net/enic: fix receive packet types

Fix missing or incorrect packet types discovered by DTS.
- Non-IP inner packets
  Set the tunnel flag.
- Inner Ethernet packets
  All supported tunnel packets have Ethernet as inner packets. So, set
  INNER_L2_ETHER for all tunnel types.
- IPv4 fragments carrying TCP/UDP
  The NIC indicates TCP/UDP based on the protocol in IP header. For
  fragments, ignore that bit and always set L4_FRAG.
- IPv6 fragments
  The NIC does regconize fragments (IPv6 packets with fragment extension
  headers). Set packet types for these.

Fixes: 93fb21fdbe23 ("net/enic: enable overlay offload for VXLAN and GENEVE")
Cc: stable@dpdk.org
Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
6 years agonet/mlx4: refine Rx packet type report
Moti Haimovsky [Thu, 28 Jun 2018 06:30:28 +0000 (09:30 +0300)]
net/mlx4: refine Rx packet type report

This commit refines the Rx Packet type flags reported by the PMD
for each packet being received in order to make the report more
accurate.

Signed-off-by: Moti Haimovsky <motih@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
6 years agonet/e1000: support descriptor status API
Wei Zhao [Fri, 29 Jun 2018 01:52:45 +0000 (09:52 +0800)]
net/e1000: support descriptor status API

rte_eth_rx_descritpr_status and rte_eth_tx_descriptor_status
are supported by igb VF.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/fm10k: support descriptor status API
Wei Zhao [Mon, 2 Jul 2018 07:15:58 +0000 (15:15 +0800)]
net/fm10k: support descriptor status API

rte_eth_rx_descritpr_status and rte_eth_tx_descriptor_status
are supported by fm10K.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: remove VF interrupt handler
Qi Zhang [Wed, 27 Jun 2018 13:15:27 +0000 (21:15 +0800)]
net/i40e: remove VF interrupt handler

For i40evf, internal rx interrupt and adminq interrupt share the same
source, that cause a lot cpu cycles be wasted on interrupt handler
on rx path. This is complained by customers which require low latency
(when set I40E_ITR_INTERVAL to small value), but have to be sufferred by
tremendous interrupts handling that eat significant CPU resources.

The patch disable pci interrupt and remove the interrupt handler,
replace it with a low frequency (50ms) interrupt polling daemon
which is implemented by registering a alarm callback periodly, this
save CPU time significently: On a typical x86 server with 2.1GHz CPU,
with low latency configure (32us) we saw CPU usage from top commmand
reduced from 20% to 0% on management core in testpmd).

Also with the new method we can remove compile option: I40E_ITR_INTERVAL
which is used to balance between low latency and low CPU usage previously.
Now we don't need it since we can reach both at same time.

Suggested-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
6 years agonet/virtio: advertise support in-order feature
Marvin Liu [Mon, 2 Jul 2018 13:56:42 +0000 (21:56 +0800)]
net/virtio: advertise support in-order feature

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/virtio: add in-order Rx/Tx into selection
Marvin Liu [Mon, 2 Jul 2018 13:56:41 +0000 (21:56 +0800)]
net/virtio: add in-order Rx/Tx into selection

After IN_ORDER Rx/Tx paths added, need to update Rx/Tx path selection
logic.

Rx path select logic: If IN_ORDER and merge-able are enabled will select
IN_ORDER Rx path. If IN_ORDER is enabled, Rx offload and merge-able are
disabled will select simple Rx path. Otherwise will select normal Rx
path.

Tx path select logic: If IN_ORDER is enabled will select IN_ORDER Tx
path. Otherwise will select default Tx path.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/virtio: support in-order Rx and Tx
Marvin Liu [Mon, 2 Jul 2018 13:56:40 +0000 (21:56 +0800)]
net/virtio: support in-order Rx and Tx

IN_ORDER Rx function depends on merge-able feature. Descriptors
allocation and free will be done in bulk.

Virtio dequeue logic:
    dequeue_burst_rx(burst mbufs)
    for (each mbuf b) {
            if (b need merge) {
                    merge remained mbufs
                    add merged mbuf to return mbufs list
            } else {
                    add mbuf to return mbufs list
            }
    }
    if (last mbuf c need merge) {
            dequeue_burst_rx(required mbufs)
            merge last mbuf c
    }
    refill_avail_ring_bulk()
    update_avail_ring()
    return mbufs list

IN_ORDER Tx function can support offloading features. Packets which
matched "can_push" option will be handled by simple xmit function. Those
packets can't match "can_push" will be handled by original xmit function
with in-order flag.

Virtio enqueue logic:
    xmit_cleanup(used descs)
    for (each xmit mbuf b) {
            if (b can inorder xmit) {
                    add mbuf b to inorder burst list
                    continue
            } else {
                    xmit inorder burst list
                    xmit mbuf b by original function
            }
    }
    if (inorder burst list not empty) {
            xmit inorder burst list
    }
    update_avail_ring()

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/virtio: extract common part for in-order functions
Marvin Liu [Mon, 2 Jul 2018 13:56:39 +0000 (21:56 +0800)]
net/virtio: extract common part for in-order functions

IN_ORDER virtio-user Tx function support Tx checksum offloading and
TSO which also support on normal Tx function. So extracts common part
into separated function for reuse.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/virtio: free in-order descriptors before device start
Marvin Liu [Mon, 2 Jul 2018 13:56:38 +0000 (21:56 +0800)]
net/virtio: free in-order descriptors before device start

Add new function for freeing IN_ORDER descriptors. As descriptors will
be allocated and freed sequentially when IN_ORDER feature was
negotiated. There will be no need to utilize chain for freed descriptors
management, only index update is enough.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/virtio-user: add mrg-rxbuf and in-order vdev parameters
Marvin Liu [Mon, 2 Jul 2018 13:56:37 +0000 (21:56 +0800)]
net/virtio-user: add mrg-rxbuf and in-order vdev parameters

Add parameters for configuring VIRTIO_NET_F_MRG_RXBUF and
VIRTIO_F_IN_ORDER feature bits. If feature is disabled, also update
corresponding unsupported feature bit.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/virtio-user: add unsupported features mask
Marvin Liu [Mon, 2 Jul 2018 13:56:36 +0000 (21:56 +0800)]
net/virtio-user: add unsupported features mask

This patch introduces unsupported features mask for virtio-user device.
For virtio-user server mode, when reconnecting virtio-user will retrieve
vhost device features as base and then unmask unsupported features.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/virtio: add in-order feature bit definition
Marvin Liu [Mon, 2 Jul 2018 13:56:35 +0000 (21:56 +0800)]
net/virtio: add in-order feature bit definition

If VIRTIO_F_IN_ORDER has been negotiated, driver will use descriptors in
ring order: starting from offset 0 in the table, and wrapping around at
the end of the table. Also introduce use_inorder_[rt]x flag for
selection of IN_ORDER [RT]x handlers.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agovhost: advertise support in-order feature
Marvin Liu [Mon, 2 Jul 2018 13:56:34 +0000 (21:56 +0800)]
vhost: advertise support in-order feature

If devices always use descriptors in the same order in which they have
been made available. These devices can offer the VIRTIO_F_IN_ORDER
feature. If negotiated, this knowledge allows devices to notify the use
of a batch of buffers to virtio driver by only writing used ring index.

Vhost user device has supported this feature by default. If vhost
dequeue zero is enabled, should disable VIRTIO_F_IN_ORDER as vhost can’t
assure that descriptors returned from NIC are in order.

Signed-off-by: Marvin Liu <yong.liu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agonet/bnxt: use correct flags during VLAN configuration
Somnath Kotur [Thu, 28 Jun 2018 20:15:49 +0000 (13:15 -0700)]
net/bnxt: use correct flags during VLAN configuration

Setting of VLAN filter cmd was being done with incorrect flag value.
We need to use inner vlan fields instead of outer vlan.

Fixes: 7fe5668d2ea3 ("net/bnxt: support VLAN filter and strip")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: fix Rx ring count limitation
Ajit Khaparde [Thu, 28 Jun 2018 20:15:48 +0000 (13:15 -0700)]
net/bnxt: fix Rx ring count limitation

Fixed size of fw_grp_ids in VNIC is limiting the number of Rx rings
being created. With this patch we are allocating fw_grp_ids dynamically,
allowing us to get over this artificial limit.

Fixes: 9738793f28ec ("net/bnxt: add VNIC functions and structs")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: check VF resources if resource manager is enabled
Ajit Khaparde [Thu, 28 Jun 2018 20:15:47 +0000 (13:15 -0700)]
net/bnxt: check VF resources if resource manager is enabled

If HWRM resource manager is enabled, check VF resources before proceeding.
Make sure there are enough resources allocated and return an error in case
of insufficient error.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: fix to move a flow to a different queue
Somnath Kotur [Thu, 28 Jun 2018 20:15:46 +0000 (13:15 -0700)]
net/bnxt: fix to move a flow to a different queue

While moving a flow to a different destination queue,
the l2_filter_id being passed to the FW command was incorrect.
Fix it by re-using the matching filter's l2_filter_id since
that is supposed to be the same in this case.

Fixes: 5ef3b79fdfe6 ("net/bnxt: support flow filter ops")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: allocate RSS context only if RSS mode is enabled
Ajit Khaparde [Thu, 28 Jun 2018 20:15:45 +0000 (13:15 -0700)]
net/bnxt: allocate RSS context only if RSS mode is enabled

allocate RSS context only if RSS mode is enabled.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: fix incorrect IO address handling in Tx
Ajit Khaparde [Thu, 28 Jun 2018 20:15:44 +0000 (13:15 -0700)]
net/bnxt: fix incorrect IO address handling in Tx

rte_mbuf_data_iova returns a 64-bit address. But we are incorrectly
using only 32-bits of that. Use rte_cpu_to_le_64 instead of
rte_cpu_to_le_32

Fixes: 6eb3cc2294fd ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: fix set MTU
Ajit Khaparde [Thu, 28 Jun 2018 20:15:43 +0000 (13:15 -0700)]
net/bnxt: fix set MTU

There is no need to update bnxt_hwrm_vnic_plcmode_cfg if new MTU is
not greater than the max data the mbuf can accommodate.

Fixes: daef48efe5e5 ("net/bnxt: support set MTU")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: check filter type before clearing it
Ajit Khaparde [Thu, 28 Jun 2018 20:15:42 +0000 (13:15 -0700)]
net/bnxt: check filter type before clearing it

In bnxt_free_filter_mem(), check the filter type and call the
appropriate HWRM command to clear the filter from HW.

Fixes: 5ef3b79fdfe6 ("net/bnxt: support flow filter ops")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: revert reset of L2 filter id
Somnath Kotur [Thu, 28 Jun 2018 20:15:41 +0000 (13:15 -0700)]
net/bnxt: revert reset of L2 filter id

The L2 filter id is needed in many scenarios particularly when
we are repurposing the same ntuple filter with different destination
queues. This patch reverts a commit in which the L2 filter id was being
reset in clear_ntuple_filter().

Fixes: 1383434c9089 ("net/bnxt: reset L2 filter id once filter is freed")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
6 years agonet/bnxt: fix Tx with multiple mbuf
Xiaoxin Peng [Thu, 28 Jun 2018 20:15:40 +0000 (13:15 -0700)]
net/bnxt: fix Tx with multiple mbuf

When using multi-mbuf to xmit large packets, we need to use total
packet lengths (sum of all segments) to set txbd->flags_type.
Packets will not be sent when using tx_pkt->data_len(The first
segment of packets).

Fixes: 6eb3cc2294fd ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Xiaoxin Peng <xiaoxin.peng@broadcom.com>
Reviewed-by: Herry Chen <herry.chen@broadcom.com>
Reviewed-by: Jason He <jason.he@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: update HWRM API to v1.9.2.9
Rob Miller [Thu, 28 Jun 2018 20:15:39 +0000 (13:15 -0700)]
net/bnxt: update HWRM API to v1.9.2.9

update HWRM API to v1.9.2.9

Signed-off-by: Rob Miller <rob.miller@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
6 years agonet/bnxt: check for invalid vNIC id
Jay Ding [Thu, 28 Jun 2018 20:15:38 +0000 (13:15 -0700)]
net/bnxt: check for invalid vNIC id

Passing an invalid fw_vnic_id to the firmware will cause the
bnxt_hwrm_vnic_plcmode_cfg command to fail.
Add a check for VNIC id before sending message to firmware.

Fixes: daef48efe5e5 ("net/bnxt: support set MTU")
Cc: stable@dpdk.org
Signed-off-by: Jay Ding <jay.ding@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: refactor filter/flow
Ajit Khaparde [Thu, 28 Jun 2018 20:15:37 +0000 (13:15 -0700)]
net/bnxt: refactor filter/flow

In preparation of more rte_flow support it has been decided to
separate out filter and flow into their own files. Functionally the
same.

Signed-off-by: Michael Wildt <michael.wildt@broadcom.com>
Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: move function check zero bytes
Scott Branden [Thu, 28 Jun 2018 20:15:36 +0000 (13:15 -0700)]
net/bnxt: move function check zero bytes

Move check_zero_bytes into new bnxt_util.h file.

Signed-off-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: fix queue start/stop operations
Ajit Khaparde [Thu, 28 Jun 2018 20:15:35 +0000 (13:15 -0700)]
net/bnxt: fix queue start/stop operations

Packets destined to the to-be-stopped queue should not be dropped
(neither in HW nor in the driver), so re-program the RSS Table without
this queue on stop and add it back to the table on start unless it
is a Representor VF.

Since 0th entry is used for default ring, use fw_grp_id + 1 to change
the RSS table population logic by programming valid IDs instead of the
default zeroth entry in case of an invalid fw_grp_id.

Destroy and recreate the trio of Rx rings(compl, Rx, AG) every time in
start so that HW is in sync with software.

Fixes: 9b63c6fd70e3 ("net/bnxt: support Rx/Tx queue start/stop")
Cc: stable@dpdk.org
Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Reviewed-by: Ray Jui <ray.jui@broadcom.com>
Reviewed-by: Scott Branden <scott.branden@broadcom.com>
Reviewed-by: Randy Schacher <stuart.schacher@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: support a StingRay VF id
Ajit Khaparde [Thu, 28 Jun 2018 20:15:34 +0000 (13:15 -0700)]
net/bnxt: support a StingRay VF id

Add support for StingRay VF device 0xd800

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: fix HW Tx checksum offload check
Ajit Khaparde [Thu, 28 Jun 2018 20:15:33 +0000 (13:15 -0700)]
net/bnxt: fix HW Tx checksum offload check

Add more checks for checksum calculation offload.
Also check for tunnel frames and select the proper
buffer descriptor size.

Fixes: 6eb3cc2294fd ("net/bnxt: add initial Tx code")
Cc: stable@dpdk.org
Signed-off-by: Xiaoxin Peng <xiaoxin.peng@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Jason He <jason.he@broadcom.com>
Reviewed-by: Qingmin Liu <qingmin.liu@broadcom.com>
6 years agonet/bnxt: set ring coalesce parameters for Stratus NIC
Ajit Khaparde [Thu, 28 Jun 2018 20:15:32 +0000 (13:15 -0700)]
net/bnxt: set ring coalesce parameters for Stratus NIC

Set ring coalesce parameters for Stratus NIC.
Other skews don't necessarily need this.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: fix close operation
Ajit Khaparde [Thu, 28 Jun 2018 20:15:31 +0000 (13:15 -0700)]
net/bnxt: fix close operation

We are not cleaning up all the memory and also not unregistering
the driver during device close operation. This patch fixes the issue.

Fixes: 893074951314 ("net/bnxt: free memory in close operation")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: set descriptor rings limits
Ajit Khaparde [Thu, 28 Jun 2018 20:15:30 +0000 (13:15 -0700)]
net/bnxt: set descriptor rings limits

Set MIN and MAX descriptor count for TX and RX rings.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: optimize Rx processing
Ajit Khaparde [Thu, 28 Jun 2018 20:15:29 +0000 (13:15 -0700)]
net/bnxt: optimize Rx processing

1) Use nb_rx_pkts instead of checking producer indices of Rx and
aggregator rings to decide if any Rx completions were processed.
2) Post Rx buffers early in Rx processing instead of waiting for
the budgeted burst quota.
3) Ring Rx CQ DB after Rx buffers are posted.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: support Tx batching
Ajit Khaparde [Thu, 28 Jun 2018 20:15:28 +0000 (13:15 -0700)]
net/bnxt: support Tx batching

Batch more than one Tx requests such that only one completion
is generarted by the HW. We request a Tx completion for first
and last Tx request in the batch.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agonet/bnxt: fix clear port stats
Ajit Khaparde [Thu, 28 Jun 2018 20:15:27 +0000 (13:15 -0700)]
net/bnxt: fix clear port stats

PORT_CLR_STATS is not allowed for VFs, NPAR, MultiHost functions
or when SR-IOV is enabled.
Don't send the HWRM command in such cases.

Fixes: bfb9c2260be2 ("net/bnxt: support xstats get/reset")
Cc: stable@dpdk.org
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
6 years agoethdev: add new offload flag to keep CRC
Ferruh Yigit [Fri, 29 Jun 2018 12:41:13 +0000 (13:41 +0100)]
ethdev: add new offload flag to keep CRC

DEV_RX_OFFLOAD_KEEP_CRC offload flag is added. PMDs that support
keeping CRC should advertise this offload capability.

DEV_RX_OFFLOAD_CRC_STRIP flag will remain one more release
default behavior in PMDs are to keep the CRC until this flag removed

Until DEV_RX_OFFLOAD_CRC_STRIP flag is removed:
- Setting both KEEP_CRC & CRC_STRIP is INVALID
- Setting only CRC_STRIP PMD should strip the CRC
- Setting only KEEP_CRC PMD should keep the CRC
- Not setting both PMD should keep the CRC

A helper function rte_eth_dev_is_keep_crc() has been added to be able to
change the no flag behavior with minimal changes in PMDs.

The PMDs that doesn't report the DEV_RX_OFFLOAD_KEEP_CRC offload can
remove rte_eth_dev_is_keep_crc() checks next release, related code
commented to help the maintenance task.

And DEV_RX_OFFLOAD_CRC_STRIP has been added to virtual drivers since
they don't use CRC at all, when an application requires this offload
virtual PMDs should not return error.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Allain Legacy <allain.legacy@windriver.com>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: add flow API to expand RSS flows
Nelio Laranjeiro [Thu, 28 Jun 2018 16:01:21 +0000 (18:01 +0200)]
ethdev: add flow API to expand RSS flows

Introduce an helper for PMD to expand easily flows items list with RSS
action into multiple flow items lists with priority information.

For instance a user items list being "eth / end" with rss action types
"ipv4-udp ipv6-udp end" needs to be expanded into three items lists:

 - eth
 - eth / ipv4 / udp
 - eth / ipv6 / udp

to match the user request.  Some drivers are unable to reach such
request without this expansion, this API is there to help those.
Only PMD should use such API for their internal cooking, the application
will still handle a single flow.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
6 years agonet/qede: fix Rx/Tx offload flags
Shahed Shaikh [Thu, 28 Jun 2018 06:01:58 +0000 (23:01 -0700)]
net/qede: fix Rx/Tx offload flags

 - We don't support QinQ offload, so removing it now.
 - Fix incorrect offload flags in default rxconf
   Since qede PMD does not support per queue rx offload, it
   should not set default_rxconf.offload flags in .dev_infos_get().
   Although these offloads are enabled by default, they are per port
   and not per queue.

Fixes: 946dfd18a4ec ("net/qede: convert to new Rx/Tx offloads API")
Cc: stable@dpdk.org
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
6 years agonet/qede: fix default extended VLAN offload config
Rasesh Mody [Thu, 28 Jun 2018 06:01:57 +0000 (23:01 -0700)]
net/qede: fix default extended VLAN offload config

This patch disables extended VLAN offload by default as PMD does not
support it.

Fixes: d87246a43759 ("net/qede: enable and disable VLAN filtering")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
6 years agodoc: update qede management firmware guide
Rasesh Mody [Thu, 28 Jun 2018 00:32:05 +0000 (17:32 -0700)]
doc: update qede management firmware guide

Fixes: c49a438fce90 ("doc: update qede guide and features")
Fixes: db86fbe54d90 ("doc: update qede PMD NIC guide")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
6 years agonet/ixgbe: add tuned Rx/Tx parameters
Remy Horton [Wed, 27 Jun 2018 12:59:49 +0000 (20:59 +0800)]
net/ixgbe: add tuned Rx/Tx parameters

The optimal values of several transmission & reception related
parameters, such as burst sizes, descriptor ring sizes, and number
of queues, varies between different network interface devices. This
patch adds the values for the ixgbe PMD.

Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/mlx5: fix invalid error check
Adrien Mazarguil [Wed, 27 Jun 2018 09:20:52 +0000 (11:20 +0200)]
net/mlx5: fix invalid error check

Since its return type is unsigned, if_nametoindex() returns 0 in case of
error, never -1.

Fixes: ccdcba53a3f4 ("net/mlx5: use Netlink to add/remove MAC addresses")
Cc: stable@dpdk.org
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: increase number of strides
Yongseok Koh [Tue, 26 Jun 2018 12:39:25 +0000 (05:39 -0700)]
net/mlx5: increase number of strides

If WQE ID is used in CQE for Multi-Packet RQ, the ratio of CQE compression
drops a little bit.  In order to reach to 100Gbps with 64B traffic, it is
needed to further save PCIe bandwidth by increasing the number of strides
in a WQE. It is now 64 by default but adjustable by a PMD parameter -
mprq_log_stride_num.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
6 years agonet/mlx5: use stride index in Rx completion entry
Yongseok Koh [Tue, 26 Jun 2018 12:39:24 +0000 (05:39 -0700)]
net/mlx5: use stride index in Rx completion entry

Multi-Packet Receive Queue is to receive multiple packets on a single large
buffer. The number of consumed strides in CQE is accumulated to keep track
of the current stride index. However, it is safer to directly use stride
index in CQE to avoid out-of-order situation which can possibly be caused
by introducing LRO in the future.

If Rx CQE compression is enabled, HW can be configured to store the stride
index in a mini-CQE but this will need newer version of library/driver.
Therefore, since this change, MPRQ is only supported with the newer
library/driver and Rx hash result is not supported if MPRQ is enabled along
with Rx CQE compression.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
6 years agonet/mlx5: add warning message for Multi-Packet RQ
Yongseok Koh [Tue, 26 Jun 2018 12:39:23 +0000 (05:39 -0700)]
net/mlx5: add warning message for Multi-Packet RQ

If Multi-Packet RQ is enabled but not supported by device or
kernel/library, print out a warning message.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
6 years agonet/mlx5: add new fields in Rx completion entry
Yongseok Koh [Tue, 26 Jun 2018 12:39:22 +0000 (05:39 -0700)]
net/mlx5: add new fields in Rx completion entry

Stride index is added to mlx5_mini_cqe8 structure and WQE ID is added to
mlx5_cqe structure.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
6 years agonet/mlx5: change return value of Rx completion poll
Yongseok Koh [Tue, 26 Jun 2018 12:39:21 +0000 (05:39 -0700)]
net/mlx5: change return value of Rx completion poll

mlx5_rx_poll_len() returns Rx hash result extracted from either mini CQE or
regular CQE. As mini CQE may not have the hash result if configured
otherwise, it shouldn't assume the first DWORD of mini CQE is always hash
result. mlx5_rx_poll_len() is changed to return pointer to the mini CQE if
compressed.

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
6 years agonet/mlx5: fix Rx buffer replenishment threshold
Yongseok Koh [Tue, 26 Jun 2018 11:33:35 +0000 (04:33 -0700)]
net/mlx5: fix Rx buffer replenishment threshold

The threshold of buffer replenishment for vectorized Rx burst is a constant
value (64). If the size of Rx queue is comparatively small, device could
run out of buffers. For example, if the size of Rx queue is 128, buffers
are replenished only twice per a wraparound. This can cause jitter in
receiving packets and the jitter can cause unnecessary retransmission for
TCP connections.

Fixes: 6cb559d67b83 ("net/mlx5: add vectorized Rx/Tx burst for x86")
Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
6 years agonet/nfp: avoid access to sysfs resource0 file
Alejandro Lucero [Tue, 26 Jun 2018 13:29:21 +0000 (14:29 +0100)]
net/nfp: avoid access to sysfs resource0 file

NFP CPP interface dinamically configures NFP CPP BARs for accessing
any NFP chip component from user space. This requires to map PCI BAR
regions specifically. However, this does not require to do such map
over the usual map done by VFIO or UIO drivers with the device PCI
BARs.

This patch avoids this remapping and therefore also avoids to access
the device sysfs resource0 file for doing that remapping.

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agonet/nfp: avoid sysfs resource file access
Alejandro Lucero [Tue, 26 Jun 2018 13:25:40 +0000 (14:25 +0100)]
net/nfp: avoid sysfs resource file access

Getting the bar size is required for NFP CPP interface configuration.
However, this information can be obtained from the VFIO or UIO driver
instead of accessing the sysfs resource file.

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agoapp/testpmd: fix missing count action fields
Nelio Laranjeiro [Thu, 31 May 2018 14:33:34 +0000 (16:33 +0200)]
app/testpmd: fix missing count action fields

COUNT action has been modified and has several fields not addressable
though testpmd.  In addition, as those fields are not definable testpmd
is providing an empty configuration which is undefined.

Fixes: fb8fd96d4251 ("ethdev: add shared counter to flow API")
Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/ena: fix SIGFPE with 0 Rx queue
Daria Kolistratova [Tue, 26 Jun 2018 17:38:56 +0000 (18:38 +0100)]
net/ena: fix SIGFPE with 0 Rx queue

When the number of rx queues is 0 (what can be when application does
not receive) failed with SIGFPE.
It happens when the application is also requesting ETH_MQ_RX_RSS_FLAG
in the rte_dev->data->dev_conf.rxmode.mq_mode.
Fixed adding zero rx queues check.

Signed-off-by: Daria Kolistratova <daria.kolistratova@intel.com>
Acked-by: Michal Krawczyk <mk@semihalf.com>
6 years agonet/tap: support TSO (TCP Segment Offload)
Ophir Munk [Sat, 23 Jun 2018 23:17:41 +0000 (23:17 +0000)]
net/tap: support TSO (TCP Segment Offload)

This commit implements TCP segmentation offload in TAP.
librte_gso library is used to segment large TCP payloads (e.g. packets
of 64K bytes size) into smaller MTU size buffers.
By supporting TSO offload capability in software a TAP device can be used
as a failsafe sub device and be paired with another PCI device which
supports TSO capability in HW.

For more details on librte_gso implementation please refer to dpdk
documentation.
The number of newly generated TCP TSO segments is limited to 64.

Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
6 years agonet/tap: calculate checksums of multi segs packets
Ophir Munk [Sat, 23 Jun 2018 23:17:40 +0000 (23:17 +0000)]
net/tap: calculate checksums of multi segs packets

Prior to this commit IP/UDP/TCP checksum offload calculations
were skipped in case of a multi segments packet.
This commit enables TAP checksum calculations for multi segments
packets.
The only restriction is that the first segment must contain
headers of layers 3 (IP) and 4 (UDP or TCP)

Reviewed-by: Raslan Darawsheh <rasland@mellanox.com>
Signed-off-by: Ophir Munk <ophirmu@mellanox.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
6 years agonet/qede: remove primary MAC removal
Rasesh Mody [Sat, 23 Jun 2018 21:20:33 +0000 (14:20 -0700)]
net/qede: remove primary MAC removal

This was added to dev_stop when set MTU requested vport restart.
We are not requiring vport restart any more with set MTU as it just
needs vport to be inactive and doesn't need the port reconfigured.

Fixes: d121a6b5f781 ("net/qede: fix VF MTU update")
Cc: stable@dpdk.org
Signed-off-by: Rasesh Mody <rasesh.mody@cavium.com>
6 years agonet/qede: fix legacy interrupt mode
Shahed Shaikh [Sat, 23 Jun 2018 21:20:32 +0000 (14:20 -0700)]
net/qede: fix legacy interrupt mode

qede pmd does not have support for legacy interrupt mode.
This causes slow path completion failure with uio_pci_generic module,
since it uses legacy interrupt (INTx) mode.

Fix this issue by installing legacy interrupt handler.

Fixes: ec94dbc57362 ("qede: add base driver")
Cc: stable@dpdk.org
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
6 years agonet/qede: fix unicast MAC address handling in VF
Shahed Shaikh [Sat, 23 Jun 2018 21:20:31 +0000 (14:20 -0700)]
net/qede: fix unicast MAC address handling in VF

We did not register unicast mac configuration handlers
for VF causing failure in bonding of VFs.

Also, mac_addr_set operation requires mac_remove followed
by mac_add.

Fixes: 86a2265e59d7 ("qede: add SRIOV support")
Cc: stable@dpdk.org
Signed-off-by: Shahed Shaikh <shahed.shaikh@cavium.com>
6 years agonet/thunderx: fix build with gcc optimization on
Ferruh Yigit [Thu, 21 Jun 2018 18:14:50 +0000 (19:14 +0100)]
net/thunderx: fix build with gcc optimization on

build error gcc version 6.3.1 20161221 (Red Hat 6.3.1-1),
with EXTRA_CFLAGS="-O3":

.../drivers/net/thunderx/nicvf_ethdev.c:907:9:
   error: ‘txq’ may be used uninitialized in this function
   [-Werror=maybe-uninitialized]
  if (txq->pool_free == nicvf_single_pool_free_xmited_buffers)
      ~~~^~~~~~~~~~~
.../drivers/net/thunderx/nicvf_ethdev.c:886:20:
   note: ‘txq’ was declared here
  struct nicvf_txq *txq;
                    ^~~

Same error on function 'nicvf_eth_dev_init' and 'nicvf_dev_start', it
seems 'nicvf_set_tx_function' inlined when optimization enabled.

Initialize the txq and add NULL check before using it to fix.

Fixes: 7413feee662d ("net/thunderx: add device start/stop and close")
Cc: stable@dpdk.org
Reported-by: Richard Walsh <richard.walsh@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
6 years agonet/bonding: support add/remove MAC addrs
Alex Kiselev [Wed, 20 Jun 2018 08:37:00 +0000 (11:37 +0300)]
net/bonding: support add/remove MAC addrs

Add functions to add/remove MAC addresses

Signed-off-by: Alex Kiselev <alex@therouter.net>
Acked-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Chas Williams <chas3@att.com>
6 years agoeal: do not enable static log macro for ethdev
Ferruh Yigit [Tue, 19 Jun 2018 01:04:57 +0000 (02:04 +0100)]
eal: do not enable static log macro for ethdev

static logging macro RTE_PMD_DEBUG_TRACE is enabled with a few DEBUG
config options, including RTE_LIBRTE_ETHDEV_DEBUG

RTE_LIBRTE_ETHDEV_DEBUG is still used for data path logging, but all
ethdev logging switched to dynamic logging, so no need to enable static
logging macro for ethdev.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: convert static log type usage to dynamic
Ferruh Yigit [Tue, 19 Jun 2018 01:04:56 +0000 (02:04 +0100)]
ethdev: convert static log type usage to dynamic

Replace RTE_PMD_DEBUG_TRACE with RTE_ETHDEV_LOG.

RTE_PMD_DEBUG_TRACE is using hardcoded PMD logtype and ERR log level,
controlled by compile time flags.
RTE_ETHDEV_LOG is using dynamic ethdev_logtype.

Also a few minor cleanups, like
- use %u for unsigned values like port_id which is uint16_t
- use PRIx64 for owner_id
- Join some log lines
- Unify to not have a "." at the end of the log
- Unify log start with uppercase

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agoethdev: move log macro to header
Ferruh Yigit [Tue, 19 Jun 2018 01:04:55 +0000 (02:04 +0100)]
ethdev: move log macro to header

Macro moved to header to be able to convert logging usage in header.
And since it has been moved to public header changed naming and added
RTE prefix, ethdev_log -> RTE_ETHDEV_LOG

Also need to add logtype variable to map file since logging macro used
from other libraries.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Reviewed-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Thomas Monjalon <thomas@monjalon.net>
6 years agonet/pcap: consolidate duplicated code
Ido Goshen [Tue, 19 Jun 2018 14:37:26 +0000 (17:37 +0300)]
net/pcap: consolidate duplicated code

Signed-off-by: Ido Goshen <ido@cgstowernetworks.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/pcap: fix multiple queues
Ido Goshen [Tue, 19 Jun 2018 14:37:25 +0000 (17:37 +0300)]
net/pcap: fix multiple queues

Change open_rx/tx_pcap/iface functions to open only a single pcap/dumper
and not loop num_of_queue times
The num_of_queue loop is already achieved by the
caller rte_kvargs_process

Fixing:
1. Opens N requested pcaps/dumpers instead of N^2
2. Leak of pcap/dumper's which are being overwritten by
   the sequential calls to open_rx/tx_pcap/iface functions
3. Use the filename/iface args per queue and not just the last one
   that overwrites the previous names

Fixes: 4c173302c307 ("pcap: add new driver")
Cc: stable@dpdk.org
Signed-off-by: Ido Goshen <ido@cgstowernetworks.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agonet/thunderx: add support for hardware first skip feature
Rakesh Kudurumalla [Mon, 18 Jun 2018 05:36:24 +0000 (11:06 +0530)]
net/thunderx: add support for hardware first skip feature

This feature is used to create a hole between HEADROOM
and actual data.Size of hole is specified in bytes as
module param to pmd

Signed-off-by: Rakesh Kudurumalla <rkudurumalla@caviumnetworks.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
6 years agonet/mlx5: separate generic tunnel TSO from the standard one
Shahaf Shuler [Sun, 24 Jun 2018 06:22:26 +0000 (09:22 +0300)]
net/mlx5: separate generic tunnel TSO from the standard one

The generic tunnel TSO was depended in the regular one capabilities to
be enabled.

Cc: stable@dpdk.org
Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
Acked-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agovhost: fix potential null pointer dereference
Tiwei Bie [Fri, 22 Jun 2018 03:53:05 +0000 (11:53 +0800)]
vhost: fix potential null pointer dereference

Coverity issue: 293097
Fixes: d90cf7d111ac ("vhost: support host notifier")

Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
6 years agovhost: fix missing increment of log cache count
Maxime Coquelin [Fri, 15 Jun 2018 13:48:46 +0000 (15:48 +0200)]
vhost: fix missing increment of log cache count

The log_cache_nb_elem was never incremented, resulting
in all dirty pages to be missed during live migration.

Fixes: c16915b87109 ("vhost: improve dirty pages logging performance")
Cc: stable@dpdk.org
Reported-by: Peng He <xnhp0320@icloud.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Tiwei Bie <tiwei.bie@intel.com>
6 years agonet/nfp: use generic PCI config access functions
Alejandro Lucero [Mon, 18 Jun 2018 20:06:12 +0000 (21:06 +0100)]
net/nfp: use generic PCI config access functions

This patch avoids direct access to device config sysfs file using
rte_pci_read_config instead.

Apart from replicating code, it turns out this direct access does
not always work if non-root users execute DPDK apps. In those cases
it is mandatory to go through VFIO specific function for reading pci
config space.

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
6 years agonet/i40e: do not reset device info data
Damjan Marion [Wed, 6 Jun 2018 20:31:25 +0000 (22:31 +0200)]
net/i40e: do not reset device info data

At this point valid data is already set by rte_eth_get_device_info.
device field becomes zero and consumer is not able to retrieve pci data.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Cc: stable@dpdk.org
Signed-off-by: Damjan Marion <damarion@cisco.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: check illegal packets
Yanglong Wu [Wed, 20 Jun 2018 02:12:47 +0000 (10:12 +0800)]
net/i40e: check illegal packets

Some illegal packets will lead to TX/RX hang and
can't recover automatically. This patch check those
illegal packets and protect TX/RX from hanging.

Signed-off-by: Yanglong Wu <yanglong.wu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
6 years agonet/i40e: workaround performance degradation
Haiyue Wang [Wed, 13 Jun 2018 05:52:41 +0000 (13:52 +0800)]
net/i40e: workaround performance degradation

The GL_SWR_PM_UP_THR value is not impacted from the link speed, its
value is set according to the total number of ports for a better
pipe-monitor configuration.

All bellowing relevant device IDs are considered (NICs, LOMs, Mezz
and Backplane):

Device-ID  Value        Comments
0x1572     0x03030303   10G SFI
0x1581     0x03030303   10G Backplane
0x1586     0x03030303   10G BaseT
0x1589     0x03030303   10G BaseT (FortPond)
0x1580     0x06060606   40G Backplane
0x1583     0x06060606   2x40G QSFP
0x1584     0x06060606   1x40G QSFP
0x1587     0x06060606   20G Backplane (HP)
0x1588     0x06060606   20G KR2 (HP)
0x158A     0x06060606   25G Backplane
0x158B     0x06060606   25G SFP28

Fixes: c9223a2bf53c ("i40e: workaround for XL710 performance")
Fixes: 75d133dd3296 ("net/i40e: enable 25G device")
Cc: stable@dpdk.org
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agoapp/testpmd: fix VLAN TCI mask set error for FDIR
Wei Zhao [Tue, 5 Jun 2018 09:12:11 +0000 (17:12 +0800)]
app/testpmd: fix VLAN TCI mask set error for FDIR

The vlan tci mask should be set to 0xEFFF, not 0x0,
the wrong mask will cause mask error for register set.

Fixes: d9d5e6f2f0ba ("app/testpmd: set default flow director mask")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
6 years agonet/i40e: remove summarized global register change info
Beilei Xing [Thu, 7 Jun 2018 02:40:08 +0000 (10:40 +0800)]
net/i40e: remove summarized global register change info

The summarized global register change info will be logged
no matter if there's real global register change. Since
only real changes are logged now, there's no need to
summarize global register change info, otherwise will
cause misunderstanding.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: print real global changes
Beilei Xing [Thu, 7 Jun 2018 02:40:07 +0000 (10:40 +0800)]
net/i40e: print real global changes

Currently no matter if there's global change, the global
configuration will be always logged. But there's no value
to log the info if the configuration is not changed.
This patch prints only real global changes.
Also, change log level from DEBUG to WARNING.

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/i40e: fix shifts of 32-bit value
Beilei Xing [Wed, 23 May 2018 07:46:46 +0000 (15:46 +0800)]
net/i40e: fix shifts of 32-bit value

Cppcheck reports following error,
(error) Shifting 32-bit value by 36 bits is undefined behaviour

According to datasheet, there's PHY type and PHY type extension
in setting PHY config command, should exclude PHY type extension
when setting PHY type.

Fixes: 1bb8f661168d ("net/i40e: fix link down and negotiation")
Cc: stable@dpdk.org
Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
6 years agonet/ixgbe: fix mask bits register set error for FDIR
Wei Zhao [Fri, 15 Jun 2018 06:08:03 +0000 (14:08 +0800)]
net/ixgbe: fix mask bits register set error for FDIR

MAC address bits in mask registers should be set to zero
when the is mac mask is 0xFF, otherwise if it is 0x0
these bits should be to 0x3F.

Fixes: 82fb702077f6 ("ixgbe: support new flow director modes for X550")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
6 years agonet/ixgbe: fix tunnel type set error for FDIR
Wei Zhao [Thu, 14 Jun 2018 08:17:28 +0000 (16:17 +0800)]
net/ixgbe: fix tunnel type set error for FDIR

Tunnel type format should be translated to ixgbe required format
before register set in FDIR cloud mode, Ans also some register
not useful in cloud mode but only useful in IP mode should be set
to zero as datasheet request.

Fixes: 82fb702077f6 ("ixgbe: support new flow director modes for X550")
Fixes: 11777435c727 ("net/ixgbe: parse flow director filter")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
6 years agonet/ixgbe: fix tunnel id format error for FDIR
Wei Zhao [Wed, 13 Jun 2018 08:11:22 +0000 (16:11 +0800)]
net/ixgbe: fix tunnel id format error for FDIR

In cloud mode for FDIR, tunnel id should be set as protocol
request, the lower 8 bits should be set as reserved.

Fixes: 82fb702077f6 ("ixgbe: support new flow director modes for X550")
Fixes: 11777435c727 ("net/ixgbe: parse flow director filter")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
6 years agonet/ixgbe: add support for VLAN in IP mode FDIR
Wei Zhao [Wed, 13 Jun 2018 08:09:55 +0000 (16:09 +0800)]
net/ixgbe: add support for VLAN in IP mode FDIR

In IP mode FDIR, X550 can support not only 4 tuple parameters
but also vlan tci in protocol, so add this feature to flow parser.

Fixes: 11777435c727 ("net/ixgbe: parse flow director filter")
Cc: stable@dpdk.org
Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
6 years agonet/ixgbe: add query rule stats support for FDIR
Wei Zhao [Wed, 13 Jun 2018 08:08:50 +0000 (16:08 +0800)]
net/ixgbe: add query rule stats support for FDIR

There are many registeres in x550 support stats of
flow director filters, for example the number of added
or removed rules and the number match or miss match packet
count for this for port, all these important information
can be read form registeres in x550 and display with command
xstats.

Signed-off-by: Wei Zhao <wei.zhao1@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
6 years agonet/ixgbe: fix crash on detach
Pablo de Lara [Thu, 31 May 2018 09:53:07 +0000 (10:53 +0100)]
net/ixgbe: fix crash on detach

When detaching a port bound to ixgbe PMD, if the port
does not have any VFs, *vfinfo is not set and there is
a NULL dereference attempt, when calling
rte_eth_switch_domain_free(), which expects VFs to be used,
causing a segmentation fault.

Steps to reproduce:

./testpmd -- -i
testpmd> port stop all
testpmd> port close all
testpmd> port detach 0

Bugzilla ID: 57
Fixes: cf80ba6e2038 ("net/ixgbe: add support for representor ports")
Cc: stable@dpdk.org
Reported-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
6 years agonet/mlx5: fix error number handling
Yongseok Koh [Tue, 19 Jun 2018 23:13:13 +0000 (16:13 -0700)]
net/mlx5: fix error number handling

rte_errno should be saved only if error has occurred because rte_errno
could have garbage value.

Fixes: a6d83b6a9209 ("net/mlx5: standardize on negative errno values")
Cc: stable@dpdk.org
Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/mlx5: clean-up developer logs
Nelio Laranjeiro [Tue, 5 Jun 2018 08:45:22 +0000 (10:45 +0200)]
net/mlx5: clean-up developer logs

Split maintainers logs from user logs.

A lot of debug logs are present providing internal information on how
the PMD works to users.  Such logs should not be available for them and
thus should remain available only when the PMD is compiled in debug
mode.

This commits removes some useless debug logs, move the Maintainers ones
under DEBUG and also move dump into debug mode only.

Cc: stable@dpdk.org
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
6 years agonet/ena: enable write combining
Rafal Kozik [Fri, 29 Jun 2018 13:54:08 +0000 (15:54 +0200)]
net/ena: enable write combining

Write combining (WC) increases NIC performance by making better
utilization of PCI bus. ENA PMD may make usage of this feature.

To enable it load igb_uio driver with wc_activate set to 1.

Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agobus/pci: enable write combining during mapping
Rafal Kozik [Fri, 29 Jun 2018 13:54:07 +0000 (15:54 +0200)]
bus/pci: enable write combining during mapping

Write combining (WC) increases NIC performance by making better
utilization of PCI bus, but cannot be used by all PMDs.

It will be enabled only if RTE_PCI_DRV_WC_ACTIVATE will be set in
drivers flags. For proper work also igb_uio driver must be loaded with
wc_activate set to 1.

When mapping PCI resources, firstly check if it support WC
and then try to use it.
In case of failure, it will fallback to normal mode.

Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
6 years agobus/pci: reference driver structure before mapping
Rafal Kozik [Fri, 29 Jun 2018 13:54:06 +0000 (15:54 +0200)]
bus/pci: reference driver structure before mapping

Add pointer to driver structure before calling rte_pci_map_device.
It allows to use driver flags for adjusting configuration.

Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoigb_uio: add write combining option
Rafal Kozik [Fri, 29 Jun 2018 13:54:05 +0000 (15:54 +0200)]
igb_uio: add write combining option

Write combining (WC) increases NIC performance by making better
utilization of PCI bus, but cannot be use by all PMD.

To get internal_addr memory need to be mapped. But as memory could not be
mapped twice: with and without WC it should be skipped for WC. [1]

To do not spoil other drivers that potentially could use internal_addr,
parameter wc_activate adds possibility to skip it for those PMDs,
that do not use it.

[1] https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf
section 5.3 and 5.4

Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agomaintainers: update for ethdev
Thomas Monjalon [Fri, 29 Jun 2018 14:58:57 +0000 (16:58 +0200)]
maintainers: update for ethdev

Ferruh and Andrew are doing excellent reviews and contributions
to ethdev API.
They become official maintainers and responsibles of this major lib.

Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
Acked-by: Andrew Rybchenko <arybchenko@solarflare.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
6 years agoipc: fix locking while sending messages
Anatoly Burakov [Wed, 27 Jun 2018 09:44:25 +0000 (10:44 +0100)]
ipc: fix locking while sending messages

Previously, we were putting an exclusive lock to prevent secondary
processes spinning up while we are sending our messages. However,
using exclusive locks had an effect of disallowing multiple
simultaenous unrelated messages/requests being sent, which was
not the intention behind locking.

Fix it to put a shared lock on the directory. That way, we still
prevent secondary process initializations while sending data over
IPC, but allow multiple unrelated transmissions to proceed.

Fixes: 89f1fe7e6d95 ("eal: lock IPC directory on init and send")
Cc: stable@dpdk.org
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Tested-by: Qi Zhang <qi.z.zhang@intel.com>