dpdk.git
7 years agonet/bnxt: support async link notification
Ajit Khaparde [Tue, 11 Oct 2016 21:47:50 +0000 (16:47 -0500)]
net/bnxt: support async link notification

This patch adds support to get Link notification asynchronously.
The HW sends async notifications on default completion ring. The
PMD processes these notifications and logs a message appropriately.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
7 years agodoc: update bnxt guide
Ajit Khaparde [Thu, 29 Sep 2016 17:04:08 +0000 (12:04 -0500)]
doc: update bnxt guide

This patch reformats the Broadcom PMD driver documentation.

Also since the PMD now loads on a VF interface, update the
documentation accordingly.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agonet/bnxt: fix crash when closing
Ajit Khaparde [Thu, 29 Sep 2016 17:03:44 +0000 (12:03 -0500)]
net/bnxt: fix crash when closing

This patch fixes segfault encountered during dev_uninit/close routine.
KNI sample app can be used to reproduce the issue.

Fixes: c09f57b49c13 ("net/bnxt: add start/stop/link update operations")

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
7 years agonet/bnx2x: merge debug register operations into headers
Chas Williams [Tue, 11 Oct 2016 23:05:01 +0000 (19:05 -0400)]
net/bnx2x: merge debug register operations into headers

The register read/writes should just be static inline instead of
alternately defined as routines or macros depending on the status of
debugging.

Fix bnx2x_reg_read32() returning 0 during debug unaligned reads.

Fixes: b5bf7719221d ("bnx2x: driver support routines")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: do not return structs
Chas Williams [Tue, 11 Oct 2016 23:05:00 +0000 (19:05 -0400)]
net/bnx2x: do not return structs

bnx2x_loop_obtain_resources() returns a struct.  This routine either
succeeds or fails -- We don't need a struct for that.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: check return codes during VF mailbox operation
Chas Williams [Tue, 11 Oct 2016 23:04:59 +0000 (19:04 -0400)]
net/bnx2x: check return codes during VF mailbox operation

Refactor bnx2x_do_req4pf() to be easier to read and return errors when
the transaction fails -- Previously, it could succeed when the control
channel was down.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: serialize access to VF mailbox
Chas Williams [Tue, 11 Oct 2016 23:04:58 +0000 (19:04 -0400)]
net/bnx2x: serialize access to VF mailbox

The pf2vf mailbox can only be used by one thread at a time.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: replace macro with static function
Chas Williams [Tue, 11 Oct 2016 23:04:57 +0000 (19:04 -0400)]
net/bnx2x: replace macro with static function

Replace BNX2X_TLV_APPEND() with the clearer and safer bnx2x_add_tlv().
bnx2x_add_tlv() was previously prototyped at some point but can be static.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: restrict Rx mask flags sent to the PF
Chas Williams [Tue, 11 Oct 2016 23:04:56 +0000 (19:04 -0400)]
net/bnx2x: restrict Rx mask flags sent to the PF

Don't use bnx2x_fill_accept_flags() to fill the RX mask in the VF
since the PF only handles a subset of the existing flags.  now,
bnx2x_fill_accept_flags() can be static.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: remove unused Rx queue code
Chas Williams [Tue, 11 Oct 2016 23:04:55 +0000 (19:04 -0400)]
net/bnx2x: remove unused Rx queue code

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: remove delay during device startup
Chas Williams [Tue, 11 Oct 2016 23:04:54 +0000 (19:04 -0400)]
net/bnx2x: remove delay during device startup

This 2.5s delay doesn't seem to serve any purpose other than a being a
pause after logging the device configuration.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: remove unused preprocessor code
Chas Williams [Tue, 11 Oct 2016 23:04:53 +0000 (19:04 -0400)]
net/bnx2x: remove unused preprocessor code

ELINK_INCLUDE_EMUL and ELINK_INCLUDE_FPGA are never defined.  Remove them
along with enumeration constants dependent on their inclusion.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/bnx2x: get cache line size from build configuration
Chas Williams [Tue, 11 Oct 2016 23:04:52 +0000 (19:04 -0400)]
net/bnx2x: get cache line size from build configuration

Correctly hint the cache line size.  Remove unused macros associated
with the cache line size.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Harish Patil <harish.patil@qlogic.com>
7 years agonet/mlx5: fix Rx function selection
Nélio Laranjeiro [Tue, 11 Oct 2016 14:44:50 +0000 (16:44 +0200)]
net/mlx5: fix Rx function selection

mlx5_rx_queue_setup() was setting the Rx function by itself instead of
using priv_select_rx_function() written for that purpose.

Fixes: cdab90cb5c8d ("net/mlx5: add Tx/Rx burst function selection wrapper")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agonet/enic: add ethernet VLAN packet type
John Daley [Wed, 24 Aug 2016 19:07:02 +0000 (12:07 -0700)]
net/enic: add ethernet VLAN packet type

Enic is capable of recognizing packets to be delivered to the
app with single VLAN tags. Advertise this with the ptype
RTE_PTYPE_L2_ETHER_VLAN and set the ptype for VLAN packets.

Signed-off-by: John Daley <johndale@cisco.com>
7 years agodoc: add ixgbe supported chipsets and NICs
Wei Dai [Mon, 26 Sep 2016 01:23:36 +0000 (09:23 +0800)]
doc: add ixgbe supported chipsets and NICs

get the list of all chipsets and NICs supported
by ixgbe driver from
https://downloadcenter.intel.com/download/14687

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agonet/bnxt: fix bit shift size
John W. Linville [Thu, 29 Sep 2016 17:39:36 +0000 (13:39 -0400)]
net/bnxt: fix bit shift size

Some(?) compilers will treat the unmarked constant 1 as a 32-bit
integer, but the shift operation is in a loop that could run up to
63 times -- undefined behavior!

Coverity issue: 127546
Fixes: 778b759ba10e ("net/bnxt: add MAC address")

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
7 years agonet/i40e: do not use VSI before NULL check
John W. Linville [Thu, 29 Sep 2016 17:39:35 +0000 (13:39 -0400)]
net/i40e: do not use VSI before NULL check

Coverity issue: 127556
Fixes: 440499cf5376 ("net/i40e: support floating VEB")

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agonet/bnxt: ensure entry length is unsigned
John W. Linville [Thu, 29 Sep 2016 17:39:34 +0000 (13:39 -0400)]
net/bnxt: ensure entry length is unsigned

Otherwise, the inherent cast when multiplying entry_length by max_vnics
in the call to rte_memzone_reserve could promote max_vnics to a signed
value, causing hilarity to ensue...

Coverity issue: 127557
Fixes: 9738793f28ec ("net/bnxt: add VNIC functions and structs")

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
7 years agonet/ena: improve safety of string handling
John W. Linville [Thu, 29 Sep 2016 17:39:33 +0000 (13:39 -0400)]
net/ena: improve safety of string handling

Use sizeof dest rather than sizeof src for limiting copy length,
and replace strncpy with snprintf to ensure NULL termination.

Coverity issue: 127795
Fixes: 372c1af5ed8f ("net/ena: add dedicated memory area for extra device info")

Signed-off-by: John W. Linville <linville@tuxdriver.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agonet/bnx2x: fix socket id for slowpath memory
Rasesh Mody [Thu, 6 Oct 2016 05:36:37 +0000 (22:36 -0700)]
net/bnx2x: fix socket id for slowpath memory

When the DMA allocation routine is invoked in the context of a non-EAL
thread, the API rte_lcore_id() returns -1 and indexing on that in
rte_lcore_to_socket_id() leads to segfault. The fix is to use
SOCKET_ID_ANY as the socket_id for all slowpath memory allocation.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Rasesh Mody <rasesh.mody@qlogic.com>
7 years agonet/bnx2x: fix maximum PF queues
Rasesh Mody [Thu, 6 Oct 2016 05:36:36 +0000 (22:36 -0700)]
net/bnx2x: fix maximum PF queues

Fix the max number of PF rx/tx queues. Set the value based
on BNX2X_MAX_RSS_COUNT() rather than hard coding it to 128.

Fixes: 540a211084a7 ("bnx2x: driver core")

Signed-off-by: Rasesh Mody <rasesh.mody@qlogic.com>
Acked-by: Chas Williams <3chas3@gmail.com>
7 years agonet/enic: extend flow director support for 1300 series
John Daley [Thu, 29 Sep 2016 20:56:39 +0000 (13:56 -0700)]
net/enic: extend flow director support for 1300 series

1300 series Cisco adapter firmware version 2.0(13) for UCS
C-series servers and 3.1(2) for blade servers supports more
filtering capabilities. The feature can be enabled via Cisco
CIMC or USCM with the 'advanced filters' radio button. When
enabled, the these additional flow director modes are available:
RTE_ETH_FLOW_NONFRAG_IPV4_OTHER
RTE_ETH_FLOW_NONFRAG_IPV4_SCTP
RTE_ETH_FLOW_NONFRAG_IPV6_UDP
RTE_ETH_FLOW_NONFRAG_IPV6_TCP
RTE_ETH_FLOW_NONFRAG_IPV6_SCTP
RTE_ETH_FLOW_NONFRAG_IPV6_OTHER

Changes:
- Detect and set an 'advanced filters' flag dependent on the adapter
  capability.
- Implement RTE_ETH_FILTER_INFO filter op to return the flow types
  available dependent on whether advanced filters are enabled.
- Use a function pointer to select how filters are added to the adapter:
  copy_fltr_v1() for older firmware/adapters or copy_fltr_v2() for
  adapters which support advanced filters.
- Apply fdir global masks to filters when in advanced filter mode.
- Update documentation.

Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: Nelson Escobar <neescoba@cisco.com>
7 years agonet/enic/base: update VIC adapter
John Daley [Thu, 29 Sep 2016 20:56:38 +0000 (13:56 -0700)]
net/enic/base: update VIC adapter

Update the VIC adapter file which is common with the firmware and
other VIC drivers. This is needed to support new capabilities
for 1300 adapters, including advanced filtering, which is available
in VIC firmware version 2.0(13) for UCS rack and 3.1(2).

Signed-off-by: John Daley <johndale@cisco.com>
7 years agonet/enic: fix crash with removed flow director filters
John Daley [Thu, 29 Sep 2016 20:56:37 +0000 (13:56 -0700)]
net/enic: fix crash with removed flow director filters

When flow director filters where removed when an enic device is
stopped, the filters were freed but the pointer was not set to
NULL so the next stop would try to free them again.

Fixes: fefed3d1e62c ("enic: new driver")

Signed-off-by: John Daley <johndale@cisco.com>
7 years agonet/enic: fix flow director
John Daley [Thu, 29 Sep 2016 20:56:36 +0000 (13:56 -0700)]
net/enic: fix flow director

The wrong queue id was being used in the enic
flow director code after the scattered Rx feature
was added.

Fixes: 856d7ba7ed22 ("net/enic: support scattered Rx")

Signed-off-by: John Daley <johndale@cisco.com>
7 years agonet/enic: document how to configure vNIC parameters
Nelson Escobar [Thu, 29 Sep 2016 20:55:05 +0000 (13:55 -0700)]
net/enic: document how to configure vNIC parameters

Update the enic guide to better explain how to setup vNIC parameters
on the Cisco VIC since the introduction of rx scatter, and print an
error message for the case of having 1 RQ configured in the vNIC,
referring to the documentation for more information.

Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agonet/mlx: align drivers to latest naming convention
David Marchand [Fri, 7 Oct 2016 13:04:13 +0000 (15:04 +0200)]
net/mlx: align drivers to latest naming convention

Fixes: 2f45703c17ac ("drivers: make driver names consistent")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
7 years agonet/thunderx: increase driver version to 2.0
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:54 +0000 (14:05 +0200)]
net/thunderx: increase driver version to 2.0

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: document secondary queue set support
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:52 +0000 (14:05 +0200)]
net/thunderx: document secondary queue set support

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agonet/thunderx: add final bits for secondary queue support
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:51 +0000 (14:05 +0200)]
net/thunderx: add final bits for secondary queue support

Summary:
 - add secondary qset support in device stats
 - add support for releasing mbufs from RBDR for >8 queues
 - add support for releasing mbufs from RX queues for >8 queues
 - support >8 queues in tx_queue_setup
 - support >8 queues in rx_queue_setup
 - support up to 96 queues per device (dev_info->max_rx_queues)
 - add secondary qset support in rbdr_rte_mempool_get
 - support >8 queues in multiprocess mode (do not reconfigure VFs)
 - setup periodic alarm accordingly for type of VFs:
   * primary VF   - handle events on queues and link status
   * secondary VF - handle events on queues
 - initialize hardware capabilities in secondary qsets

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: add secondary queue set support in configure
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:50 +0000 (14:05 +0200)]
net/thunderx: add secondary queue set support in configure

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: add secondary queue set support in start
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:49 +0000 (14:05 +0200)]
net/thunderx: add secondary queue set support in start

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: add secondary queue set support in stop/close
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:48 +0000 (14:05 +0200)]
net/thunderx: add secondary queue set support in stop/close

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: add helpers for secondary queue set
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:47 +0000 (14:05 +0200)]
net/thunderx: add helpers for secondary queue set

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: remove private data to ethdev link
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:46 +0000 (14:05 +0200)]
net/thunderx: remove private data to ethdev link

In case of the multiprocess mode a shared nicvf struct between processes
cannot point with the eth_dev pointer to master device, therefore remove it
along with references to it refactoring the code where needed.

This change fixes multiprocess issues detected in stats.

Fixes: 7413feee662d ("net/thunderx: add device start/stop and close")

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: add secondary queue set in interrupt functions
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:45 +0000 (14:05 +0200)]
net/thunderx: add secondary queue set in interrupt functions

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: add functions to store qsets
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:44 +0000 (14:05 +0200)]
net/thunderx: add functions to store qsets

These functions (nicvf_svf) are DPDK specialization of base/nicvf_bsvf.[ch]
ones.

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx/base: add secondary queue set support
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:43 +0000 (14:05 +0200)]
net/thunderx/base: add secondary queue set support

Changes:
 - add new message sqs_alloc in mailbox
 - add a queue container to hold secondary qsets.
 - add nicvf_mbox_request_sqs
 - handle new mailbox messages for secondary queue set support
 - register secondary queue sets for further reuse
 - register the number secondary queue sets in MSG_QS_CFG

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx/base: add functions to store qsets
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:42 +0000 (14:05 +0200)]
net/thunderx/base: add functions to store qsets

This interface (nicvf_bsvf) will be used for secondary queue set support.

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: fix Tx checksum handling
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:41 +0000 (14:05 +0200)]
net/thunderx: fix Tx checksum handling

The symbols PKT_TX_TCP_CKSUM and PKT_TX_UDP_CKSUM are not bits on a
bitmask. Set l3_offset always for TX offloads, not just for PKT_TX_IP_CKSUM
being true.

Fixes: 1c421f18e095 ("net/thunderx: add single and multi-segment Tx")

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/thunderx: cleanup
Kamil Rytarowski [Fri, 30 Sep 2016 12:05:40 +0000 (14:05 +0200)]
net/thunderx: cleanup

Refactored features:
 - enable nicvf_qset_rbdr_precharge to handle secondary queue sets
 - rte_free already handles NULL pointer
 - check mempool flags to predict being contiguous in memory
 - prohibit to use mempool with multiple memory chunks
 - simplify local construct of accessing nb_rx_queues
 - enable NICVF_CAP_CQE_RX2 on CN88XX PASS2.0 hardware.
 - remove redundant check for RSS size in nicvf_eth_dev_init

Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Signed-off-by: Zyta Szpak <zyta.szpak@semihalf.com>
Signed-off-by: Slawomir Rosek <slawomir.rosek@semihalf.com>
Signed-off-by: Radoslaw Biernacki <rad@semihalf.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agonet/i40e: enable bad checksum flags in vector Rx
Damjan Marion [Thu, 6 Oct 2016 06:38:10 +0000 (02:38 -0400)]
net/i40e: enable bad checksum flags in vector Rx

Decode the checksum flags from the Rx descriptor, setting
the appropriate bit in the mbuf ol_flags field when the flag
indicates a bad checksum.

Signed-off-by: Damjan Marion <damarion@cisco.com>
Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
7 years agonet/i40e: add packet type metadata in vector Rx
Damjan Marion [Thu, 6 Oct 2016 06:38:09 +0000 (02:38 -0400)]
net/i40e: add packet type metadata in vector Rx

The ptype is decoded from the Rx descriptor and stored
in the packet type field in the mbuf using the same function
in the non-vector driver.

Signed-off-by: Damjan Marion <damarion@cisco.com>
Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Acked-by: Qi Zhang <qi.z.zhang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
7 years agonet/bonding: enable slave VLAN filter
Eric Kinzie [Thu, 4 Aug 2016 18:24:43 +0000 (11:24 -0700)]
net/bonding: enable slave VLAN filter

SR-IOV virtual functions cannot rely on promiscuous mode for the reception
of VLAN tagged frames. Program the VLAN filter for each slave when a
VLAN is configured for the bonding master.

Signed-off-by: Eric Kinzie <ehkinzie@gmail.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
7 years agonet/bonding: validate speed after link up
Eric Kinzie [Thu, 4 Aug 2016 18:24:42 +0000 (11:24 -0700)]
net/bonding: validate speed after link up

It's possible for the bonding driver to mistakenly reject an interface
based in it's, as yet, unnegotiated link speed and duplex. Always allow
the interface to be added to the bonding interface but require link
properties validation to succeed before slave is activated.

Fixes: 2efb58cbab6e ("bond: new link bonding library")

Signed-off-by: Eric Kinzie <ehkinzie@gmail.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
7 years agodoc: add limitations for i40e PMD
Jingjing Wu [Fri, 30 Sep 2016 06:46:23 +0000 (14:46 +0800)]
doc: add limitations for i40e PMD

This patch adds "Limitations or Known issues" section for
i40e PMD, including two items:
1. MPLS packet classification on X710/XL710
2. 16 Byte Descriptor cannot be used on DPDK VF
3. Link down with i40e kernel driver after DPDK application exist

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agonet/mlx5: return RSS hash result in mbuf
Nélio Laranjeiro [Wed, 28 Sep 2016 12:11:18 +0000 (14:11 +0200)]
net/mlx5: return RSS hash result in mbuf

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
7 years agokni: move kernel version ifdefs to compat header
Ferruh Yigit [Mon, 26 Sep 2016 15:39:38 +0000 (16:39 +0100)]
kni: move kernel version ifdefs to compat header

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: prefer uint32_t to unsigned int
Ferruh Yigit [Mon, 26 Sep 2016 15:39:37 +0000 (16:39 +0100)]
kni: prefer uint32_t to unsigned int

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: update log messages
Ferruh Yigit [Mon, 26 Sep 2016 15:39:36 +0000 (16:39 +0100)]
kni: update log messages

Remove some function entrance logs and changed log level of some logs.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: remove compile time debug configuration
Ferruh Yigit [Mon, 26 Sep 2016 15:39:35 +0000 (16:39 +0100)]
kni: remove compile time debug configuration

Since switched to kernel dynamic debugging it is possible to remove
compile time debug log configuration.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: move functions to eliminate declarations
Ferruh Yigit [Mon, 26 Sep 2016 15:39:34 +0000 (16:39 +0100)]
kni: move functions to eliminate declarations

Function implementations kept same.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: remove unnecessary messages for out of memory
Ferruh Yigit [Mon, 26 Sep 2016 15:39:33 +0000 (16:39 +0100)]
kni: remove unnecessary messages for out of memory

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: update kernel logging
Ferruh Yigit [Mon, 26 Sep 2016 15:39:32 +0000 (16:39 +0100)]
kni: update kernel logging

Switch to dynamic logging functions. Depending kernel configuration this
may cause previously visible logs disappear.

How to enable dynamic logging:
https://www.kernel.org/doc/Documentation/dynamic-debug-howto.txt

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: prefer ether_addr_copy to memcpy
Ferruh Yigit [Mon, 26 Sep 2016 15:39:31 +0000 (16:39 +0100)]
kni: prefer ether_addr_copy to memcpy

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: prefer min_t to min
Ferruh Yigit [Mon, 26 Sep 2016 15:39:30 +0000 (16:39 +0100)]
kni: prefer min_t to min

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: enclose macros with complex values in parens
Ferruh Yigit [Mon, 26 Sep 2016 15:39:29 +0000 (16:39 +0100)]
kni: enclose macros with complex values in parens

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: do not use assignment in if condition
Ferruh Yigit [Mon, 26 Sep 2016 15:39:28 +0000 (16:39 +0100)]
kni: do not use assignment in if condition

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: move trailing statement on next line
Ferruh Yigit [Mon, 26 Sep 2016 15:39:27 +0000 (16:39 +0100)]
kni: move trailing statement on next line

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: move comparison constants on the right
Ferruh Yigit [Mon, 26 Sep 2016 15:39:26 +0000 (16:39 +0100)]
kni: move comparison constants on the right

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: remove useless return
Ferruh Yigit [Mon, 26 Sep 2016 15:39:25 +0000 (16:39 +0100)]
kni: remove useless return

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: prefer unsigned int to unsigned
Ferruh Yigit [Mon, 26 Sep 2016 15:39:24 +0000 (16:39 +0100)]
kni: prefer unsigned int to unsigned

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: fix spacing and line lenghts
Ferruh Yigit [Mon, 26 Sep 2016 15:39:23 +0000 (16:39 +0100)]
kni: fix spacing and line lenghts

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: make static struct const
Ferruh Yigit [Mon, 26 Sep 2016 15:39:22 +0000 (16:39 +0100)]
kni: make static struct const

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: uninitialize global variables
Ferruh Yigit [Mon, 26 Sep 2016 15:39:21 +0000 (16:39 +0100)]
kni: uninitialize global variables

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: move externs to the header file
Ferruh Yigit [Mon, 26 Sep 2016 15:39:20 +0000 (16:39 +0100)]
kni: move externs to the header file

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agokni: support core id parameter in single threaded mode
Vladyslav Buslov [Sat, 24 Sep 2016 13:13:02 +0000 (16:13 +0300)]
kni: support core id parameter in single threaded mode

Allow binding KNI thread to specific core in single threaded mode
by setting core_id and force_bind config parameters.

Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agoapp/test: verify LPM tbl8 recycle
Wei Dai [Mon, 8 Aug 2016 06:40:45 +0000 (14:40 +0800)]
app/test: verify LPM tbl8 recycle

As a bug-fix for lpm tbl8 recycle is introduced,
add a test case to verify tbl8 group is correctly
freed when it only includes a rule with depth=24.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
7 years agolpm: remove redundant check when adding rule
Wei Dai [Mon, 8 Aug 2016 06:42:37 +0000 (14:42 +0800)]
lpm: remove redundant check when adding rule

When a rule with depth > 24 is added into an existing
rule with depth <=24, a new tbl8 is allocated, the existing
rule first fulfill whole new tbl8, so the filed valid of
each entry in this tbl8 is always true and depth of each
entry is always <= 24 before adding the new rule with depth > 24.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
7 years agolpm: fix freeing unused sub-table on rule delete
Wei Dai [Mon, 8 Aug 2016 06:39:51 +0000 (14:39 +0800)]
lpm: fix freeing unused sub-table on rule delete

When all rules with depth > 24 are deleted in a same sub-table
(tlb8 group) and only a rule with depth <=24 is left in it,
this sub-table (tlb8 group) should be recycled.

Fixes: dc81ebbacaeb ("lpm: extend IPv4 next hop field")
Fixes: af75078fece3 ("first public release")

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
7 years agolog: respect logger configured before EAL init
John Ousterhout [Wed, 12 Oct 2016 19:38:32 +0000 (12:38 -0700)]
log: respect logger configured before EAL init

Before this patch, application-specific loggers could not be
installed before rte_eal_init completed (the initialization process
called rte_openlog_stream, overwriting any previously installed
logger). This made it impossible for an application to capture the
initial log messages generated during rte_eal_init. This patch changes
initialization so that information from a previous call to
rte_openlog_stream is not lost. Specifically:
* The default log stream is now maintained separately from an
  application-specific log stream installed with rte_openlog_stream.
* rte_eal_common_log_init has been renamed to eal_log_set_default,
  since this is all it does. It no longer invokes rte_openlog_stream; it
  just updates the default stream. Also, this method now returns void,
  rather than int, since there are no errors.

This patch also removes the "early log" mechanism and cleans up the
log initialization mechanism:
* The default log stream defaults to stderr on all platforms if
  eal_log_set_default hasn't been invoked (Linux used to use stdout
  during the first part of initialization).
* Removed rte_eal_log_early_init; all of the desired functionality can
  be achieved by calling eal_log_set_default.
* Removed lib/librte_eal/bsdapp/eal/eal_log.c: it contained only one
  function, rte_eal_log_init, which is not needed or invoked for BSD.
* Removed declaration for eal_default_log_stream in rte_log.h (it's now
  private to eal_common_log.c).
* Moved call to rte_eal_log_init earlier in rte_eal_init for Linux, so
  that it starts using the preferrred log ASAP.

Signed-off-by: John Ousterhout <ouster@cs.stanford.edu>
7 years agodoc: fix file argument of debug functions
Mauricio Vasquez B [Fri, 2 Sep 2016 11:01:51 +0000 (13:01 +0200)]
doc: fix file argument of debug functions

Previous patch updated the functions without updating all the comments.

Fixes: 591a9d7985c1 ("add FILE argument to debug functions")

Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agonet/virtio: support TSO
Olivier Matz [Thu, 13 Oct 2016 14:16:11 +0000 (16:16 +0200)]
net/virtio: support TSO

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: support LRO
Olivier Matz [Thu, 13 Oct 2016 14:16:10 +0000 (16:16 +0200)]
net/virtio: support LRO

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: support Tx checksum offload
Olivier Matz [Thu, 13 Oct 2016 14:16:09 +0000 (16:16 +0200)]
net/virtio: support Tx checksum offload

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: support Rx checksum offload
Olivier Matz [Thu, 13 Oct 2016 14:16:08 +0000 (16:16 +0200)]
net/virtio: support Rx checksum offload

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agoapp/testpmd: display LRO segment size
Olivier Matz [Thu, 13 Oct 2016 14:16:07 +0000 (16:16 +0200)]
app/testpmd: display LRO segment size

In csumonly engine, display the value of LRO segment if the
LRO flag is set.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agombuf: add flag for LRO
Olivier Matz [Thu, 13 Oct 2016 14:16:06 +0000 (16:16 +0200)]
mbuf: add flag for LRO

When receiving coalesced packets in virtio, the original size of the
segments is provided. This is a useful information because it allows to
resegment with the same size.

Add a RX new flag in mbuf, that can be set when packets are coalesced by
a hardware or virtual driver when the m->tso_segsz field is valid and is
set to the segment size of original packets.

This flag is used in next commits in the virtio pmd.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agombuf: add new Rx checksum flags
Olivier Matz [Thu, 13 Oct 2016 14:16:04 +0000 (16:16 +0200)]
mbuf: add new Rx checksum flags

Following discussions in [1] and [2], introduce a new bit to
describe the Rx checksum status in mbuf.

Before this patch, only one flag was available:
  PKT_RX_L4_CKSUM_BAD: L4 cksum of RX pkt. is not OK.

And same for L3:
  PKT_RX_IP_CKSUM_BAD: IP cksum of RX pkt. is not OK.

This had 2 issues:
- it was not possible to differentiate "checksum good" from
  "checksum unknown".
- it was not possible for a virtual driver to say "the checksum
  in packet may be wrong, but data integrity is valid".

This patch tries to solve this issue by having 4 states (2 bits)
for the IP and L4 Rx checksums. New values are:

 - PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4 checksum
   -> the application should verify the checksum by sw
 - PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
   -> the application can drop the packet without additional check
 - PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
   -> the application can accept the packet without verifying the
      checksum by sw
 - PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet
   data, but the integrity of the L4 data is verified.
   -> the application can process the packet but must not verify the
      checksum by sw. It has to take care to recalculate the cksum
      if the packet is transmitted (either by sw or using tx offload)

  And same for L3 (replace L4 by IP in description above).

This commit tries to be compatible with existing applications that
only check the existing flag (CKSUM_BAD).

[1] http://dpdk.org/ml/archives/dev/2016-May/039920.html
[2] http://dpdk.org/ml/archives/dev/2016-June/040007.html

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet: add function to calculate checksum in mbuf
Olivier Matz [Thu, 13 Oct 2016 14:16:03 +0000 (16:16 +0200)]
net: add function to calculate checksum in mbuf

This function can be used to calculate the checksum of data embedded in
mbuf, that can be composed of several segments.

This function will be used by the virtio pmd in next commits to calculate
the checksum in software in case the protocol is not recognized.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: reinitialize device when configuring
Olivier Matz [Thu, 13 Oct 2016 14:16:02 +0000 (16:16 +0200)]
net/virtio: reinitialize device when configuring

Add the ability to reset the virtio device in the configure callback
if the features flag changed since previous reset. This will be possible
with the introduction of offload support in next commits.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: move control queue configuration
Olivier Matz [Thu, 13 Oct 2016 14:16:01 +0000 (16:16 +0200)]
net/virtio: move control queue configuration

Move the configuration of control queue in the configure callback.
This is needed by next commit, which introduces the reinitialization
of the device in the configure callback to change the feature flags.
Therefore, the control queue will have to be restarted at the same
place.

As virtio_dev_cq_queue_setup() is called from a place where
config->max_virtqueue_pairs is not available, we need to store this in
the private structure. It replaces max_rx_queues and max_tx_queues which
have the same value. The log showing the value of max_rx_queues and
max_tx_queues is also removed since config->max_virtqueue_pairs is
already displayed above.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: move device initialization in a function
Olivier Matz [Thu, 13 Oct 2016 14:16:00 +0000 (16:16 +0200)]
net/virtio: move device initialization in a function

Move all code related to device initialization in a new function
virtio_init_device().

This commit brings no functional change, it prepares the next commits
that will add the offload support. For that, it will be needed to
reinitialize the device from ethdev->configure(), using this new
function.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agovhost: fix Windows VM hang
Zhihong Wang [Tue, 20 Sep 2016 02:00:12 +0000 (22:00 -0400)]
vhost: fix Windows VM hang

This patch fixes a Windows VM compatibility issue in DPDK 16.07 vhost code
which causes the guest to hang once any packets are enqueued when mrg_rxbuf
is turned on by setting the right id and len in the used ring.

As defined in virtio spec 0.95 and 1.0, in each used ring element, id means
index of start of used descriptor chain, and len means total length of the
descriptor chain which was written to. While in 16.07 code, index of the
last descriptor is assigned to id, and the length of the last descriptor is
assigned to len.

How to test?

 1. Start testpmd in the host with a vhost port.

 2. Start a Windows VM image with qemu and connect to the vhost port.

 3. Start io forwarding with tx_first in host testpmd.

For 16.07 code, the Windows VM will hang once any packets are enqueued.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/vhost: add an option to enable dequeue zero copy
Yuanhan Liu [Sun, 9 Oct 2016 07:28:00 +0000 (15:28 +0800)]
net/vhost: add an option to enable dequeue zero copy

Add an option, dequeue-zero-copy, to enable this feature in vhost-pmd.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
7 years agoexamples/vhost: add --dequeue-zero-copy option
Yuanhan Liu [Sun, 9 Oct 2016 07:27:59 +0000 (15:27 +0800)]
examples/vhost: add --dequeue-zero-copy option

Add an option, --dequeue-zero-copy, to enable dequeue zero copy.

One thing worth noting while using dequeue zero copy is the nb_tx_desc
has to be small enough so that the eth driver will hit the mbuf free
threshold easily and thus free mbuf more frequently.

The reason behind that is, when dequeue zero copy is enabled, guest Tx
used vring will be updated only when corresponding mbuf is freed. If mbuf
is not freed frequently, the guest Tx vring could be starved.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
7 years agovhost: add a flag to enable dequeue zero copy
Yuanhan Liu [Sun, 9 Oct 2016 07:27:58 +0000 (15:27 +0800)]
vhost: add a flag to enable dequeue zero copy

Dequeue zero copy is disabled by default. Here add a new flag
``RTE_VHOST_USER_DEQUEUE_ZERO_COPY`` to explictily enable it.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
7 years agovhost: add dequeue zero copy
Yuanhan Liu [Sun, 9 Oct 2016 07:27:57 +0000 (15:27 +0800)]
vhost: add dequeue zero copy

The basic idea of dequeue zero copy is, instead of copying data from
the desc buf, here we let the mbuf reference the desc buf addr directly.

Doing so, however, has one major issue: we can't update the used ring
at the end of rte_vhost_dequeue_burst. Because we don't do the copy
here, an update of the used ring would let the driver to reclaim the
desc buf. As a result, DPDK might reference a stale memory region.

To update the used ring properly, this patch does several tricks:

- when mbuf references a desc buf, refcnt is added by 1.

  This is to pin lock the mbuf, so that a mbuf free from the DPDK
  won't actually free it, instead, refcnt is subtracted by 1.

- We chain all those mbuf together (by tailq)

  And we check it every time on the rte_vhost_dequeue_burst entrance,
  to see if the mbuf is freed (when refcnt equals to 1). If that
  happens, it means we are the last user of this mbuf and we are
  safe to update the used ring.

- "struct zcopy_mbuf" is introduced, to associate an mbuf with the
  right desc idx.

Dequeue zero copy is introduced for performance reason, and some rough
tests show about 50% perfomance boost for packet size 1500B. For small
packets, (e.g. 64B), it actually slows a bit down (well, it could up to
15%). That is expected because this patch introduces some extra works,
and it outweighs the benefit from saving few bytes copy.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
7 years agovhost: introduce last available index for dequeue
Yuanhan Liu [Sun, 9 Oct 2016 07:27:56 +0000 (15:27 +0800)]
vhost: introduce last available index for dequeue

So far, we retrieve both the used ring and avail ring idx by the var
last_used_idx; it won't be a problem because the used ring is updated
immediately after those avail entries are consumed.

But that's not true when dequeue zero copy is enabled, that used ring is
updated only when the mbuf is consumed. Thus, we need use another var to
note the last avail ring idx we have consumed.

Therefore, last_avail_idx is introduced.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
7 years agovhost: get guest/host physical address mappings
Yuanhan Liu [Sun, 9 Oct 2016 07:27:55 +0000 (15:27 +0800)]
vhost: get guest/host physical address mappings

So that we can convert a guest physical address to host physical
address, which will be used in later Tx zero copy implementation.

MAP_POPULATE is set while mmaping guest memory regions, to make
sure the page tables are setup and then rte_mem_virt2phy() could
yield proper physical address.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
7 years agovhost: simplify memory regions handling
Yuanhan Liu [Sun, 9 Oct 2016 07:27:54 +0000 (15:27 +0800)]
vhost: simplify memory regions handling

Due to history reason (that vhost-cuse comes before vhost-user), some
fields for maintaining the vhost-user memory mappings (such as mmapped
address and size, with those we then can unmap on destroy) are kept in
"orig_region_map" struct, a structure that is defined only in vhost-user
source file.

The right way to go is to remove the structure and move all those fields
into virtio_memory_region struct. But we simply can't do that before,
because it breaks the ABI.

Now, thanks to the ABI refactoring, it's never been a blocking issue
any more. And here it goes: this patch removes orig_region_map and
redefines virtio_memory_region, to include all necessary info.

With that, we can simplify the guest/host address convert a bit.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
7 years agonet/virtio: support IOMMU platform
Jason Wang [Wed, 28 Sep 2016 08:25:12 +0000 (16:25 +0800)]
net/virtio: support IOMMU platform

Negotiate VIRTIO_F_IOMMU_PLATFORM to have IOMMU support.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: support modern device id
Jason Wang [Wed, 28 Sep 2016 08:25:11 +0000 (16:25 +0800)]
net/virtio: support modern device id

Add modern device id and rename VIRTIO_PCI_DEVICEID_MIN to
VIRTIO_PCI_LEGACY_DEVICEID_NET. While at it, remove unused macros too.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: add missing driver name
David Marchand [Fri, 7 Oct 2016 13:03:13 +0000 (15:03 +0200)]
net/virtio: add missing driver name

The driver name has been lost with the eal rework.
Restore it.

Fixes: c830cb295411 ("drivers: use PCI registration macro")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: set MTU
Souvik Dey [Sun, 9 Oct 2016 03:38:26 +0000 (11:38 +0800)]
net/virtio: set MTU

Virtio interfaces do not currently allow the user to specify a particular
Maximum Transmission Unit (MTU). Consequently, the MTU of Virtio interfaces
is typically set to the Ethernet default value of 1500.
This is problematic in the case of cloud deployments, in which a specific
(and potentially non-standard) MTU needs to be set by a DHCP server, which
needs to be honored by all interfaces across the traffic path.To acheive
this Virtio interfaces should support setting of MTU.
In case when GRE/VXLAN tunneling is used for internal communication, there
will be an overhead added by the infrastructure in the packet over and
above the ETHER MTU of 1518. So to take care of this overhead in these
cases the DHCP server corrects the L3 MTU to 1454. But since virtio
interfaces was not having the MTU set functionality that MTU sent by the
DHCP server was ignored and the instance will still send packets with 1500
MTU which after encapsulation will become more than 1518 and eventually
gets dropped in the infrastructure.
By adding an additional 'set_mtu' function to the Virtio driver, we can
honor the MTU sent by the DHCP server. The dhcp server/controller can
then leverage this 'set_mtu' functionality to resolve the above
mentioned issue of packets getting dropped due to incorrect size.

Signed-off-by: Souvik Dey <sodey@sonusnet.com>
Reviewed-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agodoc: fix typo in pdump guide
Mark Kavanagh [Thu, 6 Oct 2016 10:36:36 +0000 (11:36 +0100)]
doc: fix typo in pdump guide

- Fix copy/paste error in description of how to capture both rx
  & tx traffic in a single pcap file
- Replace duplicate word with what original author presumably
  intended, such that description now makes sense

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: clarify usage of testpmd MAC forward mode
Mark Kavanagh [Fri, 9 Sep 2016 16:15:52 +0000 (17:15 +0100)]
doc: clarify usage of testpmd MAC forward mode

Explain default testpmd behavior in mac fwd mode to remove
amiguity/confusion regarding user's ability to specify Ethernet
addresses.

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: add xstats commands in testpmd guide
Maryam Tahhan [Wed, 7 Sep 2016 10:45:57 +0000 (11:45 +0100)]
doc: add xstats commands in testpmd guide

Update the testpmd user guide with instructions for retrieving extended
NIC statistics.

Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agoapp/testpmd: support 25G and 50G speeds
Ajit Khaparde [Wed, 12 Oct 2016 21:26:31 +0000 (16:26 -0500)]
app/testpmd: support 25G and 50G speeds

Support to configure 25G and 50G speeds is missing from testpmd.
This patch also updates the testpmd user guide accordingly.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>