Adrien Mazarguil [Fri, 24 Jun 2016 13:18:00 +0000 (15:18 +0200)]
net/mlx5: re-add Tx gather support
Compared to its previous incarnation, the software limit on the number of
mbuf segments is no more (previously MLX5_PMD_SGE_WR_N, set to 4 by
default) hence no need for linearization code and related buffers that
permanently consumed a non negligible amount of memory to handle oversized
mbufs.
The resulting code is both lighter and faster.
With the addition of this code, older GCC versions (such
as 4.8.5) may complain about 'wqe' variable being uninitialized, so
initialize it preemptively, even though it is not necessary to do so.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:57 +0000 (15:17 +0200)]
net/mlx5: support multi-packet send
This feature enables the TX burst function to emit up to 5 packets using
only two work queue entries (WQEs) on devices that support it. Saves PCI
bandwidth and improves performance.
Yaacov Hazan [Fri, 24 Jun 2016 13:17:56 +0000 (15:17 +0200)]
net/mlx5: support inline send
Implement send inline feature which copies packet data directly into
work queue entries (WQEs) for improved latency. The maximum packet
size and the minimum number of Tx queues to qualify for inline send
are user-configurable.
This feature is effective when HW causes a performance bottleneck.
Adrien Mazarguil [Fri, 24 Jun 2016 13:17:55 +0000 (15:17 +0200)]
net/mlx5: replace countdown with threshold for Tx completions
Replacing the variable countdown (which depends on the number of
descriptors) with a fixed relative threshold known at compile time improves
performance by reducing the TX queue structure footprint and the amount of
code to manage completions during a burst.
Completions are now requested at most once per burst after threshold is
reached.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:54 +0000 (15:17 +0200)]
net/mlx5: handle Rx CQE compression
Mini (compressed) completion queue entries (CQEs) are returned by the
NIC when PCI back pressure is detected, in which case the first CQE64
contains common packet information followed by a number of CQE8
providing the rest, followed by a matching number of empty CQE64
entries to be used by software for decompression.
This patch does not perform the entire decompression step as it would be
really expensive, instead the first CQE64 is consumed and an internal
context is maintained to interpret the following CQE8 entries directly.
Intermediate empty CQE64 entries are handed back to HW without further
processing.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:50 +0000 (15:17 +0200)]
net/mlx5: add support for configuration through kvargs
The intent is to replace the remaining compile-time options and environment
variables with a common mean of runtime configuration. This commit only
adds the kvargs handling code, subsequent commits will update the rest.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:48 +0000 (15:17 +0200)]
net/mlx5: update prerequisites for upcoming enhancements
The latest version of Mellanox OFED exposes hardware definitions necessary
to implement data path operation bypassing Verbs. Update the minimum
version requirement to MLNX_OFED >= 3.3 and clean up compatibility checks
for previous releases.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:43 +0000 (15:17 +0200)]
net/mlx5: remove Rx scatter support
This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. RX scatter cannot be maintained during the
transition and will be reimplemented later.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:42 +0000 (15:17 +0200)]
net/mlx5: remove Tx gather support
This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. TX gather cannot be maintained during the
transition and will be reimplemented later.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:41 +0000 (15:17 +0200)]
net/mlx5: split memory registration function
Except for the first time when memory registration occurs, the lkey is
always cached. Since memory registration is slow and performs system calls,
performance can be improved by moving that code to its own function outside
of the data path so only the lookup code is left in the original inlined
function.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:40 +0000 (15:17 +0200)]
net: fix PCI class id
Use RTE_PCI_DEVICE macro to set all fields rather than explicitly setting
them individually in the code. This shortens the code while helping to
future-proof against future changes to the rte_pci_id structure.
Fixes: 701c8d80c820 ("pci: support class id probing") Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Beilei Xing [Thu, 23 Jun 2016 15:11:58 +0000 (23:11 +0800)]
net/ixgbe: fix single VLAN tag to be outer VLAN tag
Previously, a single VLAN header is treated as inner VLAN,
but generally, a single VLAN header is treated as the outer
VLAN header.
The patch fixes the ether type of a single VLAN type, and
enables configuring inner and outer TPID for double VLAN.
Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type") Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Beilei Xing [Wed, 22 Jun 2016 02:53:51 +0000 (10:53 +0800)]
net/i40e: fix single VLAN tag to be outer VLAN tag
In current i40e codebase, if single VLAN header is added in a packet,
it's treated as inner VLAN. Generally, a single VLAN header is
treated as the outer VLAN header, so update the driver behaviour
appropriately.
Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type") Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Nelson Escobar [Thu, 16 Jun 2016 19:19:05 +0000 (12:19 -0700)]
net/enic: support scattered Rx
For performance reasons, this patch uses 2 VIC RQs per RQ presented to
DPDK.
The VIC requires that each descriptor be marked as either a start of
packet (SOP) descriptor or a non-SOP descriptor. A one RQ solution
requires skipping descriptors when receiving small packets and results
in bad performance when receiving many small packets.
The 2 RQ solution makes use of the VIC feature that allows a receive
on primary queue to 'spill over' into another queue if the receive is
too large to fit in the buffer assigned to the descriptor on the
primary queue. This means that there is no skipping of descriptors
when receiving small packets and results in much better performance.
Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Beilei Xing [Thu, 16 Jun 2016 13:36:28 +0000 (21:36 +0800)]
net/e1000: configure outer VLAN TPID field
This patch enables configuring the outer TPID for double VLAN.
Note that all other TPID values, for single VLANs or inner VLAN in the
QinQ case, are read only.
Sony Chacko [Thu, 16 Jun 2016 05:47:10 +0000 (22:47 -0700)]
net/qede: enable VF-VF traffic with unmatched dest address
This patch enables VF to VF traffic with unmatched destination addresses.
The steps to enable this are:
- Enable promiscuous mode filter settings.
- Check for VF mode and enable promiscuous mode settings for VF.
- Check filter configuration to ensure conflicting filter modes
are not set.
Signed-off-by: Sony Chacko <sony.chacko@qlogic.com>
Harish Patil [Thu, 16 Jun 2016 05:47:09 +0000 (22:47 -0700)]
net/qede: support 100G
- Add device id to the PCI table
- Add polling for the slowpath events for CMT mode device
- Add prerequisites to allow 100g mode
* Min number of queues needed is 2
* Only even number of queues are allowed
- Update documentation
ixgbe PMD RX function(s) misses some packet types that are:
- correctly recognised by the underlying HW.
- marked as supported by ixgbe_dev_supported_ptypes_get().
Fixes: 9586ebd358d5 ("ixgbe: replace some offload flags with packet type") Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com> Tested-by: Olivier Matz <olivier.matz@6wind.com>
Nelson Escobar [Tue, 14 Jun 2016 23:54:05 +0000 (16:54 -0700)]
net/enic: improve packet type identification
- add l4 ptypes to the ones we report as supporting
- report/use RTE_PTYPE_L3_IPV4_EXT_UNKNOWN and
RTE_PTYPE_L3_IPV6_EXT_UNKNOWN instead of RTE_PTYPE_L3_IPV4 and
RTE_PTYPE_L3_IPV6 as vic can't distinguish between packets with
extentions and those without extentions.
- correctly set the ptype bits set on packets that are both tcp/udp
and a frag
- set RTE_PTYPE_L4_NONFRAG on ip packets we know are not udp, tcp,
or fragments.
Fixes: 947d860c821f ("enic: improve Rx performance") Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Zhe Tao [Tue, 14 Jun 2016 05:24:16 +0000 (13:24 +0800)]
net/i40e: fix offload flags for vector Rx
The flags for RSS and flow director are not set correctly in the vector
Rx function, so applications which use these flags will not work
correctly.
The problem is caused by incorrect constants for masking,
shuffling and shifting the descriptor bytes, to create the resultant
flags in the mbuf. Correcting the constants fixes the problem
Fixes: 9ed94e5bb04e ("i40e: add vector Rx") Signed-off-by: Zhe Tao <zhe.tao@intel.com> Reviewed-by: Piotr Azarewicz <piotrx.t.azarewicz@intel.com> Acked-by: Jingjing Wu <jingjing.wu@intel.com>
If MSIX is available, the vector count given by the table size is one
less than the actual count. This count also limits the receive and
transmit queue resources the VF can support.
Fixes: 540a211084a7 ("bnx2x: driver core") Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com> Acked-by: Harish Patil <harish.patil@qlogic.com>
Eric Kinzie [Fri, 27 May 2016 19:44:05 +0000 (12:44 -0700)]
net/bonding: allow external state machine in mode 4
Provide functions to allow an external 802.3ad state machine to transmit
and receive LACPDUs and to set the collection/distribution flags on
slave interfaces.
Signed-off-by: Eric Kinzie <ehkinzie@gmail.com> Acked-by: Declan Doherty <declan.doherty@intel.com>
Eric Kinzie [Sat, 7 May 2016 03:45:24 +0000 (20:45 -0700)]
net/bonding: inherit maximum Rx packet length
Instead of a hard-coded maximum receive length, allow the bonded interface
to inherit this limit from the slave interfaces. This allows
an application that uses jumbo frames to pass realistic values to
rte_eth_dev_configure without causing an error.
Before the bonding interface is configured, allow slaves with any
max_rx_pktlen to be added and remember the lowest of these values as
a candidate value. During dev_configure, set the bond device's
max_rx_pktlen to the candidate value. After this point only slaves
with a max_rx_pktlen greater or equal to that of the bonding device
can be added.
If all slaves are removed, the bond device's pktlen is cleared.
Signed-off-by: Eric Kinzie <ehkinzie@gmail.com> Acked-by: Declan Doherty <declan.doherty@intel.com>
Jerin Jacob [Fri, 17 Jun 2016 13:29:31 +0000 (18:59 +0530)]
net/thunderx/base: add mailbox for PF/VF communication
DPDK nicvf driver doesn't have access to NIC's PF address space.
Introduce a mailbox mechanism to communicate with PF driver through
shared 128bit register interface.
Xiao Wang [Mon, 6 Jun 2016 09:00:47 +0000 (17:00 +0800)]
net/fm10k: fix promiscuous receive for VF
When app tries to change promisc/allmulti setting, fm10k will check if a
valid glort is acquired, and exit without doing anything if not.
For VFs, this glort value is not necessary, and so the check can be
removed. This saves having unnecessary failures of the API call, as well as
saving the time taken for the mailbox communication between VF and PF in
the case when the glort check passes.
Fixes: df02ba864695 ("fm10k: support promiscuous mode") Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com> Acked-by: Jing Chen <jing.d.chen@intel.com>
Olivier Matz [Mon, 13 Jun 2016 11:24:29 +0000 (13:24 +0200)]
net/xenvirt: fix build after mempool changes
The field elt_va_start has been removed from the mempool structure,
and it was not replaced in xenvirt.
Fix this by getting the mempool objects address by using the address of
the first memory chunk list.
Note that it won't work with mempool composed of several chunks,
but it was already the case before.
Fixes: 84121f197187 ("mempool: store memory chunks in a list") Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:33 +0000 (14:23 -0700)]
net/bnxt: free memory in close operation
This patch adds code to free all resources except the one corresponding
to HWRM, which are required to notify the HWRM that the driver is unloaded
(these are freed in uninit()).
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:28 +0000 (14:23 -0700)]
net/bnxt: allocate rings and groups
Add a top level functions to initialize ring groups, and functions
to allocate and free all the rings via HWRM.
A ring group is identified by an index. It consists of Rx or Tx ring id,
completion ring id and a statistics context. Once a ring group is
initialized, use this group index while creating the rings in the ASIC
using the appropriate HWRM API added via earlier patches.
Functions added:
bnxt_free_cp_ring
Calls the HWRM function generic ring free with arguments specific
to a completion ring and sanitizes the host completion structure
bnxt_free_all_hwrm_rings
Frees all the HWRM allocated hardware rings
bnxt_free_all_hwrm_resources
Frees all the resources allocated via the HRM in the hardware
bnxt_alloc_hwrm_rings
Allocates all the HWRM rings needed in the current configuration
This should be the last functionality needed to add start/stop
device operations.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:27 +0000 (14:23 -0700)]
net/bnxt: set L2 filters
New HWRM call:
bnxt_clear_hwrm_vnic_filters
This patch adds code to set and clear L2 filters from the
corresponding VNIC. These filters will determine the Rx flows
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:25 +0000 (14:23 -0700)]
net/bnxt: add ring group alloc/free
Add HWRM API for ring group alloc/free functions, associated structs and
definitions.
This API allocates and does basic preparation for a ring group in ASIC.
A ring group is identified by an index. It consists of Rx ring id,
completion ring id and a statistics context.
New HWRM calls:
bnxt_hwrm_ring_grp_alloc
Allocates and does basic preparation for a ring group
bnxt_hwrm_ring_grp_free
Frees and does cleanup resources of a ring group
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:24 +0000 (14:23 -0700)]
net/bnxt: add ring alloc/free
Add HWRM API calls to allocate and free TX, RX and Completion rings
in the hardware along with the associated structs and definitions.
This informs the hardware of how the specific rings were set up in the
host and allocates them in the HWRM, setting up the doorbell registers
etc. as needed, returning an ID for the ring.
Basic ring alloc/free calls:
bnxt_hwrm_ring_alloc
This command allocates and does basic preparation for a ring.
bnxt_hwrm_ring_free
This command is used to free a ring and associated resources.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:20 +0000 (14:23 -0700)]
net/bnxt: allow configuration of a VNIC
This patch adds APIs to allow configuration of a VNIC.
The functions alloc and free the Class of Service or COS and
Load Balance context corresponding to the VNIC in the chip.
New HWRM calls:
bnxt_hwrm_vnic_ctx_alloc:
Used to allocate COS/Load Balance context of VNIC
bnxt_hwrm_vnic_ctx_free:
Used to free COS/Load Balance context of VNIC
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>