Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:54 +0000 (15:17 +0200)]
net/mlx5: handle Rx CQE compression
Mini (compressed) completion queue entries (CQEs) are returned by the
NIC when PCI back pressure is detected, in which case the first CQE64
contains common packet information followed by a number of CQE8
providing the rest, followed by a matching number of empty CQE64
entries to be used by software for decompression.
This patch does not perform the entire decompression step as it would be
really expensive, instead the first CQE64 is consumed and an internal
context is maintained to interpret the following CQE8 entries directly.
Intermediate empty CQE64 entries are handed back to HW without further
processing.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:50 +0000 (15:17 +0200)]
net/mlx5: add support for configuration through kvargs
The intent is to replace the remaining compile-time options and environment
variables with a common mean of runtime configuration. This commit only
adds the kvargs handling code, subsequent commits will update the rest.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:48 +0000 (15:17 +0200)]
net/mlx5: update prerequisites for upcoming enhancements
The latest version of Mellanox OFED exposes hardware definitions necessary
to implement data path operation bypassing Verbs. Update the minimum
version requirement to MLNX_OFED >= 3.3 and clean up compatibility checks
for previous releases.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:43 +0000 (15:17 +0200)]
net/mlx5: remove Rx scatter support
This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. RX scatter cannot be maintained during the
transition and will be reimplemented later.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:42 +0000 (15:17 +0200)]
net/mlx5: remove Tx gather support
This is done in preparation of bypassing Verbs entirely for the data path
as a performance improvement. TX gather cannot be maintained during the
transition and will be reimplemented later.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:41 +0000 (15:17 +0200)]
net/mlx5: split memory registration function
Except for the first time when memory registration occurs, the lkey is
always cached. Since memory registration is slow and performs system calls,
performance can be improved by moving that code to its own function outside
of the data path so only the lookup code is left in the original inlined
function.
Nélio Laranjeiro [Fri, 24 Jun 2016 13:17:40 +0000 (15:17 +0200)]
net: fix PCI class id
Use RTE_PCI_DEVICE macro to set all fields rather than explicitly setting
them individually in the code. This shortens the code while helping to
future-proof against future changes to the rte_pci_id structure.
Fixes: 701c8d80c820 ("pci: support class id probing") Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Beilei Xing [Thu, 23 Jun 2016 15:11:58 +0000 (23:11 +0800)]
net/ixgbe: fix single VLAN tag to be outer VLAN tag
Previously, a single VLAN header is treated as inner VLAN,
but generally, a single VLAN header is treated as the outer
VLAN header.
The patch fixes the ether type of a single VLAN type, and
enables configuring inner and outer TPID for double VLAN.
Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type") Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Beilei Xing [Wed, 22 Jun 2016 02:53:51 +0000 (10:53 +0800)]
net/i40e: fix single VLAN tag to be outer VLAN tag
In current i40e codebase, if single VLAN header is added in a packet,
it's treated as inner VLAN. Generally, a single VLAN header is
treated as the outer VLAN header, so update the driver behaviour
appropriately.
Fixes: 19b16e2f6442 ("ethdev: add vlan type when setting ether type") Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Nelson Escobar [Thu, 16 Jun 2016 19:19:05 +0000 (12:19 -0700)]
net/enic: support scattered Rx
For performance reasons, this patch uses 2 VIC RQs per RQ presented to
DPDK.
The VIC requires that each descriptor be marked as either a start of
packet (SOP) descriptor or a non-SOP descriptor. A one RQ solution
requires skipping descriptors when receiving small packets and results
in bad performance when receiving many small packets.
The 2 RQ solution makes use of the VIC feature that allows a receive
on primary queue to 'spill over' into another queue if the receive is
too large to fit in the buffer assigned to the descriptor on the
primary queue. This means that there is no skipping of descriptors
when receiving small packets and results in much better performance.
Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Beilei Xing [Thu, 16 Jun 2016 13:36:28 +0000 (21:36 +0800)]
net/e1000: configure outer VLAN TPID field
This patch enables configuring the outer TPID for double VLAN.
Note that all other TPID values, for single VLANs or inner VLAN in the
QinQ case, are read only.
Sony Chacko [Thu, 16 Jun 2016 05:47:10 +0000 (22:47 -0700)]
net/qede: enable VF-VF traffic with unmatched dest address
This patch enables VF to VF traffic with unmatched destination addresses.
The steps to enable this are:
- Enable promiscuous mode filter settings.
- Check for VF mode and enable promiscuous mode settings for VF.
- Check filter configuration to ensure conflicting filter modes
are not set.
Signed-off-by: Sony Chacko <sony.chacko@qlogic.com>
Harish Patil [Thu, 16 Jun 2016 05:47:09 +0000 (22:47 -0700)]
net/qede: support 100G
- Add device id to the PCI table
- Add polling for the slowpath events for CMT mode device
- Add prerequisites to allow 100g mode
* Min number of queues needed is 2
* Only even number of queues are allowed
- Update documentation
ixgbe PMD RX function(s) misses some packet types that are:
- correctly recognised by the underlying HW.
- marked as supported by ixgbe_dev_supported_ptypes_get().
Fixes: 9586ebd358d5 ("ixgbe: replace some offload flags with packet type") Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com> Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com> Tested-by: Olivier Matz <olivier.matz@6wind.com>
Nelson Escobar [Tue, 14 Jun 2016 23:54:05 +0000 (16:54 -0700)]
net/enic: improve packet type identification
- add l4 ptypes to the ones we report as supporting
- report/use RTE_PTYPE_L3_IPV4_EXT_UNKNOWN and
RTE_PTYPE_L3_IPV6_EXT_UNKNOWN instead of RTE_PTYPE_L3_IPV4 and
RTE_PTYPE_L3_IPV6 as vic can't distinguish between packets with
extentions and those without extentions.
- correctly set the ptype bits set on packets that are both tcp/udp
and a frag
- set RTE_PTYPE_L4_NONFRAG on ip packets we know are not udp, tcp,
or fragments.
Fixes: 947d860c821f ("enic: improve Rx performance") Signed-off-by: Nelson Escobar <neescoba@cisco.com> Reviewed-by: John Daley <johndale@cisco.com>
Zhe Tao [Tue, 14 Jun 2016 05:24:16 +0000 (13:24 +0800)]
net/i40e: fix offload flags for vector Rx
The flags for RSS and flow director are not set correctly in the vector
Rx function, so applications which use these flags will not work
correctly.
The problem is caused by incorrect constants for masking,
shuffling and shifting the descriptor bytes, to create the resultant
flags in the mbuf. Correcting the constants fixes the problem
Fixes: 9ed94e5bb04e ("i40e: add vector Rx") Signed-off-by: Zhe Tao <zhe.tao@intel.com> Reviewed-by: Piotr Azarewicz <piotrx.t.azarewicz@intel.com> Acked-by: Jingjing Wu <jingjing.wu@intel.com>
If MSIX is available, the vector count given by the table size is one
less than the actual count. This count also limits the receive and
transmit queue resources the VF can support.
Fixes: 540a211084a7 ("bnx2x: driver core") Signed-off-by: Charles (Chas) Williams <ciwillia@brocade.com> Acked-by: Harish Patil <harish.patil@qlogic.com>
Eric Kinzie [Fri, 27 May 2016 19:44:05 +0000 (12:44 -0700)]
net/bonding: allow external state machine in mode 4
Provide functions to allow an external 802.3ad state machine to transmit
and receive LACPDUs and to set the collection/distribution flags on
slave interfaces.
Signed-off-by: Eric Kinzie <ehkinzie@gmail.com> Acked-by: Declan Doherty <declan.doherty@intel.com>
Eric Kinzie [Sat, 7 May 2016 03:45:24 +0000 (20:45 -0700)]
net/bonding: inherit maximum Rx packet length
Instead of a hard-coded maximum receive length, allow the bonded interface
to inherit this limit from the slave interfaces. This allows
an application that uses jumbo frames to pass realistic values to
rte_eth_dev_configure without causing an error.
Before the bonding interface is configured, allow slaves with any
max_rx_pktlen to be added and remember the lowest of these values as
a candidate value. During dev_configure, set the bond device's
max_rx_pktlen to the candidate value. After this point only slaves
with a max_rx_pktlen greater or equal to that of the bonding device
can be added.
If all slaves are removed, the bond device's pktlen is cleared.
Signed-off-by: Eric Kinzie <ehkinzie@gmail.com> Acked-by: Declan Doherty <declan.doherty@intel.com>
Jerin Jacob [Fri, 17 Jun 2016 13:29:31 +0000 (18:59 +0530)]
net/thunderx/base: add mailbox for PF/VF communication
DPDK nicvf driver doesn't have access to NIC's PF address space.
Introduce a mailbox mechanism to communicate with PF driver through
shared 128bit register interface.
Xiao Wang [Mon, 6 Jun 2016 09:00:47 +0000 (17:00 +0800)]
net/fm10k: fix promiscuous receive for VF
When app tries to change promisc/allmulti setting, fm10k will check if a
valid glort is acquired, and exit without doing anything if not.
For VFs, this glort value is not necessary, and so the check can be
removed. This saves having unnecessary failures of the API call, as well as
saving the time taken for the mailbox communication between VF and PF in
the case when the glort check passes.
Fixes: df02ba864695 ("fm10k: support promiscuous mode") Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com> Acked-by: Jing Chen <jing.d.chen@intel.com>
Olivier Matz [Mon, 13 Jun 2016 11:24:29 +0000 (13:24 +0200)]
net/xenvirt: fix build after mempool changes
The field elt_va_start has been removed from the mempool structure,
and it was not replaced in xenvirt.
Fix this by getting the mempool objects address by using the address of
the first memory chunk list.
Note that it won't work with mempool composed of several chunks,
but it was already the case before.
Fixes: 84121f197187 ("mempool: store memory chunks in a list") Reported-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:33 +0000 (14:23 -0700)]
net/bnxt: free memory in close operation
This patch adds code to free all resources except the one corresponding
to HWRM, which are required to notify the HWRM that the driver is unloaded
(these are freed in uninit()).
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:28 +0000 (14:23 -0700)]
net/bnxt: allocate rings and groups
Add a top level functions to initialize ring groups, and functions
to allocate and free all the rings via HWRM.
A ring group is identified by an index. It consists of Rx or Tx ring id,
completion ring id and a statistics context. Once a ring group is
initialized, use this group index while creating the rings in the ASIC
using the appropriate HWRM API added via earlier patches.
Functions added:
bnxt_free_cp_ring
Calls the HWRM function generic ring free with arguments specific
to a completion ring and sanitizes the host completion structure
bnxt_free_all_hwrm_rings
Frees all the HWRM allocated hardware rings
bnxt_free_all_hwrm_resources
Frees all the resources allocated via the HRM in the hardware
bnxt_alloc_hwrm_rings
Allocates all the HWRM rings needed in the current configuration
This should be the last functionality needed to add start/stop
device operations.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:27 +0000 (14:23 -0700)]
net/bnxt: set L2 filters
New HWRM call:
bnxt_clear_hwrm_vnic_filters
This patch adds code to set and clear L2 filters from the
corresponding VNIC. These filters will determine the Rx flows
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:25 +0000 (14:23 -0700)]
net/bnxt: add ring group alloc/free
Add HWRM API for ring group alloc/free functions, associated structs and
definitions.
This API allocates and does basic preparation for a ring group in ASIC.
A ring group is identified by an index. It consists of Rx ring id,
completion ring id and a statistics context.
New HWRM calls:
bnxt_hwrm_ring_grp_alloc
Allocates and does basic preparation for a ring group
bnxt_hwrm_ring_grp_free
Frees and does cleanup resources of a ring group
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:24 +0000 (14:23 -0700)]
net/bnxt: add ring alloc/free
Add HWRM API calls to allocate and free TX, RX and Completion rings
in the hardware along with the associated structs and definitions.
This informs the hardware of how the specific rings were set up in the
host and allocates them in the HWRM, setting up the doorbell registers
etc. as needed, returning an ID for the ring.
Basic ring alloc/free calls:
bnxt_hwrm_ring_alloc
This command allocates and does basic preparation for a ring.
bnxt_hwrm_ring_free
This command is used to free a ring and associated resources.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:20 +0000 (14:23 -0700)]
net/bnxt: allow configuration of a VNIC
This patch adds APIs to allow configuration of a VNIC.
The functions alloc and free the Class of Service or COS and
Load Balance context corresponding to the VNIC in the chip.
New HWRM calls:
bnxt_hwrm_vnic_ctx_alloc:
Used to allocate COS/Load Balance context of VNIC
bnxt_hwrm_vnic_ctx_free:
Used to free COS/Load Balance context of VNIC
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:16 +0000 (14:23 -0700)]
net/bnxt: add HWRM function reset command
Add bnxt_hwrm_func_reset() function and supporting structs and macros.
New HWRM calls:
bnxt_hwrm_func_reset:
This command puts the function into the reset state.
In the reset state, global and port related features of the
chip are not available.
This command resets a hardware function (PCIe function) and
frees any resources used by the function. This command initiated by
the driver prepare the function for re-use. This command may also be
initiated by a driver prior to doing it's own configuration.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:15 +0000 (14:23 -0700)]
net/bnxt: allocate Rx/Tx and completion rings
Perform allocation and free()ing of ring and information structures
for the TX, RX, and completion rings. The previous patches had
so far provided top level stubs and generic ring support, while this
patch does the real allocation and freeing of the memory specific to
each different type of generic ring.
For example bnxt_init_tx_ring_struct() or bnxt_init_rx_ring_struct() is
now allocating memory based on the socked_id being provided.
bnxt_tx_queue_setup_op() or bnxt_rx_queue_setup_op() have gone through
some reformatting to perform a graceful cleanup in case memory
allocation fails.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:14 +0000 (14:23 -0700)]
net/bnxt: add initial Rx code
This patch adds initial implementation of rx_pkt_burst() function for Rx.
bnxt_recv_pkts() is the top level function for doing Rx.
This patch also adds code to allocate rings in the ASIC.
For each Rx queue allocated in the PMD driver, a corresponding ring
in hardware will be created. Every time a frame is received a Rx ring
is selected based on the hardware configuration like RSS, MAC or VLAN,
COS and such. The hardware uses a completion ring to indicate the
availability of a packet.
This patch also brings in functions like bnxt_init_one_rx_ring()
bnxt_init_rx_ring_struct() which initializes various structures before
a Rx can begin.
bnxt_init_rxbds() initializes the Rx Buffer Descriptors while
bnxt_alloc_rx_data() allocates a buffer in the host to receive the
incoming packet.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:13 +0000 (14:23 -0700)]
net/bnxt: add initial Tx code
Initial implementation of tx_pkt_burst for transmit.
bnxt_xmit_pkts() is the top level function that is called during Tx.
bnxt_handle_tx_cp() is used to check and process the Tx completions
generated for the Tx Buffer Descriptors sent by the hardware.
This patch also adds code to allocate rings in the hardware.
For each Tx queue allocated in the PMD driver, a corresponding ring
in hardware will be created. Every time a Tx request is initiated
via the bnxt_xmit_pkts() call, a Buffer Descriptor is created and
is sent to the hardware via the associated Tx ring.
On completing the Tx operation, the hardware will generates the status
in the form of a completion. This completion is processed by the
bnxt_handle_tx_cp() function.
Functions like bnxt_init_tx_ring_struct() and bnxt_init_one_tx_ring()
are used to initialize various members of the structure before
starting Tx operations.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:12 +0000 (14:23 -0700)]
net/bnxt: add statistics
Add the bnxt_stats_get_op() and bnxt_stats_reset_op() dev_ops to
get and reset statistics. It also brings in the associated HWRM calls
to handle the requests appropriately.
We also have the bnxt_free_stats() function which will be used in the
follow on patches to free the memory allocated by the driver for
statistics.
New HWRM calls:
bnxt_hwrm_stat_clear:
This command clears statistics of a context
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:11 +0000 (14:23 -0700)]
net/bnxt: add Rx queue create/destroy
In this patch we are adding the bnxt_rx_queue_setup_op() and
bnxt_rx_queue_release_op() functions. These will be tied to the
rx_queue_setup and rx_queue_release dev_ops in a subsequent patch.
In these functions we allocate/free memory for the RX queues.
This still requires support to create a RX ring in the ASIC which
will be completed in a future commit. Each Rx queue created via the
rx_queue_setup dev_op will have an associated Rx ring in the hardware.
The Rx logic in the hardware picks a Rx ring for each Rx frame received
by the hardware depending on the properties like RSS, MAC and VLAN
settings configured in the hardware. These packets in the end arrive
on the Rx queue corresponding to the Rx ring in the hardware.
We are also adding some functions like bnxt_mq_rx_configure()
bnxt_free_rx_mbufs() and bnxt_free_rxq_stats() which will be used in
subsequent patches.
We are also adding hwrm_vnic_rss_cfg_* structures, which will be used
in subsequent patches to enable RSS configuration.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>
Ajit Khaparde [Wed, 15 Jun 2016 21:23:10 +0000 (14:23 -0700)]
net/bnxt: add Tx queue create/destroy
In this patch we are adding the bnxt_tx_queue_setup_op() and
bnxt_tx_queue_release_op() functions. These will be tied to the
tx_queue_setup and tx_queue_release dev_ops in a subsequent patch.
In these functions we allocate/free memory for the TX queues.
This still requires support to create a TX ring in the ASIC which
will be completed in a future commit. Each Tx queue created via the
tx_queue_setup dev_op will have an associated Tx ring in the hardware.
A Tx request coming on the Tx queue gets sent to the corresponding
Tx ring in the ASIC for subsequent transmission.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Stephen Hurd <stephen.hurd@broadcom.com> Reviewed-by: David Christensen <david.christensen@broadcom.com>