Chen Jing D(Mark) [Thu, 28 Jan 2016 09:45:59 +0000 (17:45 +0800)]
fm10k: optimize mbuf freeing in non-vector Tx
When the TX function tries to free a bunch of mbufs, it will free
them one by one. This change will scan the free list and merge the
requests in case they belongs to same pool, then free once, which
will reduce cycles on freeing mbufs.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Shaopeng He [Fri, 5 Feb 2016 02:46:11 +0000 (10:46 +0800)]
fm10k: fix switch manager high CPU usage
fm10k switch core uses source MAC + VID + SGLORT to do
look up in MAC table. If no match, an exception interrupt
will be sent to the switch manager. Too much of this kind
of exception interrupts cause switch manager side high CPU
usage.
To reproduce this issue, one DPDK testpmd runs on a server
with one fm10k NIC, mac forwards test traffic from one of
fm10k ports to another port. The CPU usage for the switch
manager will go up to about 20% for test traffic rate at
10G bps, comparing to near 0% for no test traffic.
This patch fixes this issue. A default SGLORT is assigned
to each TX queue. This default value works for non-VMDq mode
and current VMDq example. For advanced VMDq usage, e.g.
different source MAC address for different TX queue, FTAG
forwarding function could be used to change this default
SGLORT value.
Fixes:
9ae6068c86da ("fm10k: add dev start/stop")
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Shaopeng He [Thu, 4 Feb 2016 12:43:21 +0000 (20:43 +0800)]
fm10k: enable broadcast loopback suppression
In FM10K, a single PCIe port can derive out a few logical ports,
like SRIOV PF/VF devices, VMDQ objects. To better manage them, FM10K
silicon assigns a Unique GLORT ID to each logical port.
When a logical port sends a broadcast packet, the silicon will flood
it to all logical ports, including the one that sent the broadcast packet.
To prevent this, silicon has an rxq register to store the glort id of
the logical port that queue binds to.
FM10K has a switch core inside, which has a loopback suppression
mechanism in the switch level. Switch level loopback suppression mostly
works for the ether port traffic.
This patch assigns a SGLORT for each RX queue, and enables PCIe port
level loopback suppression.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Shaopeng He [Fri, 5 Feb 2016 04:57:50 +0000 (12:57 +0800)]
examples/l3fwd-power: fix memory leak for non-IP packets
Previous l3fwd-power only processes IP and IPv6 packets, other
packets' mbufs are not freed, and this causes a memory leak.
This patch fixes this issue.
Fixes:
3c0184cc0c60 ("examples: replace some offload flags with packet type")
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Shaopeng He [Fri, 5 Feb 2016 04:57:49 +0000 (12:57 +0800)]
fm10k: make default VID available in initialization
When the PF establishes a connection with Switch Manager(SM), it receives
a logical port range from SM, and registers certain logical ports from
that range. Then a default VID will be sent back from the SM.
This whole transaction - finishing with the default VID being set -
needs to be completed before dev_init returns. If not, the interrupt
setting will subsequently be changed in dev_start according to the RX
queue number, and that can cause this transaction to fail.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Shaopeng He [Fri, 5 Feb 2016 04:57:48 +0000 (12:57 +0800)]
fm10k: add Rx queue interrupt enable/disable functions
Interrupt mode framework has per-queue enable/disable functions.
Implement these two functions for fm10k driver.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Shaopeng He [Fri, 5 Feb 2016 04:57:47 +0000 (12:57 +0800)]
fm10k: remove Rx queue interrupts when stopping
Previous dev_stop function stops the rx/tx queues. This patch adds logic
to disable rx queue interrupt, clean the datapath event and queue/vector
map.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Shaopeng He [Fri, 5 Feb 2016 04:57:46 +0000 (12:57 +0800)]
fm10k: setup Rx interrupt for PF and VF
In interrupt mode, each rx queue can have one interrupt to notify the
application when packets are available in that queue. Some queues
also can share one interrupt.
Currently, fm10k needs one separate interrupt for mailbox. So, only those
drivers which support multiple interrupt vectors e.g. vfio-pci can work
in fm10k interrupt mode.
This patch uses the RXINT/INT_MAP registers to map interrupt causes
(rx queue and other events) to vectors, and enable these interrupts
through kernel drivers like vfio-pci.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Shaopeng He [Fri, 5 Feb 2016 04:57:45 +0000 (12:57 +0800)]
fm10k: support Rx descriptor check
rx_descriptor_done is used by interrupt mode example application
(l3fwd-power) to check rxd DD bit to decide the RX trend,
then l3fwd-power will adjust the cpu frequency according to
the result.
Signed-off-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Chen Jing D(Mark) [Wed, 30 Dec 2015 08:35:35 +0000 (16:35 +0800)]
fm10k: allocate logical ports for flow director
In fm10k, PF, VF, VMDQ or queues binding to flow director rule can
be considered as a logical port. Original implementation only creates
a single port for all cases. This change creates 128 logical ports;
first 64 for PF and VMDQ, second 64 for flow director.
Registers DGLORTDEC/DGLORTMAP define rules for how to classify packets
into different queues. Currently only PF and VMDQ cases are considered.
This change add rules for flow director.
Signed-off-by: Chen Jing D(Mark) <jing.d.chen@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Xiao Wang [Fri, 18 Dec 2015 03:09:18 +0000 (11:09 +0800)]
fm10k: fix VLAN flag in scattered Rx
In fm10k_recv_scattered_pkts function, a packet is stored in a linked list,
offload flags such as PKT_RX_VLAN_PKT should be set in the first segment.
Fixes:
6b59a3bc82b1 ("fm10k: fix VLAN in Rx mbuf")
Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Remy Horton [Wed, 9 Mar 2016 13:29:24 +0000 (13:29 +0000)]
i40e: support default MAC address setting
Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Remy Horton [Wed, 9 Mar 2016 13:29:24 +0000 (13:29 +0000)]
i40e: add EEPROM and registers dumping
Signed-off-by: Remy Horton <remy.horton@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Jingjing Wu [Wed, 9 Mar 2016 08:22:46 +0000 (16:22 +0800)]
i40e: support setting VF MAC address
This patch implemented the ops of adding and removing mac
address in i40evf driver. Functions are assigned like:
.mac_addr_add = i40evf_add_mac_addr,
.mac_addr_remove = i40evf_del_mac_addr,
To support multiple mac addresses setting, this patch also
extended the mac addresses adding and deletion when device
start and stop. Each VF can have a maximum of 64 mac
addresses.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
Zhe Tao [Wed, 9 Mar 2016 05:38:43 +0000 (13:38 +0800)]
i40e: add VEB switching support
VEB switching feature for i40e is used to enable the switching between the
VSIs connect to the virtual bridge. The old implementation is setting the
virtual bridge mode as VEPA which is port aggregation. Enable the switching
ability by setting the loop back mode for the specific VSIs which connect
to PF or VFs.
VEB/VSI/VEPA are concepts not specific to the i40e HW, the concepts are
from 802.1qbg spec
IEEE EVB tutorial:
http://www.ieee802.org/802_tutorials/2009-11/evb-tutorial-draft-20091116_v09.pdf
VEB: a virtual switch can forward the packet based on the specific match
field.
VSI: a virtual interface connect between the VEB/VEPA and virtual machine.
VEPA: a virtual Ethernet port aggregator will upstream the packets from
VSI to the LAN port.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Rami Rosen [Fri, 26 Feb 2016 18:33:54 +0000 (20:33 +0200)]
i40e: fix typo in a comment
This patch fixes a typo in a comment in the definition of
the i40e_pf struct.
Fixes:
4861cde46116 ("i40e: new poll mode driver")
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Jingjing Wu [Thu, 25 Feb 2016 07:33:35 +0000 (15:33 +0800)]
examples/vmdq_dcb: support X710
Currently, the example vmdq_dcb only works on Intel(R) 82599 NICs.
This patch extends this sample to make it work both on Intel(R) 82599
and X710/XL710 NICs by making the following changes:
1. add VMDQ base queue checking to avoid forwarding on PF queues.
2. assign each VMDQ pool to a MAC address.
3. add more arguments (nb-tcs, enable-rss) to change the default
setting
4. extend the max number of queues from 128 to 1024.
This patch also reworks the user guide for the vmdq_dcb sample.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Jingjing Wu [Thu, 25 Feb 2016 07:33:33 +0000 (15:33 +0800)]
i40e: enable DCB in VMDQ VSIs
Previously, DCB(Data Center Bridging) is only enabled on PF,
queue mapping and BW configuration is only done on PF.
This patch enables DCB for VMDQ VSIs(Virtual Station Interfaces)
by following steps:
1. Take BW and ETS(Enhanced Transmission Selection)
configuration on VEB(Virtual Ethernet Bridge).
2. Take BW and ETS configuration on VMDQ VSIs.
3. Update TC(Traffic Class) and queues mapping on VMDQ VSIs.
To enable DCB on VMDQ, the number of TCs should not be larger than
the number of queues in VMDQ pools, and the number of queues per
VMDQ pool is specified by CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM
in config/common_* file.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:37 +0000 (16:14 +0800)]
i40evf: use base driver defined interface
It removes the i40evf_set_mac_type() defined in PMD, and reuses
i40e_set_mac_type() defined in base driver.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:36 +0000 (16:14 +0800)]
i40e/base: add base driver release info
It adds base driver release information such as release date,
for better tracking in the future.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:35 +0000 (16:14 +0800)]
i40e/base: update AQ command structures and macros
Several structures and macros are added or updated, such
as 'struct i40e_aqc_get_link_status',
'struct i40e_aqc_run_phy_activity' and
'struct i40e_aqc_lldp_set_local_mib_resp'.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:34 +0000 (16:14 +0800)]
i40e/base: add AQ thermal sensor control struct
It adds the new AQ command and struct for managing a
thermal sensor.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:33 +0000 (16:14 +0800)]
i40e/base: add virtchnl offload for X722 PCTYPES
X722 supports Expanded version of TCP, UDP PCTYPES for RSS.
Add a Virtchnl offload to support this.
Without this patch VF drivers will not be able to support
the correct PCTYPES for X722 and UDP flows will not fan out.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:32 +0000 (16:14 +0800)]
i40e/base: add some register definitions
This patch adds 7 new register definitions for programming the
parser, flow director and RSS blocks in the HW.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:31 +0000 (16:14 +0800)]
i40e: use AQ for Rx control register read/write
RX control register read/write functions are added, as directly
read/write may fail when under stress small traffic. After the
adminq is ready, all rx control registers should be read/written
by dedicated functions.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:30 +0000 (16:14 +0800)]
i40e/base: fix coding style
Clean up coding style in base code
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:29 +0000 (16:14 +0800)]
i40e/base: save VSI resource count on update
When updating a VSI, save off the number of allocated and
unallocated VSIs as we do when adding a VSI.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:28 +0000 (16:14 +0800)]
i40e/base: fix driver load failure
Fix the driver load failure with linking with some
PHY types, as the amount of time it takes for the
GLGEN_RSTAT_DEVSTATE to be set increases greatly on those PHY
types, which can lead to a timeout.
Fixes:
9aeefed05538 ("i40e/base: support ESS")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:27 +0000 (16:14 +0800)]
i40e/base: avoid unwanted Tx traffic mirroring
In Multi-Function Mode (MFP) particularly when the PF VSI is set
in limited promiscuous mode, the HW switch was still mirroring the
outgoing packets from other VSIs (VF/VMdq) onto the PF VSI.
This sets a new bit to avoid above mirroring, and it is in limited
promiscuous on the PF VSI in MFP which is similar to default port
VSI.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:26 +0000 (16:14 +0800)]
i40e/base: support LED blinking with new PHY
This patch adds functions to blink led on devices using a new
PHY since MAC registers used in other designs do not work in
this device configuration.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:25 +0000 (16:14 +0800)]
i40e/base: add AQ switch configuration
Add the support code for calling the AdminQ API call
aq_set_switch_config.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:24 +0000 (16:14 +0800)]
i40e/base: add VEB statistics control
With the latest firmware, statistics gathering can now be enabled and
disabled in the HW switch, so we need to add a parameter to allow the
driver to set it as desired. At the same time, the L2 cloud filtering
parameter has been removed as it was never used.
Older drivers working with the newer firmware and newer drivers working
with older firmware will not run into problems with these bits as the
defaults are reasonable and there is no overlap in the bit definitions.
Also, newer drivers will be forced to update because of the change in
function call parameters, a reminder that the functionality exists.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:23 +0000 (16:14 +0800)]
i40e/base: support mirroring rules
This patch implements necessary functions related to port
mirroring features such as add/delete mirror rule, function
to set promiscuous VLAN mode for VSI if mirror rule_type is
"VLAN Mirroring".
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:22 +0000 (16:14 +0800)]
i40e/base: set shared bit for multicast filters
Add the use of the new shared MAC filter bit for multicast
and broadcast filters in order to make better use of the
filters available from the device.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:21 +0000 (16:14 +0800)]
i40e/base: fix PHY NVM interaction
This patch fixes a problem where the NVMUpdate Tool, when
using the PHY NVM feature, gets bad data from the PHY because
of contention on the MDIO interface from get phy capability
calls from the driver during regular operations. The problem
is fixed by adding a check if media is available before calling
get phy capability function because that bit is not set when
device is in PHY interaction mode.
Fixes:
842ea1996335 ("i40e/base: save link module type")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:20 +0000 (16:14 +0800)]
i40e/base: unify the capability function
The device capabilities were defined in two places, and neither had
all the definitions. It really belongs with the AQ API definition,
so this patch removes the other set of definitions and fills out the
missing item.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:19 +0000 (16:14 +0800)]
i40e/base: fix proxy for X722
The recently added proxy opcodes should be available only with
X722_SUPPORT, so move them into the #ifdef, and reorder these
to be in numerical order with the rest of the opcodes. Several
structs that were added are unnecessary, so they are removed
here.
Fixes:
788fc17b2dec ("i40e/base: support proxy config for X722")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:18 +0000 (16:14 +0800)]
i40e/base: fix wake on lan for X722
The recently added Wakeup On Line (WOL) opcodes should be
available only with X722_SUPPORT, so move them into the #ifdef,
and reorder these to be in numerical order with the rest of the
opcodes. Several structs that were added are unnecessary, so
they are removed here.
Fixes:
3c89193a36fd ("i40e/base: support WOL config for X722")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:17 +0000 (16:14 +0800)]
i40e: update device ids
Add new Device ID's for backplane and QSFP+ adapters, and delete
deprecated one for backplane.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:16 +0000 (16:14 +0800)]
i40e/base: fix uncertain event descriptor issue
In one obscure corner case, it was possible to clear the NVM update
wait flag when no update_done message was actually received. This
patch cleans the event descriptor before use, and moves the opcode
check to where it won't get done if there was no event to clean.
Fixes:
8db9e2a1b232 ("i40e: base driver")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:15 +0000 (16:14 +0800)]
i40e/base: set AQ count after memory allocation
The standard way to check if the AQ is enabled is to look at
the count field. So it should only set this field after it has
successfully allocated memory.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:14 +0000 (16:14 +0800)]
i40e/base: fix missing check for stopped admin queue
It's possible that while waiting for the spinlock, another
entity (that owns the spinlock) has shut down the admin queue.
If it then attempts to use the queue, it will panic.
It adds a check for this condition on the receive side. This
matches an existing check on the send queue side.
Fixes:
8db9e2a1b232 ("i40e: base driver")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:13 +0000 (16:14 +0800)]
i40e/base: limit version check of DCB
XL710/X710 devices requires FW version checks to properly handle
DCB configurations from the FW while other devices (e.g. X722)
do not, so limit these checks to XL710/X710 only.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:12 +0000 (16:14 +0800)]
i40e/base: support NVM read on X722
In X722, NVM reads can't be done through SRCTL registers.
And require AQ calls, which require grabbing the NVM lock.
Unfortunately some paths need the lock to be acquired once
and do a whole bunch of stuff and then release it.
This patch creates an unsafe version of the read calls, so
that it can be called from the paths that need the bulk access.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:11 +0000 (16:14 +0800)]
i40e/base: add flag for X722 register access
Instead of doing the MAC check, use a flag that gets set per
MAC. This way there are less chances of user error and it
can enable multiple MACs with the capability in a single place
rather than cluttering the code with MAC checks.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:10 +0000 (16:14 +0800)]
i40e/base: acquire NVM ownership before reading it
SW needs to acquire the NVM ownership before issuing an AQ read
to the X722 NVM otherwise it will get EBUSY from the firmware.
Also it should be released when done.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 08:14:09 +0000 (16:14 +0800)]
i40e/base: fix compilation warnings
Fix compilation warnings in base code on some platforms.
Fixes:
bd6651c2d2d7 ("i40e/base: use bit shift macros")
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Helin Zhang [Tue, 8 Mar 2016 06:42:09 +0000 (14:42 +0800)]
i40evf: rework MAC address validation
Use ether API of 'is_valid_assigned_ether_addr' to validate
MAC address.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
Helin Zhang [Tue, 8 Mar 2016 06:42:08 +0000 (14:42 +0800)]
i40e: generate MAC address for VF
Generate a MAC address for each VF during PF host
initialization.
Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
Julien Meunier [Thu, 4 Feb 2016 11:02:16 +0000 (12:02 +0100)]
i40e: fix VLAN filtering
VLAN filtering was always performed, even if hw_vlan_filter was
disabled. During device initialization, default filter
RTE_MACVLAN_PERFECT_MATCH was applied. In this situation, all incoming
VLAN frames were dropped by the card (increase of the register RUPP - Rx
Unsupported Protocol).
In order to restore default behavior, if HW VLAN filtering is activated,
set a filter to match MAC and VLAN. If not, set a filter to only match
MAC.
Fixes:
4861cde46116 ("i40e: new poll mode driver")
Fixes:
912b595146d6 ("i40e: mac vlan filter")
Signed-off-by: Julien Meunier <julien.meunier@6wind.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Rich Lane [Wed, 23 Dec 2015 08:08:00 +0000 (00:08 -0800)]
i40e: fix inverted check for no refcount
The no-refcount path was being taken without the application opting
in to it.
Fixes:
4861cde46116 ("i40e: new poll mode driver")
Reported-by: Mike Stolarchuk <mike.stolarchuk@bigswitch.com>
Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Jingjing Wu [Thu, 25 Feb 2016 07:33:34 +0000 (15:33 +0800)]
ixgbe: disallow unsupported Rx mode
The multi queue mode ETH_MQ_RX_VMDQ_DCB_RSS is not supported in
ixgbe driver.
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Bernard Iremonger [Tue, 8 Mar 2016 17:10:27 +0000 (17:10 +0000)]
ixgbe: fix VF close to remove MAC address
Call the ixgbevf_remove_mac_addr() function in the ixgbevf_dev_close()
function to ensure that the VF traffic goes to the PF after stop,
close and detach of the VF.
Fixes:
af75078fece3 ("first public release")
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Bernard Iremonger [Tue, 8 Mar 2016 17:10:26 +0000 (17:10 +0000)]
ixgbe: add more information to multiqueue error message
Add the nb_rx_q and nb_tx_q values to the error message
to give details about the error.
Fixes:
27b609cbd1c6 ("ethdev: move the multi-queue mode check to specific drivers")
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Bernard Iremonger [Tue, 8 Mar 2016 17:10:25 +0000 (17:10 +0000)]
ixgbe: fix releasing queues twice when detaching VF
Releasing the rx and tx queues is already done in ixgbe_dev_close()
so it does not need to be done in eth_ixgbevf_dev_uninit().
Fixes:
2866c5f1b87e ("ixgbe: support port hotplug")
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Zhe Tao [Thu, 10 Mar 2016 15:26:22 +0000 (15:26 +0000)]
ixgbe: fix VF Rx/Tx function assignment
For the secondary process of DPDK to initialize ixgbevf, it will always
use the simple RX function or LRO RX function, and this behavior is not
the same RX/TX function selection logic as it is for the primary process.
Use the ixgbe_set_tx_function and ixgbe_set_rx_function to select the
RX/TX function when secondary process calls the init function for eth dev.
Fixes:
9d8a92628f21 ("ixgbe: remove simple scalar scattered Rx method")
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Wenzhuo Lu [Fri, 26 Feb 2016 03:05:29 +0000 (11:05 +0800)]
ixgbe: support link speed auto-negotiation on X550em_x
Normally the auto-negotiation is supported by FW. SW need not care about
that. But on x550em_x, FW doesn't support auto-neg. As the x550em_x ports
are 10G, if we connect the port will a peer which is 1G, the link will
always be down.
We need support auto-neg by SW to avoid this link down issue. As we already
have the code to handle the link speed setting, what we need is a trigger.
When the advertised link speed changes, a PHY interruption will be
triggered. So, we should handle this interrupt and call ixgbe_handle_lasi
to set the link speed correctly.
Please be aware it's working when auto-neg is on. If the auto-neg of the
peer port is turned off and its speed is indicated manually, we should also
set the speed of our own port manually.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 06:24:47 +0000 (14:24 +0800)]
ixgbe: support multicast promiscuous mode on VF
Add multicast promiscuous mode support on ixgbe VF driver.
Please note if we want to use this promiscuous mode, we need both PF
and VF driver to support it. The reason is this VF feature is
configged on PF.
If use kernel PF driver + dpdk VF driver, make sure kernel PF driver
support VF multicast promiscuous mode. If use dpdk PF + dpdk VF,
better make sure PF driver is the same version as VF.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
Acked-by: Xiao Wang <xiao.w.wang@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:06 +0000 (16:55 +0800)]
ixgbe: support new devices and MAC types
Add the support for new devices and mac types, as supported by the base
code update.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:05 +0000 (16:55 +0800)]
ixgbe/base: update readme
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:04 +0000 (16:55 +0800)]
ixgbe/base: abstract out link read/write
It's more valuable to abstract the link read/write interface. As such,
change the following method names, and add them to a new link info
structure:
read_i2c_combined => read_link
read_i2c_combined_unlocked => read_link_unlocked
write_i2c_combined => write_link
write_i2c_combined_unlocked => write_link_unlocked
This will allow X550EM_a to override these methods for MDIO access
while X550EM_x provides methods to use I2C combined access.
Initially the structure is just method pointers and a bus
address.
Two functions involved in combined I2C accesses were moved from
ixgbe_phy.c to ixgbe_x550.c. The underlying functions that carry
out the combined I2C accesses were left in ixgbe_phy.c because
they share some functions with other I2C methods.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:03 +0000 (16:55 +0800)]
ixgbe/base: set MDIO speed after MAC reset
The MDIO clock speed must be reconfigured after the MAC reset.
The MDIO clock speed becomes invalid, therefore the driver reads
invalid PHY register values. The driver now set the MDIO clock
speed prior to initializing PHY ops and again after the MAC reset.
As now the MDIO speed gets set in more than one place, make a
function for it so it will always be done correctly.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:02 +0000 (16:55 +0800)]
ixgbe/base: fix setting flow director flag twice
Do not set FDIRCTRL.DROP_NO_MATCH in ixgbe_init_fdir_perfect_82599(),
this bit is already set in ixgbe_set_fdir_drop_queue_82599() which
makes more sense for drivers that call that function.
This resolves an issue where packets were being dropped when switching
to perfect filters mode.
Setting this bit makes no sense in perfect filters mode for the
driver as we do not want to route all packets that don't match an FDIR
rule to a single queue and instead fall back to RSS.
Drivers that need this bit set can call ixgbe_set_fdir_drop_queue_82599()
and the ones that don't, can preserve the old behavior.
Fixes:
2241ce281646 ("ixgbe/base: add flow director drop queue")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:01 +0000 (16:55 +0800)]
ixgbe/base: add register definition for SGMII busy
The X550EM_a device provides the MAC_SGMII_BUSY register to
indicate when slow SGMII register writes complete. Add
definitions for the register. No definitions are provided for
the individual bits under the theory that it is better to wait
for everything to complete when needed rather than try to map
out which reads need to wait for which writes. So we should wait
when anything is marked as "busy".
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:55:00 +0000 (16:55 +0800)]
ixgbe/base: ignore manageability for PHY power on
Instead of not defining the callback for set_phy_power when
manageability is enabled, put the check in the set_phy_power
function so that only turning the power off is conditional on
management, but not turning the PHY on.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:54:59 +0000 (16:54 +0800)]
ixgbe/base: set VF MAC address only when acked by PF
This patch resolves an issue where VF mac address is zeroed out
in cases where the VF driver is loaded while the PF interface
is down.
The solution is to only set it when we get an ACK from the PF.
Fixes:
6202266e5680 ("ixgbe/base: vf changes")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:54:58 +0000 (16:54 +0800)]
ixgbe/base: add sw-firmware sync for resource sharing on X550em_a
Use a PHY token, shared between sw-fw for PHY access on X550EM_a.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:54:57 +0000 (16:54 +0800)]
ixgbe/base: support X550em_x V2 device
Only x550em_x V1 was supported before. Now V2 is supported.
A mask for V1 and V2 is defined and used to support both.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Wenzhuo Lu [Sun, 14 Feb 2016 08:54:56 +0000 (16:54 +0800)]
ixgbe/base: support X550em_a device
Add new X550EM_a devices and their mac types, X550EM_a
and X550EM_a_vf.
Update the code to use the new devices and mac types.
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Michael Qiu [Fri, 29 Jan 2016 05:58:10 +0000 (13:58 +0800)]
ixgbe: fix disable interrupt twice
Currently, ixgbe vf and pf will disable interrupt twice in
stop stage and uninit stage. It will cause an error:
testpmd> quit
Shutting down port 0...
Stopping ports...
Done
Closing ports...
EAL: Error disabling MSI-X interrupts for fd 26
Done
because the interrupt has already been disabled in stop stage.
Since it is enabled in init stage, better remove from
stop stage.
Fixes:
0eb609239efd ("ixgbe: enable Rx queue interrupts for PF and VF")
Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Stephen Hemminger [Wed, 13 Jan 2016 04:54:10 +0000 (20:54 -0800)]
ixgbe: fix whitespace
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Stephen Hemminger [Fri, 13 Nov 2015 16:10:13 +0000 (08:10 -0800)]
ixgbe: speed up non-vector Tx
The freeing of mbuf's in ixgbe is one of the observable hot spots
under load. Optimize it by doing bulk free of mbufs using code similar
to i40e and fm10k.
Drop the no longer needed micro-optimization for the no refcount flag.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Stephen Hemminger [Fri, 22 Jan 2016 01:38:37 +0000 (17:38 -0800)]
igb: set default thresholds based on MAC type
This brings the DPDK igb driver inline with the behavior used by
the current Linux driver. The IGB hardware has several different
MAC types and the threshold values that work vary based on the hardware.
Since DPDK 1.8 it has been up to devices to provide the correct default
configuration parameter. But the igb driver gives values that are broken
on some devices, and always causes a warning message at startup.
Please test this on real hardware, I don't have the luxury of a
hardware lab full of variations of this chip.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Bernard Iremonger [Wed, 2 Mar 2016 16:09:06 +0000 (16:09 +0000)]
e1000: fix VF MAC address on close
Allow reprogramming of the RAR with a zero mac address,
to ensure that the VF traffic goes to the PF after
stop, close and detach of the VF.
Fixes:
be2d648a2dd3 ("igb: add PF support")
Fixes:
d82170d27918 ("igb: add VF support")
Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Yury Kylulin [Tue, 9 Feb 2016 09:09:43 +0000 (12:09 +0300)]
e1000: support VF promiscuous and allmulticast
Enable promiscuous and allmulticast mode control from the VF using
rte_eth_promiscuous_enable()/rte_eth_promiscuous_disable() and
rte_eth_allmulticast_enable()/rte_eth_allmulticast_disable().
For promiscuous mode host/PF igb driver should be built with
IGB_ENABLE_VF_PROMISC.
For allmulticast mode "allmulti" flag should be set for appropriate PF
ifconfig eth0 allmulti
Signed-off-by: Yury Kylulin <yury.kylulin@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Ravi Kerur [Wed, 2 Mar 2016 13:59:34 +0000 (05:59 -0800)]
e1000: support I217 and I218 devices
Modified driver and eal code to support I217 and I218 Intel NICs.
Compiled and tested (via testpmd) on Ubuntu 14.04 for target
x86_64-native-linuxapp-gcc
Compiled for target x86_64-native-linuxapp-clang
Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
John Daley [Tue, 8 Mar 2016 18:49:07 +0000 (10:49 -0800)]
enic: fix last packet not being sent
The last packet of the tx burst function array was not being
emitted until the subsequent call. The nic descriptor index
was being set to the current tx descriptor instead of one past
the descriptor as required by the nic.
Fixes:
d739ba4c6abf ("enic: improve Tx packet rate")
Signed-off-by: John Daley <johndale@cisco.com>
John Daley [Fri, 4 Mar 2016 21:09:00 +0000 (13:09 -0800)]
enic: improve Rx performance
This is a wholesale replacement of the Enic PMD receive path in order
to improve performance and code clarity. The changes are:
- Simplify and reduce code path length of receive function.
- Put most of the fast-path receive functions in one file.
- Reduce the number of posted_index updates (pay attention to
rx_free_thresh)
- Remove the unneeded container structure around the RQ mbuf ring
- Prefetch next Mbuf and descriptors while processing the current one
- Use a lookup table for converting CQ flags to mbuf flags.
Signed-off-by: John Daley <johndale@cisco.com>
Yoann Desmouceaux [Wed, 24 Feb 2016 23:06:15 +0000 (00:06 +0100)]
enic: fix DMA address of outgoing packets
The enic PMD driver send function uses a constant offset instead
of relying on the data_off in the mbuf to find the start of the packet.
Fixes:
fefed3d1e62c ("enic: new driver")
Signed-off-by: Yoann Desmouceaux <ydesmouc@cisco.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Rahul Lakkireddy [Thu, 25 Feb 2016 09:37:53 +0000 (15:07 +0530)]
cxgbe: fix PCI info copy to ports under same PF
Chelsio NIC ports share a single PF. Move rte_eth_copy_pci_info()
to copy the pci device information to the remaining ports as well.
Also update license year to 2016.
Fixes:
eeefe73f0af1 ("drivers: copy PCI device info to ethdev data")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Rahul Lakkireddy [Fri, 12 Feb 2016 11:45:30 +0000 (17:15 +0530)]
cxgbe: fix memory leak after initialization failure
Add missing code to free adapter when the device initialization fails.
Fixes:
8318984927ff ("cxgbe: add pmd skeleton")
Reported-by: Seth Arnold <seth.arnold@canonical.com>
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Rahul Lakkireddy [Tue, 19 Jan 2016 10:17:08 +0000 (15:47 +0530)]
cxgbe: fix setting wrong MTU
max_rx_pkt_len already includes ETHER_HDR_LEN and ETHER_CRC_LEN for the
mtu. But, the firmware also adds ETHER_HDR_LEN and ETHER_CRC_LEN to the
mtu specified. Fix by subtracting these values from the mtu before
passing it to firmware.
Fixes:
4b2eff452d2e ("cxgbe: enable jumbo frames")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Rahul Lakkireddy [Tue, 19 Jan 2016 10:17:07 +0000 (15:47 +0530)]
cxgbe: fix allocated size for RSS table
The size of each entry in the port's rss table is actually 2 bytes
and not 1 byte. A segfault occurs when accessing part of port 0's rss
table because it gets overwritten by subsequent port 1's part of the
rss table. Fix by setting the size of each entry appropriately.
Fixes:
92c8a63223e5 ("cxgbe: add device configuration and Rx support")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
Charles (Chas) Williams [Thu, 31 Dec 2015 00:37:51 +0000 (19:37 -0500)]
bnx2x: determine queue sizes sooner
The VF needs to determine the queues sizes before .dev_infos_get
so that it can hint to the upper layer the proper sizes. Move
bnx2x_vf_get_resources() to .eth_dev_init and probe with the guesses
from bnx2x_init_rte().
Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Charles (Chas) Williams [Thu, 31 Dec 2015 00:37:50 +0000 (19:37 -0500)]
bnx2x: fix resource allocattion error handling
bnx2x_loop_obtain_resources() returns a struct containing the status and
the error message. If bnx2x_do_req4pf() fails, it shouldn't return both
of these fields set to 0 indicating failure and no error.
Further, bnx2x_do_req4pf() needs to be able fail and return NO_RESOURCES
so that bnx2x_loop_obtain_resources() can negotiate reduced resource
requirments. This requires additional checking around bnx2x_do_req4pf().
Fixes:
540a211084a7 ("bnx2x: driver core")
Signed-off-by: Chas Williams <3chas3@gmail.com>
Acked-by: Rasesh Mody <rasesh.mody@qlogic.com>
Stephen Hemminger [Tue, 5 Jan 2016 16:32:00 +0000 (08:32 -0800)]
bnx2x: remove unused variable
The mbuf_alloc_size is leftover from BSD or some other code base.
It is set but never used in DPDK driver. After that the related defines
can also be eliminated.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Harish Patil <harish.patil@qlogic.com>
Liming Sun [Wed, 10 Feb 2016 05:15:21 +0000 (00:15 -0500)]
mpipe: fix crash when testpmd is quit under load
Fixes: the hung/crash issue when quitting testpmd under high
traffic rate. The following issue were found and fixed.
1. edesc->size is not initialized properly in mpipe_do_xmit() and could
cause buffer leak or corruption when HW buffer return is used.
2. Check the 'idesc.be' error bit in mpipe_recv_flush() to make sure
buffer is valid before releasing it. This is to avoid issues when
running out of buffers.
3. priv->rx_buffers counter is not accurate when HW buffer return is
used. Remove this counter to simplify the code.
Signed-off-by: Liming Sun <lsun@ezchip.com>
Acked-by: Zhigang Lu <zlu@ezchip.com>
Liming Sun [Fri, 8 Jan 2016 14:30:38 +0000 (09:30 -0500)]
mpipe: fix link initialization ordering
Mpipe link structure is initialized in function mpipe_link_init().
Currently it's only called from the eth_dev_ops.dev_start, which
caused crashes when link mgmt APIs (like promiscuous_enable)
was called before eth_dev_ops.dev_start(). This submit fixed it
by calling mpipe_link_init() in rte_pmd_mpipe_devinit().
Fixes:
a8dd50513dea ("mpipe: add TILE-Gx mPIPE poll mode driver")
Signed-off-by: Liming Sun <lsun@ezchip.com>
Acked-by: Zhigang Lu <zlu@ezchip.com>
Liming Sun [Fri, 8 Jan 2016 14:30:37 +0000 (09:30 -0500)]
mpipe: optimize buffer return mechanism
This submit has changes to optimize the mpipe buffer return. When
a packet is received, instead of allocating and refilling the
buffer stack right away, it tracks the number of pending buffers,
and use HW buffer return as an optimization when the pending
number is below certain threshold, thus save two MMIO writes and
improves performance especially for bidirectional traffic case.
Signed-off-by: Liming Sun <lsun@ezchip.com>
Acked-by: Zhigang Lu <zlu@ezchip.com>
Liming Sun [Fri, 8 Jan 2016 14:30:36 +0000 (09:30 -0500)]
mk: support native build on TILE-Gx
The CROSS variable has empty default value (for native) and
must be set when using a cross-toolchain.
Signed-off-by: Liming Sun <lsun@ezchip.com>
Acked-by: Zhigang Lu <zlu@ezchip.com>
Thomas Monjalon [Tue, 15 Mar 2016 18:43:55 +0000 (19:43 +0100)]
doc: fix IPsec entry in the release notes
It was inserted in the "Resolved Issues" section.
Move the entry with the new features.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tetsuya Mukawa [Mon, 14 Mar 2016 08:53:32 +0000 (17:53 +0900)]
vhost: fix default value of kickfd and callfd
Currently, default values of kickfd and callfd are -1.
If the values are -1, current code guesses kickfd and callfd haven't
been initialized yet. Then vhost library will guess the virtqueue isn't
ready for processing.
But callfd and kickfd will be set as -1 when "--enable-kvm"
isn't specified in QEMU command line. It means we cannot treat -1 as
uninitialized state.
The patch defines -1 and -2 as VIRTIO_INVALID_EVENTFD and
VIRTIO_UNINITIALIZED_EVENTFD, and uses VIRTIO_UNINITIALIZED_EVENTFD for
the default values of kickfd and callfd.
Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:46 +0000 (12:32 +0800)]
vhost: avoid dead loop chain
If a malicious guest forges a dead loop chain, it could lead to a dead
loop of copying the desc buf to mbuf, which results to all mbuf being
exhausted.
Add a var nr_desc to avoid such case.
Suggested-by: Huawei Xie <huawei.xie@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:45 +0000 (12:32 +0800)]
vhost: check for ring descriptors overflow
A malicious guest may easily forge some illegal vring desc buf.
To make our vhost robust, we need make sure desc->next will not
go beyond the vq->desc[] array.
Suggested-by: Rich Lane <rich.lane@bigswitch.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:44 +0000 (12:32 +0800)]
vhost: do sanity check for ring descriptor length
We need make sure that desc->len is bigger than the size of virtio net
header, otherwise, unexpected behaviour might happen due to "desc_avail"
would become a huge number with for following code:
desc_avail = desc->len - vq->vhost_hlen;
For dequeue code path, it will try to allocate enough mbuf to hold such
size of desc buf, which ends up with consuming all mbufs, leading to no
free mbuf is available. Therefore, you might see an error message:
Failed to allocate memory for mbuf.
Also, for both dequeue/enqueue code path, while it copies data from/to
desc buf, the big "desc_avail" would result to access memory not belong
the desc buf, which could lead to some potential memory access errors.
A malicious guest could easily forge such malformed vring desc buf. Every
time we restart an interrupted DPDK application inside guest would also
trigger this issue, as all huge pages are reset to 0 during DPDK re-init,
leading to desc->len being 0.
Therefore, this patch does a sanity check for desc->len, to make vhost
robust.
Reported-by: Rich Lane <rich.lane@bigswitch.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:43 +0000 (12:32 +0800)]
vhost: remove wrong unlikely prediction in Rx
VIRTIO_NET_F_MRG_RXBUF is a default feature supported by vhost.
Adding unlikely for VIRTIO_NET_F_MRG_RXBUF detection doesn't
make sense to me at all.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:42 +0000 (12:32 +0800)]
vhost: remove rte_memcpy from header copy
First of all, rte_memcpy() is mostly useful for copying big packets
by leveraging hardware advanced instructions like AVX. But for virtio
net hdr, which is 12 bytes at most, invoking rte_memcpy() will not
introduce any performance boost.
And, to my suprise, rte_memcpy() is VERY huge. Since rte_memcpy()
is inlined, it increases the binary code size linearly every time
we call it at a different place. Replacing the two rte_memcpy()
with directly copy saves nearly 12K bytes of code size!
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Mon, 14 Mar 2016 07:35:22 +0000 (15:35 +0800)]
vhost: refactor mergeable Rx
Current virtio_dev_merge_rx() implementation just looks like the
old rte_vhost_dequeue_burst(), full of twisted logic, that you
can see same code block in quite many different places.
However, the logic of virtio_dev_merge_rx() is quite similar to
virtio_dev_rx(). The big difference is that the mergeable one
could allocate more than one available entries to hold the data.
Fetching all available entries to vec_buf at once makes the
difference a bit bigger then.
The refactored code looks like below:
while (mbuf_has_not_drained_totally || mbuf_has_next) {
if (this_desc_has_no_room) {
this_desc = fetch_next_from_vec_buf();
if (it is the last of a desc chain)
update_used_ring();
}
if (this_mbuf_has_drained_totally)
mbuf = fetch_next_mbuf();
COPY(this_desc, this_mbuf);
}
This patch reduces quite many lines of code, therefore, make it much
more readable.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:40 +0000 (12:32 +0800)]
vhost: refactor Rx
This is a simple refactor, as there isn't any twisted logic in old
code. Here I just broke the code and introduced two helper functions,
reserve_avail_buf() and copy_mbuf_to_desc() to make the code more
readable.
Also, it saves nearly 1K bytes of binary code size.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 10 Mar 2016 04:32:39 +0000 (12:32 +0800)]
vhost: refactor dequeueing
The current rte_vhost_dequeue_burst() implementation is a bit messy
and logic twisted. And you could see repeat code here and there.
However, rte_vhost_dequeue_burst() acutally does a simple job: copy
the packet data from vring desc to mbuf. What's tricky here is:
- desc buff could be chained (by desc->next field), so that you need
fetch next one if current is wholly drained.
- One mbuf could not be big enough to hold all desc buff, hence you
need to chain the mbuf as well, by the mbuf->next field.
The simplified code looks like following:
while (this_desc_is_not_drained_totally || has_next_desc) {
if (this_desc_has_drained_totally) {
this_desc = next_desc();
}
if (mbuf_has_no_room) {
mbuf = allocate_a_new_mbuf();
}
COPY(mbuf, desc);
}
Note that the old patch does a special handling for skipping virtio
header. However, that could be simply done by adjusting desc_avail
and desc_offset var:
desc_avail = desc->len - vq->vhost_hlen;
desc_offset = vq->vhost_hlen;
This refactor makes the code much more readable (IMO), yet it reduces
binary code size.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>