dpdk.git
8 years agonet/i40e: fix parsing QinQ packets type
Beilei Xing [Mon, 12 Sep 2016 09:41:46 +0000 (17:41 +0800)]
net/i40e: fix parsing QinQ packets type

Previously, PTYPE filed in the RX descriptors is not set properly
for QinQ packets. The wrong PTYPE is generated because outer Tag did
not have ORT/PIT configured, so fix this issue by configuring ORT/PIT.
This patch also changes bitmask of outer VLAN tag in L2 header
to support RSS and flow director for QinQ.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Fixes: 4072d503aaa5 ("i40e: fix VLAN bitmasks for input set")

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
8 years agonet/i40e: fix null pointer dereferences when using VMDq+RSS
Rich Lane [Tue, 2 Aug 2016 19:34:56 +0000 (12:34 -0700)]
net/i40e: fix null pointer dereferences when using VMDq+RSS

When using VMDQ+RSS, the queue ids used by the application are not
contiguous (see i40e_pf_config_rss). Most of the driver already handled
this, but there were a few cases where it assumed all configured queues
had been setup.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Fixes: 6b4537128394 ("i40e: free queue memory when closing")
Fixes: 8e109464c022 ("i40e: allow vector Rx and Tx usage")

Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
8 years agonet/ixgbe: fix VF reset to apply to correct VF
Alex Zelezniak [Tue, 30 Aug 2016 01:23:29 +0000 (20:23 -0500)]
net/ixgbe: fix VF reset to apply to correct VF

In SR-IOV configuration, queues 0 - nb_rx_queues belong to VF0,
which means that with the current implementation when a reset mbox
message comes from any VF, it affects the settings of VF0.

Fix this by using PF queue index to update the correct queue.

Fixes: dbb0b8737f64 ("ixgbe: add vlan offload support")

Signed-off-by: Alex Zelezniak <alexz@att.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
8 years agonet/ixgbe: start Rx/Tx after all config done
Wang Wei [Tue, 6 Sep 2016 12:05:17 +0000 (20:05 +0800)]
net/ixgbe: start Rx/Tx after all config done

Starting rxtx before flow director config will cause driver not to
receive packets from NIC.

Signed-off-by: Wang Wei <lnykww@gmail.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
8 years agonet/i40e: fix dropping packets with ethertype 0x88A8
Beilei Xing [Wed, 17 Aug 2016 01:58:06 +0000 (09:58 +0800)]
net/i40e: fix dropping packets with ethertype 0x88A8

In FW default settings, Ethertype 0x88A8 is treated as S-TAG,
and packets with S-TAG should be received in Port Virtualizer mode.
However, Port Virtualizer mode is not initialized in DPDK, so X710 will
drop packets with Ethertype 0x88A8.
This patch fixes this issue by turning off S-TAG identification.

Fixes: 4861cde46116 ("i40e: new poll mode driver")

Signed-off-by: Beilei Xing <beilei.xing@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
8 years agonet/enic: fix bad L4 checksum flag on ICMP packets
John Daley [Wed, 17 Aug 2016 22:15:26 +0000 (15:15 -0700)]
net/enic: fix bad L4 checksum flag on ICMP packets

The bad L4 checksum flag was set on IP packets which were not
also TCP or UDP packets. This includes ICMP, IGMP and OSPF packets.

L4 ptypes were being treated as bits instead of values within the
L4 mask causing the code to check L4 checksum in the completion
queue and incorrectly set the L4 bad checksum flag.

Fixes: 947d860c821f ("enic: improve Rx performance")

Reviewed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
8 years agonet/fm10k: fix MAC address removal from switch
Xiao Wang [Fri, 5 Aug 2016 03:17:43 +0000 (11:17 +0800)]
net/fm10k: fix MAC address removal from switch

When testpmd quits with two ports, the second port's MAC address
remains in the MAC table of switch manager.

There needs to be some time for HW to quiesce when closing a port,
otherwise a subsequent port close won't be handled correctly.

This patch adds a delay after turning off a logic port, just as
the kernel driver does.

Fixes: 8b5c9ec20b7b ("fm10k: support VMDQ in MAC/VLAN filter")

Reported-by: Xueqin Lin <xueqin.lin@intel.com>
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
8 years agonet/enic: move link checking init to probe time
Nelson Escobar [Tue, 9 Aug 2016 21:42:04 +0000 (14:42 -0700)]
net/enic: move link checking init to probe time

The enic DMAs link status information to the host and this requires a
little setup. This setup was being done as a result of calling
rte_eth_dev_start(). But applications expect to be able to check link
status before calling rte_eth_dev_start().

This patch moves the link status setup to enic_init() which is called
at device probe time so that link status can be checked anytime.

Fixes: fefed3d1e62c ("enic: new driver")

Signed-off-by: Nelson Escobar <neescoba@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
8 years agonet/pcap: fix memory leak in jumbo frames
Dror Birkman [Tue, 20 Sep 2016 12:08:56 +0000 (15:08 +0300)]
net/pcap: fix memory leak in jumbo frames

If rte_pktmbuf_alloc() fails on any segment that is not the initial
segment, previously allocated mbufs are not freed.

Fixes: 6db141c91e1f ("pcap: support jumbo frames")

Signed-off-by: Dror Birkman <dror.birkman@lightcyber.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
8 years agonet/mlx5: remove gather loop on segments
Nélio Laranjeiro [Tue, 20 Sep 2016 08:53:51 +0000 (10:53 +0200)]
net/mlx5: remove gather loop on segments

Tx function was handling a double loop to send segmented packets, it can be
done in a single one.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
8 years agonet/mlx5: reduce memory overhead for WQE handling
Nélio Laranjeiro [Tue, 20 Sep 2016 08:53:50 +0000 (10:53 +0200)]
net/mlx5: reduce memory overhead for WQE handling

PMD uses only power of two number of Work Queue Elements (aka WQE), storing
the number of elements in log2 helps to reduce the size of the container to
store it.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
8 years agonet/mlx5: reduce memory overhead for BF handling
Nélio Laranjeiro [Tue, 20 Sep 2016 08:53:49 +0000 (10:53 +0200)]
net/mlx5: reduce memory overhead for BF handling

Blue Flame (aka BF) is a buffer allocated with a power of two value, its
size is returned by Verbs in log2.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
8 years agonet/mlx5: reduce memory overhead for CQE handling
Nélio Laranjeiro [Tue, 20 Sep 2016 08:53:48 +0000 (10:53 +0200)]
net/mlx5: reduce memory overhead for CQE handling

PMD uses only power of two number of Completion Queue Elements (aka CQE),
storing the number of elements in log2 helps to reduce the size of the
container to store it.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
8 years agonet/mlx5: reduce memory overhead of Rx/Tx descriptors
Nélio Laranjeiro [Tue, 20 Sep 2016 08:53:47 +0000 (10:53 +0200)]
net/mlx5: reduce memory overhead of Rx/Tx descriptors

PMD uses only power of two number of descriptors, storing the number of
elements in log2 helps to reduce the size of the container to store it.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
8 years agonet/mlx5: rework hardware structures
Nélio Laranjeiro [Tue, 20 Sep 2016 08:53:46 +0000 (10:53 +0200)]
net/mlx5: rework hardware structures

Rework Work Queue Element (aka WQE) structures to fit PMD needs.
A WQE is an aggregation of 16 bytes elements known as "data segments"
(aka dseg).

The only common part is the first two elements i.e. the control one to
define the job type, and the Ethernet segment which embed offload requests
with other information, after that, it can have:
  - a raw data packet,
  - a data pointer to the packet itself,
  - both.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
8 years agonet/nfp: unregister interrupt callback when closing
Alejandro Lucero [Fri, 16 Sep 2016 11:11:14 +0000 (12:11 +0100)]
net/nfp: unregister interrupt callback when closing

With an app using hotplug feature, when a device is unplugged without
unregistering makes the interrupt handling unstable.

Fixes: 6c53f87b3497 ("nfp: add link status interrupt")

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
8 years agonet/nfp: fix copying MAC address
Alejandro Lucero [Fri, 16 Sep 2016 11:11:04 +0000 (12:11 +0100)]
net/nfp: fix copying MAC address

Fixes: defb9a5dd156 ("nfp: introduce driver initialization")

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
8 years agonet/nfp: use random MAC address if not configured
Alejandro Lucero [Fri, 16 Sep 2016 11:10:48 +0000 (12:10 +0100)]
net/nfp: use random MAC address if not configured

Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
8 years agonet/thunderx: support 81xx SoC
Jerin Jacob [Thu, 21 Jul 2016 14:01:47 +0000 (19:31 +0530)]
net/thunderx: support 81xx SoC

81xx NIC subsystem differs in new PCI subsystem_device_id and
NICVF_CAP_CQE_RX2 capability.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
8 years agonet/thunderx: add tunneling extension info capability flag
Jerin Jacob [Thu, 21 Jul 2016 14:01:46 +0000 (19:31 +0530)]
net/thunderx: add tunneling extension info capability flag

Certain thunderx SoC pass has additional optional word
in Rx descriptor to hold tunneling extension info.
Based on this capability, the location where packet pointer
address stored in Rx descriptor will vary.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
8 years agonet/thunderx: remove generic passX references
Jerin Jacob [Thu, 21 Jul 2016 14:01:45 +0000 (19:31 +0530)]
net/thunderx: remove generic passX references

thunderx pmd driver needs to support multiple SoC
variants in ThunderX family.
Remove generic pass references from driver as each SoC
can have same pass number.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
8 years agonet/mlx: fix debug build with gcc 6.1
Bruce Richardson [Mon, 19 Sep 2016 14:36:54 +0000 (15:36 +0100)]
net/mlx: fix debug build with gcc 6.1

With recent gcc versions, e.g. gcc 6.1, compilation of mlx drivers with
debug enabled produces lots of errors complaining that "pedantic" is
not a warning level that can be ignored.

error: ‘-pedantic’ is not an option that controls warnings [-Werror=pragmas]
 #pragma GCC diagnostic ignored "-pedantic"
                                 ^~~~~~~~~~~

These errors can be removed by changing the "-pedantic" to "-Wpedantic".

Fixes: 7fae69eeff13 ("mlx4: new poll mode driver")
Fixes: 771fa900b73a ("mlx5: introduce new driver for Mellanox ConnectX-4 adapters")

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
8 years agonet/pcap: fix checkpatch warnings
Ferruh Yigit [Fri, 26 Aug 2016 11:17:59 +0000 (12:17 +0100)]
net/pcap: fix checkpatch warnings

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: remove rte prefix from static functions
Ferruh Yigit [Fri, 26 Aug 2016 11:17:58 +0000 (12:17 +0100)]
net/pcap: remove rte prefix from static functions

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: coding convention updates
Ferruh Yigit [Fri, 26 Aug 2016 11:17:57 +0000 (12:17 +0100)]
net/pcap: coding convention updates

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: fix missing Tx interface assignment
Ferruh Yigit [Fri, 26 Aug 2016 11:17:56 +0000 (12:17 +0100)]
net/pcap: fix missing Tx interface assignment

Missing pcap assignment may cause pcap file/interface to be opened
again, and previous one not closed.

Fixes: 1e38a7c66923 ("pcap: fix storage of name and type in queues")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: simplify function
Ferruh Yigit [Fri, 26 Aug 2016 11:17:55 +0000 (12:17 +0100)]
net/pcap: simplify function

simplify function rte_eth_from_pcaps_common by using interim variables.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: remove redundant assignment
Ferruh Yigit [Fri, 26 Aug 2016 11:17:54 +0000 (12:17 +0100)]
net/pcap: remove redundant assignment

data->name assigned twice.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: remove unnecessary check
Ferruh Yigit [Fri, 26 Aug 2016 11:17:53 +0000 (12:17 +0100)]
net/pcap: remove unnecessary check

Both fields are fields of same type of struct, one's size can't be bigger
than others.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: update single interface handling
Ferruh Yigit [Fri, 26 Aug 2016 11:17:52 +0000 (12:17 +0100)]
net/pcap: update single interface handling

Remove hardcoded single interface values, make it more obvious.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: reorder functions
Ferruh Yigit [Fri, 26 Aug 2016 11:17:51 +0000 (12:17 +0100)]
net/pcap: reorder functions

Reorder functions to be able to remove function declarations in .c file.
Function definitions not modified.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: reorder includes
Ferruh Yigit [Fri, 26 Aug 2016 11:17:50 +0000 (12:17 +0100)]
net/pcap: reorder includes

Remove unused ones and sort remaining.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: make const array static
Ferruh Yigit [Fri, 26 Aug 2016 11:17:49 +0000 (12:17 +0100)]
net/pcap: make const array static

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: group stats related fields into a struct
Ferruh Yigit [Fri, 26 Aug 2016 11:17:48 +0000 (12:17 +0100)]
net/pcap: group stats related fields into a struct

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: use single interface flag instead of hardcoding
Ferruh Yigit [Fri, 26 Aug 2016 11:17:47 +0000 (12:17 +0100)]
net/pcap: use single interface flag instead of hardcoding

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: remove duplicated max queue number check
Ferruh Yigit [Fri, 26 Aug 2016 11:17:46 +0000 (12:17 +0100)]
net/pcap: remove duplicated max queue number check

Remove duplicated check by reorganizing the code, no functional change.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: move comment
Ferruh Yigit [Fri, 26 Aug 2016 11:17:45 +0000 (12:17 +0100)]
net/pcap: move comment

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: do not carry kvlist argument
Ferruh Yigit [Fri, 26 Aug 2016 11:17:44 +0000 (12:17 +0100)]
net/pcap: do not carry kvlist argument

Don't carry kvlist argument into sub function and used it, use kvlist
argument in upper level of call stack.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: do not carry numa node argument
Ferruh Yigit [Fri, 26 Aug 2016 11:17:43 +0000 (12:17 +0100)]
net/pcap: do not carry numa node argument

Instead of defining numa_node variable upper level of call stack and
carry into sub function, set it where needs to be used.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: update function to allow reuse
Ferruh Yigit [Fri, 26 Aug 2016 11:17:42 +0000 (12:17 +0100)]
net/pcap: update function to allow reuse

rte_eth_from_pcaps and rte_eth_from_pcaps_n_dumpers functions are very
close, updated rte_eth_from_pcaps function and reused.

No functional update.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: check max queue number
Ferruh Yigit [Fri, 26 Aug 2016 11:17:41 +0000 (12:17 +0100)]
net/pcap: check max queue number

Number of queues is defined by devargs, a check added to be sure this
number is not bigger than configured max number of queue.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: reorganize private structs
Ferruh Yigit [Fri, 26 Aug 2016 11:17:40 +0000 (12:17 +0100)]
net/pcap: reorganize private structs

struct rx_pcaps and tx_pcaps used to point parsed devargs, but it is not
clear with current names.

Merged both into single struct and modified struct name and field names.

Functionality not changed, only struct names.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: use macros for parameter string
Ferruh Yigit [Fri, 26 Aug 2016 11:17:39 +0000 (12:17 +0100)]
net/pcap: use macros for parameter string

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/pcap: convert config option to a macro
Ferruh Yigit [Fri, 26 Aug 2016 11:17:38 +0000 (12:17 +0100)]
net/pcap: convert config option to a macro

pcap PMD is using ring PMD configuration parameters to set max number of
queues. This creates an unnecessary dependency and confusion.

Stop using configuration parameter to set max number of queues and
convert this variable into a macro within source code, to simplify
configuration file.

Default value of macro is same as ring parameter's default.

pcap pmd doesn't need to be configured in a detail to set rx and tx max
queue numbers separately, so using same macro for both queues.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/e1000: fix returned number of available Rx descriptors
Ali Volkan Atli [Wed, 27 Jul 2016 13:11:09 +0000 (16:11 +0300)]
net/e1000: fix returned number of available Rx descriptors

Fixes: 0f6b7c7f7a37 ("igb: use DD bit to count RX available descriptors")

Signed-off-by: Ali Volkan Atli <volkan.atli@argela.com.tr>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
8 years agonet/mlx5: fix inline logic
Nélio Laranjeiro [Wed, 14 Sep 2016 11:53:55 +0000 (13:53 +0200)]
net/mlx5: fix inline logic

To improve performance the NIC expects for large packets to have a pointer
to a cache aligned address, old inline code could break this assumption
which hurts performance.

Fixes: 2a66cf378954 ("net/mlx5: support inline send")

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
8 years agonet/mlx5: re-factorize functions
Nélio Laranjeiro [Wed, 14 Sep 2016 11:53:54 +0000 (13:53 +0200)]
net/mlx5: re-factorize functions

Rework logic of wqe_write() and wqe_write_vlan() which are pretty similar
to keep a single one.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
8 years agonet/mlx5: force inline for completion function
Nélio Laranjeiro [Wed, 14 Sep 2016 11:53:53 +0000 (13:53 +0200)]
net/mlx5: force inline for completion function

This function was supposed to be inlined, but was not because several
functions calls it.  This function should always be inline avoid
external function calls and to optimize code in data-path.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
8 years agonet/mlx5: fix flow director drop mode
Yaacov Hazan [Wed, 14 Sep 2016 11:53:52 +0000 (13:53 +0200)]
net/mlx5: fix flow director drop mode

Packet rejection was routed to a polled queue.  This patch route them to a
dummy queue which is not polled.

Fixes: 76f5c99e6840 ("mlx5: support flow director")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
8 years agonet/mlx5: refactor allocation of flow director queues
Yaacov Hazan [Wed, 14 Sep 2016 11:53:51 +0000 (13:53 +0200)]
net/mlx5: refactor allocation of flow director queues

This is done to prepare support for drop queues, which are not related to
existing Rx queues and need to be managed separately.

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
8 years agonet/mlx5: fix removing VLAN filter
Raslan Darawsheh [Wed, 14 Sep 2016 11:53:50 +0000 (13:53 +0200)]
net/mlx5: fix removing VLAN filter

memmove was moving bytes as the number of elements next to i, while it
should move the number of elements multiplied by the size of each element.

Fixes: e9086978 ("mlx5: support VLAN filtering")

Signed-off-by: Raslan Darawsheh <rdarawsheh@asaltech.com>
8 years agonet/mlx5: fix Rx VLAN offload capability report
Adrien Mazarguil [Wed, 14 Sep 2016 11:53:49 +0000 (13:53 +0200)]
net/mlx5: fix Rx VLAN offload capability report

This capability is implemented but not reported.

Fixes: f3db9489188a ("mlx5: support Rx VLAN stripping")

Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
8 years agonet/mlx5: fix inconsistent return value in flow director
Yaacov Hazan [Wed, 14 Sep 2016 11:53:48 +0000 (13:53 +0200)]
net/mlx5: fix inconsistent return value in flow director

The return value in DPDK is negative errno on failure.
Since internal functions in mlx driver return positive
values need to negate this value when it returned to
dpdk layer.

Fixes: 76f5c99 ("mlx5: support flow director")

Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
8 years agodoc: fix typo in VF guide
Rami Rosen [Sat, 6 Aug 2016 06:16:23 +0000 (09:16 +0300)]
doc: fix typo in VF guide

This patch fixes a typo in doc/guides/nics/intel_vf.rst.

Fixes: fc1f2750a3ec ("doc: programmers guide")

Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
8 years agonet/mlx5: fix possible NULL dereference in Rx path
Sagi Grimberg [Tue, 2 Aug 2016 14:41:21 +0000 (17:41 +0300)]
net/mlx5: fix possible NULL dereference in Rx path

The user is allowed to call ->rx_pkt_burst() even without free
mbufs in the pool. In this scenario we'll fail allocating a rep mbuf
on the first iteration (where pkt is still NULL). This would cause us
to deref a NULL pkt (reset refcount and free).

Fix this by checking the pkt before freeing it.

Fixes: a1bdb71a32da ("net/mlx5: fix crash in Rx")

Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
8 years agonet/i40e: add packet type translation for X722
Jeff Guo [Wed, 7 Sep 2016 09:38:40 +0000 (05:38 -0400)]
net/i40e: add packet type translation for X722

To make the PCTYPE in x722 compatible with original PCTYPE in
flow director (FD) filters, the PCTYPE in the FD programming
descriptor needs to be translated into a different PCTYPE using
GLQF_FD_PCTYPE table.
Translation needs to be done before the FD filter is programmed.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
8 years agonet/i40e: add new packet types for device X722
Jeff Guo [Wed, 7 Sep 2016 09:38:02 +0000 (05:38 -0400)]
net/i40e: add new packet types for device X722

There are 6 new PCTYPEs enabled in the device x722.

The 6 new PCTYPEs As below:
* NonF Unicast IPv4, UDP
* NonF Multicast IPv4, UDP
* NonF IPv4, TCP, SYN, no ACK
* NonF Unicast IPv6, UDP
* NonF Multicast IPv6, UDP

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
8 years agonet/vmxnet3: enable LRO
Yong Wang [Mon, 29 Aug 2016 19:18:50 +0000 (12:18 -0700)]
net/vmxnet3: enable LRO

This change enables device LRO if requested.

The current implementation of jumbo frame Rx can be used for LRO
directly without changes.

Note that since jumbo frame uses both ring0 and ring1, it cannot
be enabled in UPT (VMDirectPath) mode.

Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/vmxnet3: update NIC documentation
Yong Wang [Mon, 29 Aug 2016 19:18:49 +0000 (12:18 -0700)]
net/vmxnet3: update NIC documentation

Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/vmxnet3: update feature doc
Yong Wang [Mon, 29 Aug 2016 19:18:48 +0000 (12:18 -0700)]
net/vmxnet3: update feature doc

Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/vmxnet3: reallocate shared memzone on re-config
Yong Wang [Mon, 29 Aug 2016 19:18:47 +0000 (12:18 -0700)]
net/vmxnet3: reallocate shared memzone on re-config

When adding a DPDK port to ovs-vswitchd with DPDK, the vmxnet3 device
fails to activate due to mismatched magic number.  This failure causes
following operations to run: start the port, stop the port,
reconfigure and re-start the port.

During reconfigure, if there is an existing memzone, driver will reuse
it. But reconfigure may request different number of Tx/Rx queues.
This results in a memzone with wrong size and potential invalid memory
access.

To fix this, free the memzone if found and reserve a new one.

Signed-off-by: Yong Wang <yongwang@vmware.com>
Reviewed-by: Guolin Yang <gyang@vmware.com>
Reviewed-by: Daniele Di Proietto <ddiproietto@vmware.com>
Tested-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/vmxnet3: coding style changes
Yong Wang [Mon, 29 Aug 2016 19:18:46 +0000 (12:18 -0700)]
net/vmxnet3: coding style changes

Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/vmxnet3: improve error checks and return values
Yong Wang [Mon, 29 Aug 2016 19:18:45 +0000 (12:18 -0700)]
net/vmxnet3: improve error checks and return values

Signed-off-by: Yong Wang <yongwang@vmware.com>
Reviewed-by: Juho Snellman <jsnell@iki.fi>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agonet/i40e: fix mbuf leak during Rx queue release
Yury Kylulin [Mon, 29 Aug 2016 16:50:48 +0000 (19:50 +0300)]
net/i40e: fix mbuf leak during Rx queue release

For the vector PMD, release all mbufs from the Rx queue if no packets
are received after device start.

Fixes: 9ed94e5bb04e ("i40e: add vector Rx")

Signed-off-by: Yury Kylulin <yury.kylulin@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
8 years agonet/ixgbe: fix mbuf leak during Rx queue release
Yury Kylulin [Mon, 29 Aug 2016 16:50:47 +0000 (19:50 +0300)]
net/ixgbe: fix mbuf leak during Rx queue release

For the vector PMD, release all mbufs from the Rx queue if no packets
are received after device start.

Fixes: 11b220c6498d ("ixgbe: fix release queue mbufs")

Signed-off-by: Yury Kylulin <yury.kylulin@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
8 years agoip_frag: fix missing dependency on hash library
Panu Matilainen [Wed, 5 Oct 2016 12:14:08 +0000 (15:14 +0300)]
ip_frag: fix missing dependency on hash library

Not sure what exactly changed and where, but I've started getting
build failures on Fedora rawhide i386:
    lib/librte_ip_frag/ip_frag_internal.c:36:23: fatal error:
    rte_jhash.h: No such file or directory
     #include <rte_jhash.h>
                       ^
Looking at librte_ip_frag, it clearly depends on librte_hash so
its probably more a question of something commonly masking the issue.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
8 years agokni: remove unnecessary ethtool files
Ferruh Yigit [Fri, 30 Sep 2016 10:10:30 +0000 (11:10 +0100)]
kni: remove unnecessary ethtool files

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
8 years agokni: remove unused ethtool files
Ferruh Yigit [Fri, 30 Sep 2016 10:10:29 +0000 (11:10 +0100)]
kni: remove unused ethtool files

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
8 years agoapp/testpmd: reset headroom after txonly packet allocation
Maxime Coquelin [Tue, 4 Oct 2016 12:05:24 +0000 (14:05 +0200)]
app/testpmd: reset headroom after txonly packet allocation

This patch fixes txonly raw packets allocations by resetting the
available headroom.

Indeed, some PMDs such as Virtio might prepend some data to the
packet, resulting in mbuf's data_off field to be decremented each
time the mbuf gets re-allocated.

For Virtio PMD, it means that we use only single descriptors for the
first times mbufs get allocated, as at some point there is not
enough headroom to store the header.

Other alternative would be use standard API to allocate the packets,
which does reset the headroom, but the impact on performance is too
big to consider this an option.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
8 years agombuf: add function to reset headroom
Maxime Coquelin [Tue, 4 Oct 2016 12:05:23 +0000 (14:05 +0200)]
mbuf: add function to reset headroom

Some application use rte_mbuf_raw_alloc() function to improve
performance by not resetting mbuf's fields to their default state.

This can be however problematic for mbuf consumers that need some
headroom, meaning that data_off field gets decremented after
allocation. When the mbuf is re-used afterwards, there might not
be enough room for the consumer to prepend anything, if the data_off
field is not reset to its default value.

This patch adds a new rte_pktmbuf_reset_headroom() function that
applications can call to reset the data_off field.
This patch also replaces current data_off affectations in the mbuf
lib with a call to this function.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
8 years agombuf: fix error handling on pool creation
Olivier Matz [Mon, 19 Sep 2016 12:34:41 +0000 (14:34 +0200)]
mbuf: fix error handling on pool creation

On error, the mempool object has to be freed, and rte_errno should be a
positive value.

Fixes: 152ca517900b ("mbuf: use default mempool handler from config")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
8 years agohash: modify lookup bulk pipeline
Byron Marohn [Tue, 4 Oct 2016 23:25:15 +0000 (00:25 +0100)]
hash: modify lookup bulk pipeline

This patch replaces the pipelined rte_hash lookup mechanism with a
loop-and-jump model, which performs significantly better,
especially for smaller table sizes and smaller table occupancies.

Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
8 years agohash: add vectorized comparison
Byron Marohn [Tue, 4 Oct 2016 23:25:14 +0000 (00:25 +0100)]
hash: add vectorized comparison

In lookup bulk function, the signatures of all entries
are compared against the signature of the key that is being looked up.
Now that all the signatures are together, they can be compared
with vector instructions (SSE, AVX2), achieving higher lookup performance.

Also, entries per bucket are increased to 8 when using processors
with AVX2, as 256 bits can be compared at once, which is the size of
8x32-bit signatures.

Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
8 years agohash: reorganize bucket structure
Byron Marohn [Tue, 4 Oct 2016 23:25:13 +0000 (00:25 +0100)]
hash: reorganize bucket structure

Move current signatures of all entries together in the bucket
and same with all alternative signatures, instead of having
current and alternative signatures together per entry in the bucket.
This will be benefitial in the next commits, where a vectorized
comparison will be performed, achieving better performance.

The alternative signatures have been moved away from
the current signatures, to make the key indices be consecutive
to the current signatures, as these two fields are used by lookup,
so they are in the same cache line.

Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
8 years agohash: reorder hash structure
Pablo de Lara [Tue, 4 Oct 2016 23:25:12 +0000 (00:25 +0100)]
hash: reorder hash structure

In order to optimize lookup performance, hash structure
is reordered, so all fields used for lookup will be
in the first cache line.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
8 years agotimer: fix lag delay
Karmarkar Suyash [Wed, 21 Sep 2016 20:54:27 +0000 (16:54 -0400)]
timer: fix lag delay

For periodic timers, if the lag gets introduced, the current code
added additional delay when the next peridoc timer was initialized
by not taking into account the delay added, with this fix the code
would start the next occurrence of timer keeping in account the
lag added. Corrected the behavior.

Fixes: 9b15ba89 ("timer: use a skip list")

Signed-off-by: Karmarkar Suyash <skarmarkar@sonusnet.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
8 years agomem: fix hugepage mapping error messages
Jean Tourrilhes [Tue, 4 Oct 2016 17:17:03 +0000 (10:17 -0700)]
mem: fix hugepage mapping error messages

Running secondary is tricky due to the need to map the memory region
at the right place in VM, which is whatever primary has chosen. If the
base address for primary happens to by already mapped in the
secondary, we will hit precisely these error messages (depending if we
fail on the config region or the hugepages). This is why there is
already a comment about ASLR.

The issue is that in most cases, remapping does not happen and "errno"
is not changed and therefore stale. In our case, we got a "permission
denied", which sent us down the wrong track. It's such a common error
for secondary that I feel this error message should be unambiguous and
helpful.
The call to close was also moved because close() may override errno.

Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
8 years agoeal: fix C++ link of delay function pointer
Konstantin Ananyev [Mon, 3 Oct 2016 17:27:25 +0000 (18:27 +0100)]
eal: fix C++ link of delay function pointer

When compiling with C++, it treats
void (*rte_delay_us)(unsigned int us);
as definition of the global variable.
So further linking with librte_eal fails.

Fixes: b4d63fb62240 ("eal: customize delay function")

Steps to reproduce:

$ cat rttm1.cpp

using namespace std;

int main(int argc, char *argv[])
{
        int ret = rte_eal_init(argc, argv);
        rte_delay_us(1);
        cout << "return code ";
        cout << ret;
        return ret;
}

$ g++ -m64 -I/${RTE_SDK}/${RTE_TARGET}/include -c  -o rttm1.o rttm1.cpp
$ gcc -m64 -pthread -o rttm1 rttm1.o -ldl -Wl,-lstdc++ \
  -L/${RTE_SDK}/${RTE_TARGET}/lib -Wl,-lrte_eal
.../librte_eal.a(eal_common_timer.o):
(.bss+0x0): multiple definition of `rte_delay_us'
rttm1.o:(.bss+0x0): first defined here
collect2: error: ld returned 1 exit status

$ nm rttm1.o | grep rte_delay_us
0000000000000092 t _GLOBAL__sub_I_rte_delay_us
0000000000000000 B rte_delay_us

Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agonet/vhost: add extended statistics
Zhiyong Yang [Thu, 29 Sep 2016 12:35:49 +0000 (20:35 +0800)]
net/vhost: add extended statistics

This feature adds vhost pmd extended statistics from per port perspective
in order to meet the requirements of the applications such as OVS etc.
RX/TX xstats count the bytes without CRC. This is different from physical
NIC stats with CRC.

The statistics counters are based on RFC 2819 and RFC 2863 as follows:

rx/tx_good_packets
rx/tx_total_bytes
rx/tx_missed_pkts
rx/tx_broadcast_packets
rx/tx_multicast_packets
rx/tx_unicast_packets
rx/tx_undersize_errors
rx/tx_size_64_packets
rx/tx_size_65_to_127_packets;
rx/tx_size_128_to_255_packets;
rx/tx_size_256_to_511_packets;
rx/tx_size_512_to_1023_packets;
rx/tx_size_1024_to_1522_packets;
rx/tx_1523_to_max_packets;
rx/tx_errors
rx_fragmented_errors
rx_jabber_errors
rx_unknown_protos_packets;

No API is changed or added.
rte_eth_xstats_get_names() to retrieve what kinds of vhost xstats are
supported,
rte_eth_xstats_get() to retrieve vhost extended statistics,
rte_eth_xstats_reset() to reset vhost extended statistics.

The usage of vhost pmd xstats is the same as virtio pmd xstats.
for example, when test-pmd application is running in interactive mode
vhost pmd xstats will support the two following commands:

show port xstats all | port_id will show vhost xstats
clear port xstats all | port_id will reset vhost xstats

net/virtio pmd xstats(the function virtio_update_packet_stats) is used
as reference when implementing the feature.

Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet/vhost: move statistics into a structure
Zhiyong Yang [Thu, 29 Sep 2016 12:35:48 +0000 (20:35 +0800)]
net/vhost: move statistics into a structure

The patch moves all stats counters to a new defined struct vhost_stats
as follows, in order to manage all stats counters in a unified way and
simplify the subsequent function implementation(vhost_dev_xstats_reset).

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet/vhost: retrieve vid for a given port
Ciara Loftus [Tue, 13 Sep 2016 13:47:43 +0000 (14:47 +0100)]
net/vhost: retrieve vid for a given port

In some cases when using the vHost PMD, certain vHost library functions
may still need to be accessed. One such example is the
rte_vhost_get_queue_num function which returns the number of virtqueues
reported by the guest - information which is not exposed by the PMD.

This commit introduces a new rte_eth_vhost function that returns the
'vid' associated with a given port id. This allows the PMD user to call
vHost library functions which require the 'vid' value.

Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet/virtio: add NEON based Rx handler
Jerin Jacob [Thu, 18 Aug 2016 04:12:11 +0000 (12:12 +0800)]
net/virtio: add NEON based Rx handler

Added neon based Rx vector implementation.
Selection of the new handler based neon availability at runtime.
Updated the release notes and MAINTAINERS file.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
8 years agonet/virtio: select data handler depending on CPU flag
Jerin Jacob [Tue, 5 Jul 2016 12:49:25 +0000 (18:19 +0530)]
net/virtio: select data handler depending on CPU flag

Introduced cpuflag based run-time detection to select the
SSE based simple Rx handler

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet/virtio: move SSE based Rx code to separate file
Jerin Jacob [Tue, 5 Jul 2016 12:49:24 +0000 (18:19 +0530)]
net/virtio: move SSE based Rx code to separate file

Split out SSE instruction based virtio simple Rx
implementation to a separate file

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet/virtio: cleanup conditional compilation
Jerin Jacob [Tue, 5 Jul 2016 12:49:23 +0000 (18:19 +0530)]
net/virtio: cleanup conditional compilation

Removed unnecessary compile time dependency on "use_simple_rxtx".

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet: fix clang build
Yuanhan Liu [Mon, 26 Sep 2016 04:29:13 +0000 (12:29 +0800)]
net: fix clang build

Interestingly, clang and gcc has different prototype for _mm_prefetch().
For gcc, we have

   _mm_prefetch (const void *__P, enum _mm_hint __I)

While for clang, it's

   #define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), 0, (sel)))

That's how the following error comes with clang:

   error: cast from 'const void *' to 'void *' drops const qualifier
   [-Werror,-Wcast-qual]
           _mm_prefetch((const void *)rused, _MM_HINT_T0);
   /usr/lib/llvm-3.8/bin/../lib/clang/3.8.0/include/xmmintrin.h:684:58:
   note: expanded from macro '_mm_prefetch'
            #define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a),
                                          0, (sel)))

What's weird is that the build was actaully Okay before. I met it while
apply Jerin's vector support for ARM patch set: he just move this piece
of code to another file, nothing else changed.

This patch fix the issue when Jerin's patchset is applied. Thus, I think
it's still needed.

Similarly, make the same change to other _mm_prefetch users, just in case
this weird issue shows up again somehow later.

Fixes: fc3d66212fed ("virtio: add vector Rx")
Fixes: c95584dc2b18 ("ixgbe: new vectorized functions for Rx/Tx")
Fixes: 9ed94e5bb04e ("i40e: add vector Rx")
Fixes: 7092be8437bd ("fm10k: add vector Rx")

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
8 years agonet/virtio_user: fix error management during init
Jianfeng Tan [Tue, 27 Sep 2016 19:11:06 +0000 (19:11 +0000)]
net/virtio_user: fix error management during init

Currently, when virtio_user device fails to be started (e.g., vhost
unix socket does not exit), the init function does not return struct
rte_eth_dev (and some other structs) back to ether layer. And what's
more, it does not report the error to upper layer.

The fix is to free those structs and report error when failing to
start virtio_user devices.

Fixes: ce2eabdd43ec ("net/virtio-user: add virtual device")

Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet/virtio_user: fix wrong sequence of messages
Jianfeng Tan [Tue, 27 Sep 2016 19:11:05 +0000 (19:11 +0000)]
net/virtio_user: fix wrong sequence of messages

When virtio_user is used with VPP's native vhost user, it cannot
send/receive any packets.

The root cause is that vpp-vhost-user translates the message
VHOST_USER_SET_FEATURES as puting this device into init state,
aka, zero all related structures. However, previous code
puts this message at last in the whole initialization process,
which leads to all previous information are zeroed.

To fix this issue, we rearrange the sequence of those messages.
  - step 0, send VHOST_USER_SET_VRING_CALL so that vhost allocates
    virtqueue structures;
  - step 1, send VHOST_USER_SET_FEATURES to confirm the features;
  - step 2, send VHOST_USER_SET_MEM_TABLE to share mem regions;
  - step 3, send VHOST_USER_SET_VRING_NUM, VHOST_USER_SET_VRING_BASE,
    VHOST_USER_SET_VRING_ADDR, VHOST_USER_SET_VRING_KICK for each
    queue;
  - ...

Fixes: 37a7eb2ae816 ("net/virtio-user: add device emulation layer")

Reported-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet/virtio_user: fix first queue pair without multiqueue
Jianfeng Tan [Tue, 27 Sep 2016 19:11:04 +0000 (19:11 +0000)]
net/virtio_user: fix first queue pair without multiqueue

When virtio_user is used with OVS-DPDK (with mq disabled), it cannot
receive any packets. This is because no queue is enabled at all when
mq is disabled.

To fix it, we should consistently make sure the 1st queue is enabled,
which is also the behaviour QEMU takes.

Fixes: 37a7eb2ae816 ("net/virtio-user: add device emulation layer")

Reported-by: Ning Li <lining18@jd.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: support indirect Tx descriptors
Maxime Coquelin [Tue, 27 Sep 2016 08:42:49 +0000 (10:42 +0200)]
vhost: support indirect Tx descriptors

Indirect descriptors are usually supported by virtio-net devices,
allowing to dispatch a larger number of requests.

When the virtio device sends a packet using indirect descriptors,
only one slot is used in the ring, even for large packets.

The main effect is to improve the 0% packet loss benchmark.
A PVP benchmark using Moongen (64 bytes) on the TE, and testpmd
(fwd io for host, macswap for VM) on DUT shows a +50% gain for
zero loss.

On the downside, micro-benchmark using testpmd txonly in VM and
rxonly on host shows a loss between 1 and 4%. But depending on
the needs, feature can be disabled at VM boot time by passing
indirect_desc=off argument to vhost-user device in Qemu.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agotools: fix virtio interface name when binding
Gary Mussar [Fri, 2 Sep 2016 13:16:33 +0000 (09:16 -0400)]
tools: fix virtio interface name when binding

The dpdk-devbind.py script does not find/display the ifname for virtio
interfaces since the "net" directory is not directly under the device
directory but rather under a subdirectory.
eg.
> dpdk-devbind.py --status
0000:00:03.0 'Virtio network device' if= drv=virtio-pci unused=

This change searches for the first "net" directory under the device
directory hierarchy.
eg.
0000:00:03.0 'Virtio network device' if=ens3 drv=virtio-pci unused=

Fixes: 629395b063e8 ("igb_uio: remove PCI id table")

Signed-off-by: Gary Mussar <gmussar@ciena.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet/virtio: fix xstats name
Zhiyong Yang [Wed, 7 Sep 2016 06:11:00 +0000 (14:11 +0800)]
net/virtio: fix xstats name

We have a stats named "size_1024_1517_packets", while the code
actually counts the range "[1024, 1518]", which is obviously wrong.
The code is as follows in the function virtio_update_packet_stats.

else if (s < 1519)
stats->size_bins[6]++;

We could either fix it by correcting the "if" check in the code,
or fix it by just renaming the stats to conform to the code. The
latter solution is taken because that's what the RFC2819 suggests.

Fixes: 76d4c652e07d ("virtio: add extended stats")

Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agonet/virtio: enable indirect descriptors feature
Pierre Pfister [Wed, 7 Sep 2016 02:46:18 +0000 (10:46 +0800)]
net/virtio: enable indirect descriptors feature

Virtio indirect descriptors are supported by the data-path
but the feature bit is never set during feature negociation.

This patch simply adds VIRTIO_RING_F_INDIRECT_DESC back to
the supported features bit mask, hence enabling the use of
indirect descriptors when the feature is negociated with the
device.

Signed-off-by: Pierre Pfister <ppfister@cisco.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: remove obsolete comment
Matthias Gatto [Fri, 2 Sep 2016 15:05:21 +0000 (17:05 +0200)]
vhost: remove obsolete comment

As new_device and destroy_device use an int instead of a
"struct virtio_net *", The comment about setting VIRTIO_DEV_RUNNING
doesn't make sense anymore, plus If I've correctly understand the
code, the drivers take care of setting the flag before calling the
callbacks, so I guess that this comment is obsolet and I've remove it.

Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: simplify features set/get
Yuanhan Liu [Thu, 18 Aug 2016 08:48:43 +0000 (16:48 +0800)]
vhost: simplify features set/get

No need to use a pointer to store/retrieve features.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
8 years agovhost: get device once
Yuanhan Liu [Thu, 18 Aug 2016 08:48:42 +0000 (16:48 +0800)]
vhost: get device once

Invoke get_device() at the beginning of vhost_user_msg_handler, so that
we could check the return value once. Which could save tons of duplicate
get-and-check device.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
8 years agovhost: unify function names
Yuanhan Liu [Thu, 18 Aug 2016 08:48:41 +0000 (16:48 +0800)]
vhost: unify function names

Some functions are with prefix "user_", while others with "vhost_".
Making them all starting with "vhost_user_" to unify the function names.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
8 years agovhost: fold common message handlers
Yuanhan Liu [Thu, 18 Aug 2016 08:48:40 +0000 (16:48 +0800)]
vhost: fold common message handlers

Due to history reason (that we have 2 vhost implementations), some
messages are handled in two calls: vhost specific implementation
handles it first and then invoke the common one to do another handling.

We have one implementation only now, we could write one method for
each message. Here fold those common handles to corresponding vhost
user handler.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
8 years agovhost: refactor code structure
Yuanhan Liu [Thu, 18 Aug 2016 08:48:39 +0000 (16:48 +0800)]
vhost: refactor code structure

The code structure is a bit messy now. For example, vhost-user message
handling is spread to three different files:

    vhost-net-user.c  virtio-net.c  virtio-net-user.c

Where, vhost-net-user.c is the entrance to handle all those messages
and then invoke the right method for a specific message. Some of them
are stored at virtio-net.c, while others are stored at virtio-net-user.c.

The truth is all of them should be in one file, vhost_user.c.

So this patch refactors the source code structure: mainly on renaming
files and moving code from one file to another file that is more suitable
for storing it. Thus, no functional changes are made.

After the refactor, the code structure becomes to:

- socket.c      handles all vhost-user socket file related stuff, such
                as, socket file creation for server mode, reconnection
                for client mode.

- vhost.c       mainly on stuff like vhost device creation/destroy/reset.
                Most of the vhost API implementation are there, too.

- vhost_user.c  all stuff about vhost-user messages handling goes there.

- virtio_net.c  all stuff about virtio-net should go there. It has virtio
                net Rx/Tx implementation only so far: it's just a rename
                from vhost_rxtx.c

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
8 years agovhost: remove sub-directory
Yuanhan Liu [Thu, 18 Aug 2016 08:48:38 +0000 (16:48 +0800)]
vhost: remove sub-directory

We now have one vhost implementation; no sub source dir is needed.
Remove it by move them to upper dir.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>