dpdk.git
7 years agoivshmem: remove library and its EAL integration
David Marchand [Fri, 29 Jul 2016 12:28:36 +0000 (14:28 +0200)]
ivshmem: remove library and its EAL integration

Following discussions on the mailing list [1] and since nobody stood up to
implement the necessary cleanups, here is the ivshmem integration removal.

There is not much to say about this patch, a lot of code is being removed.
The default configuration file for packet_ordering example is replaced with
the "native" x86 file.
The only tricky part is in eal_memory with the memseg index stuff.

More cleanups can be done after this but will come in subsequent patchsets.

[1]: http://dpdk.org/ml/archives/dev/2016-June/040844.html

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
7 years agodoc: fix references to old binding script
Pablo de Lara [Fri, 29 Jul 2016 18:20:49 +0000 (19:20 +0100)]
doc: fix references to old binding script

dpdk-nic-bind.py script has been renamed to dpdk-devbind.py,
but some references to the old script have remained.
This commit completes the renaming.

Fixes: a5d7a3f77ddc ("unify tools naming")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
7 years agocontigmem: zero all pages during mmap
Jim Harris [Tue, 16 Aug 2016 22:46:46 +0000 (15:46 -0700)]
contigmem: zero all pages during mmap

On Linux, all huge pages are zeroed by the kernel before
first access by the DPDK application.  But on FreeBSD,
the contigmem driver would only zero the contiguous
memory regions during initial driver load.

DPDK commit b78c91751 eliminated the explicit memset()
operation for rte_zmalloc(), which was OK on Linux
because the kernel zeroes the pages during app start,
but this broke FreeBSD when restarting app.
So this patch explicitly zeroes the pages before they are mmap'd,
to ensure equivalent behavior to Linux.

Fixes: b78c9175118f ("mem: do not zero out memory on zmalloc")

Reported-by: Daniel Verkamp <daniel.verkamp@intel.com>
Signed-off-by: Jim Harris <james.r.harris@intel.com>
Tested-by: Daniel Verkamp <daniel.verkamp@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
7 years agotable: fix symbol exports
Aleksey Katargin [Mon, 1 Aug 2016 09:00:43 +0000 (14:00 +0500)]
table: fix symbol exports

Fixes: 8aa327214ceb ("table: hash")
Fixes: 68866e2417cc ("table: add 16-byte hash operations computed on lookup")

Signed-off-by: Aleksey Katargin <gureedo@gmail.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
7 years agoversion: 16.11-rc0
Thomas Monjalon [Mon, 1 Aug 2016 19:39:42 +0000 (21:39 +0200)]
version: 16.11-rc0

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agombuf: remove deprecated internal function
Thomas Monjalon [Mon, 1 Aug 2016 12:54:06 +0000 (14:54 +0200)]
mbuf: remove deprecated internal function

The function __rte_mbuf_raw_alloc was reserved for internal use and
has been deprecated in favor of the public function rte_mbuf_raw_alloc.
It can be safely removed now.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agolog: remove deprecated history dump
Thomas Monjalon [Fri, 29 Jul 2016 12:55:28 +0000 (14:55 +0200)]
log: remove deprecated history dump

The log history feature was deprecated in 16.07.
The remaining empty functions are removed in 16.11.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
7 years agodoc: postpone mempool ABI breakage
Thomas Monjalon [Fri, 29 Jul 2016 13:34:29 +0000 (15:34 +0200)]
doc: postpone mempool ABI breakage

It was planned to remove some mempool functions which are deprecated
since 16.07.
As no other mempool ABI change is planned in 16.11, it is better
to postpone and group every mempool ABI changes in 17.02.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agodoc: generate NIC overview table from ini files
John McNamara [Fri, 29 Jul 2016 11:59:14 +0000 (12:59 +0100)]
doc: generate NIC overview table from ini files

Convert the NIC feature table in the overview doc into a set of ini
files and add functions into the Sphinx conf.py file to auto-generate
them back into an RST table.

The reason for doing this is to make it easier for PMD maintainers to
update the feature matrix that makes up the table and to avoid
frequent and hard to resolve conflicts in doc/guides/nics/overview.rst.

A NIC/PMD feature matrix is now an ini file like the following:

    $ head doc/guides/nics/nic_features/i40e.ini
    ;
    ; Features of the i40e network driver.
    ;
    [Features]
    Link status          = Y
    Link status event    = Y
    Rx interrupt         = Y
    Queue start/stop     = Y
    ...

The output RST table matches the existing table with the column
headers sorted.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agodoc: add template release notes for 16.11
John McNamara [Fri, 29 Jul 2016 11:23:27 +0000 (12:23 +0100)]
doc: add template release notes for 16.11

Add template release notes for DPDK 16.11 with inline
comments and explanations of the various sections.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
7 years agoversion: 16.07.0
Thomas Monjalon [Thu, 28 Jul 2016 18:48:41 +0000 (20:48 +0200)]
version: 16.07.0

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agodoc: announce API change in port library
Fan Zhang [Thu, 19 May 2016 14:18:35 +0000 (15:18 +0100)]
doc: announce API change in port library

The API changes are planned for rte_port_source_params and
rte_port_sink_params, which will be supported from release 16.11.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agodoc: announce vhost-cuse removal
Yuanhan Liu [Fri, 15 Jul 2016 12:28:33 +0000 (20:28 +0800)]
doc: announce vhost-cuse removal

Vhost-cuse was invented before vhost-user exist. The both are actually
doing the same thing: a vhost-net implementation in user space. But they
are not exactly the same thing.

Firstly, vhost-cuse is harder for use; no one seems to care it, either.
Furthermore, since v2.1, a large majority of development effort has gone
to vhost-user. For example, we extended the vhost-user spec to add the
multiple queue support. We also added the vhost-user live migration at
v16.04 and the latest one, vhost-user reconnect that allows vhost app
restart without restarting the guest. Both of them are very important
features for product usage and none of them works for vhost-cuse.

You now see that the difference between vhost-user and vhost-cuse is
big (and will be bigger and bigger as time moves forward), that you
should never use vhost-cuse, that we should drop it completely.

The remove would also result to a much cleaner code base, allowing us
to do all kinds of extending easier.

So here to mark vhost-cuse as deprecated in this release and will be
removed in the next release (v16.11).

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
7 years agodoc: announce ivshmem support removal
Thomas Monjalon [Wed, 20 Jul 2016 16:35:46 +0000 (18:35 +0200)]
doc: announce ivshmem support removal

There was a prior call with an explanation of what needs to be done:
http://dpdk.org/ml/archives/dev/2016-June/040844.html
- Qemu patch upstreamed
- IVSHMEM PCI device managed by a PCI driver
- No DPDK objects (ring/mempool) allocated by EAL

As nobody seems interested, it is time to remove this code which
makes EAL improvements harder.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
7 years agodoc: announce ABI change for mbuf structure
Olivier Matz [Wed, 20 Jul 2016 07:16:14 +0000 (09:16 +0200)]
doc: announce ABI change for mbuf structure

For 16.11, the mbuf structure will be modified implying ABI breakage.
Some discussions already took place here:
http://www.dpdk.org/dev/patchwork/patch/12878/

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: John Daley <johndale@cisco.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agodoc: announce ABI change for Tx preparation
Tomasz Kulasek [Thu, 21 Jul 2016 15:24:19 +0000 (17:24 +0200)]
doc: announce ABI change for Tx preparation

This is an ABI deprecation notice for DPDK 16.11 in librte_ether about
changes in rte_eth_dev and rte_eth_desc_lim structures.

As discussed in that thread:

http://dpdk.org/ml/archives/dev/2015-September/023603.html

Different NIC models depending on HW offload requested might impose
different requirements on packets to be TX-ed in terms of:

 - Max number of fragments per packet allowed
 - Max number of fragments per TSO segments
 - The way pseudo-header checksum should be pre-calculated
 - L3/L4 header fields filling
 - etc.

MOTIVATION:
-----------

1) Some work cannot (and didn't should) be done in rte_eth_tx_burst.
   However, this work is sometimes required, and now, it's an
   application issue.

2) Different hardware may have different requirements for TX offloads,
   other subset can be supported and so on.

3) Some parameters (eg. number of segments in ixgbe driver) may hung
   device. These parameters may be vary for different devices.

   For example i40e HW allows 8 fragments per packet, but that is after
   TSO segmentation. While ixgbe has a 38-fragment pre-TSO limit.

4) Fields in packet may require different initialization (like eg. will
   require pseudo-header checksum precalculation, sometimes in a
   different way depending on packet type, and so on). Now application
   needs to care about it.

5) Using additional API (rte_eth_tx_prep) before rte_eth_tx_burst let to
   prepare packet burst in acceptable form for specific device.

6) Some additional checks may be done in debug mode keeping tx_burst
   implementation clean.

PROPOSAL:
---------

To help user to deal with all these varieties we propose to:

1. Introduce rte_eth_tx_prep() function to do necessary preparations of
   packet burst to be safely transmitted on device for desired HW
   offloads (set/reset checksum field according to the hardware
   requirements) and check HW constraints (number of segments per
   packet, etc).

   While the limitations and requirements may differ for devices, it
   requires to extend rte_eth_dev structure with new function pointer
   "tx_pkt_prep" which can be implemented in the driver to prepare and
   verify packets, in devices specific way, before burst, what should to
   prevent application to send malformed packets.

2. Also new fields will be introduced in rte_eth_desc_lim:
   nb_seg_max and nb_mtu_seg_max, providing an information about max
   segments in TSO and non-TSO packets acceptable by device.

   This information is useful for application to not create/limit
   malicious packet.

APPLICATION (CASE OF USE):
--------------------------

1) Application should to initialize burst of packets to send, set
   required tx offload flags and required fields, like l2_len, l3_len,
   l4_len, and tso_segsz

2) Application passes burst to the rte_eth_tx_prep to check conditions
   required to send packets through the NIC.

3) The result of rte_eth_tx_prep can be used to send valid packets
   and/or restore invalid if function fails.

eg.

for (i = 0; i < nb_pkts; i++) {

/* initialize or process packet */

bufs[i]->tso_segsz = 800;
bufs[i]->ol_flags = PKT_TX_TCP_SEG | PKT_TX_IPV4
| PKT_TX_IP_CKSUM;
bufs[i]->l2_len = sizeof(struct ether_hdr);
bufs[i]->l3_len = sizeof(struct ipv4_hdr);
bufs[i]->l4_len = sizeof(struct tcp_hdr);
}

/* Prepare burst of TX packets */
nb_prep = rte_eth_tx_prep(port, 0, bufs, nb_pkts);

if (nb_prep < nb_pkts) {
printf("tx_prep failed\n");

/* drop or restore invalid packets */

}

/* Send burst of TX packets */
nb_tx = rte_eth_tx_burst(port, 0, bufs, nb_prep);

/* Free any unsent packets. */

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agodoc: announce renaming of ethdev library
Thomas Monjalon [Tue, 26 Jul 2016 16:22:21 +0000 (18:22 +0200)]
doc: announce renaming of ethdev library

The right name of ethdev should be dpdk_netdev. However:
1/ We are using rte_ prefix in the code and library names.
2/ The API uses rte_ethdev
That's why 16.11 will just have the rte_ prefix prepended to
the library filename as every other libraries.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agodoc: announce driver name changes
Pablo de Lara [Sat, 9 Jul 2016 16:56:34 +0000 (17:56 +0100)]
doc: announce driver name changes

Driver names for all the supported devices in DPDK do not have
a naming convention. Some are using a prefix, some are not
and some have long names. Driver names are used when creating
virtual devices, so it is useful to have consistency in the names.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
7 years agodoc: remove deprecation notice related to new flow types
Rahul Lakkireddy [Thu, 28 Jul 2016 10:15:28 +0000 (15:45 +0530)]
doc: remove deprecation notice related to new flow types

Remove deprecation notice pertaining to introduction of new flow
types in favor of a more generic filtering infrastructure proposal.

Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
7 years agodoc: add tested hardware and systems for 16.07
Yulong Pei [Thu, 28 Jul 2016 06:18:32 +0000 (14:18 +0800)]
doc: add tested hardware and systems for 16.07

Add new section on tested platforms and nics and OSes to the release notes.

Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: improve wording of new features in 16.07
John McNamara [Wed, 27 Jul 2016 13:26:55 +0000 (14:26 +0100)]
doc: improve wording of new features in 16.07

Improve the wording of some text in the "new features" section of
the release notes.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: update guide and release notes for mlx5
Olga Shern [Wed, 27 Jul 2016 09:27:26 +0000 (12:27 +0300)]
doc: update guide and release notes for mlx5

Signed-off-by: Olga Shern <olgas@mellanox.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: fix path to testpmd app
Shreyansh Jain [Tue, 26 Jul 2016 13:31:41 +0000 (19:01 +0530)]
doc: fix path to testpmd app

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: add known issue for promiscuous mode in i40e VF
Jeff Guo [Wed, 27 Jul 2016 02:56:34 +0000 (22:56 -0400)]
doc: add known issue for promiscuous mode in i40e VF

When use i40e linux kernel driver as host driver and DPDK handler the i40e
VF, the promiscuous mode doesn't work in i40e VF. It is not supported by
DPDK i40e VF driver right now.

Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agonet/i40e: fix metadata in first mbuf of scattered Rx
Dumitru Ceara [Tue, 26 Jul 2016 10:46:09 +0000 (12:46 +0200)]
net/i40e: fix metadata in first mbuf of scattered Rx

The driver is incorrectly setting the RSS field in the last mbuf in
the packet chain instead of the first. Moreover, the last mbuf might
have already been freed if it only contained the Ethernet CRC.

Also, fix the call to i40e_rxd_build_fdir to store the fdir flags in
the first mbuf of the chain instead of the last.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Fixes: 5a21d9715f81 ("i40e: report flow director matching")

Signed-off-by: Dumitru Ceara <dumitru.ceara@gmail.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agoexamples/ip_pipeline: fix flow classification config
Sankar Chokkalingam [Wed, 27 Jul 2016 09:35:30 +0000 (02:35 -0700)]
examples/ip_pipeline: fix flow classification config

This configuration is example configuration for flow classification.
This fix changes the offset and mask value to compute the hash correctly.
This fix does not involve code change and do not impact compilation,
build and performance.

Fixes: 93771a569daa ("examples/ip_pipeline: rework flow classification CLI")

Signed-off-by: Sankar Chokkalingam <sankarx.chokkalingam@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
7 years agoethdev: fix documentation for queue start/stop
Nikhil Rao [Tue, 26 Jul 2016 14:12:40 +0000 (19:42 +0530)]
ethdev: fix documentation for queue start/stop

Fix documentation for rte_eth_dev_tx/rx_queue_start/stop() functions

Fixes: 2de9f8551ff9 ("ethdev: fix documentation for queue start/stop")

Signed-off-by: Nikhil Rao <nikhil.rao@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agoeal: fix tail blank check in --lcores argument
Wei Dai [Wed, 27 Jul 2016 11:25:56 +0000 (19:25 +0800)]
eal: fix tail blank check in --lcores argument

the tail blank after a group of lcore or cpu set
will make check of its end character fail.
for example: --lcores '(0-3)@(0-3)   ,(4-5)@(4-5)',
the next character after cpu set (0-3) is not ','
or '\0', which fail the check in eal_parse_lcores( ).

Fixes: 53e54bf81700 ("eal: new option --lcores for cpu assignment")

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agoeal: fix parsing of option --lcores
Wei Dai [Wed, 27 Jul 2016 11:23:41 +0000 (19:23 +0800)]
eal: fix parsing of option --lcores

The '-' in lcore set overrides cpu set of following
lcore set in the argument of EAL option --lcores.
for example --locres '0-2,(3-5)@(3,4),6@(5,6),7@(5-7)',
0-2 make lflags=1 which indeed suppress following
cpu set (3,4), (5,6) and (5-7) after @ .

Fixes: 53e54bf81700 ("eal: new option --lcores for cpu assignment")

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agoeal: remove redundant code to parse --lcores
Wei Dai [Wed, 27 Jul 2016 11:22:31 +0000 (19:22 +0800)]
eal: remove redundant code to parse --lcores

local variable i is not referred by other codes in
the function eal_parse_lcores( ), so it can be removed.

Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agomaintainers: update email address
Tetsuya Mukawa [Thu, 28 Jul 2016 05:53:59 +0000 (14:53 +0900)]
maintainers: update email address

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
7 years agoversion: 16.07-rc5
Thomas Monjalon [Mon, 25 Jul 2016 16:29:37 +0000 (18:29 +0200)]
version: 16.07-rc5

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agomempool: fix unsafe removal from list by callback
Thomas Monjalon [Mon, 25 Jul 2016 19:32:03 +0000 (21:32 +0200)]
mempool: fix unsafe removal from list by callback

If a mempool is removed from the list by a callback function
during rte_mempool_walk(), the TAILQ_FOREACH loop will fail unexpectedly.
It is fixed by using the safe version of the loop macro.

Reported-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agomaintainers: add an entry for the stable branches
Thomas Monjalon [Mon, 25 Jul 2016 12:56:37 +0000 (14:56 +0200)]
maintainers: add an entry for the stable branches

This git tree will be used to backport some fixes from the
master branch to maintain some "stable releases".
The minor version number z will be incremented for these releases:
YY.MM.z

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/i40e: fix VSI removing when releasing
Jingjing Wu [Mon, 25 Jul 2016 05:36:09 +0000 (13:36 +0800)]
net/i40e: fix VSI removing when releasing

VSI structure needs to be removed from TAILQ list when releasing.
But for the child VSI it will be removed again after the structure
is freed. It will cause core dump when the DPDK i40e using as PF
host driver.

This patch fixes it to only remove child VSI from TAILQ before
send adminq command to remove it from hardware.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Fixes: 440499cf5376 ("net/i40e: support floating VEB")

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
7 years agovhost: fix off-by-one error on descriptor number check
Maxime Coquelin [Mon, 25 Jul 2016 14:09:58 +0000 (16:09 +0200)]
vhost: fix off-by-one error on descriptor number check

nr_desc is not an index but the number of descriptors,
so can be equal to the virtqueue size.

Fixes: a436f53ebfeb ("vhost: avoid dead loop chain")

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agotimer: fix corruption with reset
Hiroyuki Mikita [Sun, 17 Jul 2016 18:08:00 +0000 (03:08 +0900)]
timer: fix corruption with reset

When timer_cb resets another running timer on the same lcore,
the list of expired timers is chained to the pending-list.
This commit prevents a running timer from being reset
by not its own timer_cb.

Fixes: a4b7a5a45cf5 ("timer: fix race condition")

Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
7 years agotimer: remove unnecessary list insertion
Hiroyuki Mikita [Sun, 17 Jul 2016 17:35:50 +0000 (02:35 +0900)]
timer: remove unnecessary list insertion

When timer_set_running_state() fails in rte_timer_manage(),
the failed timer is put back on pending-list.
In this case, another core tries to reset or stop the timer.
It does not need to be on pending-list.

Fixes: a4b7a5a45cf5 ("timer: fix race condition")

Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
7 years agotimer: fix pending-list manipulation
Hiroyuki Mikita [Sun, 17 Jul 2016 14:35:39 +0000 (23:35 +0900)]
timer: fix pending-list manipulation

This commit fixes incorrect pending-list manipulation
when getting list of expired timers in rte_timer_manage().

When timer_get_prev_entries() sets pending_head on prev,
the pending-list is broken.
The next of pending_head always becomes NULL.
In this depth level, it is not need to manipulate the list.

Fixes: 9b15ba895b9f ("timer: use a skip list")

Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
7 years agoring: fix single consumer dequeue performance
Jerin Jacob [Sun, 24 Jul 2016 17:07:40 +0000 (22:37 +0530)]
ring: fix single consumer dequeue performance

Use of rte_smb_wmb() instead of rte_smb_rmb() in sc dequeue function
creates the additional overhead of waiting for all the STOREs
to be completed to local buffer from ring buffer memory.
The sc dequeue function demands only LOAD-STORE barrier where LOADs
from ring buffer memory needs to be completed before tail pointer update.
Changing to rte_smb_rmb() to enable the required LOAD-STORE barrier.

Fixes: ecc7d10e448e ("ring: guarantee dequeue ordering before tail update")

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agomk: fix link with glibc < 2.17
Thomas Monjalon [Mon, 25 Jul 2016 10:10:55 +0000 (12:10 +0200)]
mk: fix link with glibc < 2.17

There is a dependency on librt with old glibc.
The -lrt option was added everywhere it is needed but was also
added in some applications makefiles as the first link option.
The problem is this option is really useful only if added after
the objects or libraries using it (except if using --whole-archive).
And the -lrt options put after were removed to avoid duplicates.

It was resulting in errors linking test application:
eal_timer.c:(.text+0x128): undefined reference to `clock_gettime'
eal_timer.c:(.text+0x166): undefined reference to `clock_gettime'
eal_alarm.c:(.text+0xda): undefined reference to `clock_gettime'
eal_alarm.c:(.text+0x211): undefined reference to `clock_gettime'

It is fixed by removing superfluous -lrt in app makefiles.

Fixes: 281948b4753e ("mk: fix missing librt dependencies")
Fixes: 2f6414f4baf1 ("mk: fix static link with glibc < 2.17")

Reported-by: Piotr Azarewicz <piotrx.t.azarewicz@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agomk: fix build with clang < 3.5
Ferruh Yigit [Mon, 25 Jul 2016 12:55:49 +0000 (13:55 +0100)]
mk: fix build with clang < 3.5

clang version < 3.5 doesn't support -z linker option,
and some FreeBSD box still has clang versions < 3.5 as default version.

compile error:
clang: error: unknown argument: '-z'

Fixes: fd591c4c4e35 ("mk: check shared library dependencies")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
7 years agomk: fix clang version query
Ferruh Yigit [Mon, 25 Jul 2016 12:55:48 +0000 (13:55 +0100)]
mk: fix clang version query

-dumpversion is for gcc compatibility and doesn't return actual clang
version. -dumpversion only returns 4.2.1 for a long time.

Fixes: 2ef6eea891e5 ("mk: add clang toolchain")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Bruce Richardson <bruce.richardson@intel.com>
7 years agoversion: 16.07-rc4
Thomas Monjalon [Fri, 22 Jul 2016 20:41:56 +0000 (22:41 +0200)]
version: 16.07-rc4

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agomaintainers: add git tree for virtio/vhost
Yuanhan Liu [Tue, 19 Jul 2016 04:17:49 +0000 (12:17 +0800)]
maintainers: add git tree for virtio/vhost

Add a git tree line for the virtio/vhost section, to make an explicit
statement that the developers are suggested to make patches based on
that tree.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agomaintainers: split networking and crypto drivers
Thomas Monjalon [Wed, 20 Jul 2016 10:34:45 +0000 (12:34 +0200)]
maintainers: split networking and crypto drivers

There are now 2 different sections for drivers/net/ and drivers/crypto/.
It makes possible to declare some dedicated git trees.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
7 years agounify tools naming
Thomas Monjalon [Wed, 20 Jul 2016 13:38:54 +0000 (15:38 +0200)]
unify tools naming

The following tools may be installed system-wide.
It may be cleaner and more convenient to find them with the same
dpdk- prefix (especially for autocompletion).
Moreover, the script dpdk_nic_bind.py deserves a new name because it is
not restricted to NICs and can be used for e.g. crypto.

These files are renamed:
pmdinfogen       -> dpdk-pmdinfogen
pmdinfo.py       -> dpdk-pmdinfo.py
dpdk_pdump       -> dpdk-pdump
dpdk_proc_info   -> dpdk-procinfo
dpdk_nic_bind.py -> dpdk-devbind.py
setup.sh         -> dpdk-setup.sh

The tools pmdinfogen, pmdinfo.py and dpdk_pdump are new in 16.07.

The scripts dpdk_nic_bind.py and setup.sh may have been used with
previous releases by end users. That's why a symbolic link still
provide the old name in the installed tools directory.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agodoc: update sphinx installation instructions
John McNamara [Sun, 17 Jul 2016 13:19:08 +0000 (14:19 +0100)]
doc: update sphinx installation instructions

Update the Sphinx installation instructions in the documentation
contributors guide to reflect the fact that in the 1.4+ versions
of Sphinx the ReadTheDocs theme must also be installed. Previously,
in version 1.3.x, it was installed by default.

Also change 'yum' to 'dnf' for package installations.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: fix sphinx highlighting warnings
John McNamara [Sun, 17 Jul 2016 13:11:54 +0000 (14:11 +0100)]
doc: fix sphinx highlighting warnings

Fix warnings raised by Python Sphinx 1.4.5:

    guides/sample_app_ug/ip_pipeline.rst:334:
    WARNING: Could not lex literal_block as "ini". Highlighting skipped.

    guides/sample_app_ug/l2_forward_real_virtual.rst:467:
    WARNING: Could not lex literal_block as "c". Highlighting skipped.

    guides/sample_app_ug/l3_forward.rst:293:
    WARNING: Could not lex literal_block as "c". Highlighting skipped.

    guides/sample_app_ug/vm_power_management.rst:162:
    WARNING: Could not lex literal_block as "xml". Highlighting skipped.

These warnings arise from invalid syntax in code-block directives.

Fixes: f1e779ec5b50 ("doc: update ip pipeline app guide")
Fixes: d0dff9ba445e ("doc: sample application user guide")
Fixes: c75f4e6a7a2b ("doc: add vm power mgmt app")

Signed-off-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: fix release notes for 16.07
John McNamara [Tue, 19 Jul 2016 13:16:35 +0000 (14:16 +0100)]
doc: fix release notes for 16.07

Fix grammar, spelling and formatting of DPDK 16.07 release notes.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: add cryptodev shared library version to release notes
Pablo de Lara [Wed, 20 Jul 2016 12:31:12 +0000 (13:31 +0100)]
doc: add cryptodev shared library version to release notes

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
7 years agodoc: add flow bifurcation howto on Linux
Jingjing Wu [Tue, 19 Jul 2016 03:31:10 +0000 (11:31 +0800)]
doc: add flow bifurcation howto on Linux

Flow Bifurcation is a mechanism which uses features of advanced
Ethernet devices to split traffic between queues. It provides
the capability to let the kernel driver and DPDK driver co-exist
and take advantage of both.

It is achieved by using SR-IOV and the NIC's advanced filtering. This
patch describes Flow Bifurcation and adds the user guide for ixgbe
and i40e NICs.

Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: add VM live migration image
Bernard Iremonger [Mon, 18 Jul 2016 14:30:27 +0000 (15:30 +0100)]
doc: add VM live migration image

This patch adds an image of the Live Migration of a VM using vhost_user
on the host, test configuration.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: add VM live migration howto with vhost-user
Bernard Iremonger [Mon, 18 Jul 2016 14:30:26 +0000 (15:30 +0100)]
doc: add VM live migration howto with vhost-user

This patch describes the procedure to be be followed to perform
Live Migration of a VM with Virtio PMD running on a host which
is running the vhost_user sample application (vhost-switch).

It includes sample host and VM scripts used in the procedure.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: add VF live migration image
Bernard Iremonger [Tue, 19 Jul 2016 15:09:29 +0000 (16:09 +0100)]
doc: add VF live migration image

This patch adds an image of the Live Migration for
virtio and sriov test configuration.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: add VF live migration howto with bonded virtio
Bernard Iremonger [Tue, 19 Jul 2016 15:09:28 +0000 (16:09 +0100)]
doc: add VF live migration howto with bonded virtio

This patch describes the procedure to be be followed
to perform Live Migration of a VM with Virtio and VF PMD's
using the bonding PMD.

It includes sample host and VM scripts used in the procedure,
and a sample switch configuration.

Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: fix vhost setup in tep_termination guide
Mark Kavanagh [Thu, 21 Jul 2016 13:10:13 +0000 (14:10 +0100)]
doc: fix vhost setup in tep_termination guide

- Fix vhost setup flags
- Add minor edits to improve readability and consistency

Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: note a pitfall on vhost reconnect feature
Yuanhan Liu [Tue, 19 Jul 2016 04:17:48 +0000 (12:17 +0800)]
doc: note a pitfall on vhost reconnect feature

The vhost feature negotiation only happens at virtio reset stage, say
when a virtio-net device is firstly initiated, or when DPDK virtio PMD
initiates. That means, if vhost APP restarts after the negotiation and
reconnects, the feature negotiation process will not be triggered again,
meaning the info is lost. To make reconnect work, QEMU simply saves
the negotiated features before the restart and restores it afterwards.

Therefore, the vhost supported features must be exactly the same before
and after the restart. For example, if TSO is disabled and then enabled,
nothing will work and undefined issues might happen.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agodoc: update release notes and guide for enic
John Daley [Thu, 21 Jul 2016 09:11:58 +0000 (02:11 -0700)]
doc: update release notes and guide for enic

Signed-off-by: John Daley <johndale@cisco.com>
7 years agodoc: fix macro name in mempool guide
Shreyansh Jain [Mon, 18 Jul 2016 11:33:00 +0000 (17:03 +0530)]
doc: fix macro name in mempool guide

Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agoexamples/l2fwd-ivshmem: fix build with icc
Ferruh Yigit [Fri, 22 Jul 2016 14:13:36 +0000 (15:13 +0100)]
examples/l2fwd-ivshmem: fix build with icc

icc version 16.0.2, compile error:

examples/l2fwd-ivshmem/host/host.c(157):
error #3656: variable "total_vm_packets_dropped"
             may be used before its value is set
        total_vm_packets_dropped += ctrl->vm_ports[portid].stats.dropped;
        ^

Fixes: 6aa497249172 ("examples/l2fwd-ivshmem: import sample application")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
7 years agoapp/pdump: cleanup rings upon failures
Reshma Pattan [Fri, 22 Jul 2016 13:44:04 +0000 (14:44 +0100)]
app/pdump: cleanup rings upon failures

Function create_mp_ring_vdev() for failure cases exits without
freeing the created rte rings, because of this, pdump tool cannot be
rerun successfully. Added rte ring cleanup logic upon failures.

Fixes: caa7028276b8 ("app/pdump: add tool for packet capturing")

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
7 years agonet/i40e: fix unsafe tailq element removal
Pablo de Lara [Fri, 22 Jul 2016 14:02:02 +0000 (15:02 +0100)]
net/i40e: fix unsafe tailq element removal

i40e driver was removing elements when iterating tailq lists
with TAILQ_FOREACH macro, which is not safe.
It is especially visible since the memory is zeroed on free
(commit ea0bddbd14e6).

Instead, TAILQ_FOREACH_SAFE macro is used when removing/freeing
these elements.

Fixes: 4861cde46116 ("i40e: new poll mode driver")
Fixes: 440499cf5376 ("net/i40e: support floating VEB")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
7 years agoeal: add tailq safe iterator macro
Pablo de Lara [Fri, 22 Jul 2016 14:02:01 +0000 (15:02 +0100)]
eal: add tailq safe iterator macro

Removing/freeing elements elements within a TAILQ_FOREACH loop is not safe.
FreeBSD defines TAILQ_FOREACH_SAFE macro, which permits
these operations safely.
This patch defines this macro for Linux systems, where it is not defined.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
7 years agomem: fix check of physical address retrieval
Michal Jastrzebski [Fri, 22 Jul 2016 14:33:50 +0000 (16:33 +0200)]
mem: fix check of physical address retrieval

In rte_mem_virt2phy: Value returned from a function and indicating the
number of bytes was ignored. This could cause a wrong pfn (page frame
number) mask read from pagemap file.
When read returns less than the number of sizeof(uint64_t) bytes,
function rte_mem_virt2phy returns error.

Coverity issue: 13212
Fixes: 40b966a211ab ("ivshmem: library changes for mmaping using ivshmem")

Signed-off-by: Michal Jastrzebski <michalx.k.jastrzebski@intel.com>
7 years agoscripts: validate ABI faster with parallel make jobs
Neil Horman [Wed, 20 Jul 2016 19:02:17 +0000 (15:02 -0400)]
scripts: validate ABI faster with parallel make jobs

John Mcnamara and I were discussing enhancing the validate_abi script to
build the dpdk tree faster with multiple jobs.
Theres no reason not to do it, so this implements that requirement.

It uses a DPDK_MAKE_JOBS variable that can be set by the user to limit
the job count.  By default the job count is set to the number of online
cpus.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
7 years agoexamples/ip_pipeline: fix performance with default config
Sankar Chokkalingam [Mon, 18 Jul 2016 18:23:26 +0000 (11:23 -0700)]
examples/ip_pipeline: fix performance with default config

In TM, the read size should be lesser than the write size to improve
performance.
This enables the TM ports to push maximum packets to the output port.

This fix changes the burst_read value from 64 to 24 in default_tm_params.

Signed-off-by: Sankar Chokkalingam <sankarx.chokkalingam@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
7 years agoexamples/ip_pipeline: fix IPv6 flow classification
Sankar Chokkalingam [Mon, 18 Jul 2016 18:15:43 +0000 (11:15 -0700)]
examples/ip_pipeline: fix IPv6 flow classification

IP Pipeline application with the configuration for Flow Classification
IPV6 did not instantiate.
Parse error in section "PIPELINE1": entry "dma_src_mask" too long

The dma_src_mask check in pipeline_passthrough_parse_args() is wrong.

This fix increases the length of dma_src_mask by 1 for NULL termination
and corrected the validation of dma_src_mask length.
This fix is also propagated to pipeline_fc_parse_args() for key_mask_str
validation.

Signed-off-by: Sankar Chokkalingam <sankarx.chokkalingam@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
7 years agoexamples/ip_pipeline: fix action flow bulk command
Sankar Chokkalingam [Mon, 18 Jul 2016 17:32:58 +0000 (10:32 -0700)]
examples/ip_pipeline: fix action flow bulk command

Error while executing action flow bulk command
pipeline> p 1 action flow bulk ./config/action.txt
Command "action flow bulk" failed
pipeline>

The flow action entries are added successfully.
But the return value is not computed correctly.
Due to this, the error message appears on CLI.

The return value is computed with rsp->n_flows after rsp pointer is freed.
This fix computes the return value before rsp pointer is freed.

Signed-off-by: Sankar Chokkalingam <sankarx.chokkalingam@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
7 years agoexamples/performance-thread: add missing braces
Pablo de Lara [Mon, 18 Jul 2016 11:20:25 +0000 (12:20 +0100)]
examples/performance-thread: add missing braces

pthread_detach() function was returning 0 even when not calling
lthread_detach(), due to missing braces in conditional
(extra indentation was applied, giving a hint this is the correct fix).

Fixes: 433ba6228f9a ("examples/performance-thread: add pthread_shim app")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Tested-by: John McNamara <john.mcnamara@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
7 years agoexamples/vhost: fix performance
Jianfeng Tan [Thu, 21 Jul 2016 00:42:45 +0000 (00:42 +0000)]
examples/vhost: fix performance

We find significant perfermance drop introduced by below commit,
when vhost example is started with --mergeable 0 and inside vm,
kernel virtio-net driver is used to do ip based forwarding.

The commit, 859b480d5afd ("vhost: add guest offload setting"), adds
support for VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6,
in vhost lib. But inside vhost example, the way to disable tso only
excludes the direction from virtio to vhost, but not the opposite
direction. When mergeable is disabled, it triggers big_packets path
of virtio-net driver to prepare to receive possible big packets with
size of 64K. Because mergeable is off, for each entry of avail ring,
virtio driver uses 19 desc chained together, with one desc pointing
to header, other 18 desc pointing to 4K-sized pages. But QEMU only
creates 256 desc entries for each vq, which results in that only 13
packets can be received. VM kernel can quickly handle those packets
and go to sleep (HLT).

As QEMU has no option to set the desc entries of a vq, so here,
we disable VIRTIO_NET_F_GUEST_TSO4 and VIRTIO_NET_F_GUEST_TSO6
with VIRTIO_NET_F_HOST_TSO4 and VIRTIO_NET_F_HOST_TSO6 when we
disable tso of vhost example, to avoid VM kernel virtio driver
go into big_packets path.

Fixes: 9fd72e3cbd29 ("examples/vhost: add virtio offload")

Reported-by: Qian Xu <qian.q.xu@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agoexamples/ipsec-secgw: call start function
Hemant Agrawal [Thu, 21 Jul 2016 10:54:04 +0000 (16:24 +0530)]
examples/ipsec-secgw: call start function

The usual device sequence is configure, queue setup and start.
Crypto device should be started before use.

Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
7 years agoexamples/l2fwd-crypto: call start function
Hemant Agrawal [Thu, 21 Jul 2016 10:54:03 +0000 (16:24 +0530)]
examples/l2fwd-crypto: call start function

The usual device sequence is configure, queue setup and start.
Crypto device should be started before use.

Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
7 years agoexamples/ipsec-secgw: fix build with gcc 4.5
Sergio Gonzalez Monroy [Tue, 19 Jul 2016 11:06:00 +0000 (12:06 +0100)]
examples/ipsec-secgw: fix build with gcc 4.5

GCC 4.5.x does not handle well initializing anonymous union and/or
structs.

To make the compiler happy we name those anonymous union/struct.

Fixes: 906257e965b7 ("examples/ipsec-secgw: support IPv6")

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
7 years agocryptodev: fix memory leak in parameter parsing
Pablo de Lara [Mon, 18 Jul 2016 13:21:04 +0000 (14:21 +0100)]
cryptodev: fix memory leak in parameter parsing

When parsing the parameters for virtual device initialization,
rte_kvargs structure was being freed only if there was an error,
not when parsing was successful.

Coverity issue: 124568
Fixes: f3e764fa2fb7 ("cryptodev: uninline parameter parsing")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Reshma Pattan <reshma.pattan@intel.com>
7 years agonet/virtio-user: fix inconsistent name
Jianfeng Tan [Fri, 22 Jul 2016 02:24:47 +0000 (02:24 +0000)]
net/virtio-user: fix inconsistent name

The commit cb6696d22023 ("drivers: update registration macro usage")
changes the name from virtio-user to virtio_user, because hyphen
cannot be used in a C symbol name. However, this commit does not
update the strings in docs and source code, which could lead to
failure to start this device as per the docs.

This patch updates related strings in the docs and source code.

Fixes: cb6696d22023 ("drivers: update registration macro usage")

Reported-by: Tiwei Bie <tiwei.bie@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/fm10k: fix RSS hash config
Xiao Wang [Thu, 21 Jul 2016 08:24:30 +0000 (16:24 +0800)]
net/fm10k: fix RSS hash config

Sometimes app just wants to update the RSS hash function and no RSS key
update is needed, but fm10k pmd will return EINVAL for this case.

If the rss_key is NULL, we don't need to check the rss_key_len.

Fixes: 57033cdf8fdc ("fm10k: add PF RSS")

Reported-by: Xueqin Lin <xueqin.lin@intel.com>
Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
7 years agonet/i40e: fix out-of-bounds writes in vector Rx
Sergey Dyasly [Thu, 21 Jul 2016 11:03:38 +0000 (14:03 +0300)]
net/i40e: fix out-of-bounds writes in vector Rx

Rx loop inside _recv_raw_pkts_vec() ignores nb_pkts argument and always
tries to receive RTE_I40E_VPMD_RX_BURST (32) packets. This is a violation
of rte_eth_rx_burst() API and can lead to memory corruption (out-of-bounds
writes to struct rte_mbuf **rx_pkts) if nb_pkts is less than 32.

Fix this by actually using nb_pkts inside the loop.

Fixes: 9ed94e5bb04e ("i40e: add vector Rx")

Signed-off-by: Sergey Dyasly <s.dyasly@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Adam Bynes <adambynes@outlook.com>
7 years agonet/i40e: fix speed capabilities
Ido Barnea [Thu, 21 Jul 2016 23:25:31 +0000 (01:25 +0200)]
net/i40e: fix speed capabilities

Signed-off-by: Ido Barnea <ibarnea@cisco.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
7 years agonet/ixgbe/base: fix C++ build
Ido Barnea [Thu, 21 Jul 2016 23:20:35 +0000 (01:20 +0200)]
net/ixgbe/base: fix C++ build

Signed-off-by: Ido Barnea <ibarnea@cisco.com>
7 years agonet/enic: heed VLAN strip flag
John Daley [Tue, 19 Jul 2016 22:43:45 +0000 (15:43 -0700)]
net/enic: heed VLAN strip flag

The configure function enicpmd_dev_configure() was not paying attention
to the rxmode VLAN strip bit. Set the VLAN strip mode according to the bit.

Fixes: fefed3d1e62c ("enic: new driver")

Signed-off-by: John Daley <johndale@cisco.com>
Reviewed-by: David Harton <dharton@cisco.com>
Tested-by: David Harton <dharton@cisco.com>
7 years agonet/enic: fix possible Rx corruption
John Daley [Tue, 19 Jul 2016 22:42:43 +0000 (15:42 -0700)]
net/enic: fix possible Rx corruption

Initialize the mbuf data offset to RTE_PKTMBUF_HEADROOM as the
enic takes ownership of them. If allocated mbufs had some offset
other than RTE_PKTMBUF_HEADROOM, the application would read mbuf
data starting at the wrong place and misinterpret the packet.

Fixes: 856d7ba7ed22 ("net/enic: support scattered Rx")

Reviewed-by: Nelson Escobar <neescoba@cisco.com>
Signed-off-by: John Daley <johndale@cisco.com>
7 years agonet/ena: fix icc build
Ferruh Yigit [Tue, 19 Jul 2016 09:33:03 +0000 (10:33 +0100)]
net/ena: fix icc build

drivers/net/ena/base/ena_com.c(346):
error #3656: variable "dev_node" may be used before its value is set
                ENA_MEM_ALLOC_COHERENT_NODE(ena_dev->dmadev,
                ^

drivers/net/ena/base/ena_com.c(399):
error #3656: variable "prev_node" may be used before its value is set
        ENA_MEM_ALLOC_COHERENT_NODE(ena_dev->dmadev,
        ^

Fixes: 3d3edc265fc8 ("net/ena: make coherent memory allocation NUMA-aware")

Reported-by: Eoin Breen <eoin.breen@intel.com>
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Jan Medala <jan@semihalf.com>
7 years agonet/bnx2x: fix mempool name length
Rasesh Mody [Wed, 20 Jul 2016 19:09:34 +0000 (12:09 -0700)]
net/bnx2x: fix mempool name length

This patch fixes following error:
   EAL: Detected 36 lcore(s)
   EAL: Probing VFIO support...
   PMD: bnxt_rte_pmd_init() called for (null)
   EAL: PCI device 0000:08:00.0 on NUMA socket 0
   EAL:   probe driver: 14e4:16a1 rte_bnx2x_pmd
   EAL: PCI device 0000:08:00.1 on NUMA socket 0
   EAL:   probe driver: 14e4:16a1 rte_bnx2x_pmd
   Lcore 0: RX port 0
   Lcore 1: RX port 1
   Initializing port 0... EAL: Error - exiting with code: 1
     Cause: Cannot configure device: err=-6, port=0

Fixes: 540a2110 ("bnx2x: driver core")
Fixes: 85cf0079 ("mem: avoid memzone/mempool/ring name truncation")

Signed-off-by: Rasesh Mody <rasesh.mody@qlogic.com>
7 years agonet/bnx2x: disable fast path interrupts
Rasesh Mody [Wed, 20 Jul 2016 19:09:33 +0000 (12:09 -0700)]
net/bnx2x: disable fast path interrupts

Disable fastpath interrupts and remove unneeded delay in
bnx2x_interrupt_action(). This patch fixes and prevents performance
degradation (upto 50% drop) for BNX2X PMD.

Fixes: 540a2110 ("bnx2x: driver core")

Signed-off-by: Rasesh Mody <rasesh.mody@qlogic.com>
7 years agonet/virtio: fix crash on null dereference
Yuanhan Liu [Tue, 19 Jul 2016 02:39:53 +0000 (10:39 +0800)]
net/virtio: fix crash on null dereference

The rxq/txq for the queue_release callback could be NULL, say when
rte_eth_dev_configure() fails that the queue is not setup at all.

Do a simple NULL check would fix the crash issue.

Fixes: 01ad44fd374f ("net/virtio: split Rx/Tx queue")

Reported-by: Olivier Matz <olivier.matz@6wind.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agonet/virtio: fix packet corruption
Olivier Matz [Tue, 19 Jul 2016 12:31:59 +0000 (14:31 +0200)]
net/virtio: fix packet corruption

The support of virtio-user changed the way the mbuf dma address is
retrieved, using a physical address in case of virtio-pci and a virtual
address in case of virtio-user.

This change introduced some possible memory corruption in packets,
replacing:
  m->buf_physaddr + RTE_PKTMBUF_HEADROOM
by:
  m->buf_physaddr + m->data_off     (through a macro)

This patch fixes this issue, restoring the original behavior.

By the way, it also rework the macros, adding a "VIRTIO_" prefix and
API comments.

Fixes: f24f8f9fee8a ("net/virtio: allow virtual address to fill vring descriptors")

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agovhost: fix unregistering in client mode
Ilya Maximets [Thu, 21 Jul 2016 12:55:36 +0000 (15:55 +0300)]
vhost: fix unregistering in client mode

Currently while calling of 'rte_vhost_driver_unregister()' connection
to QEMU will not be closed. This leads to inability to register driver
again and reconnect to same virtual machine.

This scenario is reproducible with OVS. While executing of the following
command vhost port will be re-created (will be executed
'rte_vhost_driver_register()' followed by 'rte_vhost_driver_unregister()')
network will be broken and QEMU possibly will crash:

ovs-vsctl set Interface vhost1 ofport_request=15

Fix this by closing all established connections on driver unregister and
removing of pending connections from reconnection list.

Fixes: 64ab701c3d1e ("vhost: add vhost-user client mode")

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agovhost: fix connect hang in client mode
Ilya Maximets [Thu, 21 Jul 2016 13:19:35 +0000 (16:19 +0300)]
vhost: fix connect hang in client mode

If something abnormal happened to QEMU, 'connect()' can block calling
thread (e.g. main thread of OVS) forever or for a really long time.
This can break whole application or block the reconnection thread.

Example with OVS:

ovs_rcu(urcu2)|WARN|blocked 512000 ms waiting for main to quiesce
(gdb) bt
#0  connect () from /lib64/libpthread.so.0
#1  vhost_user_create_client (vsocket=0xa816e0)
#2  rte_vhost_driver_register
#3  netdev_dpdk_vhost_user_construct
#4  netdev_open (name=0xa664b0 "vhost1")
[...]
#11 main

Fix that by setting non-blocking mode for client sockets for connection.

Fixes: 64ab701c3d1e ("vhost: add vhost-user client mode")

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
7 years agoethdev: fix overwriting driver-specific stats
Remy Horton [Tue, 19 Jul 2016 11:05:17 +0000 (12:05 +0100)]
ethdev: fix overwriting driver-specific stats

After doing a driver callout to fill in the driver specific
parts of struct rte_eth_stats, rte_eth_stats_get() overwrites
the rx_nombuf member regardless of whether the driver itself
has assigned a value. Any driver-assigned value should take
priority.

Fixes: af75078fece3 ("first public release")

Signed-off-by: Remy Horton <remy.horton@intel.com>
7 years agoapp/test: increase memory allocated for greedy autotests
Thomas Monjalon [Tue, 19 Jul 2016 17:15:34 +0000 (19:15 +0200)]
app/test: increase memory allocated for greedy autotests

The autotest lists, requirements and distribution needs a big rework
to reduce the amount of cores and memory required.
The root cause is not addressed yet.

This patch just increase some memory allocation for some greedy tests
which often fail because of memory fragmentation:
LPM6 and reentrancy tests in groups 3 and 6 respectively.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Olivier Matz <olivier.matz@6wind.com>
7 years agoapp/test: disable filtering with stripped binary
Thomas Monjalon [Tue, 19 Jul 2016 16:43:19 +0000 (18:43 +0200)]
app/test: disable filtering with stripped binary

The unavailable tests are filtered out by autotest by looking for
the symbols in the binary:

PCI autotest:                  Skipped [Not Available]       [00m 00s]
Malloc autotest:               Success                       [00m 00s]

It results to skip everything if the binary has no symbol (stripped):

PCI autotest:                  Skipped [Not Available]       [00m 00s]
Malloc autotest:               Skipped [Not Available]       [00m 00s]

This case is handled by getting back to the old behaviour if the binary
has no symbol information:

PCI autotest:                  Fail [Not found]              [00m 00s]
Malloc autotest:               Success                       [00m 00s]

Fixes: d553c8f2b1a2 ("app/test: filter out unavailable tests")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Tested-by: Olivier Matz <olivier.matz@6wind.com>
7 years agoapp/test: fix mempool freeing
Santosh Shukla [Thu, 21 Jul 2016 11:49:47 +0000 (17:19 +0530)]
app/test: fix mempool freeing

test_mempool func not using pointer variable 'mp_ext' and incorrectly
freed. So removing ptr var. Now freeing mp_stack var.

Fixes: ea151eb48a04 ("app/test: migrate custom mempool handler to stack handler")

Signed-off-by: Santosh Shukla <santosh.shukla@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agoapp/test: fix ring size
Jerin Jacob [Mon, 18 Jul 2016 05:55:54 +0000 (11:25 +0530)]
app/test: fix ring size

rte_ring_create expects the size of the ring to
be a power of 2. REFCNT_RING_SIZE value is not
power of 2 in-case if RTE_MAX_LCORE == 96.
Fix it by aligning the size to next power of 2 value.

Fixes: af75078f ("first public release")

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agoring: guarantee dequeue ordering before tail update
Juhamatti Kuusisaari [Fri, 15 Jul 2016 04:39:51 +0000 (07:39 +0300)]
ring: guarantee dequeue ordering before tail update

Consumer queue dequeuing must be guaranteed to be done fully before
the tail is updated. This is not guaranteed with a read barrier,
changed to a write barrier just before tail update which in practice
guarantees correct order of reads and writes.

Signed-off-by: Juhamatti Kuusisaari <juhamatti.kuusisaari@coriant.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
7 years agomempool: fix missing registration of free function
Zoltan Kiss [Wed, 20 Jul 2016 17:14:00 +0000 (18:14 +0100)]
mempool: fix missing registration of free function

The new mempool handler interface forgets to register the free() function
of the ops. Introduced in this patch:

Fixes: 449c49b93a6b ("mempool: support handler operations")

Signed-off-by: Zoltan Kiss <zoltan.kiss@schaman.hu>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agomempool: adjust name size in related data types
Zoltan Kiss [Wed, 20 Jul 2016 17:16:38 +0000 (18:16 +0100)]
mempool: adjust name size in related data types

A recent patch brought up an issue about the size of the 'name' fields:

85cf0079 mem: avoid memzone/mempool/ring name truncation

These relations should be observed:

1. Each ring creates a memzone with a prefixed name:
RTE_RING_NAMESIZE <= RTE_MEMZONE_NAMESIZE - strlen(RTE_RING_MZ_PREFIX)

2. There are some mempool handlers which create a ring with a prefixed
name:
RTE_MEMPOOL_NAMESIZE <= RTE_RING_NAMESIZE - strlen(RTE_MEMPOOL_MZ_PREFIX)

3. A mempool can create up to RTE_MAX_MEMZONE pre and postfixed memzones:
sprintf(postfix, "_%d", RTE_MAX_MEMZONE)
RTE_MEMPOOL_NAMESIZE <= RTE_MEMZONE_NAMESIZE -
strlen(RTE_MEMPOOL_MZ_PREFIX) - strlen(postfix)

Setting all of them to 32 hides this restriction from the application.
This patch decreases the mempool and ring string size to accommodate for
these prefixes, but it doesn't apply the 3rd constraint. Applications
relying on these constants need to be recompiled, otherwise they'll run
into ENAMETOOLONG issues.
The size of the arrays are kept 32 for ABI compatibility, it can be
decreased next time the ABI changes.

Signed-off-by: Zoltan Kiss <zoltan.kiss@schaman.hu>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agomem: allow full length name
Zoltan Kiss [Wed, 20 Jul 2016 17:16:39 +0000 (18:16 +0100)]
mem: allow full length name

(strlen(name) == sizeof(mz->name) - 1) is a valid case, change the
condition to reflect that.
Move it earlier to avoid lookup with invalid name.
Change errno to ENAMETOOLONG.

Fixes: 85cf0079 ("mem: avoid memzone/mempool/ring name truncation")

Signed-off-by: Zoltan Kiss <zoltan.kiss@schaman.hu>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
7 years agoeal/ppc: fix memory barrier for IBM POWER
Chao Zhu [Fri, 15 Jul 2016 02:30:19 +0000 (10:30 +0800)]
eal/ppc: fix memory barrier for IBM POWER

On weak memory order architecture like POWER, rte_smp_wmb/rte_smp_rmb
need to use CPU instructions, not compiler barrier. This patch fixes
this. Also, to improve performance on PPC64, use light weight sync
instruction instead of sync instruction.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
7 years agomk: fix FreeBSD build
Sergio Gonzalez Monroy [Tue, 19 Jul 2016 13:40:37 +0000 (14:40 +0100)]
mk: fix FreeBSD build

The sed syntax of '0,/regexp/' is GNU specific and fails with
non GNU sed in FreeBSD.

To solve the issue we can use awk instead to remove duplicates.

The awk script basically keeps the last config value, while
maintaining order and comments from original config file.

Fixes: b2063f104db7 ("mk: filter duplicate configuration entries")

Signed-off-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Acked-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>