dpdk.git
8 years agocmdline: bump library version
Thomas Monjalon [Wed, 9 Mar 2016 14:11:16 +0000 (15:11 +0100)]
cmdline: bump library version

There was an ABI change in the release 16.04.

Fixes: fb76dd26a31d ("cmdline: increase command line buffer")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
8 years agoethdev: bump library version
Thomas Monjalon [Wed, 9 Mar 2016 14:09:50 +0000 (15:09 +0100)]
ethdev: bump library version

There was an ABI change and more are coming in the release 16.04.

Fixes: a9963a86b2e1 ("ethdev: increase RETA entry size")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
8 years agoeal: check if primary process is alive
Harry van Haaren [Wed, 9 Mar 2016 13:37:25 +0000 (13:37 +0000)]
eal: check if primary process is alive

This patch adds a new function to the EAL API:
int rte_eal_primary_proc_alive(const char *path);

The function indicates if a primary process is alive right now.
This functionality is implemented by testing for a write-
lock on the config file, and the function tests for a lock.

The use case for this functionality is that a secondary
process can wait until a primary process starts by polling
the function and waiting. When the primary is running, the
secondary continues to poll to detect if the primary process
has quit unexpectedly, the secondary process can detect this.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Maryam Tahhan <maryam.tahhan@intel.com>
8 years agoeal: fix race condition in multi-process startup
Harry van Haaren [Wed, 9 Mar 2016 13:37:24 +0000 (13:37 +0000)]
eal: fix race condition in multi-process startup

This patch fixes a race-condition when a primary and
secondary process simultaneously probe PCI devices.

This is implemented by moving the rte_eal_mcfg_complete()
function call in rte_eal_init() until after rte_eal_pci_probe().
The memory mapping of PCI device in the secondary process *must*
happen after the primary has finished doing the mapping as it
relies on information written by the primary.

The end result is that the secondary process waits longer,
until the primary has completed its PCI probing, and then
notifies the secondary process.

This race-condition became visible during the development of
a function that allows a secondary process to be polling until
a primary process exists. The secondary would then probe PCI
devices at the same time, causing an error during rte_eal_init()

Linux EAL:
Fixes: 916e4f4f4e45 ("memory: fix for multi process support")

BSD EAL:
Fixes: 764bf26873b9 ("add FreeBSD support")

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
8 years agoigb_uio: deprecate extended tag
Helin Zhang [Mon, 22 Feb 2016 03:59:45 +0000 (11:59 +0800)]
igb_uio: deprecate extended tag

It deprecates sys files of 'extended_tag' and
'max_read_request_size' which was not documented.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
8 years agopci: remove config of extended tag
Helin Zhang [Mon, 22 Feb 2016 03:59:44 +0000 (11:59 +0800)]
pci: remove config of extended tag

Remove pci configuration of 'extended tag' and 'max read request
size', as they are not required by all devices and it lets PMD to
configure them if necessary.
In addition, 'pci_config_space_set()' is deprecated.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoi40e: enable extended tag
Helin Zhang [Mon, 22 Feb 2016 03:59:43 +0000 (11:59 +0800)]
i40e: enable extended tag

PCIe feature of 'Extended Tag' is important for 40G performance.
It adds its enabling during each port initialization, to ensure
the high performance.

Signed-off-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
8 years agokeepalive: fix spacing
Harry van Haaren [Tue, 8 Mar 2016 10:50:40 +0000 (10:50 +0000)]
keepalive: fix spacing

This patch removes double newlines between functions
in keepalive.[hc] aligning it with the rest of DPDK.

Fixes: 75583b0d1efd ("eal: add keep alive monitoring")

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
8 years agokeepalive: set timestamp on core registration
Harry van Haaren [Tue, 8 Mar 2016 10:50:39 +0000 (10:50 +0000)]
keepalive: set timestamp on core registration

This patch sets a timestamp on each lcore when it is registered
for keepalive. This causes the first values read by the monitor
to show time since the core was registered, instead of the delta
between 0 and the timestamp counter.

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
8 years agodoc: fix keepalive sample app guide
Harry van Haaren [Tue, 8 Mar 2016 10:50:38 +0000 (10:50 +0000)]
doc: fix keepalive sample app guide

This patch fixes some mismatches between the keepalive code
and the docs. Struct names, and descriptions are not in line
with the codebase.

Fixes: e64833f2273a ("examples/l2fwd-keepalive: add sample application")

Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
8 years agomaintainers: claim responsibility for igb_uio
Ferruh Yigit [Fri, 4 Mar 2016 15:07:11 +0000 (15:07 +0000)]
maintainers: claim responsibility for igb_uio

igb_iuo has no maintainer, claim responsibility for igb_uio

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Helin Zhang <helin.zhang@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoigb_uio: cast private data to correct struct type
Ferruh Yigit [Thu, 3 Mar 2016 17:08:19 +0000 (17:08 +0000)]
igb_uio: cast private data to correct struct type

This was working fine because addresses of two structs are same:

struct A {
struct B b;
} a;

As above sample "a" and "b" has same address.

Now casting private data back to the correct struct type, to the one
stored.

Fixes: af75078fece3 ("first public release")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agoigb_uio: use macros for array size calculation
Ferruh Yigit [Fri, 4 Mar 2016 11:21:16 +0000 (11:21 +0000)]
igb_uio: use macros for array size calculation

Minor code cleanup.
Remove array size calculations and remove unnecessary assignment.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agodoc: fix number of supported bonding modes
Ferruh Yigit [Mon, 7 Mar 2016 10:25:36 +0000 (10:25 +0000)]
doc: fix number of supported bonding modes

Document mentions from 4 bonding mode but there are more modes.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
8 years agodoc: add known clang compilation issue
Pablo de Lara [Thu, 18 Feb 2016 15:21:09 +0000 (15:21 +0000)]
doc: add known clang compilation issue

Add known issue about DPDK not compiling on some CPUs
with clang versions older than 3.7.0.

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
8 years agodoc: comment unsupported ixgbe malicious driver detection
Wenzhuo Lu [Mon, 7 Mar 2016 05:28:10 +0000 (13:28 +0800)]
doc: comment unsupported ixgbe malicious driver detection

Announce that Malicious Driver Detection is not supported.

Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Shaopeng He <shaopeng.he@intel.com>
8 years agodoc: fix references in sample apps guide
Mauricio Vasquez B [Thu, 25 Feb 2016 17:02:27 +0000 (18:02 +0100)]
doc: fix references in sample apps guide

MANY references in the sample applications user guide are wrong because
they are hard-coded and section numbers have changed over the time.
This patch changes thoses references to dynamic ones, in this way if
section numbers change the reference get updated automatically.

Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
8 years agoexamples/ip_pipeline: add link identification
Fan Zhang [Tue, 1 Mar 2016 10:35:52 +0000 (10:35 +0000)]
examples/ip_pipeline: add link identification

This patch adds link identification feature to packet framework. To
identify a link, user can use both existing port-mask option, or specify
PCI device in every LINK section in the configuration file.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
8 years agoexamples/ip_pipeline: measure CPU utilization
Fan Zhang [Mon, 22 Feb 2016 14:07:04 +0000 (14:07 +0000)]
examples/ip_pipeline: measure CPU utilization

This patch adds CPU utilization measurement and idle cycle rate
computation to packet framework. The measurement is done by measuring
the cycles spent while a thread pulls zero packet from RX queue. These
cycles are treated as idle cycles (or headroom). A CLI command is added
to display idle cycle rate of specific thread. The CLI command format is
shown as following:

t <thread_id> headroom

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
8 years agoexamples/ip_pipeline: clean config parser
Fan Zhang [Wed, 17 Feb 2016 11:14:11 +0000 (11:14 +0000)]
examples/ip_pipeline: clean config parser

This patch updates the pipelne configuration file parser, cleans up nesting
if/else conditions, and add clearer error message display.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
8 years agoexamples/ip_pipeline: fix CPU socket id
Jasvinder Singh [Wed, 27 Jan 2016 11:47:09 +0000 (11:47 +0000)]
examples/ip_pipeline: fix CPU socket id

This patch fixes the socket-id error in ip_pipeline sample
application running over uni-processor systems.

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
8 years agoport: fix crash for ring writer nodrop
Jasvinder Singh [Wed, 2 Mar 2016 21:19:58 +0000 (21:19 +0000)]
port: fix crash for ring writer nodrop

Error log:
 [APP] Initializing PIPELINE0 ...
 pipeline> [APP] Initializing PIPELINE1 ...
 [PIPELINE1] Pass-through
 [APP] Initializing PIPELINE2 ...
 [PIPELINE2] Pass-through
 Segmentation fault (core dumped)

Fixes: 5f4cd47309d6 ("port: add ring writer nodrop")
Fixes: d58f69c54172 ("port: add ring multi reader or writer")

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
8 years agoport: fix crash for ethdev writer nodrop
Jasvinder Singh [Wed, 2 Mar 2016 21:19:29 +0000 (21:19 +0000)]
port: fix crash for ethdev writer nodrop

Error log:
 [APP] Initializing PIPELINE0 ...
 pipeline> [APP] Initializing PIPELINE1 ...
 [PIPELINE1] Pass-through
 Segmentation fault (core dumped)

Fixes: 304c8091e90a ("port: add ethdev writer nodrop")

Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
8 years agodoc: add gcc-multilib as linux package hint
Harry van Haaren [Tue, 16 Feb 2016 13:40:03 +0000 (13:40 +0000)]
doc: add gcc-multilib as linux package hint

When compiling for i686 targets compilation could fail
if the 32bit libc6-dev package is not installed. The
gcc-multilib packages is a meta-package that will pull
in the necessary dependencies, making setup easier for
beginners.

Reported-by: Weichun Chen <weichunx.chen@intel.com>
Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com>
Acked-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
8 years agoapp/testpmd: fix numa socket detection
Stephen Hurd [Wed, 13 Jan 2016 22:23:36 +0000 (14:23 -0800)]
app/testpmd: fix numa socket detection

Previously, max_socket was set to the highest numbered socket with
an enabled lcore.  The intent is to set it to the highest socket
regardless of it being enabled.

Fixes: 7acf894d07d1 ("app/testpmd: detect numa socket count")

Signed-off-by: Stephen Hurd <shurd@broadcom.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoapp/testpmd: fix error message when setting Tx VLAN
Wang Xiao W [Fri, 5 Feb 2016 04:50:23 +0000 (12:50 +0800)]
app/testpmd: fix error message when setting Tx VLAN

When using testpmd, sometimes we forget the right order of port_id and
vid in "tx_vlan set (port_id) vlan_id[, vlan_id_outer]\n" command, and
input "tx_vlan set 51 0", we'll get a strange prompt saying "Error, as
QinQ has been enabled.".

In cmd_tx_vlan_set_parsed function, the first thing we do is checking
the port's vlan_offload capability, rather than checking validity of the
port_id, therefore if it's an invalid port_id we'll get the above wrong
message. We should always make sure that we get a valid port_id before
we do other things.

It's the similar issue for cmd_tx_vlan_set_qinq_parsed function.

Fixes: 92ebda07ee58 ("app/testpmd: add qinq stripping and insertion")

Signed-off-by: Wang Xiao W <xiao.w.wang@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoeal/arm: check support of armv8.1 atomics
Jerin Jacob [Wed, 2 Mar 2016 13:20:59 +0000 (18:50 +0530)]
eal/arm: check support of armv8.1 atomics

armv8.1 adds support for new atomic instructions.
Linux kernel v4.3 onwards, the presence of atomic instruction
support can detect through HWCAP_ATOMICS

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>
8 years agoconfig: remove EAL flags for OS environment
Thomas Monjalon [Fri, 4 Mar 2016 23:02:02 +0000 (00:02 +0100)]
config: remove EAL flags for OS environment

CONFIG_RTE_LIBRTE_EAL_*APP can be replaced by CONFIG_RTE_EXEC_ENV_*APP.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Keith Wiles <keith.wiles@intel.com>
8 years agoconfig: remove duplicate information
Keith Wiles [Fri, 4 Mar 2016 18:11:12 +0000 (12:11 -0600)]
config: remove duplicate information

In order to cleanup the configuration files some and reduce
the number of duplicate configuration information. Add a new
file called common_base which contains just about all of the
configuration lines in one place. Then have the common_bsdapp,
common_linuxapp files include this one file. Then in those OS
specific files add the delta configuration lines.

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoconfig: fix missing 64-bit flag on FreeBSD
Keith Wiles [Fri, 4 Mar 2016 17:03:26 +0000 (11:03 -0600)]
config: fix missing 64-bit flag on FreeBSD

Until now, the generic 64-bit flag was used only for ARM or Linux,
and was not defined for BSD environment.

Fixes: d05e7115f466 ("mem: support layout of IBM Power")

Signed-off-by: Keith Wiles <keith.wiles@intel.com>
8 years agodoc: tidy sections in release notes
Thomas Monjalon [Fri, 4 Mar 2016 15:44:26 +0000 (16:44 +0100)]
doc: tidy sections in release notes

Fixes: 5499c1fc9baa ("examples/vhost: fix mbuf allocation")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoethdev: fix byte order consistency of flow director
Jingjing Wu [Mon, 1 Feb 2016 02:48:21 +0000 (10:48 +0800)]
ethdev: fix byte order consistency of flow director

Fixed issue of byte order in ethdev library that the structure
for setting fdir's mask and flow entry is inconsist and made
inputs of mask be in big endian.

Fixes: 2d4c1a9ea2ac ("ethdev: add new flow director masks")
Fixes: 76c6f89e80d4 ("ixgbe: support new flow director masks")

Reported-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Jingjing Wu <jingjing.wu@intel.com>
Acked-by: Zhe Tao <zhe.tao@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
8 years agolpm: merge tbl24 and tbl8 structures
Bruce Richardson [Tue, 24 Nov 2015 14:25:56 +0000 (14:25 +0000)]
lpm: merge tbl24 and tbl8 structures

The tbl8 and tbl24 structures were essentially identical except for
slightly different names for one or two fields. Merge these two
structures into a single structure definition.

Two fields have been renamed as part of this change: the
"ext_entry" field in the tbl24 has been renamed to "valid_group" to match
the tbl8 value to make the merge easier, and the "tbl8_gindex" field has
been renamed to "group_idx". The "valid_group" field now serves two
purposes: in a tbl8 it indicates if the group, i.e. the tbl8, is valid,
and in a tbl24, it indicates if the "group_idx" is valid, i.e. whether
the value is a next_hop or a tbl8 index. [The name "group_idx" was used
to make this latter link between the fields clearer]

Suggested-by: Vladimir Medvedkin <medvedkinv@gmail.com>
Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
8 years agombuf: get DMA address
Ravi Kerur [Fri, 4 Mar 2016 09:09:40 +0000 (10:09 +0100)]
mbuf: get DMA address

Macros RTE_MBUF_DATA_DMA_ADDR and RTE_MBUF_DATA_DMA_ADDR_DEFAULT
are defined in each PMD driver file. Convert macros to inline
functions and move them to common lib/librte_mbuf/rte_mbuf.h file.
PMD drivers include rte_mbuf.h file directly/indirectly hence no
additioanl header file inclusion is necessary.

Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
8 years agocmdline: fix missing include
Marc Sune [Wed, 2 Mar 2016 23:52:02 +0000 (00:52 +0100)]
cmdline: fix missing include

cmdline_parse_*.h headers use struct cmdline_token_hdr /
cmdline_parse_token_hdr_t which is defined in cmdline_parse.h, but
do not include it, forcing manual inclusion.

This commit includes cmdline_parse.h in all cmdline_parse_*.h.

Signed-off-by: Marc Sune <marcdevel@gmail.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
8 years agomlx5: increase RETA table size
Nelio Laranjeiro [Tue, 12 Jan 2016 10:49:09 +0000 (11:49 +0100)]
mlx5: increase RETA table size

ConnectX-4 NICs can handle at most 512 entries in RETA table.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
8 years agoethdev: increase RETA entry size
Nelio Laranjeiro [Tue, 12 Jan 2016 10:49:08 +0000 (11:49 +0100)]
ethdev: increase RETA entry size

Several NICs can handle 512 entries/queues in their RETA table,
an 8 bit field is not large enough for them.

Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
8 years agocmdline: increase command line buffer
Nelio Laranjeiro [Tue, 12 Jan 2016 10:49:07 +0000 (11:49 +0100)]
cmdline: increase command line buffer

Allow long command lines in testpmd (like flow director with IPv6, ...).

Signed-off-by: John McNamara <john.mcnamara@intel.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
8 years agoconfig: enable virtio for ARM
Santosh Shukla [Tue, 1 Mar 2016 10:02:19 +0000 (15:32 +0530)]
config: enable virtio for ARM

removed _VIRTIO_PMD=n from arch config and let arch to use _VIRTIO_PMD
from config/common_linuxapp.

Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovirtio: restrict vector Rx/Tx to x86 SSSE3
Santosh Shukla [Tue, 1 Mar 2016 10:02:18 +0000 (15:32 +0530)]
virtio: restrict vector Rx/Tx to x86 SSSE3

Temporary implementation to let virtio operate in non-vec mode for archs
which doesn't support _ssse_ cpuflag.

todo:
1) Move virtio_recv_pkts_vec() implementation to
   drivers/virtio/virtio_vec_<arch>.h file.
2) Remove use_simple_rxtx flag, so that virtio/virtio_vec_<arch>.h
   files to provide vectored/non-vectored rx/tx apis.

Fixes: fc3d66212fed ("virtio: add vector Rx")
Fixes: c121c8d6d31a ("virtio: add simple Tx")
Fixes: 8d8393fb1861 ("virtio: pick simple Rx/Tx")

Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoeal/linux: change hugepage sorting to avoid overlapping memcpy
Ralf Hoffmann [Thu, 7 Jan 2016 14:54:02 +0000 (15:54 +0100)]
eal/linux: change hugepage sorting to avoid overlapping memcpy

with only one hugepage or already sorted hugepage addresses, the sort
function called memcpy with same src and dst pointer. Debugging with
valgrind will issue a warning about overlapping area. This patch changes
the sort method to qsort to avoid this behavior. The separate sort
function is no longer necessary.

Suggested-by: Jay Rolette <rolette@infiniteio.com>
Signed-off-by: Ralf Hoffmann <ralf.hoffmann@allegro-packets.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
8 years agoeal/linux: fix build with hpet
Yi Lu [Thu, 28 Jan 2016 14:16:40 +0000 (14:16 +0000)]
eal/linux: fix build with hpet

Fix compile error when enable CONFIG_RTE_LIBEAL_USE_HPET.

Error messages:
lib/librte_eal/linuxapp/eal/eal_timer.c: In function ‘rte_eal_hpet_init’:
lib/librte_eal/linuxapp/eal/eal_timer.c:222:2: error:
implicit declaration of function ‘rte_thread_setname’

Fixes: badb3688ffa8 ("eal/linux: fix build with glibc < 2.12")

Signed-off-by: Yi Lu <luyi68@live.com>
Acked-by: David Marchand <david.marchand@6wind.com>
8 years agoeal: fix symbol map version number
Thomas Monjalon [Tue, 1 Mar 2016 07:45:38 +0000 (08:45 +0100)]
eal: fix symbol map version number

The version 2.3 has been renamed 16.04.

Fixes: 6d7de6d2e357 ("version: switch to year.month numbers")

Reported-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agomk: fix error message
Thomas Monjalon [Wed, 2 Mar 2016 15:35:29 +0000 (16:35 +0100)]
mk: fix error message

When specifying a wrong directory with RTE_SDK and RTE_TARGET
to build an application, the error message about missing config
file was wrong.

Fixes: 6b62a72a70d0 ("mk: install a standard cutomizable tree")

Reported-by: Steeven Lee <steeven@gmail.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agomk: stop on warning only in developer build
Panu Matilainen [Thu, 12 Feb 2015 15:18:20 +0000 (17:18 +0200)]
mk: stop on warning only in developer build

Add RTE_DEVEL_BUILD make-variable which can be used to do things
differently when doing development vs building a release,
autodetected from source root .git presence and overridable via
commandline. It is used it to enable -Werror compiler flag and may
be extended to other checks.

Failing build on warnings is a useful developer tool but its bad
for release tarballs which can and do get built with newer
compilers than what was used/available during development. Compilers
routinely add new warnings so code which built silently with cc X
might no longer do so with X+1. This doesn't make the existing code
any more buggier and failing the build in this case does not help
to improve the quality of an already released version either.

This change the default flags which can be tuned with EXTRA_CFLAGS.

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
8 years agomk: replace the combined library with a linker script
Panu Matilainen [Tue, 24 Nov 2015 14:31:17 +0000 (16:31 +0200)]
mk: replace the combined library with a linker script

The physically linked-together combined library has been an increasing
source of problems, as was predicted when library and symbol versioning
was introduced. Replace the complex and fragile construction with a
simple linker script which achieves the same without all the problems,
remove the related kludges from eg mlx drivers.

Since creating the linker script is practically zero cost, remove the
config option and just create it always.

Based on a patch by Sergio Gonzales Monroy, linker script approach
initially suggested by Neil Horman.

Suggested-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Suggested-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agohash: fix CRC32c computation
Didier Pallard [Fri, 19 Feb 2016 11:00:31 +0000 (12:00 +0100)]
hash: fix CRC32c computation

Fix crc32c hash functions to return a valid crc32c value for
data lengths not multiple of 4 bytes.
ARM code is not tested.

Fixes: af75078fece3 ("first public release")

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoapp/test: fix CRC hash values
Didier Pallard [Fri, 19 Feb 2016 11:00:30 +0000 (12:00 +0100)]
app/test: fix CRC hash values

Add some small key lengths (below 4 bytes), and fix odd key lengths
expected returned values for CRC computation to match real CRC values.

Fixes: 6298d2c55ae8 ("app/test: add new functional tests for hash functions")

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agombuf: provide bulk allocation
Huawei Xie [Sun, 28 Feb 2016 12:44:56 +0000 (20:44 +0800)]
mbuf: provide bulk allocation

rte_pktmbuf_alloc_bulk allocates a bulk of packet mbufs.

There is related thread about this bulk API.
http://dpdk.org/dev/patchwork/patch/4718/
Thanks to Konstantin's loop unrolling.

Attached the wiki page about duff's device. It explains the performance
optimization through loop unwinding, and also the most dramatic use of
case label fall-through.
https://en.wikipedia.org/wiki/Duff%27s_device

In this implementation, while() loop is used because we could not assume
count is strictly positive. Using while() loop saves one line of check.

Signed-off-by: Gerald Rogers <gerald.rogers@intel.com>
Signed-off-by: Huawei Xie <huawei.xie@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
8 years agovhost: broadcast RARP by injecting in receiving mbuf array
Yuanhan Liu [Mon, 22 Feb 2016 14:36:11 +0000 (22:36 +0800)]
vhost: broadcast RARP by injecting in receiving mbuf array

Broadcast RARP packet by injecting it to receiving mbuf array at
rte_vhost_dequeue_burst().

Commit 33226236a35e ("vhost: handle request to send RARP") iterates
all host interfaces and then broadcast it by all of them.  It did
notify the switches about the new location of the migrated VM, however,
the mac learning table in the target host is wrong (at least in my
test with OVS):

    $ ovs-appctl fdb/show ovsbr0
     port  VLAN  MAC                Age
        1     0  b6:3c:72:71:cd:4d   10
    LOCAL     0  b6:3c:72:71:cd:4e   10
    LOCAL     0  52:54:00:12:34:68    9
        1     0  56:f6:64:2c:bc:c0    1

Where 52:54:00:12:34:68 is the mac of the VM. As you can see from the
above, the port learned is "LOCAL", which is the "ovsbr0" port. That
is reasonable, since we indeed send the pkt by the "ovsbr0" interface.

The wrong mac table lead all the packets to the VM go to the "ovsbr0"
in the end, which ends up with all packets being lost, until the guest
send a ARP quest (or reply) to refresh the mac learning table.

Jianfeng then came up with a solution I have thought of firstly but NAKed
by myself, concerning it has potential issues [0]. The solution is as title
stated: broadcast the RARP packet by injecting it to the receiving mbuf
arrays at rte_vhost_dequeue_burst(). The re-bring of that idea made me
think it twice; it looked like a false concern to me then. And I had done
a rough verification: it worked as expected.

[0]: http://dpdk.org/ml/archives/dev/2016-February/033527.html

Another note is that while preparing this version, I found that DPDK has
some ARP related structures and macros defined. So, use them instead of
the one from standard header files here.

Cc: Thibaut Collet <thibaut.collet@6wind.com>
Suggested-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoconfig: use unaligned types for ARMv7
Jan Viktorin [Wed, 9 Dec 2015 15:16:17 +0000 (16:16 +0100)]
config: use unaligned types for ARMv7

This patch reduces number of warnings from 53 to 40.
It removes the usual false positives utilizing unaligned_uint*_t data types.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
8 years agolog: add missing symbols
Stephen Hemminger [Thu, 17 Dec 2015 00:38:34 +0000 (16:38 -0800)]
log: add missing symbols

rte_get_log_type and rte_get_log_level functions has been available
for many versions. But they are missing from the shared library map
and therefore do not get exported correctly.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoexamples/l3fwd: rework exact-match
Tomasz Kulasek [Mon, 29 Feb 2016 10:33:07 +0000 (11:33 +0100)]
examples/l3fwd: rework exact-match

Current implementation of Exact-Match uses different execution path than
for LPM. Unifying them allows to reuse big part of LPM code and sightly
increase performance of Exact-Match.

Main changes:
-------------
* Packet classification stage is separated from the rest of path for both
  LPM and EM.
* Packet processing, modifying and transmit part is the same for LPM and EM
  and mostly based on the current LPM implementation.
* Shared code is moved to the common file "l3fwd_sse.h".
* While sequential packet classification in EM path, seems to be faster
  than using multi hash lookup, used before, it is used by default. Old
  implementation is moved to the file l3fwd_em_hlm_sse.h and can be enabled
  with HASH_LOOKUP_MULTI global define in compilation time.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agocfgfile: support looking up sections by index
Rich Lane [Thu, 25 Feb 2016 20:43:03 +0000 (12:43 -0800)]
cfgfile: support looking up sections by index

This is useful when sections have duplicate names.

Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
8 years agojobstats: add abort function
Marcin Kerlin [Fri, 12 Feb 2016 16:04:41 +0000 (17:04 +0100)]
jobstats: add abort function

This patch adds new function rte_jobstats_abort.
It marks *job* as finished and time of this work will be add to management
time instead of execution time.
This function should be used instead of rte_jobstats_finish if condition
occurs, condition is defined by the application for example when receiving
n>0 packets.
Example of usage is added to the example l2fwd-jobstats.
At maximum load do-while loop inside Idle job will be execute once because
one or more jobs waiting to be executed, so this time should not be include
as the execution time by calling rte_jobstats_abort().

Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
8 years agomk: fix armv7 machine name
Jan Viktorin [Tue, 16 Feb 2016 18:35:06 +0000 (19:35 +0100)]
mk: fix armv7 machine name

The CONFIG_RTE_MACHINE must not contain hyphens to work correctly. This was
initially done only for the file name defconfig_arm-armv7a-linuxapp-gcc. This
patch fixes install-sdk goal. Otherwise, it creates a wrong directory for this
platform.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
8 years agoexamples/vhost: fix out of sequence packets
Jianfeng Tan [Tue, 19 Jan 2016 19:18:11 +0000 (03:18 +0800)]
examples/vhost: fix out of sequence packets

Issue description: when packets go through vhost example to virtio
device and come back to another virtio device or physical NIC, the
sequence of packets will be changed.

Reported-by: Thomas Long <thomas.long@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoexamples/vhost: fix mbuf allocation
Jianfeng Tan [Thu, 18 Feb 2016 00:08:39 +0000 (08:08 +0800)]
examples/vhost: fix mbuf allocation

How to reproduce:

1. Start vhost-switch
./examples/vhost/build/vhost-switch -c 0x3 -n 4 -- -p 1 --stat 0
2. Start VM with a virtio port
$ $QEMU -smp cores=2,sockets=1 -m 4G -cpu host -enable-kvm \
  -chardev socket,id=char1,path=<path to vhost-user socket> \
  -device virtio-net-pci,netdev=vhostuser1 \
  -netdev vhost-user,id=vhostuser1,chardev=char1
  -object memory-backend-file,id=mem,size=4G,mem-path=<hugetlbfs path>,share=on \
  -numa node,memdev=mem -mem-prealloc \
  -hda <path to VM img>
3. Start l2fwd in VM
$ ./examples/l2fwd/build/l2fwd -c 0x1 -n 4 -m 1024 -- -p 0x1
4. Use ixia to inject packets in a small data bit rate.

Error:

vhost-switch keeps printing error message:
failed to allocate memory for mbuf.

Root cause:

How many mbufs allocated for a port is calculated by below formula.
NUM_MBUFS_PER_PORT = ((MAX_QUEUES*RTE_TEST_RX_DESC_DEFAULT) + \
(num_switching_cores*MAX_PKT_BURST) + \
(num_switching_cores*RTE_TEST_TX_DESC_DEFAULT) +\
(num_switching_cores*MBUF_CACHE_SIZE))
We suppose num_switching_cores is 1 and MBUF_CACHE_SIZE is 128.
And when initializing port, master core fills mbuf mempool cache,
so there would be some left in that cache, for example 121.
So total mbufs which can be used is:
(MAX_PKT_BURST + MBUF_CACHE_SIZE - 121) = (32 + 128 - 121) = 39.
What makes it worse is that there is a buffer to store mbufs
(which will be tx_burst to physical port), if it occupies some mbufs,
there will be possible < 32 mbufs left, so vhost dequeue prints out
this msg.

In all, it fails to include master core's mbuf mempool cache.

Reported-by: Qian Xu <qian.q.xu@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
8 years agoexamples/l3fwd: modularize
Ravi Kerur [Thu, 25 Feb 2016 10:24:24 +0000 (11:24 +0100)]
examples/l3fwd: modularize

The main problem with l3fwd is that it is too monolithic with everything
being in one file, and the various options all controlled by compile time
flags. This means that it's hard to read and understand, and when making
any changes, you need to go to a lot of work to try and ensure you cover
all the code paths, since a compile of the app will not touch large parts
of the l3fwd codebase.

Following changes were done to fix the issues mentioned above

- Split out the various lpm and hash specific functionality into separate
  files, so that l3fwd code has one file for common code e.g. args
  processing, mempool creation, and then individual files for the various
  forwarding approaches.

  Following are new file lists
  main.c (Common code for args processing, memppol creation, etc)
  l3fwd_em.c (Hash/Exact match aka 'EM' functionality)
  l3fwd_em_sse.h (SSE4_1 buffer optimizated 'EM' code)
  l3fwd_lpm.c (Longest Prefix Match aka 'LPM' functionality)
  l3fwd_lpm_sse.h (SSE4_1 buffer optimizated 'LPM' code)
  l3fwd.h (Common include for 'EM' and 'LPM')

- The choosing of the lpm/hash path should be done at runtime, not
  compile time, via a command-line argument. This will ensure that
  both code paths get compiled in a single go

  Following examples show runtime options provided

  Select 'LPM' or 'EM' based on run time selection f.e.
                > l3fwd -c 0x1 -n 1 -- -p 0x1 -E ... (EM)
                > l3fwd -c 0x1 -n 1 -- -p 0x1 -L ... (LPM)
  Options "E" and "L" are mutualy-exclusive.
  If none selected, "L" is default.

Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Piotr Azarewicz <piotrx.t.azarewicz@intel.com>
Tested-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agoethdev: support unidirectional configuration
Reshma Pattan [Tue, 5 Jan 2016 16:34:58 +0000 (16:34 +0000)]
ethdev: support unidirectional configuration

User should be able to configure ethdev with zero rx/tx queues,
but both should not be zero.
After above change, rte_eth_dev_tx_queue_config,
rte_eth_dev_rx_queue_config should allocate memory for rx/tx queues only
when number of rx/tx queues are nonzero.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agocryptodev: allow full control from secondary process
Reshma Pattan [Tue, 5 Jan 2016 16:34:57 +0000 (16:34 +0000)]
cryptodev: allow full control from secondary process

Macro RTE_PROC_PRIMARY_OR_ERR_RET blocking the secondary process from
API usage. API access should be given to both secondary and primary.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agoethdev: allow full control from secondary process
Reshma Pattan [Tue, 5 Jan 2016 16:34:56 +0000 (16:34 +0000)]
ethdev: allow full control from secondary process

Macros RTE_PROC_PRIMARY_OR_ERR_RET and RTE_PROC_PRIMARY_OR_RET
are blocking the secondary process from using the APIs.
API access should be given to both secondary and primary.

Reported-by: Sean Harte <sean.harte@intel.com>
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agodoc: fix Linux version required by QAT driver
John Griffin [Wed, 10 Feb 2016 23:28:01 +0000 (23:28 +0000)]
doc: fix Linux version required by QAT driver

Fixing the version of the kernel required in the QAT documentation.

Signed-off-by: John Griffin <john.griffin@intel.com>
Acked by: Declan Doherty <declan.doherty@intel.com>

8 years agoqat: fix build on 32-bit systems
John Griffin [Thu, 18 Feb 2016 10:57:32 +0000 (10:57 +0000)]
qat: fix build on 32-bit systems

Fixing build on 32-bit systems on quick assist driver - for example:
drivers/crypto/qat/qat_crypto.c: In function ‘qat_alg_write_mbuf_entry’:
drivers/crypto/qat/qat_crypto.c:408:34: error:
cast from pointer to integer of different size [-Werror=pointer-to-int-cast]

Fixes: 1703e94ac5ce ("qat: add driver for QuickAssist devices")

Signed-off-by: John Griffin <john.griffin@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoaesni_mb: fix strict-aliasing compilation rule
Declan Doherty [Mon, 15 Feb 2016 17:06:07 +0000 (17:06 +0000)]
aesni_mb: fix strict-aliasing compilation rule

When compiling the AESNI_MB PMD with GCC 4.4.7 on Centos 6.7 a "dereferencing
pointer ‘obj_p’ does break strict-aliasing rules" warning occurs in the
get_session() function.

Fixes: 924e84f87306 ("aesni_mb: add driver for multi buffer based crypto")

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoaesni_mb: fix wrong return value
Pablo de Lara [Mon, 15 Feb 2016 16:45:04 +0000 (16:45 +0000)]
aesni_mb: fix wrong return value

cryptodev_aesni_mb_init was returning the device id of
the device just created, but rte_eal_vdev_init
(the function that calls the first one), was expecting 0 or
negative value.
This made impossible to create more than one aesni_mb device
from command line.

Fixes: 924e84f87306 ("aesni_mb: add driver for multi buffer based crypto")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agoexamples/l2fwd-crypto: fix typos
Pablo de Lara [Fri, 12 Feb 2016 09:17:25 +0000 (09:17 +0000)]
examples/l2fwd-crypto: fix typos

Fixes: 387259bd6c67 ("examples/l2fwd-crypto: add sample application")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agoexamples/l2fwd-crypto: fix auth params setting
Pablo de Lara [Fri, 12 Feb 2016 09:17:24 +0000 (09:17 +0000)]
examples/l2fwd-crypto: fix auth params setting

Fixes: 387259bd6c67 ("examples/l2fwd-crypto: add sample application")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agoexamples/l2fwd-crypto: fix incorrect params in command line help
Pablo de Lara [Fri, 12 Feb 2016 09:17:23 +0000 (09:17 +0000)]
examples/l2fwd-crypto: fix incorrect params in command line help

Fixes: 387259bd6c67 ("examples/l2fwd-crypto: add sample application")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agoexamples/l2fwd-crypto: fix total stats
Pablo de Lara [Fri, 12 Feb 2016 09:17:22 +0000 (09:17 +0000)]
examples/l2fwd-crypto: fix total stats

Reset total statistics (sum of all port statistics) before
adding up the new accumulated statistics per port.

Fixes: 387259bd6c67 ("examples/l2fwd-crypto: add sample application")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agovfio: support PCI ioport
Santosh Shukla [Sun, 21 Feb 2016 14:18:01 +0000 (19:48 +0530)]
vfio: support PCI ioport

Include vfio map/rd/wr support for pci ioport.

Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovfio: ignore mapping for ioport region
Santosh Shukla [Sun, 21 Feb 2016 14:18:00 +0000 (19:48 +0530)]
vfio: ignore mapping for ioport region

vfio_pci_mmap() try to map all pci bars. ioport region are not mapped in
vfio/kernel so ignore mmaping for ioport.

Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoeal/linux: never check iopl for arm
Santosh Shukla [Sun, 21 Feb 2016 14:17:59 +0000 (19:47 +0530)]
eal/linux: never check iopl for arm

iopl() syscall not supported in linux-arm/arm64 so always return 0 value.

Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoaesni_mb: fix build clean
Thomas Monjalon [Thu, 18 Feb 2016 19:16:32 +0000 (20:16 +0100)]
aesni_mb: fix build clean

The variable AESNI_MULTI_BUFFER_LIB_PATH is not required for
make clean

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agombuf_offload: fix header for C++
Thomas Monjalon [Fri, 5 Feb 2016 16:51:19 +0000 (17:51 +0100)]
mbuf_offload: fix header for C++

When built in a C++ application, the include fails for 2 reasons:

rte_mbuf_offload.h:128:24: error:
invalid conversion from ‘void*’ to ‘rte_pktmbuf_offload_pool_private*’ [-fpermissive]
    rte_mempool_get_priv(mpool);
                        ^
The cast must be explicit for C++.

rte_mbuf_offload.h:304:1: error: expected declaration before ‘}’ token

There was a closing brace for __cplusplus but not an opening one.

Fixes: 78c8709b5ddb ("mbuf_offload: introduce library to attach offloads to mbuf")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agohash: fix header for C++
Thomas Monjalon [Fri, 5 Feb 2016 16:06:05 +0000 (17:06 +0100)]
hash: fix header for C++

When built in a C++ application, the jhash include fails:

rte_jhash.h:123:22: error:
invalid conversion from ‘const void*’ to ‘const uint32_t*’ [-fpermissive]
  const uint32_t *k = key;
                      ^
The cast must be explicit for C++.

Fixes: 8718219a8737 ("hash: add new jhash functions")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoeal: fix keep alive header for C++
Thomas Monjalon [Fri, 5 Feb 2016 16:14:17 +0000 (17:14 +0100)]
eal: fix keep alive header for C++

When built in a C++ application, the keepalive include fails:

rte_keepalive.h:142:41: error: ‘ALIVE’ was not declared in this scope
  keepcfg->state_flags[rte_lcore_id()] = ALIVE;
                                         ^
C++ requires to use a scope operator to access an enum inside a struct.
There was also a namespace issue for the values (no RTE prefix).
The solution is to move the struct and related code out of the header file.

Fixes: 75583b0d1efd ("eal: add keep alive monitoring")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
8 years agovhost: check memory map before address translation
Pavel Fedin [Wed, 13 Jan 2016 07:32:57 +0000 (10:32 +0300)]
vhost: check memory map before address translation

Malfunctioning virtio clients may not send VHOST_USER_SET_MEM_TABLE for
some reason. This causes NULL dereference in qva_to_vva().

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: remove device operations pointers
Rich Lane [Fri, 19 Feb 2016 18:10:16 +0000 (10:10 -0800)]
vhost: remove device operations pointers

The vhost_net_device_ops indirection is unnecessary because there is only
one implementation of the vhost common code.
Removing it makes the code more readable.

Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agomempool: fix leak when creation fails
Olivier Matz [Tue, 16 Feb 2016 14:40:10 +0000 (15:40 +0100)]
mempool: fix leak when creation fails

Since commits ff909fe21f and 4e32101f9b, it is now possible to free
memzones and rings.

The rte_mempool_create() should be modified to take advantage of this
and not leak memory when an allocation fails.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
8 years agovhost: fix leak of fds and mmaps
Rich Lane [Wed, 10 Feb 2016 18:40:55 +0000 (10:40 -0800)]
vhost: fix leak of fds and mmaps

The common vhost code only supported a single mmap per device. vhost-user
worked around this by saving the address/length/fd of each mmap after the end
of the rte_virtio_memory struct. This only works if the vhost-user code frees
dev->mem, since the common code is unaware of the extra info. The
VHOST_USER_RESET_OWNER message is one situation where the common code frees
dev->mem and leaks the fds and mappings. This happens every time I shut down a
VM.

The new code calls back into the implementation (vhost-user or vhost-cuse) to
clean up these resources.

The vhost-cuse changes are only compile tested.

Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: remove duplicate header include
Yuanhan Liu [Fri, 29 Jan 2016 04:58:03 +0000 (12:58 +0800)]
vhost: remove duplicate header include

unistd.h has been included twice; remove one.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: enable log_shmfd protocol feature
Yuanhan Liu [Fri, 29 Jan 2016 04:58:02 +0000 (12:58 +0800)]
vhost: enable log_shmfd protocol feature

To claim that we support vhost-user live migration support:
SET_LOG_BASE request will be send only when this feature flag
is set.

Besides this flag, we actually need another feature flag set
to make vhost-user live migration work: VHOST_F_LOG_ALL.
Which, however, has been enabled long time ago.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: handle request to send RARP
Yuanhan Liu [Fri, 29 Jan 2016 04:58:01 +0000 (12:58 +0800)]
vhost: handle request to send RARP

While in former patch we enabled GUEST_ANNOUNCE feature, so that the
guest OS will broadcast a GARP message after migration to notify the
switch about the new location of migrated VM, the thing is that
GUEST_ANNOUNCE is enabled since kernel v3.5 only. For older kernel,
VHOST_USER_SEND_RARP request comes to rescue.

The payload of this new request is the mac address of the migrated VM,
with that, we could construct a RARP message, and then broadcast it
to host interfaces.

That's how this patch works:

- list all interfaces, with the help of SIOCGIFCONF ioctl command

- construct an RARP message and broadcast it

Cc: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: claim support of guest announce
Yuanhan Liu [Fri, 29 Jan 2016 04:58:00 +0000 (12:58 +0800)]
vhost: claim support of guest announce

It's actually a feature already enabled in Linux kernel (since v3.5).
What we need to do is simply to claim that we support such feature,
and nothing else.

With that, the guest will send an ARP message after live migration
to notify the switches about the new location of migrated VM.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: log vring desc buffer changes
Yuanhan Liu [Fri, 29 Jan 2016 04:57:59 +0000 (12:57 +0800)]
vhost: log vring desc buffer changes

Every time we copy a buf to vring desc, we need to log it.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: log used vring changes
Yuanhan Liu [Fri, 29 Jan 2016 04:57:57 +0000 (12:57 +0800)]
vhost: log used vring changes

Introduce vhost_log_write() helper function to log the dirty pages we
touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each
log is presented by 1 bit.

Therefore, vhost_log_write() simply finds the right bit for related
page we are gonna change, and set it to 1. dev->log_base denotes the
start of the dirty page bitmap.

Every time we update virtio used ring, we need to log it. And it's
been done by a new vhost_log_write() wrapper, vhost_log_used_vring().

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: handle dirty pages logging request
Yuanhan Liu [Fri, 29 Jan 2016 04:57:56 +0000 (12:57 +0800)]
vhost: handle dirty pages logging request

VHOST_USER_SET_LOG_BASE request is used to tell the backend (dpdk
vhost-user) where we should log dirty pages, and how big the log
buffer is.

This request introduces a new payload:

    typedef struct VhostUserLog {
            uint64_t mmap_size;
            uint64_t mmap_offset;
    } VhostUserLog;

Also, a fd is delivered from QEMU by ancillary data.

With those info given, an area of memory is mmaped, assigned
to dev->log_base, for logging dirty pages.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: fix build dependency
Panu Matilainen [Thu, 18 Feb 2016 09:47:43 +0000 (11:47 +0200)]
vhost: fix build dependency

Commit d0cf91303d73 added dependency on librte_net headers to vhost
but did not add this to the Makefile, which makes builds
non-deterministic. Curiously it is non-parallel build that is
consistently broken by this missing dependency, usually it's the other
way around, but trying to build without -j(n) fails with:

lib/librte_vhost/vhost_rxtx.c:41:20:
fatal error: rte_ip.h: No such file or directory

Fixes: d0cf91303d73 ("vhost: add Tx offload capabilities")

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoexamples/vhost: add virtio offload
Jijiang Liu [Fri, 5 Feb 2016 07:31:41 +0000 (15:31 +0800)]
examples/vhost: add virtio offload

Change the codes in vhost sample to test virtio offload feature.

These changes include,

1. add two test options: tx-csum and tso.

2. add virtio_tx_offload() function to test vhost TX offload feature
   for VM to NIC case;

however, for VM to VM case, it doesn't need to call this function,
  the reason is explained in patch 2.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoexamples/vhost: remove IPv4 header definition
Jijiang Liu [Fri, 5 Feb 2016 07:31:40 +0000 (15:31 +0800)]
examples/vhost: remove IPv4 header definition

Remove the ipv4_hdr structure defination in vhost sample.

The same structure has already defined in the rte_ip.h file, so we
  remove the defination from the sample, and include that header file.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: add guest offload setting
Jijiang Liu [Fri, 5 Feb 2016 07:31:39 +0000 (15:31 +0800)]
vhost: add guest offload setting

Add guest offload setting in vhost lib.

Virtio 1.0 spec (5.1.6.4 Processing of Incoming Packets) says:

    1. If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
       VIRTIO_NET_HDR_F_NEEDS_CSUM bit in flags can be set: if so,
       the packet checksum at offset csum_offset from csum_start
       and any preceding checksums have been validated. The checksum
       on the packet is incomplete and csum_start and csum_offset
       indicate how to calculate it (see Packet Transmission point 1).

    2. If the VIRTIO_NET_F_GUEST_TSO4, TSO6 or UFO options were
       negotiated, then gso_type MAY be something other than
       VIRTIO_NET_HDR_GSO_NONE, and gso_size field indicates the
       desired MSS (see Packet Transmission point 2).

In order to support these features, the following changes are added,

1. Extend 'VHOST_SUPPORTED_FEATURES' macro to add the offload features negotiation.

2. Enqueue these offloads: convert some fields in mbuf to the fields in virtio_net_hdr.

There are more explanations for the implementation.

For VM2VM case, there is no need to do checksum, for we think the
  data should be reliable enough, and setting VIRTIO_NET_HDR_F_NEEDS_CSUM
  at RX side will let the TCP layer to bypass the checksum validation,
  so that the RX side could receive the packet in the end.

In terms of us-vhost, at vhost RX side, the offload information is
  inherited from mbuf, which is in turn inherited from TX side. If we
  can still get those info at RX side, it means the packet is from
  another VM at same host. So, it's safe to set the
  VIRTIO_NET_HDR_F_NEEDS_CSUM, to skip checksum validation.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: add Tx offload capabilities
Jijiang Liu [Fri, 5 Feb 2016 07:31:38 +0000 (15:31 +0800)]
vhost: add Tx offload capabilities

Add vhost TX offload (CSUM and TSO) support capabilities in vhost lib.

In order to support these features, and the following changes are added,

1. Extend 'VHOST_SUPPORTED_FEATURES' macro to add the offload features
   negotiation.

2. Dequeue TX offload: convert the fileds in virtio_net_hdr to the
   related fileds in mbuf.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovirtio: use PCI ioport API
David Marchand [Tue, 16 Feb 2016 20:37:04 +0000 (21:37 +0100)]
virtio: use PCI ioport API

Move all os / arch specifics to eal.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Santosh Shukla <sshukla@mvista.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoeal: introduce PCI ioport API
David Marchand [Tue, 16 Feb 2016 20:37:03 +0000 (21:37 +0100)]
eal: introduce PCI ioport API

Most of the code is inspired on virtio driver.
rte_pci_ioport structure is filled at map time with anything needed for later
read / write calls.
At the moment, base field is used to store a x86 ioport (uint16_t) and will
be reused for other arches.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovirtio: fix check when mapping PCI resources
David Marchand [Tue, 16 Feb 2016 20:37:02 +0000 (21:37 +0100)]
virtio: fix check when mapping PCI resources

According to the api, rte_eal_pci_map_device is only successful when
returning 0.

Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovirtio: fix FreeBSD build
David Marchand [Tue, 16 Feb 2016 20:37:01 +0000 (21:37 +0100)]
virtio: fix FreeBSD build

Fixes: c52afa68d763 ("virtio: move left PCI stuff in the right file")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoeal: remove compiler optimization workaround
Thomas Monjalon [Tue, 2 Feb 2016 23:10:26 +0000 (00:10 +0100)]
eal: remove compiler optimization workaround

The compiler optimization was disabled a long time ago
without describing what was the exact issue.
Maybe it does not apply anymore.
As it looks unneeded, let's remove this strange pragma.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoeal/ppc: adapt CPU flags check to the arch
Thomas Monjalon [Tue, 2 Feb 2016 23:10:25 +0000 (00:10 +0100)]
eal/ppc: adapt CPU flags check to the arch

The structure feature_entry does not need leaf/subleaf
which were copied from x86 CPUID implementation.

On x86, a valid flag is detected with the non-zero leaf value.
This check is replaced by a check with a dummy "none" register.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoeal/arm: adapt CPU flags check to the arch
Thomas Monjalon [Tue, 2 Feb 2016 23:10:24 +0000 (00:10 +0100)]
eal/arm: adapt CPU flags check to the arch

The structure feature_entry does not need leaf/subleaf
which were copied from x86 CPUID implementation.

On x86, a valid flag is detected with the non-zero leaf value.
This check is replaced by a check with a dummy "none" register.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>