dpdk.git
8 years agoconfig: use unaligned types for ARMv7
Jan Viktorin [Wed, 9 Dec 2015 15:16:17 +0000 (16:16 +0100)]
config: use unaligned types for ARMv7

This patch reduces number of warnings from 53 to 40.
It removes the usual false positives utilizing unaligned_uint*_t data types.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
8 years agolog: add missing symbols
Stephen Hemminger [Thu, 17 Dec 2015 00:38:34 +0000 (16:38 -0800)]
log: add missing symbols

rte_get_log_type and rte_get_log_level functions has been available
for many versions. But they are missing from the shared library map
and therefore do not get exported correctly.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoexamples/l3fwd: rework exact-match
Tomasz Kulasek [Mon, 29 Feb 2016 10:33:07 +0000 (11:33 +0100)]
examples/l3fwd: rework exact-match

Current implementation of Exact-Match uses different execution path than
for LPM. Unifying them allows to reuse big part of LPM code and sightly
increase performance of Exact-Match.

Main changes:
-------------
* Packet classification stage is separated from the rest of path for both
  LPM and EM.
* Packet processing, modifying and transmit part is the same for LPM and EM
  and mostly based on the current LPM implementation.
* Shared code is moved to the common file "l3fwd_sse.h".
* While sequential packet classification in EM path, seems to be faster
  than using multi hash lookup, used before, it is used by default. Old
  implementation is moved to the file l3fwd_em_hlm_sse.h and can be enabled
  with HASH_LOOKUP_MULTI global define in compilation time.

Signed-off-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agocfgfile: support looking up sections by index
Rich Lane [Thu, 25 Feb 2016 20:43:03 +0000 (12:43 -0800)]
cfgfile: support looking up sections by index

This is useful when sections have duplicate names.

Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
8 years agojobstats: add abort function
Marcin Kerlin [Fri, 12 Feb 2016 16:04:41 +0000 (17:04 +0100)]
jobstats: add abort function

This patch adds new function rte_jobstats_abort.
It marks *job* as finished and time of this work will be add to management
time instead of execution time.
This function should be used instead of rte_jobstats_finish if condition
occurs, condition is defined by the application for example when receiving
n>0 packets.
Example of usage is added to the example l2fwd-jobstats.
At maximum load do-while loop inside Idle job will be execute once because
one or more jobs waiting to be executed, so this time should not be include
as the execution time by calling rte_jobstats_abort().

Signed-off-by: Marcin Kerlin <marcinx.kerlin@intel.com>
Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
8 years agomk: fix armv7 machine name
Jan Viktorin [Tue, 16 Feb 2016 18:35:06 +0000 (19:35 +0100)]
mk: fix armv7 machine name

The CONFIG_RTE_MACHINE must not contain hyphens to work correctly. This was
initially done only for the file name defconfig_arm-armv7a-linuxapp-gcc. This
patch fixes install-sdk goal. Otherwise, it creates a wrong directory for this
platform.

Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
8 years agoexamples/vhost: fix out of sequence packets
Jianfeng Tan [Tue, 19 Jan 2016 19:18:11 +0000 (03:18 +0800)]
examples/vhost: fix out of sequence packets

Issue description: when packets go through vhost example to virtio
device and come back to another virtio device or physical NIC, the
sequence of packets will be changed.

Reported-by: Thomas Long <thomas.long@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoexamples/vhost: fix mbuf allocation
Jianfeng Tan [Thu, 18 Feb 2016 00:08:39 +0000 (08:08 +0800)]
examples/vhost: fix mbuf allocation

How to reproduce:

1. Start vhost-switch
./examples/vhost/build/vhost-switch -c 0x3 -n 4 -- -p 1 --stat 0
2. Start VM with a virtio port
$ $QEMU -smp cores=2,sockets=1 -m 4G -cpu host -enable-kvm \
  -chardev socket,id=char1,path=<path to vhost-user socket> \
  -device virtio-net-pci,netdev=vhostuser1 \
  -netdev vhost-user,id=vhostuser1,chardev=char1
  -object memory-backend-file,id=mem,size=4G,mem-path=<hugetlbfs path>,share=on \
  -numa node,memdev=mem -mem-prealloc \
  -hda <path to VM img>
3. Start l2fwd in VM
$ ./examples/l2fwd/build/l2fwd -c 0x1 -n 4 -m 1024 -- -p 0x1
4. Use ixia to inject packets in a small data bit rate.

Error:

vhost-switch keeps printing error message:
failed to allocate memory for mbuf.

Root cause:

How many mbufs allocated for a port is calculated by below formula.
NUM_MBUFS_PER_PORT = ((MAX_QUEUES*RTE_TEST_RX_DESC_DEFAULT) + \
(num_switching_cores*MAX_PKT_BURST) + \
(num_switching_cores*RTE_TEST_TX_DESC_DEFAULT) +\
(num_switching_cores*MBUF_CACHE_SIZE))
We suppose num_switching_cores is 1 and MBUF_CACHE_SIZE is 128.
And when initializing port, master core fills mbuf mempool cache,
so there would be some left in that cache, for example 121.
So total mbufs which can be used is:
(MAX_PKT_BURST + MBUF_CACHE_SIZE - 121) = (32 + 128 - 121) = 39.
What makes it worse is that there is a buffer to store mbufs
(which will be tx_burst to physical port), if it occupies some mbufs,
there will be possible < 32 mbufs left, so vhost dequeue prints out
this msg.

In all, it fails to include master core's mbuf mempool cache.

Reported-by: Qian Xu <qian.q.xu@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
8 years agoexamples/l3fwd: modularize
Ravi Kerur [Thu, 25 Feb 2016 10:24:24 +0000 (11:24 +0100)]
examples/l3fwd: modularize

The main problem with l3fwd is that it is too monolithic with everything
being in one file, and the various options all controlled by compile time
flags. This means that it's hard to read and understand, and when making
any changes, you need to go to a lot of work to try and ensure you cover
all the code paths, since a compile of the app will not touch large parts
of the l3fwd codebase.

Following changes were done to fix the issues mentioned above

- Split out the various lpm and hash specific functionality into separate
  files, so that l3fwd code has one file for common code e.g. args
  processing, mempool creation, and then individual files for the various
  forwarding approaches.

  Following are new file lists
  main.c (Common code for args processing, memppol creation, etc)
  l3fwd_em.c (Hash/Exact match aka 'EM' functionality)
  l3fwd_em_sse.h (SSE4_1 buffer optimizated 'EM' code)
  l3fwd_lpm.c (Longest Prefix Match aka 'LPM' functionality)
  l3fwd_lpm_sse.h (SSE4_1 buffer optimizated 'LPM' code)
  l3fwd.h (Common include for 'EM' and 'LPM')

- The choosing of the lpm/hash path should be done at runtime, not
  compile time, via a command-line argument. This will ensure that
  both code paths get compiled in a single go

  Following examples show runtime options provided

  Select 'LPM' or 'EM' based on run time selection f.e.
                > l3fwd -c 0x1 -n 1 -- -p 0x1 -E ... (EM)
                > l3fwd -c 0x1 -n 1 -- -p 0x1 -L ... (LPM)
  Options "E" and "L" are mutualy-exclusive.
  If none selected, "L" is default.

Signed-off-by: Ravi Kerur <rkerur@gmail.com>
Signed-off-by: Piotr Azarewicz <piotrx.t.azarewicz@intel.com>
Tested-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Tomasz Kulasek <tomaszx.kulasek@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agoethdev: support unidirectional configuration
Reshma Pattan [Tue, 5 Jan 2016 16:34:58 +0000 (16:34 +0000)]
ethdev: support unidirectional configuration

User should be able to configure ethdev with zero rx/tx queues,
but both should not be zero.
After above change, rte_eth_dev_tx_queue_config,
rte_eth_dev_rx_queue_config should allocate memory for rx/tx queues only
when number of rx/tx queues are nonzero.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agocryptodev: allow full control from secondary process
Reshma Pattan [Tue, 5 Jan 2016 16:34:57 +0000 (16:34 +0000)]
cryptodev: allow full control from secondary process

Macro RTE_PROC_PRIMARY_OR_ERR_RET blocking the secondary process from
API usage. API access should be given to both secondary and primary.

Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agoethdev: allow full control from secondary process
Reshma Pattan [Tue, 5 Jan 2016 16:34:56 +0000 (16:34 +0000)]
ethdev: allow full control from secondary process

Macros RTE_PROC_PRIMARY_OR_ERR_RET and RTE_PROC_PRIMARY_OR_RET
are blocking the secondary process from using the APIs.
API access should be given to both secondary and primary.

Reported-by: Sean Harte <sean.harte@intel.com>
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agodoc: fix Linux version required by QAT driver
John Griffin [Wed, 10 Feb 2016 23:28:01 +0000 (23:28 +0000)]
doc: fix Linux version required by QAT driver

Fixing the version of the kernel required in the QAT documentation.

Signed-off-by: John Griffin <john.griffin@intel.com>
Acked by: Declan Doherty <declan.doherty@intel.com>

8 years agoqat: fix build on 32-bit systems
John Griffin [Thu, 18 Feb 2016 10:57:32 +0000 (10:57 +0000)]
qat: fix build on 32-bit systems

Fixing build on 32-bit systems on quick assist driver - for example:
drivers/crypto/qat/qat_crypto.c: In function ‘qat_alg_write_mbuf_entry’:
drivers/crypto/qat/qat_crypto.c:408:34: error:
cast from pointer to integer of different size [-Werror=pointer-to-int-cast]

Fixes: 1703e94ac5ce ("qat: add driver for QuickAssist devices")

Signed-off-by: John Griffin <john.griffin@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoaesni_mb: fix strict-aliasing compilation rule
Declan Doherty [Mon, 15 Feb 2016 17:06:07 +0000 (17:06 +0000)]
aesni_mb: fix strict-aliasing compilation rule

When compiling the AESNI_MB PMD with GCC 4.4.7 on Centos 6.7 a "dereferencing
pointer ‘obj_p’ does break strict-aliasing rules" warning occurs in the
get_session() function.

Fixes: 924e84f87306 ("aesni_mb: add driver for multi buffer based crypto")

Signed-off-by: Declan Doherty <declan.doherty@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoaesni_mb: fix wrong return value
Pablo de Lara [Mon, 15 Feb 2016 16:45:04 +0000 (16:45 +0000)]
aesni_mb: fix wrong return value

cryptodev_aesni_mb_init was returning the device id of
the device just created, but rte_eal_vdev_init
(the function that calls the first one), was expecting 0 or
negative value.
This made impossible to create more than one aesni_mb device
from command line.

Fixes: 924e84f87306 ("aesni_mb: add driver for multi buffer based crypto")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agoexamples/l2fwd-crypto: fix typos
Pablo de Lara [Fri, 12 Feb 2016 09:17:25 +0000 (09:17 +0000)]
examples/l2fwd-crypto: fix typos

Fixes: 387259bd6c67 ("examples/l2fwd-crypto: add sample application")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agoexamples/l2fwd-crypto: fix auth params setting
Pablo de Lara [Fri, 12 Feb 2016 09:17:24 +0000 (09:17 +0000)]
examples/l2fwd-crypto: fix auth params setting

Fixes: 387259bd6c67 ("examples/l2fwd-crypto: add sample application")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agoexamples/l2fwd-crypto: fix incorrect params in command line help
Pablo de Lara [Fri, 12 Feb 2016 09:17:23 +0000 (09:17 +0000)]
examples/l2fwd-crypto: fix incorrect params in command line help

Fixes: 387259bd6c67 ("examples/l2fwd-crypto: add sample application")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agoexamples/l2fwd-crypto: fix total stats
Pablo de Lara [Fri, 12 Feb 2016 09:17:22 +0000 (09:17 +0000)]
examples/l2fwd-crypto: fix total stats

Reset total statistics (sum of all port statistics) before
adding up the new accumulated statistics per port.

Fixes: 387259bd6c67 ("examples/l2fwd-crypto: add sample application")

Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agovfio: support PCI ioport
Santosh Shukla [Sun, 21 Feb 2016 14:18:01 +0000 (19:48 +0530)]
vfio: support PCI ioport

Include vfio map/rd/wr support for pci ioport.

Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovfio: ignore mapping for ioport region
Santosh Shukla [Sun, 21 Feb 2016 14:18:00 +0000 (19:48 +0530)]
vfio: ignore mapping for ioport region

vfio_pci_mmap() try to map all pci bars. ioport region are not mapped in
vfio/kernel so ignore mmaping for ioport.

Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoeal/linux: never check iopl for arm
Santosh Shukla [Sun, 21 Feb 2016 14:17:59 +0000 (19:47 +0530)]
eal/linux: never check iopl for arm

iopl() syscall not supported in linux-arm/arm64 so always return 0 value.

Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoaesni_mb: fix build clean
Thomas Monjalon [Thu, 18 Feb 2016 19:16:32 +0000 (20:16 +0100)]
aesni_mb: fix build clean

The variable AESNI_MULTI_BUFFER_LIB_PATH is not required for
make clean

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Declan Doherty <declan.doherty@intel.com>
8 years agombuf_offload: fix header for C++
Thomas Monjalon [Fri, 5 Feb 2016 16:51:19 +0000 (17:51 +0100)]
mbuf_offload: fix header for C++

When built in a C++ application, the include fails for 2 reasons:

rte_mbuf_offload.h:128:24: error:
invalid conversion from ‘void*’ to ‘rte_pktmbuf_offload_pool_private*’ [-fpermissive]
    rte_mempool_get_priv(mpool);
                        ^
The cast must be explicit for C++.

rte_mbuf_offload.h:304:1: error: expected declaration before ‘}’ token

There was a closing brace for __cplusplus but not an opening one.

Fixes: 78c8709b5ddb ("mbuf_offload: introduce library to attach offloads to mbuf")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agohash: fix header for C++
Thomas Monjalon [Fri, 5 Feb 2016 16:06:05 +0000 (17:06 +0100)]
hash: fix header for C++

When built in a C++ application, the jhash include fails:

rte_jhash.h:123:22: error:
invalid conversion from ‘const void*’ to ‘const uint32_t*’ [-fpermissive]
  const uint32_t *k = key;
                      ^
The cast must be explicit for C++.

Fixes: 8718219a8737 ("hash: add new jhash functions")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoeal: fix keep alive header for C++
Thomas Monjalon [Fri, 5 Feb 2016 16:14:17 +0000 (17:14 +0100)]
eal: fix keep alive header for C++

When built in a C++ application, the keepalive include fails:

rte_keepalive.h:142:41: error: ‘ALIVE’ was not declared in this scope
  keepcfg->state_flags[rte_lcore_id()] = ALIVE;
                                         ^
C++ requires to use a scope operator to access an enum inside a struct.
There was also a namespace issue for the values (no RTE prefix).
The solution is to move the struct and related code out of the header file.

Fixes: 75583b0d1efd ("eal: add keep alive monitoring")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
8 years agovhost: check memory map before address translation
Pavel Fedin [Wed, 13 Jan 2016 07:32:57 +0000 (10:32 +0300)]
vhost: check memory map before address translation

Malfunctioning virtio clients may not send VHOST_USER_SET_MEM_TABLE for
some reason. This causes NULL dereference in qva_to_vva().

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: remove device operations pointers
Rich Lane [Fri, 19 Feb 2016 18:10:16 +0000 (10:10 -0800)]
vhost: remove device operations pointers

The vhost_net_device_ops indirection is unnecessary because there is only
one implementation of the vhost common code.
Removing it makes the code more readable.

Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agomempool: fix leak when creation fails
Olivier Matz [Tue, 16 Feb 2016 14:40:10 +0000 (15:40 +0100)]
mempool: fix leak when creation fails

Since commits ff909fe21f and 4e32101f9b, it is now possible to free
memzones and rings.

The rte_mempool_create() should be modified to take advantage of this
and not leak memory when an allocation fails.

Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
8 years agovhost: fix leak of fds and mmaps
Rich Lane [Wed, 10 Feb 2016 18:40:55 +0000 (10:40 -0800)]
vhost: fix leak of fds and mmaps

The common vhost code only supported a single mmap per device. vhost-user
worked around this by saving the address/length/fd of each mmap after the end
of the rte_virtio_memory struct. This only works if the vhost-user code frees
dev->mem, since the common code is unaware of the extra info. The
VHOST_USER_RESET_OWNER message is one situation where the common code frees
dev->mem and leaks the fds and mappings. This happens every time I shut down a
VM.

The new code calls back into the implementation (vhost-user or vhost-cuse) to
clean up these resources.

The vhost-cuse changes are only compile tested.

Signed-off-by: Rich Lane <rich.lane@bigswitch.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: remove duplicate header include
Yuanhan Liu [Fri, 29 Jan 2016 04:58:03 +0000 (12:58 +0800)]
vhost: remove duplicate header include

unistd.h has been included twice; remove one.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: enable log_shmfd protocol feature
Yuanhan Liu [Fri, 29 Jan 2016 04:58:02 +0000 (12:58 +0800)]
vhost: enable log_shmfd protocol feature

To claim that we support vhost-user live migration support:
SET_LOG_BASE request will be send only when this feature flag
is set.

Besides this flag, we actually need another feature flag set
to make vhost-user live migration work: VHOST_F_LOG_ALL.
Which, however, has been enabled long time ago.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: handle request to send RARP
Yuanhan Liu [Fri, 29 Jan 2016 04:58:01 +0000 (12:58 +0800)]
vhost: handle request to send RARP

While in former patch we enabled GUEST_ANNOUNCE feature, so that the
guest OS will broadcast a GARP message after migration to notify the
switch about the new location of migrated VM, the thing is that
GUEST_ANNOUNCE is enabled since kernel v3.5 only. For older kernel,
VHOST_USER_SEND_RARP request comes to rescue.

The payload of this new request is the mac address of the migrated VM,
with that, we could construct a RARP message, and then broadcast it
to host interfaces.

That's how this patch works:

- list all interfaces, with the help of SIOCGIFCONF ioctl command

- construct an RARP message and broadcast it

Cc: Thibaut Collet <thibaut.collet@6wind.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: claim support of guest announce
Yuanhan Liu [Fri, 29 Jan 2016 04:58:00 +0000 (12:58 +0800)]
vhost: claim support of guest announce

It's actually a feature already enabled in Linux kernel (since v3.5).
What we need to do is simply to claim that we support such feature,
and nothing else.

With that, the guest will send an ARP message after live migration
to notify the switches about the new location of migrated VM.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: log vring desc buffer changes
Yuanhan Liu [Fri, 29 Jan 2016 04:57:59 +0000 (12:57 +0800)]
vhost: log vring desc buffer changes

Every time we copy a buf to vring desc, we need to log it.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: log used vring changes
Yuanhan Liu [Fri, 29 Jan 2016 04:57:57 +0000 (12:57 +0800)]
vhost: log used vring changes

Introduce vhost_log_write() helper function to log the dirty pages we
touched. Page size is harded code to 4096 (VHOST_LOG_PAGE), and each
log is presented by 1 bit.

Therefore, vhost_log_write() simply finds the right bit for related
page we are gonna change, and set it to 1. dev->log_base denotes the
start of the dirty page bitmap.

Every time we update virtio used ring, we need to log it. And it's
been done by a new vhost_log_write() wrapper, vhost_log_used_vring().

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: handle dirty pages logging request
Yuanhan Liu [Fri, 29 Jan 2016 04:57:56 +0000 (12:57 +0800)]
vhost: handle dirty pages logging request

VHOST_USER_SET_LOG_BASE request is used to tell the backend (dpdk
vhost-user) where we should log dirty pages, and how big the log
buffer is.

This request introduces a new payload:

    typedef struct VhostUserLog {
            uint64_t mmap_size;
            uint64_t mmap_offset;
    } VhostUserLog;

Also, a fd is delivered from QEMU by ancillary data.

With those info given, an area of memory is mmaped, assigned
to dev->log_base, for logging dirty pages.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
8 years agovhost: fix build dependency
Panu Matilainen [Thu, 18 Feb 2016 09:47:43 +0000 (11:47 +0200)]
vhost: fix build dependency

Commit d0cf91303d73 added dependency on librte_net headers to vhost
but did not add this to the Makefile, which makes builds
non-deterministic. Curiously it is non-parallel build that is
consistently broken by this missing dependency, usually it's the other
way around, but trying to build without -j(n) fails with:

lib/librte_vhost/vhost_rxtx.c:41:20:
fatal error: rte_ip.h: No such file or directory

Fixes: d0cf91303d73 ("vhost: add Tx offload capabilities")

Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoexamples/vhost: add virtio offload
Jijiang Liu [Fri, 5 Feb 2016 07:31:41 +0000 (15:31 +0800)]
examples/vhost: add virtio offload

Change the codes in vhost sample to test virtio offload feature.

These changes include,

1. add two test options: tx-csum and tso.

2. add virtio_tx_offload() function to test vhost TX offload feature
   for VM to NIC case;

however, for VM to VM case, it doesn't need to call this function,
  the reason is explained in patch 2.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoexamples/vhost: remove IPv4 header definition
Jijiang Liu [Fri, 5 Feb 2016 07:31:40 +0000 (15:31 +0800)]
examples/vhost: remove IPv4 header definition

Remove the ipv4_hdr structure defination in vhost sample.

The same structure has already defined in the rte_ip.h file, so we
  remove the defination from the sample, and include that header file.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: add guest offload setting
Jijiang Liu [Fri, 5 Feb 2016 07:31:39 +0000 (15:31 +0800)]
vhost: add guest offload setting

Add guest offload setting in vhost lib.

Virtio 1.0 spec (5.1.6.4 Processing of Incoming Packets) says:

    1. If the VIRTIO_NET_F_GUEST_CSUM feature was negotiated, the
       VIRTIO_NET_HDR_F_NEEDS_CSUM bit in flags can be set: if so,
       the packet checksum at offset csum_offset from csum_start
       and any preceding checksums have been validated. The checksum
       on the packet is incomplete and csum_start and csum_offset
       indicate how to calculate it (see Packet Transmission point 1).

    2. If the VIRTIO_NET_F_GUEST_TSO4, TSO6 or UFO options were
       negotiated, then gso_type MAY be something other than
       VIRTIO_NET_HDR_GSO_NONE, and gso_size field indicates the
       desired MSS (see Packet Transmission point 2).

In order to support these features, the following changes are added,

1. Extend 'VHOST_SUPPORTED_FEATURES' macro to add the offload features negotiation.

2. Enqueue these offloads: convert some fields in mbuf to the fields in virtio_net_hdr.

There are more explanations for the implementation.

For VM2VM case, there is no need to do checksum, for we think the
  data should be reliable enough, and setting VIRTIO_NET_HDR_F_NEEDS_CSUM
  at RX side will let the TCP layer to bypass the checksum validation,
  so that the RX side could receive the packet in the end.

In terms of us-vhost, at vhost RX side, the offload information is
  inherited from mbuf, which is in turn inherited from TX side. If we
  can still get those info at RX side, it means the packet is from
  another VM at same host. So, it's safe to set the
  VIRTIO_NET_HDR_F_NEEDS_CSUM, to skip checksum validation.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovhost: add Tx offload capabilities
Jijiang Liu [Fri, 5 Feb 2016 07:31:38 +0000 (15:31 +0800)]
vhost: add Tx offload capabilities

Add vhost TX offload (CSUM and TSO) support capabilities in vhost lib.

In order to support these features, and the following changes are added,

1. Extend 'VHOST_SUPPORTED_FEATURES' macro to add the offload features
   negotiation.

2. Dequeue TX offload: convert the fileds in virtio_net_hdr to the
   related fileds in mbuf.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovirtio: use PCI ioport API
David Marchand [Tue, 16 Feb 2016 20:37:04 +0000 (21:37 +0100)]
virtio: use PCI ioport API

Move all os / arch specifics to eal.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Santosh Shukla <sshukla@mvista.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoeal: introduce PCI ioport API
David Marchand [Tue, 16 Feb 2016 20:37:03 +0000 (21:37 +0100)]
eal: introduce PCI ioport API

Most of the code is inspired on virtio driver.
rte_pci_ioport structure is filled at map time with anything needed for later
read / write calls.
At the moment, base field is used to store a x86 ioport (uint16_t) and will
be reused for other arches.

Signed-off-by: David Marchand <david.marchand@6wind.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovirtio: fix check when mapping PCI resources
David Marchand [Tue, 16 Feb 2016 20:37:02 +0000 (21:37 +0100)]
virtio: fix check when mapping PCI resources

According to the api, rte_eal_pci_map_device is only successful when
returning 0.

Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agovirtio: fix FreeBSD build
David Marchand [Tue, 16 Feb 2016 20:37:01 +0000 (21:37 +0100)]
virtio: fix FreeBSD build

Fixes: c52afa68d763 ("virtio: move left PCI stuff in the right file")

Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agoeal: remove compiler optimization workaround
Thomas Monjalon [Tue, 2 Feb 2016 23:10:26 +0000 (00:10 +0100)]
eal: remove compiler optimization workaround

The compiler optimization was disabled a long time ago
without describing what was the exact issue.
Maybe it does not apply anymore.
As it looks unneeded, let's remove this strange pragma.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoeal/ppc: adapt CPU flags check to the arch
Thomas Monjalon [Tue, 2 Feb 2016 23:10:25 +0000 (00:10 +0100)]
eal/ppc: adapt CPU flags check to the arch

The structure feature_entry does not need leaf/subleaf
which were copied from x86 CPUID implementation.

On x86, a valid flag is detected with the non-zero leaf value.
This check is replaced by a check with a dummy "none" register.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoeal/arm: adapt CPU flags check to the arch
Thomas Monjalon [Tue, 2 Feb 2016 23:10:24 +0000 (00:10 +0100)]
eal/arm: adapt CPU flags check to the arch

The structure feature_entry does not need leaf/subleaf
which were copied from x86 CPUID implementation.

On x86, a valid flag is detected with the non-zero leaf value.
This check is replaced by a check with a dummy "none" register.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Tested-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
8 years agoeal: move CPU flag functions out of headers
Thomas Monjalon [Tue, 2 Feb 2016 23:10:23 +0000 (00:10 +0100)]
eal: move CPU flag functions out of headers

The patch c344eab3ee has moved the hardware definition of CPU flags.
Now the functions checking these hardware flags are also moved.
The function rte_cpu_get_flag_enabled() is no more inline.

The benefits are:
- remove rte_cpu_feature_table from the ABI (recently added)
- hide hardware details from the API
- allow to adapt structures per arch (done in next patch)

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
8 years agoeal: get CPU flag name
Thomas Monjalon [Tue, 2 Feb 2016 22:59:49 +0000 (23:59 +0100)]
eal: get CPU flag name

The new function rte_cpu_get_flag_name() is added to the EAL API.
It is implemented (duplicated) in each arch because the next patch
will remove the public exposure of the feature tables.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoexamples: fix build dependencies
Thomas Monjalon [Fri, 5 Feb 2016 14:43:56 +0000 (15:43 +0100)]
examples: fix build dependencies

When building for ARM some examples were failing to compile because
of some dependencies disabled.
Declaring these dependencies prevent from trying to compile some
not supported examples.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoexamples/ethtool: fix build
Thomas Monjalon [Fri, 5 Feb 2016 14:38:02 +0000 (15:38 +0100)]
examples/ethtool: fix build

When building for ARM, the spinlock structure was not found.
It appears to be a mismatch with rwlock which is not used in this file.

Fixes: bda68ab9d1e7 ("examples/ethtool: add user-space ethtool sample application")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Remy Horton <remy.horton@intel.com>
8 years agoexamples/ip_pipeline: fix build for x86_64 without SSE4.2
Thomas Monjalon [Wed, 3 Feb 2016 18:56:39 +0000 (19:56 +0100)]
examples/ip_pipeline: fix build for x86_64 without SSE4.2

The compiler cannot use _mm_crc32_u64:

examples/ip_pipeline/pipeline/hash_func.h:165:9:
error: implicit declaration of function '_mm_crc32_u64' is invalid in C99

Fixes: 947024a26df7 ("examples/ip_pipeline: rework passthrough pipeline")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoexamples/l3fwd: fix build without SSE4.1
Thomas Monjalon [Wed, 3 Feb 2016 18:56:38 +0000 (19:56 +0100)]
examples/l3fwd: fix build without SSE4.1

clang reports this error:
examples/l3fwd/main.c:550:1: error: unused function 'send_packetsx4'

The function is used only when ENABLE_MULTI_BUFFER_OPTIMIZE is 1.

Fixes: 96ff445371e0 ("examples/l3fwd: reorganise and optimize LPM code path")
Fixes: 6f1c1e28d98e ("examples/l3fwd: fix build with exact-match enabled")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
8 years agoexamples/distributor: fix build for non-x86 arch
Jerin Jacob [Fri, 12 Feb 2016 11:13:51 +0000 (16:43 +0530)]
examples/distributor: fix build for non-x86 arch

_mm_prefetch is defined only in x86 compilers.
Use rte_prefetch_non_temporal() abstraction instead of _mm_prefetch(x, 0)
to in-order to build distributor application for non x86 platforms

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
8 years agoeal: introduce non-temporal prefetch
Jerin Jacob [Fri, 12 Feb 2016 11:13:50 +0000 (16:43 +0530)]
eal: introduce non-temporal prefetch

non-temporal/transient/stream version of rte_prefetch0()

The non-temporal prefetch is intended as a prefetch hint that processor
will use the prefetched data only once or short period,
unlike the rte_prefetch0() function which imply that
prefetched data to use repeatedly.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Jan Viktorin <viktorin@rehivetech.com>
8 years agoethdev: reduce alignment requirement for 128-byte cache line
Jerin Jacob [Fri, 29 Jan 2016 07:45:55 +0000 (13:15 +0530)]
ethdev: reduce alignment requirement for 128-byte cache line

slow-path data structures need not be 128-byte cache aligned.
Reduce the alignment to 64-byte to save the memory.

No behavior change for 64-byte cache aligned systems as minimum
cache line size as 64.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agobitmap: optimize for 128-bytes cache line
Jerin Jacob [Fri, 29 Jan 2016 07:45:54 +0000 (13:15 +0530)]
bitmap: optimize for 128-bytes cache line

existing rte_bitmap library implementation optimally configured to run on
64-bytes cache line, extending to 128-bytes cache line targets.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
8 years agombuf: fix performance with 128-byte cache line
Jerin Jacob [Fri, 29 Jan 2016 07:45:53 +0000 (13:15 +0530)]
mbuf: fix performance with 128-byte cache line

No need to split mbuf structure to two cache lines for 128-byte cache
line size targets as it can fit on a single 128-byte cache line.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
8 years agoeal: introduce new cache line macros
Jerin Jacob [Fri, 29 Jan 2016 07:45:52 +0000 (13:15 +0530)]
eal: introduce new cache line macros

- RTE_CACHE_LINE_MIN_SIZE(Supported minimum cache line size)
- __rte_cache_min_aligned(Force minimum cache line alignment)
- RTE_CACHE_LINE_SIZE_LOG2(Express cache line size in terms of log2)

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
8 years agoconfig: clean cache line size selection scheme
Jerin Jacob [Mon, 7 Dec 2015 14:22:50 +0000 (19:52 +0530)]
config: clean cache line size selection scheme

by default, all the targets will be configured with the 64-byte cache line
size, targets which have different cache line size can be overridden
through target specific config file.

Selected ThunderX and power8 as CONFIG_RTE_CACHE_LINE_SIZE=128 targets
based on existing configuration.

Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoconfig: add a common x86 flag
Thomas Monjalon [Fri, 5 Feb 2016 21:23:15 +0000 (22:23 +0100)]
config: add a common x86 flag

Intel Architecture (IA), also called x86, is declined in
- i686
- x86_x32
- x86_64

The code common to all of these architectures can now be guarded
by a single flag RTE_ARCH_X86.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agoconfig: remove obsolete machine descriptions
Thomas Monjalon [Fri, 5 Feb 2016 21:20:08 +0000 (22:20 +0100)]
config: remove obsolete machine descriptions

More and more machines and architectures are added without keeping
the lists up-to-date.
Replace the lists with a pointer to the reference directory.
The same kind of pointer is used for the supported compilers and environments.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agoconfig: remove useless explicit includes of generated header
Thomas Monjalon [Mon, 8 Feb 2016 14:18:22 +0000 (15:18 +0100)]
config: remove useless explicit includes of generated header

The file rte_config.h is automatically generated and included.
No need to #include it.

The example performance-thread needs a makefile fix to avoid
overwriting the default cflags.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agodoc: rename release notes 2.3 to 16.04
Bruce Richardson [Wed, 10 Feb 2016 17:02:12 +0000 (17:02 +0000)]
doc: rename release notes 2.3 to 16.04

Updated release documentation to reflect new numbering scheme.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agoversion: switch to year.month numbers
Bruce Richardson [Wed, 10 Feb 2016 17:02:11 +0000 (17:02 +0000)]
version: switch to year.month numbers

As discussed on list, switch numbering scheme to be based on year/month.
Release 2.3 then becomes 16.04.

    Ref: http://dpdk.org/ml/archives/dev/2015-December/030336.html

Also, added zero padding to the month so that it appear as 16.04 and
not 16.4 in "make showversion" and rte_version().

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agodoc: drop old naming of the project
Thomas Monjalon [Mon, 8 Feb 2016 10:30:07 +0000 (11:30 +0100)]
doc: drop old naming of the project

It was requested by Intel, more than one year ago, to replace the name
"Intel DPDK" by "DPDK".
Some references to the old name were still in some docs and code comments,
leading to confusion.

Fixes: ac8ada004c12 ("doc: remove Intel references from release notes")

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
8 years agoremove extra parentheses in return statement
Huawei Xie [Wed, 27 Jan 2016 13:58:30 +0000 (21:58 +0800)]
remove extra parentheses in return statement

fix the error reported by checkpatch:
  "ERROR: return is not a function, parentheses are not required"

remove parentheses in return like:
  "return (logical expressions)"

remove parentheses in return a function like:
  "return (rte_mempool_lookup(...))"

Fixes: 6307b909b8e0 ("lib: remove extra parenthesis after return")

Signed-off-by: Huawei Xie <huawei.xie@intel.com>
8 years agoeal/linux: support built-in kernel modules
Kamil Rytarowski [Thu, 28 Jan 2016 13:13:54 +0000 (14:13 +0100)]
eal/linux: support built-in kernel modules

Currently rte_eal_check_module() detects Linux kernel modules via reading
/proc/modules. Built-in ones aren't listed there and therefore they are not
being found.

Add support for checking built-in modules with parsing the sysfs files

This commit obsoletes the /proc/modules parsing approach.

Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agotools: support binding to built-in kernel modules
Kamil Rytarowski [Thu, 28 Jan 2016 13:13:53 +0000 (14:13 +0100)]
tools: support binding to built-in kernel modules

Currently dpdk_nic_bind.py detects Linux kernel modules via reading
/proc/modules. Built-in ones aren't listed there and therefore they are
not being found by the script.

Add support for checking built-in modules with parsing the sysfs files.

This commit obsoletes the /proc/modules parsing approach.

Signed-off-by: Kamil Rytarowski <kamil.rytarowski@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
8 years agotools: support Python 3 in bind script
Dawid Jurczak [Wed, 27 Jan 2016 19:59:44 +0000 (20:59 +0100)]
tools: support Python 3 in bind script

This patch fixes syntax errors during binding ethernet device
on systems where Python 3 is default.
Backward compatibility with Python 2 is preserved.

Signed-off-by: Dawid Jurczak <dawid_jurek@vp.pl>
8 years agotools: fix unbinding failure handling
Jeff Shaw [Tue, 9 Feb 2016 00:33:46 +0000 (16:33 -0800)]
tools: fix unbinding failure handling

We should call sys.exit(), not divide sys by exit().

Signed-off-by: Jeff Shaw <jeffrey.b.shaw@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
8 years agodoc: introduce networking driver matrix
Thomas Monjalon [Wed, 27 Jan 2016 20:07:09 +0000 (21:07 +0100)]
doc: introduce networking driver matrix

In order to better compare the drivers and check what is missing
for a common baseline, we need to fill a matrix.

A CSS trick is used to fit the HTML page.
The PDF output needs some LaTeX wizardry.

Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
8 years agodoc: add a further example in ACL guide
Antonio Fischetti [Mon, 11 Jan 2016 17:44:54 +0000 (17:44 +0000)]
doc: add a further example in ACL guide

Add a further ACL example where the elements of the search key
are not entirely fitting into the 4 consecutive bytes of all
input fields.

Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
8 years agodoc: fix multi-process guide
Ferruh Yigit [Tue, 2 Feb 2016 13:11:48 +0000 (13:11 +0000)]
doc: fix multi-process guide

* remove outdated chapter reference to Multi-process support.

* html output converts "--" to "-", this is wrong when explaining the
  command arguments, used fixed width quotes for them.

Fixes: fc1f2750a3ec ("doc: programmers guide")

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
8 years agoeal/x86: fix build with clang for old AVX
Zhihong Wang [Thu, 4 Feb 2016 02:12:34 +0000 (21:12 -0500)]
eal/x86: fix build with clang for old AVX

When configuring RTE_MACHINE to "default", rte_memcpy implementation
is the default one (old AVX).
In this code, clang raises a warning thanks to -Wsometimes-uninitialized:

rte_memcpy.h:838:6: error:
variable 'srcofs' is used uninitialized whenever 'if' condition is false
        if (dstofss > 0) {
            ^~~~~~~~~~~
rte_memcpy.h:849:6: note: uninitialized use occurs here
        if (srcofs == 0) {
            ^~~~~~

It is fixed by moving srcofs initialization out of the condition.
Also dstofss calculation is corrected.

Fixes: 1ae817f9f887 ("eal/x86: tune memcpy for platforms without AVX512")

Reported-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
8 years agovirtio: move ioport macros
Yuanhan Liu [Tue, 2 Feb 2016 13:48:20 +0000 (21:48 +0800)]
virtio: move ioport macros

virtio_pci.c is the only file references macros VIRTIO_READ/WRITE_REG_X.
Move them there.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
8 years agovirtio: support specification 1.0
Yuanhan Liu [Tue, 2 Feb 2016 13:48:19 +0000 (21:48 +0800)]
virtio: support specification 1.0

Modern (v1.0) virtio pci device defines several pci capabilities.
Each cap has a configure structure corresponding to it, and the
cap.bar and cap.offset fields tell us where to find it.

Firstly, we map the pci resources by rte_eal_pci_map_device().
We then could easily locate a cfg structure by:

    cfg_addr = dev->mem_resources[cap.bar].addr + cap.offset;

Therefore, the entrance of enabling modern (v1.0) pci device support
is to iterate the pci capability lists, and to locate some configs
we care; and they are:

- common cfg

  For generic virtio and virtqueue configuration, such as setting/getting
  features, enabling a specific queue, and so on.

- nofity cfg

  Combining with `queue_notify_off' from common cfg, we could use it to
  notify a specific virt queue.

- device cfg

  Where virtio_net_config structure is located.

- isr cfg

  Where to read isr (interrupt status).

If any of above cap is not found, we fallback to the legacy virtio
handling.

If succeed, hw->vtpci_ops is assigned to modern_ops, where all
operations are implemented by reading/writing a (or few) specific
configuration space from above 4 cfg structures. And that's basically
how this patch works.

Besides those changes, virtio 1.0 introduces a new status field:
FEATURES_OK, which is set after features negotiation is done.

Last, set the VIRTIO_F_VERSION_1 feature flag.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
8 years agopci: export device mapping functions
Yuanhan Liu [Tue, 2 Feb 2016 13:48:18 +0000 (21:48 +0800)]
pci: export device mapping functions

Normally we could set RTE_PCI_DRV_NEED_MAPPING flag so that eal will
invoke pci_map_device internally for us. From that point view, there
is no need to export pci_map_device.

However, for virtio pmd driver, which is designed to work without
binding UIO (or something similar first), pci_map_device() will fail,
which ends up with virtio pmd driver being skipped. Therefore, we can
not set RTE_PCI_DRV_NEED_MAPPING blindly at virtio pmd driver.

Therefore, this patch exports pci_map_device, and let virtio pmd call
it when necessary.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: David Marchand <david.marchand@6wind.com>
Acked-by: Huawei Xie <huawei.xie@intel.com>
8 years agovirtio: retrieve header size from device setting
Yuanhan Liu [Tue, 2 Feb 2016 13:48:17 +0000 (21:48 +0800)]
virtio: retrieve header size from device setting

The mergeable virtio net hdr format has been the standard and the
only virtio net hdr format since virtio 1.0. Therefore, we can
not hardcode hdr_size to "sizeof(struct virtio_net_hdr)" any more
at virtio_recv_pkts(), otherwise, there would be a mismatch of
hdr size from rte_vhost_enqueue_burst() and virtio_recv_pkts(),
leading a packet corruption.

Instead, we should retrieve it from hw->vtnet_hdr_size; we will
do proper settings at eth_virtio_dev_init() in later patches.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
8 years agovirtio: switch to 64 bit features
Yuanhan Liu [Tue, 2 Feb 2016 13:48:16 +0000 (21:48 +0800)]
virtio: switch to 64 bit features

Switch to 64 bit features, which virtio 1.0 supports.

While legacy virtio only supports 32 bit features, it complains aloud
and quit when trying to setting > 32 bit features.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
8 years agovirtio: move left PCI stuff in the right file
Yuanhan Liu [Tue, 2 Feb 2016 13:48:15 +0000 (21:48 +0800)]
virtio: move left PCI stuff in the right file

virtio_pci.c is a more proper place for pci stuff; virtio_ethdev is not.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
8 years agovirtio: introduce PCI implementation structure
Yuanhan Liu [Tue, 2 Feb 2016 13:48:14 +0000 (21:48 +0800)]
virtio: introduce PCI implementation structure

Introduce struct virtio_pci_ops, to let legacy virtio (v0.95) and
modern virtio (1.0) have different implementation regarding to a
specific pci action, such as read host status.

With that, this patch reimplements all exported pci functions, in
a way like:

vtpci_foo_bar(struct virtio_hw *hw)
{
hw->vtpci_ops->foo_bar(hw);
}

So that we need pay attention to those pci related functions only
while adding virtio 1.0 support.

This patch introduced a new vtpci function, vtpci_init(), to do
proper virtio pci settings. It's pretty simple so far: just sets
hw->vtpci_ops to legacy_ops as we don't support 1.0 yet.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
8 years agovirtio: define offset as size_t type
Yuanhan Liu [Tue, 2 Feb 2016 13:48:13 +0000 (21:48 +0800)]
virtio: define offset as size_t type

offset arg of vtpci_read/write_dev_config is derived from offsetof(),
which is of size_t type, instead of uint64_t. So, define it as size_t
type.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
8 years agovirtio: do not set vring address again at queue startup
Yuanhan Liu [Tue, 2 Feb 2016 13:48:12 +0000 (21:48 +0800)]
virtio: do not set vring address again at queue startup

As we have already set up it at virtio_dev_queue_setup(), and a vq
restart will not reset the settings.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Reviewed-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Tested-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Acked-by: Huawei Xie <huawei.xie@intel.com>
8 years agodoc: add example text to release notes
John McNamara [Mon, 1 Feb 2016 15:24:47 +0000 (15:24 +0000)]
doc: add example text to release notes

Added example text to each of the release notes sections to show
the preferred format.

Signed-off-by: John McNamara <john.mcnamara@intel.com>
8 years agoeal: move cpu flags out of headers
Ferruh Yigit [Thu, 28 Jan 2016 12:20:23 +0000 (12:20 +0000)]
eal: move cpu flags out of headers

Move cpu_feature_table array from arch specific rte_cpuflags.h files to
new arch specific rte_cpuflags.c files.

Main motivation is to escape from static variable declarations in
header files. cpu_feature_table has many copies in final binary, even
exist in some object files that does not use this variable at all.

And this can be a sample to create architecture specific source files
and move some functions which are not performance sensitive from
architecture header files to source files.

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
8 years agolib: remove keyword extern for functions
Ferruh Yigit [Thu, 28 Jan 2016 14:31:23 +0000 (14:31 +0000)]
lib: remove keyword extern for functions

Remove "extern" keywords in header files,
the ones for function prototypes

Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
8 years agovfio: support no-IOMMU mode
Anatoly Burakov [Thu, 28 Jan 2016 11:57:54 +0000 (11:57 +0000)]
vfio: support no-IOMMU mode

This commit is adding a generic mechanism to support multiple IOMMU
types. For now, it's only type 1 (x86 IOMMU) and no-IOMMU (a special
VFIO mode that doesn't use IOMMU at all), but it's easily extended
by adding necessary definitions to eal_vfio.h, and DMA mapping
functions to eal_pci_vfio.c.

Since type 1 IOMMU module is no longer necessary to have VFIO,
we fix the module check to check for vfio-pci instead. It's not
ideal and triggers VFIO checks more often (and thus produces more
error output, which was the reason behind the module check in the
first place), so we compensate for that by providing more verbose
logging, indicating whether VFIO initialization has succeeded or
failed.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Signed-off-by: Santosh Shukla <sshukla@mvista.com>
Tested-by: Santosh Shukla <sshukla@mvista.com>
8 years agoeal/x86: fix build with gcc 5.3.1
Michael Qiu [Thu, 28 Jan 2016 07:30:34 +0000 (15:30 +0800)]
eal/x86: fix build with gcc 5.3.1

In fedora 22 with GCC version 5.3.1, when compile,
will result an error:

    include/rte_memcpy.h:309:7: error: "RTE_MACHINE_CPUFLAG_AVX2"
                                is not defined [-Werror=undef]
    #elif RTE_MACHINE_CPUFLAG_AVX2

Fixes: 9484092baad3 ("eal/x86: optimize memcpy for AVX512 platforms")

Signed-off-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Zhihong Wang <zhihong.wang@intel.com>
8 years agoapp/test: adjust alignment unit for memcpy performance
Zhihong Wang [Mon, 18 Jan 2016 03:05:13 +0000 (22:05 -0500)]
app/test: adjust alignment unit for memcpy performance

Decide alignment unit for memcpy perf test based on predefined macros.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
8 years agoeal/x86: tune memcpy for platforms without AVX512
Zhihong Wang [Mon, 18 Jan 2016 03:05:14 +0000 (22:05 -0500)]
eal/x86: tune memcpy for platforms without AVX512

For prior platforms, add condition for unalignment handling, to keep this
operation from interrupting the batch copy loop for aligned cases.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
8 years agoeal/x86: optimize memcpy for AVX512 platforms
Zhihong Wang [Mon, 18 Jan 2016 03:05:12 +0000 (22:05 -0500)]
eal/x86: optimize memcpy for AVX512 platforms

Implement AVX512 memcpy and choose the right implementation based on
predefined macros, to make full utilization of hardware resources and
deliver high performance.

In current DPDK, memcpy holds a large proportion of execution time in
libs like Vhost, especially for large packets, and this patch can bring
considerable benefits for AVX512 platforms.

The implementation is based on the current DPDK memcpy framework, some
background introduction can be found in these threads:
http://dpdk.org/ml/archives/dev/2014-November/008158.html
http://dpdk.org/ml/archives/dev/2015-January/011800.html

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
8 years agoeal/x86: identify AVX512 CPU flag
Zhihong Wang [Mon, 18 Jan 2016 03:05:10 +0000 (22:05 -0500)]
eal/x86: identify AVX512 CPU flag

Read CPUID to check if AVX512 is supported by CPU.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
8 years agomk: predefine AVX512 macro for compiler
Zhihong Wang [Mon, 18 Jan 2016 03:05:11 +0000 (22:05 -0500)]
mk: predefine AVX512 macro for compiler

Predefine AVX512 macro if AVX512 is enabled by compiler.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
8 years agoexamples/l3fwd: handle SIGINT and SIGTERM
Zhihong Wang [Wed, 30 Dec 2015 21:59:51 +0000 (16:59 -0500)]
examples/l3fwd: handle SIGINT and SIGTERM

Handle SIGINT and SIGTERM in l3fwd.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
8 years agoexamples/l2fwd: handle SIGINT and SIGTERM
Zhihong Wang [Wed, 30 Dec 2015 21:59:50 +0000 (16:59 -0500)]
examples/l2fwd: handle SIGINT and SIGTERM

Handle SIGINT and SIGTERM in l2fwd.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
8 years agoapp/testpmd: handle SIGINT and SIGTERM
Zhihong Wang [Wed, 30 Dec 2015 21:59:49 +0000 (16:59 -0500)]
app/testpmd: handle SIGINT and SIGTERM

Handle SIGINT and SIGTERM in testpmd.

Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Michael Qiu <michael.qiu@intel.com>