Fan Zhang [Tue, 18 Apr 2017 11:34:31 +0000 (12:34 +0100)]
examples/l2fwd-crypto: add cryptodev mask option
Previously, l2fwd-crypto application did not give user the
flexibility to decide which crypto device(s) will be used.
In this patch, a new cryptodev_mask option is added to the
application. Same as portmask, the cryptodev_mask avails the
user to mask out the unwanted crypto devices in the system.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Pablo de Lara [Mon, 17 Apr 2017 12:23:37 +0000 (13:23 +0100)]
examples/l2fwd-crypto: fix AEAD tests when AAD is zero
For AEAD algorithms, additional authenticated data (AAD)
can be passed, but it is optional, so its size can be zero.
However, it is required to set this length to zero in the crypto
operation to avoid undefined behaviour.
Fixes: 617a7949c98a ("examples/l2fwd-crypto: parse AAD parameter") Cc: stable@dpdk.org Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Pablo de Lara [Wed, 12 Apr 2017 09:26:26 +0000 (10:26 +0100)]
app/crypto-perf: fix AEAD tests when AAD is zero
For AEAD algorithms, additional authenticated data (AAD)
can be passed, but it is optional, so its size can be zero.
Therefore, test can be run if no memory is allocated.
Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test application") Cc: stable@dpdk.org Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Pablo de Lara [Mon, 10 Apr 2017 09:58:07 +0000 (10:58 +0100)]
app/crypto-perf: fix length for wireless algos
When SNOW3G/KASUMI/ZUC algorithms are used, ciphering
and authentication lengths have to be passed as bits
and not as bytes.
Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test application") Cc: stable@dpdk.org Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Fan Zhang [Tue, 18 Apr 2017 11:28:14 +0000 (12:28 +0100)]
crypto/scheduler: improve parameters parsing
This patch improves the cryptodev scheduler PMD's commandline
parsing capability. Originally, the scheduler's slave option
requires the slave vdev(s) being declared prior to it. This
patch removes this limitation by storing the slave names
temporarily and attaching them later.
Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Pablo de Lara [Tue, 18 Apr 2017 11:39:39 +0000 (12:39 +0100)]
test/crypto: create unique driver name
Since commit <dda987315ca2> ("vdev: make virtual bus use
its device struct"), rte_eal_vdev_init cannot be called
with same name twice.
If several devices with the same driver are needed
(as in the crypto scheduler test), then driver name argument
has to be unique, concatenating the driver name and an index.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Fan Zhang <roy.fan.zhang@intel.com>
Pablo de Lara [Tue, 18 Apr 2017 11:39:38 +0000 (12:39 +0100)]
test/crypto: create only one virtual device if needed
Instead of creating two virtual devices per PMD, if they have not
been initialized from EAL already, create only one, as only the
first device is used for the crypto tests.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
This patch fixes the crypto operation resubmission problem in crypto
perferformance test. Originally, when needed crypto ops amount is
smaller than the enqueued crypto ops in the last round, one or more
processed crypto operations will be re-enqueued.
Fixes: f8be1786b1b8 ("app/crypto-perf: introduce performance test application") Cc: stable@dpdk.org Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Pablo de Lara [Tue, 11 Apr 2017 14:42:59 +0000 (15:42 +0100)]
app/crypto-perf: fix possible overflow
In the latency test, when number of enqueued operations
is less than the burst size, the timestamp value of the
non-enqueued operations was being stored, even though
those operations were being freed.
This could cause an array overflow, since it could store
more values than the total number of operations.
Fixes: 5d75fb09d3be ("app/crypto-perf: fix invalid latency for QAT") Cc: stable@dpdk.org Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Fan Zhang [Mon, 10 Apr 2017 15:00:54 +0000 (16:00 +0100)]
crypto/scheduler: fix queue pair configuration
This patch fixes the queue pair configuration for the scheduler PMD.
The queue pairs of a scheduler may have different nb_descriptors sizes,
which was not the case. Also, the maximum available objects in a
queue pair is 1 object smaller than nb_descriptors. This patch fixes
these issues.
Fixes: a783aa634410 ("crypto/scheduler: add packet size based mode") Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com> Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
On IBM POWER platform, when mapping /dev/zero file to hugepage memory
space, mmap will not respect the requested address hint. This will cause
the memory initialization for the second process fails. This patch adds
the required mmap flags to make it work. Beside this, users need to set
the nr_overcommit_hugepages to expand the VA range. When
doing the initialization, users need to set both nr_hugepages and
nr_overcommit_hugepages to the same value, like 64, 128, etc.
When we pass --log-level=0, it disables the logs. This level is
not displayed properly by the function that dumps the registered log
types (it shows "unknown"). Show "disabled" instead.
Before:
./build/app/test --log-level=0
RTE>>dump_log_types
global log level is unknown
...
After:
./build/app/test --log-level=0
RTE>>dump_log_types
global log level is disabled
...
Fix misuse of regular expression functions, which was producing a
segfault.
After the patch, it works properly:
$ ./build/app/test --no-huge --log-level=pmd,3
RTE>>dump_log_types
[...]
id 30: user7, level is debug
id 31: user8, level is debug
id 32: pmd.i40e.init, level is critical
id 33: pmd.i40e.driver, level is critical
Coverity issue: 143472 Fixes: a5279180f510 ("eal: change several log levels matching a regexp") Reported-by: Jianfeng Tan <jianfeng.tan@intel.com> Signed-off-by: Olivier Matz <olivier.matz@6wind.com> Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
This field is only used in the initialization phase. Remove it since the
global log level can also be retrieved using a public API:
rte_log_get_global_level().
The initialization of the default log level (from configuration) was
removed by mistake in a previous commit. The global log level was
wrongly set to debug when no --log-level argument was passed. Restore
this initialization.
Before:
$ ./build/app/test
RTE>>dump_log_types
global log level is debug
...
After:
$ ./build/app/test
RTE>>dump_log_types
global log level is info
...
Pablo de Lara [Wed, 19 Apr 2017 14:06:34 +0000 (15:06 +0100)]
log: redefine logtype values
After the changes in commit c1b5fa94a46f
("eal: support dynamic log types"), logtype is not treated as a
bitmask, but a decimal value. Therefore, values have to be
converted.
Fixes: c1b5fa94a46f ("eal: support dynamic log types") Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com> Acked-by: Olivier Matz <olivier.matz@6wind.com>
Bruce Richardson [Fri, 14 Apr 2017 15:18:15 +0000 (16:18 +0100)]
event/sw: fix events mis-identified as needing reorder
When taking events from a port, we checked the history list to check if the
event needed to be put back in order i.e. originally came from a reordered
queue type. The check for reordering involved checking if the reorder
buffer entry pointer was null. However, after that pointer was used it was
never cleared to null again.
This caused problems when we had mixed reordered and atomic or parallel
events, as the events from the latter two queue types were misidentified as
needing reordering. This let in some cases to crashes, but mostly led to
dropping events, and then application lock-up.
Fixes: 617995dfc5b2 ("event/sw: add scheduling logic") Signed-off-by: Bruce Richardson <bruce.richardson@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com>
Harry van Haaren [Tue, 18 Apr 2017 09:58:40 +0000 (10:58 +0100)]
event/sw: fix credit return on invalid queue id
This patch returns a credit when an rte_event is
enqueued with an invalid queue_id. Previously a
credit was leaked from the system.
Note that the eventdev instance does not attempt
to free any resources that the rte_event owns. As
a result, resources owned by the rte_event are leaked.
Eg. if the rte_event represents an rte_mbuf, the mbuf
will not be freed, and causes a leak from the mempool.
Fixes: 656af9180014 ("event/sw: add worker core functions") Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: David Hunt <david.hunt@intel.com>
Harry van Haaren [Mon, 10 Apr 2017 15:56:43 +0000 (16:56 +0100)]
event/sw: fix hashing of flow on ordered ingress
The flow id of packets was not being hashed on ingress
on an ordered queue. Fix by applying same hashing as is
applied in the atomic queue case. The hashing itself is
broken out into a macro to avoid duplication of code.
Fixes: 617995dfc5b2 ("event/sw: add scheduling logic") Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Yuanhan Liu [Wed, 19 Apr 2017 05:26:01 +0000 (13:26 +0800)]
vhost: fix dequeue zero copy
For zero copy mode, we need pin the mbuf to not let the underlaying PMD
driver (or the app) free the mbuf. Currently, only the heading mbuf is
pinned. However, the mbuf free function would try to free all mbufs
in the mbuf chain (-1 to the refcnt). This may lead the head mbuf being
still pinned, while the other subsequent mbufs are actually freed. Which
is wrong.
It becomes more fatal after the mbuf refactor, more specificly, after
the commit 8f094a9ac5d7 ("mbuf: set mbuf fields while in pool"). The
refcnt resets to 1 after the last real reference. OTOH, it leads to a
situtation that we never know one mbuf is actually freed or not. This
would result the mbuf __just__ after the heading mbuf being freed twice:
it's firstly freed (and put back to mempool) when the underlaying PMD
finishes the DMA. Later, it will then be freed again when vhost unpins
it. Meaning, one mbuf may be returned to the mempool twice, while in
turn, being allocated twice later. Something uncertain may happen then.
For example, the VM2VM case becomes broken.
Fixes: b0a985d1f340 ("vhost: add dequeue zero copy") Cc: stable@dpdk.org Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Zhiyong Yang [Wed, 19 Apr 2017 06:29:21 +0000 (14:29 +0800)]
net/virtio: support to turn on/off traffic flow
Current virtio_dev_stop only disables interrupt and marks link down,
When it is invoked, tx/rx traffic flows still work. This is a strange
behavior. The patch supports the switch of flow.
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Jianfeng Tan [Wed, 19 Apr 2017 02:30:33 +0000 (02:30 +0000)]
net/virtio-user: fix address on 32-bit system
virtio-user cannot work on 32-bit system as higher 32-bit of the
addr field (64-bit) in the desc is filled with non-zero value
which should not happen for a 32-bit system.
In case of virtio-user, we use buf_addr of mbuf to fill the
virtqueue desc addr. This is a regression bug. For 32-bit system,
the first 4 bytes of mbuf is buf_addr, with following 8 bytes for
buf_phyaddr. With below wrong definition, both buf_addr and lower
4 bytes buf_phyaddr are obtained to fill the virtqueue desc.
#define VIRTIO_MBUF_ADDR(mb, vq) \
(*(uint64_t *)((uintptr_t)(mb) + (vq)->offset))
Fixes: 25f80d108780 ("net/virtio: fix packet corruption") Cc: stable@dpdk.org Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Fri, 14 Apr 2017 07:53:18 +0000 (15:53 +0800)]
vhost: avoid memory write on net header when necessary
Like what we did for virtio PMD driver [0][1], we could also apply such
trick to vhost, to avoid the memory write on net header when necessary.
[0]: c9ea670c1dc7 ("net/virtio: fix performance regression due to TSO")
[1]: 16994abee215 ("net/virtio: optimize header reset on any layout")
With this, the cache issue of the mergeable path is again greatly reduced:
even the write of "num_buffers" could be avoided. A quick PVP test shows
the gap between the mergeable Rx and non-mergeable Rx is pretty small now:
they are basically the same in my test.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Fri, 14 Apr 2017 06:10:30 +0000 (06:10 +0000)]
net/virtio-user: fix link status
Previously, we miss to set intr_handle->fd which will be used as
target file for epoll to check LSC.
As a result, stdin (0) is used and intr thread keeps busy whenever
data comes from stdin.
To fix this, we use vhostfd as the target file for epoll to check
the link status change events. And we move intr_handle initialization
after vhost backend settup to make sure vhostfd is initialized.
Fixes: 35c4f8554833 ("net/virtio-user: support to report net status") Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Fri, 14 Apr 2017 06:36:45 +0000 (14:36 +0800)]
net/virtio: fix link status always being up
The virtio port link status will always be UP, even the port is stopped:
testpmd> port stop 0
Stopping ports...
Checking link statuses...
Port 0 Link Up - speed 10000 Mbps - full-duplex
Done
The link status is queried by link_update callback when LSC is disabled.
Which in turn queries the "status" field. However, the "status" is
read-only. I couldn't think of some proper ways to change the status
without doing device reset.
Instead of doing (the heavy) reset at stop, this patch introduced a flag,
which is set to 1 and 0 on start and stop, respectively. When it's set to
0, the link status is set to DOWN unconditionally.
Fixes: a85786dc816f ("virtio: fix states handling during initialization") Cc: stable@dpdk.org Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Mon, 17 Apr 2017 07:27:04 +0000 (15:27 +0800)]
vhost: fix use after free
A "return" is missing on error, which could lead to a "use after free"
issue (about var "conn").
Coverity issue: 143476 Fixes: 65388b43f592 ("vhost: fix fd leaks for vhost-user server mode") Reported-by: John McNamara <john.mcnamara@intel.com> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Jianfeng Tan [Thu, 13 Apr 2017 10:11:27 +0000 (10:11 +0000)]
net/virtio-user: fix feature negotiation
The feature negotiation in virtio-user is proven to be broken,
which results in device initialization failure.
Originally, we get features from vhost backend, and remove those
that are not supported. But when new feature is added, for example,
VIRTIO_NET_F_MTU, we fail to remove this new feature. Then, this
new feature will be negotiated, as both frontend and backend claim
to support this feature.
To fix it, we add a macro to record supported features, as a filter
to remove newly added features.
Fixes: 37a7eb2ae816 ("net/virtio-user: add device emulation layer") Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Xiao Wang [Tue, 11 Apr 2017 10:44:28 +0000 (03:44 -0700)]
net/virtio: fix queue notify
According to spec, we should write virtqueue index into the notify
address, rather than 1. Besides, some HW backend may rely on the data
written to identify which queue need to serve.
Fixes: 6ba1f63b5ab0 ("virtio: support specification 1.0") Cc: stable@dpdk.org Signed-off-by: Xiao Wang <xiao.w.wang@intel.com> Reviewed-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
With the Enhanced multi packet send addition, the defaults were made
in order to get the maximum out of the box performance.
Features like tso, don't use the enhanced send, however the defaults
are still valid. This cause Tx queue creation to fail.
Currently the argument process is done without indication which
parameter was forced by the application and which one is on it
default value.
This becomes problematic when different features requires different
defaults. For example, Enhanced multi packet send and TSO.
This commit modifies the argument process, enabling to differ
which parameter was forced by the application.
Andrew Rybchenko [Tue, 18 Apr 2017 13:03:08 +0000 (14:03 +0100)]
net/sfc: correct RSS hash availability condition
RSS hash is computed by hardware if corresponding Rx filter (for
example, default Rx filters) has RSS flag set which is set if
the number of RSS channels is greater than zero.
Fixes: 4ec1fc3ba881 ("net/sfc: add basic stubs for RSS support on driver attach") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Andrew Rybchenko [Tue, 18 Apr 2017 13:03:06 +0000 (14:03 +0100)]
net/sfc: reset RSS channels back to 0 on close
Fixes: 4ec1fc3ba881 ("net/sfc: add basic stubs for RSS support on driver attach") Cc: stable@dpdk.org Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Andrew Rybchenko [Tue, 18 Apr 2017 12:51:01 +0000 (13:51 +0100)]
net/sfc: remove logically dead code
Coverity issue: 1419717 Fixes: a9825ccf5bb8 ("net/sfc: support flow API filters") Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com> Reviewed-by: Andy Moreton <amoreton@solarflare.com>
Current implementation is error-prone if the max inline size
(txq->max_inilne) is decoupled from txq->inline_en and becomes zero. If it
becomes zero, HW can crash due to WQ overflow.
Qiming Yang [Thu, 13 Apr 2017 03:08:33 +0000 (11:08 +0800)]
doc: add known igb_uio issue for i40e
When insmod "igb_uio" with "intr_mode=legacy and test link
status interrupt. Since INTx interrupt is not supported by
X710/XL710/XXV710, it will cause Input/Output error when
reading file descriptor.
Signed-off-by: Qiming Yang <qiming.yang@intel.com>
Acked-by Jingjing Wu <jingjing.wu@intel.com> Acked-by: John McNamara <john.mcnamara@intel.com>
This patch corrects the description on Physical and Virtual Function
Infrastructure of Intel NICs. The RSS part description should belong
to ixgbe but not i40e.
This patch also add more notes to describe the queue number on Intel
X710/XL710 NICs.
Jerin Jacob [Fri, 14 Apr 2017 09:41:07 +0000 (15:11 +0530)]
net/thunderx: reduce writes to mbuf
With the mbuf rework, we now have 8 contiguous bytes to be
rearmed in the mbuf at 8B naturally aligned address.
Use single 8B write to avoid multiple 2B writes in Rx path.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Charles Myers [Thu, 13 Apr 2017 22:15:24 +0000 (12:15 -1000)]
net/mlx4: fix Rx after mbuf alloc failure
Fixes issue where mlx4 driver stops receiving packets when mbuf
allocation fails in mlx4_rx_burst().
This issue appears to be caused because the code doesn't recycle the
existing mbuf to the sges array when mbuf allocation fails as is done
in the code right above it which handles (wc.status != IBV_WC_SUCCESS).
Copying the code from the above case fixes the issue.
Fixes: acac55f16412 ("mlx4: use MOFED 3.0 fast verbs interface for Rx operations") Cc: stable@dpdk.org Signed-off-by: Charles Myers <charles.myers@spirent.com> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Result:
EAL: Initializing pmd_bond for eth_bond0
EAL: Create bonded device eth_bond0 on port 4 in mode 2 on socket 0.
rte_eth_dev_configure: ethdev port_id=4 \
max_rx_pkt_len 9018 > max valid value 2048
It's expected that slaves will be added to bonded device inside
'rte_eth_dev_configure' and proper 'max_rx_pktlen' configured
for all of them.
Failure happens because of hardcoded low value of 'max_rx_pktlen'.
Increasing of this value to ETHER_MAX_JUMBO_FRAME_LEN will allow
above scenario (attach + configure).
It is important because it is the way OVS wants to work with
all DPDK devices (including virtual).
Changing the default hardcoded value makes no harm because
all the slaves' related code uses only 'candidate_max_rx_pktlen'
variable.
Some applications like OVS knows nothing about the
device type and wants to use same API to work with
all of them. But bond_pmd, unlike other pmds, requires
additional step (removing of all the slaves) before
closing the device.
In fact that bond_pmd automatically adds all the
devices from kvargs to bonding on configuration it
also should remove all of them on close.
This change is intended to have the same API for physical
and virtual devices. It allows us to handle virtual
devices in OVS in a common way.
DPDK community has several emails discussion on this topic,
these mails link is bellow:
http://dpdk.org/ml/archives/dev/2017-March/060379.html,
http://dpdk.org/ml/archives/dev/2017-March/060295.html,
items like VLAN can already have several valid "types"
(0x88a8, 0x8100, 0x9100), and who knows what will come up
in the future.
And ixgbe_flow just ignores the types when do filter configuration.
So it may be reasonable to delete the related tpid check process.
Also add some more comment log on stack explanation.
Since the queue release API does not allow failures (return value is void)
and the flow API does not allow a queue to be released as long as a flow
rule depends on it, the only rational decision to avoid undefined behavior
is to panic in this situation.
VF link status rely on PF's notification, so when PF link status
be updated, it should notify VF to update link status also.
Current implementation only cover part of the situation when PF's link
status is updated, call i40e_notify_all_vfs_link_status in
i4e_dev_link_update will cover all situationa.
The patch adds 4 APIs to support configurable
PTYPE mapping for i40e device.
rte_pmd_i40e_ptype_mapping_get.
rte_pmd_i40e_ptype_mapping_replace.
rte_pmd_i40e_ptype_mapping_reset.
rte_pmd_i40e_ptype_mapping_update.
The mapping from hardware defined packet type to software defined packet
type can be updated/reset/read out with these APIs.
Also a software ptype with the most significent bit set will be regarded
as a user defined ptype (RTE_PMD_I40E_PTYPE_USER_DEFINE_MASK) so
application can use it to defined its own PTYPE naming system.
The mapping from hardware defined packet type to software defined
packet type is static for i40e device, the patch let each ethdev to
to have their own copy of mapping table, this give the possibility
that different ethdev can be set different PTYPE mapping rule which
is the requirement to support following hardware's dynamic PTYPE
feature.
Add a section in NIC drivers documentation to explain compiling and
testing of a PMD. It also mentions about host setup, which is required
before running testpmd.
Add label "testpmd_ug" to refer user guide.
Signed-off-by: Shijith Thotton <shijith.thotton@caviumnetworks.com> Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Acked-by: John McNamara <john.mcnamara@intel.com>
As the hardware determines which core will process which packet,
performance is boosted by direct cache warming/stashing as well
as by providing biasing for core-to-flow affinity, which ensures
that flow-specific data structures can remain in the core’s cache.
This patch enables the one cache line data stashing for packet
annotation data and packet context