Ferruh Yigit [Fri, 26 Aug 2016 11:17:41 +0000 (12:17 +0100)]
net/pcap: check max queue number
Number of queues is defined by devargs, a check added to be sure this
number is not bigger than configured max number of queue.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Ferruh Yigit [Fri, 26 Aug 2016 11:17:40 +0000 (12:17 +0100)]
net/pcap: reorganize private structs
struct rx_pcaps and tx_pcaps used to point parsed devargs, but it is not
clear with current names.
Merged both into single struct and modified struct name and field names.
Functionality not changed, only struct names.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Ferruh Yigit [Fri, 26 Aug 2016 11:17:39 +0000 (12:17 +0100)]
net/pcap: use macros for parameter string
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Ferruh Yigit [Fri, 26 Aug 2016 11:17:38 +0000 (12:17 +0100)]
net/pcap: convert config option to a macro
pcap PMD is using ring PMD configuration parameters to set max number of
queues. This creates an unnecessary dependency and confusion.
Stop using configuration parameter to set max number of queues and
convert this variable into a macro within source code, to simplify
configuration file.
Default value of macro is same as ring parameter's default.
pcap pmd doesn't need to be configured in a detail to set rx and tx max
queue numbers separately, so using same macro for both queues.
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Ali Volkan Atli [Wed, 27 Jul 2016 13:11:09 +0000 (16:11 +0300)]
net/e1000: fix returned number of available Rx descriptors
Fixes:
0f6b7c7f7a37 ("igb: use DD bit to count RX available descriptors")
Signed-off-by: Ali Volkan Atli <volkan.atli@argela.com.tr>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Nélio Laranjeiro [Wed, 14 Sep 2016 11:53:55 +0000 (13:53 +0200)]
net/mlx5: fix inline logic
To improve performance the NIC expects for large packets to have a pointer
to a cache aligned address, old inline code could break this assumption
which hurts performance.
Fixes:
2a66cf378954 ("net/mlx5: support inline send")
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Signed-off-by: Vasily Philipov <vasilyf@mellanox.com>
Nélio Laranjeiro [Wed, 14 Sep 2016 11:53:54 +0000 (13:53 +0200)]
net/mlx5: re-factorize functions
Rework logic of wqe_write() and wqe_write_vlan() which are pretty similar
to keep a single one.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Nélio Laranjeiro [Wed, 14 Sep 2016 11:53:53 +0000 (13:53 +0200)]
net/mlx5: force inline for completion function
This function was supposed to be inlined, but was not because several
functions calls it. This function should always be inline avoid
external function calls and to optimize code in data-path.
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Yaacov Hazan [Wed, 14 Sep 2016 11:53:52 +0000 (13:53 +0200)]
net/mlx5: fix flow director drop mode
Packet rejection was routed to a polled queue. This patch route them to a
dummy queue which is not polled.
Fixes:
76f5c99e6840 ("mlx5: support flow director")
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Signed-off-by: Nelio Laranjeiro <nelio.laranjeiro@6wind.com>
Yaacov Hazan [Wed, 14 Sep 2016 11:53:51 +0000 (13:53 +0200)]
net/mlx5: refactor allocation of flow director queues
This is done to prepare support for drop queues, which are not related to
existing Rx queues and need to be managed separately.
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Raslan Darawsheh [Wed, 14 Sep 2016 11:53:50 +0000 (13:53 +0200)]
net/mlx5: fix removing VLAN filter
memmove was moving bytes as the number of elements next to i, while it
should move the number of elements multiplied by the size of each element.
Fixes:
e9086978 ("mlx5: support VLAN filtering")
Signed-off-by: Raslan Darawsheh <rdarawsheh@asaltech.com>
Adrien Mazarguil [Wed, 14 Sep 2016 11:53:49 +0000 (13:53 +0200)]
net/mlx5: fix Rx VLAN offload capability report
This capability is implemented but not reported.
Fixes:
f3db9489188a ("mlx5: support Rx VLAN stripping")
Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Yaacov Hazan [Wed, 14 Sep 2016 11:53:48 +0000 (13:53 +0200)]
net/mlx5: fix inconsistent return value in flow director
The return value in DPDK is negative errno on failure.
Since internal functions in mlx driver return positive
values need to negate this value when it returned to
dpdk layer.
Fixes:
76f5c99 ("mlx5: support flow director")
Signed-off-by: Yaacov Hazan <yaacovh@mellanox.com>
Rami Rosen [Sat, 6 Aug 2016 06:16:23 +0000 (09:16 +0300)]
doc: fix typo in VF guide
This patch fixes a typo in doc/guides/nics/intel_vf.rst.
Fixes:
fc1f2750a3ec ("doc: programmers guide")
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Sagi Grimberg [Tue, 2 Aug 2016 14:41:21 +0000 (17:41 +0300)]
net/mlx5: fix possible NULL dereference in Rx path
The user is allowed to call ->rx_pkt_burst() even without free
mbufs in the pool. In this scenario we'll fail allocating a rep mbuf
on the first iteration (where pkt is still NULL). This would cause us
to deref a NULL pkt (reset refcount and free).
Fix this by checking the pkt before freeing it.
Fixes:
a1bdb71a32da ("net/mlx5: fix crash in Rx")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
Jeff Guo [Wed, 7 Sep 2016 09:38:40 +0000 (05:38 -0400)]
net/i40e: add packet type translation for X722
To make the PCTYPE in x722 compatible with original PCTYPE in
flow director (FD) filters, the PCTYPE in the FD programming
descriptor needs to be translated into a different PCTYPE using
GLQF_FD_PCTYPE table.
Translation needs to be done before the FD filter is programmed.
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Jeff Guo [Wed, 7 Sep 2016 09:38:02 +0000 (05:38 -0400)]
net/i40e: add new packet types for device X722
There are 6 new PCTYPEs enabled in the device x722.
The 6 new PCTYPEs As below:
* NonF Unicast IPv4, UDP
* NonF Multicast IPv4, UDP
* NonF IPv4, TCP, SYN, no ACK
* NonF Unicast IPv6, UDP
* NonF Multicast IPv6, UDP
Signed-off-by: Jeff Guo <jia.guo@intel.com>
Acked-by: Jingjing Wu <jingjing.wu@intel.com>
Yong Wang [Mon, 29 Aug 2016 19:18:50 +0000 (12:18 -0700)]
net/vmxnet3: enable LRO
This change enables device LRO if requested.
The current implementation of jumbo frame Rx can be used for LRO
directly without changes.
Note that since jumbo frame uses both ring0 and ring1, it cannot
be enabled in UPT (VMDirectPath) mode.
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Yong Wang [Mon, 29 Aug 2016 19:18:49 +0000 (12:18 -0700)]
net/vmxnet3: update NIC documentation
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Yong Wang [Mon, 29 Aug 2016 19:18:48 +0000 (12:18 -0700)]
net/vmxnet3: update feature doc
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Yong Wang [Mon, 29 Aug 2016 19:18:47 +0000 (12:18 -0700)]
net/vmxnet3: reallocate shared memzone on re-config
When adding a DPDK port to ovs-vswitchd with DPDK, the vmxnet3 device
fails to activate due to mismatched magic number. This failure causes
following operations to run: start the port, stop the port,
reconfigure and re-start the port.
During reconfigure, if there is an existing memzone, driver will reuse
it. But reconfigure may request different number of Tx/Rx queues.
This results in a memzone with wrong size and potential invalid memory
access.
To fix this, free the memzone if found and reserve a new one.
Signed-off-by: Yong Wang <yongwang@vmware.com>
Reviewed-by: Guolin Yang <gyang@vmware.com>
Reviewed-by: Daniele Di Proietto <ddiproietto@vmware.com>
Tested-by: Daniele Di Proietto <ddiproietto@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Yong Wang [Mon, 29 Aug 2016 19:18:46 +0000 (12:18 -0700)]
net/vmxnet3: coding style changes
Signed-off-by: Yong Wang <yongwang@vmware.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Yong Wang [Mon, 29 Aug 2016 19:18:45 +0000 (12:18 -0700)]
net/vmxnet3: improve error checks and return values
Signed-off-by: Yong Wang <yongwang@vmware.com>
Reviewed-by: Juho Snellman <jsnell@iki.fi>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Yury Kylulin [Mon, 29 Aug 2016 16:50:48 +0000 (19:50 +0300)]
net/i40e: fix mbuf leak during Rx queue release
For the vector PMD, release all mbufs from the Rx queue if no packets
are received after device start.
Fixes:
9ed94e5bb04e ("i40e: add vector Rx")
Signed-off-by: Yury Kylulin <yury.kylulin@intel.com>
Acked-by: Beilei Xing <beilei.xing@intel.com>
Yury Kylulin [Mon, 29 Aug 2016 16:50:47 +0000 (19:50 +0300)]
net/ixgbe: fix mbuf leak during Rx queue release
For the vector PMD, release all mbufs from the Rx queue if no packets
are received after device start.
Fixes:
11b220c6498d ("ixgbe: fix release queue mbufs")
Signed-off-by: Yury Kylulin <yury.kylulin@intel.com>
Acked-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Panu Matilainen [Wed, 5 Oct 2016 12:14:08 +0000 (15:14 +0300)]
ip_frag: fix missing dependency on hash library
Not sure what exactly changed and where, but I've started getting
build failures on Fedora rawhide i386:
lib/librte_ip_frag/ip_frag_internal.c:36:23: fatal error:
rte_jhash.h: No such file or directory
#include <rte_jhash.h>
^
Looking at librte_ip_frag, it clearly depends on librte_hash so
its probably more a question of something commonly masking the issue.
Signed-off-by: Panu Matilainen <pmatilai@redhat.com>
Ferruh Yigit [Fri, 30 Sep 2016 10:10:30 +0000 (11:10 +0100)]
kni: remove unnecessary ethtool files
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Ferruh Yigit [Fri, 30 Sep 2016 10:10:29 +0000 (11:10 +0100)]
kni: remove unused ethtool files
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Maxime Coquelin [Tue, 4 Oct 2016 12:05:24 +0000 (14:05 +0200)]
app/testpmd: reset headroom after txonly packet allocation
This patch fixes txonly raw packets allocations by resetting the
available headroom.
Indeed, some PMDs such as Virtio might prepend some data to the
packet, resulting in mbuf's data_off field to be decremented each
time the mbuf gets re-allocated.
For Virtio PMD, it means that we use only single descriptors for the
first times mbufs get allocated, as at some point there is not
enough headroom to store the header.
Other alternative would be use standard API to allocate the packets,
which does reset the headroom, but the impact on performance is too
big to consider this an option.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Maxime Coquelin [Tue, 4 Oct 2016 12:05:23 +0000 (14:05 +0200)]
mbuf: add function to reset headroom
Some application use rte_mbuf_raw_alloc() function to improve
performance by not resetting mbuf's fields to their default state.
This can be however problematic for mbuf consumers that need some
headroom, meaning that data_off field gets decremented after
allocation. When the mbuf is re-used afterwards, there might not
be enough room for the consumer to prepend anything, if the data_off
field is not reset to its default value.
This patch adds a new rte_pktmbuf_reset_headroom() function that
applications can call to reset the data_off field.
This patch also replaces current data_off affectations in the mbuf
lib with a call to this function.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 19 Sep 2016 12:34:41 +0000 (14:34 +0200)]
mbuf: fix error handling on pool creation
On error, the mempool object has to be freed, and rte_errno should be a
positive value.
Fixes:
152ca517900b ("mbuf: use default mempool handler from config")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Byron Marohn [Tue, 4 Oct 2016 23:25:15 +0000 (00:25 +0100)]
hash: modify lookup bulk pipeline
This patch replaces the pipelined rte_hash lookup mechanism with a
loop-and-jump model, which performs significantly better,
especially for smaller table sizes and smaller table occupancies.
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
Byron Marohn [Tue, 4 Oct 2016 23:25:14 +0000 (00:25 +0100)]
hash: add vectorized comparison
In lookup bulk function, the signatures of all entries
are compared against the signature of the key that is being looked up.
Now that all the signatures are together, they can be compared
with vector instructions (SSE, AVX2), achieving higher lookup performance.
Also, entries per bucket are increased to 8 when using processors
with AVX2, as 256 bits can be compared at once, which is the size of
8x32-bit signatures.
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
Byron Marohn [Tue, 4 Oct 2016 23:25:13 +0000 (00:25 +0100)]
hash: reorganize bucket structure
Move current signatures of all entries together in the bucket
and same with all alternative signatures, instead of having
current and alternative signatures together per entry in the bucket.
This will be benefitial in the next commits, where a vectorized
comparison will be performed, achieving better performance.
The alternative signatures have been moved away from
the current signatures, to make the key indices be consecutive
to the current signatures, as these two fields are used by lookup,
so they are in the same cache line.
Signed-off-by: Byron Marohn <byron.marohn@intel.com>
Signed-off-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
Pablo de Lara [Tue, 4 Oct 2016 23:25:12 +0000 (00:25 +0100)]
hash: reorder hash structure
In order to optimize lookup performance, hash structure
is reordered, so all fields used for lookup will be
in the first cache line.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Acked-by: Sameh Gobriel <sameh.gobriel@intel.com>
Karmarkar Suyash [Wed, 21 Sep 2016 20:54:27 +0000 (16:54 -0400)]
timer: fix lag delay
For periodic timers, if the lag gets introduced, the current code
added additional delay when the next peridoc timer was initialized
by not taking into account the delay added, with this fix the code
would start the next occurrence of timer keeping in account the
lag added. Corrected the behavior.
Fixes:
9b15ba89 ("timer: use a skip list")
Signed-off-by: Karmarkar Suyash <skarmarkar@sonusnet.com>
Acked-by: Robert Sanford <rsanford@akamai.com>
Jean Tourrilhes [Tue, 4 Oct 2016 17:17:03 +0000 (10:17 -0700)]
mem: fix hugepage mapping error messages
Running secondary is tricky due to the need to map the memory region
at the right place in VM, which is whatever primary has chosen. If the
base address for primary happens to by already mapped in the
secondary, we will hit precisely these error messages (depending if we
fail on the config region or the hugepages). This is why there is
already a comment about ASLR.
The issue is that in most cases, remapping does not happen and "errno"
is not changed and therefore stale. In our case, we got a "permission
denied", which sent us down the wrong track. It's such a common error
for secondary that I feel this error message should be unambiguous and
helpful.
The call to close was also moved because close() may override errno.
Signed-off-by: Jean Tourrilhes <jt@labs.hpe.com>
Konstantin Ananyev [Mon, 3 Oct 2016 17:27:25 +0000 (18:27 +0100)]
eal: fix C++ link of delay function pointer
When compiling with C++, it treats
void (*rte_delay_us)(unsigned int us);
as definition of the global variable.
So further linking with librte_eal fails.
Fixes:
b4d63fb62240 ("eal: customize delay function")
Steps to reproduce:
$ cat rttm1.cpp
using namespace std;
int main(int argc, char *argv[])
{
int ret = rte_eal_init(argc, argv);
rte_delay_us(1);
cout << "return code ";
cout << ret;
return ret;
}
$ g++ -m64 -I/${RTE_SDK}/${RTE_TARGET}/include -c -o rttm1.o rttm1.cpp
$ gcc -m64 -pthread -o rttm1 rttm1.o -ldl -Wl,-lstdc++ \
-L/${RTE_SDK}/${RTE_TARGET}/lib -Wl,-lrte_eal
.../librte_eal.a(eal_common_timer.o):
(.bss+0x0): multiple definition of `rte_delay_us'
rttm1.o:(.bss+0x0): first defined here
collect2: error: ld returned 1 exit status
$ nm rttm1.o | grep rte_delay_us
0000000000000092 t _GLOBAL__sub_I_rte_delay_us
0000000000000000 B rte_delay_us
Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Zhiyong Yang [Thu, 29 Sep 2016 12:35:49 +0000 (20:35 +0800)]
net/vhost: add extended statistics
This feature adds vhost pmd extended statistics from per port perspective
in order to meet the requirements of the applications such as OVS etc.
RX/TX xstats count the bytes without CRC. This is different from physical
NIC stats with CRC.
The statistics counters are based on RFC 2819 and RFC 2863 as follows:
rx/tx_good_packets
rx/tx_total_bytes
rx/tx_missed_pkts
rx/tx_broadcast_packets
rx/tx_multicast_packets
rx/tx_unicast_packets
rx/tx_undersize_errors
rx/tx_size_64_packets
rx/tx_size_65_to_127_packets;
rx/tx_size_128_to_255_packets;
rx/tx_size_256_to_511_packets;
rx/tx_size_512_to_1023_packets;
rx/tx_size_1024_to_1522_packets;
rx/tx_1523_to_max_packets;
rx/tx_errors
rx_fragmented_errors
rx_jabber_errors
rx_unknown_protos_packets;
No API is changed or added.
rte_eth_xstats_get_names() to retrieve what kinds of vhost xstats are
supported,
rte_eth_xstats_get() to retrieve vhost extended statistics,
rte_eth_xstats_reset() to reset vhost extended statistics.
The usage of vhost pmd xstats is the same as virtio pmd xstats.
for example, when test-pmd application is running in interactive mode
vhost pmd xstats will support the two following commands:
show port xstats all | port_id will show vhost xstats
clear port xstats all | port_id will reset vhost xstats
net/virtio pmd xstats(the function virtio_update_packet_stats) is used
as reference when implementing the feature.
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Zhiyong Yang [Thu, 29 Sep 2016 12:35:48 +0000 (20:35 +0800)]
net/vhost: move statistics into a structure
The patch moves all stats counters to a new defined struct vhost_stats
as follows, in order to manage all stats counters in a unified way and
simplify the subsequent function implementation(vhost_dev_xstats_reset).
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Ciara Loftus [Tue, 13 Sep 2016 13:47:43 +0000 (14:47 +0100)]
net/vhost: retrieve vid for a given port
In some cases when using the vHost PMD, certain vHost library functions
may still need to be accessed. One such example is the
rte_vhost_get_queue_num function which returns the number of virtqueues
reported by the guest - information which is not exposed by the PMD.
This commit introduces a new rte_eth_vhost function that returns the
'vid' associated with a given port id. This allows the PMD user to call
vHost library functions which require the 'vid' value.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jerin Jacob [Thu, 18 Aug 2016 04:12:11 +0000 (12:12 +0800)]
net/virtio: add NEON based Rx handler
Added neon based Rx vector implementation.
Selection of the new handler based neon availability at runtime.
Updated the release notes and MAINTAINERS file.
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jianbo Liu <jianbo.liu@linaro.org>
Jerin Jacob [Tue, 5 Jul 2016 12:49:25 +0000 (18:19 +0530)]
net/virtio: select data handler depending on CPU flag
Introduced cpuflag based run-time detection to select the
SSE based simple Rx handler
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jerin Jacob [Tue, 5 Jul 2016 12:49:24 +0000 (18:19 +0530)]
net/virtio: move SSE based Rx code to separate file
Split out SSE instruction based virtio simple Rx
implementation to a separate file
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jerin Jacob [Tue, 5 Jul 2016 12:49:23 +0000 (18:19 +0530)]
net/virtio: cleanup conditional compilation
Removed unnecessary compile time dependency on "use_simple_rxtx".
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Mon, 26 Sep 2016 04:29:13 +0000 (12:29 +0800)]
net: fix clang build
Interestingly, clang and gcc has different prototype for _mm_prefetch().
For gcc, we have
_mm_prefetch (const void *__P, enum _mm_hint __I)
While for clang, it's
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), 0, (sel)))
That's how the following error comes with clang:
error: cast from 'const void *' to 'void *' drops const qualifier
[-Werror,-Wcast-qual]
_mm_prefetch((const void *)rused, _MM_HINT_T0);
/usr/lib/llvm-3.8/bin/../lib/clang/3.8.0/include/xmmintrin.h:684:58:
note: expanded from macro '_mm_prefetch'
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a),
0, (sel)))
What's weird is that the build was actaully Okay before. I met it while
apply Jerin's vector support for ARM patch set: he just move this piece
of code to another file, nothing else changed.
This patch fix the issue when Jerin's patchset is applied. Thus, I think
it's still needed.
Similarly, make the same change to other _mm_prefetch users, just in case
this weird issue shows up again somehow later.
Fixes:
fc3d66212fed ("virtio: add vector Rx")
Fixes:
c95584dc2b18 ("ixgbe: new vectorized functions for Rx/Tx")
Fixes:
9ed94e5bb04e ("i40e: add vector Rx")
Fixes:
7092be8437bd ("fm10k: add vector Rx")
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Jing Chen <jing.d.chen@intel.com>
Jianfeng Tan [Tue, 27 Sep 2016 19:11:06 +0000 (19:11 +0000)]
net/virtio_user: fix error management during init
Currently, when virtio_user device fails to be started (e.g., vhost
unix socket does not exit), the init function does not return struct
rte_eth_dev (and some other structs) back to ether layer. And what's
more, it does not report the error to upper layer.
The fix is to free those structs and report error when failing to
start virtio_user devices.
Fixes:
ce2eabdd43ec ("net/virtio-user: add virtual device")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Tue, 27 Sep 2016 19:11:05 +0000 (19:11 +0000)]
net/virtio_user: fix wrong sequence of messages
When virtio_user is used with VPP's native vhost user, it cannot
send/receive any packets.
The root cause is that vpp-vhost-user translates the message
VHOST_USER_SET_FEATURES as puting this device into init state,
aka, zero all related structures. However, previous code
puts this message at last in the whole initialization process,
which leads to all previous information are zeroed.
To fix this issue, we rearrange the sequence of those messages.
- step 0, send VHOST_USER_SET_VRING_CALL so that vhost allocates
virtqueue structures;
- step 1, send VHOST_USER_SET_FEATURES to confirm the features;
- step 2, send VHOST_USER_SET_MEM_TABLE to share mem regions;
- step 3, send VHOST_USER_SET_VRING_NUM, VHOST_USER_SET_VRING_BASE,
VHOST_USER_SET_VRING_ADDR, VHOST_USER_SET_VRING_KICK for each
queue;
- ...
Fixes:
37a7eb2ae816 ("net/virtio-user: add device emulation layer")
Reported-by: Zhihong Wang <zhihong.wang@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jianfeng Tan [Tue, 27 Sep 2016 19:11:04 +0000 (19:11 +0000)]
net/virtio_user: fix first queue pair without multiqueue
When virtio_user is used with OVS-DPDK (with mq disabled), it cannot
receive any packets. This is because no queue is enabled at all when
mq is disabled.
To fix it, we should consistently make sure the 1st queue is enabled,
which is also the behaviour QEMU takes.
Fixes:
37a7eb2ae816 ("net/virtio-user: add device emulation layer")
Reported-by: Ning Li <lining18@jd.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Maxime Coquelin [Tue, 27 Sep 2016 08:42:49 +0000 (10:42 +0200)]
vhost: support indirect Tx descriptors
Indirect descriptors are usually supported by virtio-net devices,
allowing to dispatch a larger number of requests.
When the virtio device sends a packet using indirect descriptors,
only one slot is used in the ring, even for large packets.
The main effect is to improve the 0% packet loss benchmark.
A PVP benchmark using Moongen (64 bytes) on the TE, and testpmd
(fwd io for host, macswap for VM) on DUT shows a +50% gain for
zero loss.
On the downside, micro-benchmark using testpmd txonly in VM and
rxonly on host shows a loss between 1 and 4%. But depending on
the needs, feature can be disabled at VM boot time by passing
indirect_desc=off argument to vhost-user device in Qemu.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Gary Mussar [Fri, 2 Sep 2016 13:16:33 +0000 (09:16 -0400)]
tools: fix virtio interface name when binding
The dpdk-devbind.py script does not find/display the ifname for virtio
interfaces since the "net" directory is not directly under the device
directory but rather under a subdirectory.
eg.
> dpdk-devbind.py --status
0000:00:03.0 'Virtio network device' if= drv=virtio-pci unused=
This change searches for the first "net" directory under the device
directory hierarchy.
eg.
0000:00:03.0 'Virtio network device' if=ens3 drv=virtio-pci unused=
Fixes:
629395b063e8 ("igb_uio: remove PCI id table")
Signed-off-by: Gary Mussar <gmussar@ciena.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Zhiyong Yang [Wed, 7 Sep 2016 06:11:00 +0000 (14:11 +0800)]
net/virtio: fix xstats name
We have a stats named "size_1024_1517_packets", while the code
actually counts the range "[1024, 1518]", which is obviously wrong.
The code is as follows in the function virtio_update_packet_stats.
else if (s < 1519)
stats->size_bins[6]++;
We could either fix it by correcting the "if" check in the code,
or fix it by just renaming the stats to conform to the code. The
latter solution is taken because that's what the RFC2819 suggests.
Fixes:
76d4c652e07d ("virtio: add extended stats")
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Pierre Pfister [Wed, 7 Sep 2016 02:46:18 +0000 (10:46 +0800)]
net/virtio: enable indirect descriptors feature
Virtio indirect descriptors are supported by the data-path
but the feature bit is never set during feature negociation.
This patch simply adds VIRTIO_RING_F_INDIRECT_DESC back to
the supported features bit mask, hence enabling the use of
indirect descriptors when the feature is negociated with the
device.
Signed-off-by: Pierre Pfister <ppfister@cisco.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Matthias Gatto [Fri, 2 Sep 2016 15:05:21 +0000 (17:05 +0200)]
vhost: remove obsolete comment
As new_device and destroy_device use an int instead of a
"struct virtio_net *", The comment about setting VIRTIO_DEV_RUNNING
doesn't make sense anymore, plus If I've correctly understand the
code, the drivers take care of setting the flag before calling the
callbacks, so I guess that this comment is obsolet and I've remove it.
Signed-off-by: Matthias Gatto <matthias.gatto@outscale.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Thu, 18 Aug 2016 08:48:43 +0000 (16:48 +0800)]
vhost: simplify features set/get
No need to use a pointer to store/retrieve features.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Yuanhan Liu [Thu, 18 Aug 2016 08:48:42 +0000 (16:48 +0800)]
vhost: get device once
Invoke get_device() at the beginning of vhost_user_msg_handler, so that
we could check the return value once. Which could save tons of duplicate
get-and-check device.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Yuanhan Liu [Thu, 18 Aug 2016 08:48:41 +0000 (16:48 +0800)]
vhost: unify function names
Some functions are with prefix "user_", while others with "vhost_".
Making them all starting with "vhost_user_" to unify the function names.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Yuanhan Liu [Thu, 18 Aug 2016 08:48:40 +0000 (16:48 +0800)]
vhost: fold common message handlers
Due to history reason (that we have 2 vhost implementations), some
messages are handled in two calls: vhost specific implementation
handles it first and then invoke the common one to do another handling.
We have one implementation only now, we could write one method for
each message. Here fold those common handles to corresponding vhost
user handler.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Yuanhan Liu [Thu, 18 Aug 2016 08:48:39 +0000 (16:48 +0800)]
vhost: refactor code structure
The code structure is a bit messy now. For example, vhost-user message
handling is spread to three different files:
vhost-net-user.c virtio-net.c virtio-net-user.c
Where, vhost-net-user.c is the entrance to handle all those messages
and then invoke the right method for a specific message. Some of them
are stored at virtio-net.c, while others are stored at virtio-net-user.c.
The truth is all of them should be in one file, vhost_user.c.
So this patch refactors the source code structure: mainly on renaming
files and moving code from one file to another file that is more suitable
for storing it. Thus, no functional changes are made.
After the refactor, the code structure becomes to:
- socket.c handles all vhost-user socket file related stuff, such
as, socket file creation for server mode, reconnection
for client mode.
- vhost.c mainly on stuff like vhost device creation/destroy/reset.
Most of the vhost API implementation are there, too.
- vhost_user.c all stuff about vhost-user messages handling goes there.
- virtio_net.c all stuff about virtio-net should go there. It has virtio
net Rx/Tx implementation only so far: it's just a rename
from vhost_rxtx.c
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Yuanhan Liu [Thu, 18 Aug 2016 08:48:38 +0000 (16:48 +0800)]
vhost: remove sub-directory
We now have one vhost implementation; no sub source dir is needed.
Remove it by move them to upper dir.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Yuanhan Liu [Thu, 18 Aug 2016 08:48:37 +0000 (16:48 +0800)]
vhost: remove vhost-cuse
remove vhost-cuse code, including the eventfd_link kernel module that
is for vhost-cuse only.
The lib/virt/qemu-wrap.py is also removed, as it's mainly for vhost-cuse
usage.
As we have one vhost implementation now, one vhost config option is
needed only. Thus, CONFIG_RTE_LIBRTE_VHOST_USER is removed.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Jianfeng Tan [Thu, 18 Aug 2016 05:46:13 +0000 (05:46 +0000)]
examples/vhost: remove VLAN strip option
When VMDQ is enabled, different NICs have different behaviors for
disabling VLAN strip. In detail, i40e only enables/disables it of
PF's main vsi; fm10k cannot disable VLAN strip, etc. We now remove
this option, --vlan-strip, to reduce any confusion. And now, VLAN
strip will be enabled and cannot be disabled.
Reported-by: Qian Xu <qian.q.xu@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jiayu Hu [Sat, 20 Aug 2016 10:11:36 +0000 (06:11 -0400)]
examples/vhost: support multiple socket files
When examples/vhost runs in client mode, only one QEMU can be connected.
This is because that examples/vhost just supports one socket file. This
patch is to add multiple sockets support for examples/vhost.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jiayu Hu [Sat, 20 Aug 2016 10:10:33 +0000 (06:10 -0400)]
examples/vhost: rename --dev-basename to --socket-file
In examples/vhost, "dev-basename" is a program option, which is to set
the vhost-net socket used by vhost-user, or the character device used
by vhost-cuse. Since vhost-cuse should be dropped, and "dev-basename"
is not a suitable name for the vhost-net socket. Therefore, this patch
is to change this option name for examples/vhost.
Signed-off-by: Jiayu Hu <jiayu.hu@intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Fri, 26 Aug 2016 13:15:32 +0000 (15:15 +0200)]
tools: fix json output of pmdinfo
Using dpdk-pmdinfo with the '-r' flag does not produce a json output as
documented. Instead, the python representation of the json object is
shown, which is nearly the same, but cannot be properly parsed by a json
parser.
python repr (before):
{u'pci_ids': [[5549, 1968, 65535, 65535]], u'name': u'vmxnet3'}
json (after):
{"pci_ids": [[5549, 1968, 65535, 65535]], "name": "vmxnet3"}
Fixes:
c67c9a5c646a ("tools: query binaries for HW and other support information")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Ferruh Yigit [Wed, 28 Sep 2016 10:05:19 +0000 (11:05 +0100)]
pmdinfogen: fix clang build
Compile error:
CC mlx5.o.pmd.o
mlx5.o.pmd.c:1:227:
error: no newline at end of file [-Werror,-Wnewline-eof]
...__attribute__((used)) = "PMD_INFO_STRING= {...}";
^
Produced with clang 3.8.0 and MLX5_PMD and MLX5_DEBUG
config options enabled.
Fixes:
98b0fdb0ffc6 ("pmdinfogen: add buildtools and pmdinfogen utility")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Pablo de Lara [Tue, 4 Oct 2016 07:16:34 +0000 (08:16 +0100)]
hash: fix free slot check
In function rte_hash_cuckoo_insert_mw_tm, while looking for
an empty slot, only the first entry in the bucket was being checked,
as key_idx array was not being iterated.
Fixes:
5fc74c2e146d ("hash: check if slot is empty with key index")
Reported-by: Bruce Richardson <bruce.richardson@intel.com>
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Olivier Matz [Fri, 9 Sep 2016 08:15:21 +0000 (10:15 +0200)]
ethdev: clarify API comment for imissed stats
The "imissed" stats represent RX packets dropped by the HW,
so we should not talk about mbufs as the hardware is not aware
of this structure. Buffer seems to be a better word.
Fixes:
4eadb8ba11b7 ("ethdev: do not deprecate imissed counter")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Ferruh Yigit [Wed, 28 Sep 2016 13:59:29 +0000 (14:59 +0100)]
mempool: fix comments for no contiguous flag
Fixes:
ce94a51ff05c ("mempool: add flag for removing phys contiguous constraint")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Ferruh Yigit [Wed, 28 Sep 2016 13:59:28 +0000 (14:59 +0100)]
mempool: fix comments of create functions
Fixes:
85226f9c526b ("mempool: introduce a function to create an empty pool")
Fixes:
d1d914ebbc25 ("mempool: allocate in several memory chunks by default")
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Jerin Jacob [Thu, 18 Aug 2016 11:51:30 +0000 (17:21 +0530)]
eal/armv8: use high-resolution cycle counter
Existing cntvct_el0 based rte_rdtsc() provides portable
means to get wall clock counter at user space. Typically
it runs at <= 100MHz.
The alternative method to enable rte_rdtsc() for high resolution
wall clock counter is through armv8 PMU subsystem.
The PMU cycle counter runs at CPU frequency, However,
access to PMU cycle counter from user space is not enabled
by default in the arm64 linux kernel.
It is possible to enable cycle counter at user space access
by configuring the PMU from the privileged mode (kernel space).
by default rte_rdtsc() implementation uses portable
cntvct_el0 scheme. Application can choose the PMU based
implementation with CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com>
Acked-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Yangchao Zhou [Thu, 29 Sep 2016 01:41:10 +0000 (09:41 +0800)]
pci: fix memory leak when detaching device
Fixes:
dbe6b4b61b0e ("pci: probe or close device")
Signed-off-by: Yangchao Zhou <zhouyates@gmail.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:36 +0000 (18:11 +0530)]
pci: create device list and fallback on its members
Now that rte_device is available, drivers can start using its members
(numa, name) as well as link themselves into another rte_device list.
As of now no one is using this list, but can be used for moving over all
devices (pdev/vdev/Xdev) and perform bulk actions (like cleanup).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Reword commit log for extra rte_device list]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:35 +0000 (18:11 +0530)]
eal: introduce generalized device
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:34 +0000 (18:11 +0530)]
eal: register drivers explicitly
To register both vdev and pci drivers into the list of all rte_driver,
we have to call rte_eal_driver_register explicitly.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:33 +0000 (18:11 +0530)]
pci: inherit common driver in PCI driver
Remove the 'name' member from rte_pci_driver and move to generic
rte_driver.
Most of the PMD drivers were initially using DRIVER_REGISTER_PCI(<name>..)
as well as assigning a name to eth_driver.pci_drv.name member.
In this patch, only the original DRIVER_REGISTER_PCI(<name>..) name has
been populated into the rte_driver.name member - assignments through
eth_driver has been removed.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Rebase and expand changes to newly added files]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:32 +0000 (18:11 +0530)]
eal: rename and move PCI resource structure
There is no need to have a custom memory resource representation for
each infrastructure (PCI, ...) as it would always have the same members.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:31 +0000 (18:11 +0530)]
eal: include dev headers in place of PCI headers
Further refactoring and generalization of PCI infrastructure will
require access to the rte_dev.h contents.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:30 +0000 (18:11 +0530)]
eal: remove unused PMD types
- All devices register themselfs by calling a kind of DRIVER_REGISTER_XXX.
The PMD_REGISTER_DRIVER is not used anymore.
- PMD_VDEV type is also not being used - can be removed from all VDEVs.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:29 +0000 (18:11 +0530)]
drivers: use vdev registration
All PMD_VDEV drivers can now use rte_vdev_driver instead of the
rte_driver (which is embedded in the rte_vdev_driver).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:28 +0000 (18:11 +0530)]
eal: remove PCI/vdev unused code
- Remove checks for VDEV from rte_eal_vdev_(init/uninint) as all devices
are inherently virtual here.
- PDEVs perform PCI specific inits - rte_eal_dev_init() need not call
rte_driver->init();
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
[Shreyansh: Reword commit log]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
Jan Viktorin [Tue, 20 Sep 2016 12:41:27 +0000 (18:11 +0530)]
eal: extract vdev infra
Move all PMD_VDEV-specific code into a separate module and header
file to not polute the generic code anymore. There is now a list
of virtual devices available.
The rte_vdev_driver integrates the original rte_driver inside
(C inheritance). The rte_driver will be however change in the
future to serve as a common base for all other types of drivers.
The existing PMDs (PMD_VDEV) are to be modified later (there is
no change for them at the moment).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
David Marchand [Tue, 20 Sep 2016 12:41:26 +0000 (18:11 +0530)]
ethdev: get rid of device type
Now that hotplug has been moved to eal, there is no reason to keep the
device type in this layer.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:25 +0000 (18:11 +0530)]
ethdev: convert to EAL hotplug
Remove bus logic from ethdev hotplug by using eal for this.
Current api is preserved:
- the last port that has been created is tracked to return it to the
application when attaching,
- the internal device name is reused when detaching.
We can not get rid of ethdev hotplug yet since we still need some
mechanism to inform applications of port creation/removal to substitute
for ethdev hotplug api.
dev_type field in struct rte_eth_dev and rte_eth_dev_allocate are kept as
is, but this information is not needed anymore and is removed in the
following commit.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:24 +0000 (18:11 +0530)]
eal: add hotplug operations for PCI and vdev
Hotplug invocations, which deals with devices, should come from the layer
that already handles them, i.e. EAL.
For both attach and detach operations, 'name' is used to select the bus
that will handle the request.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:23 +0000 (18:11 +0530)]
ethdev: do not scan all PCI devices on attach
No need to scan all devices, we only need to update the device being
attached.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:22 +0000 (18:11 +0530)]
pci: introduce helpers for device name parsing/update
- Move rte_eth_dev_create_unique_device_name() from ether/rte_ethdev.c to
common/include/rte_pci.h as rte_eal_pci_device_name(). Being a common
method, can be used across crypto/net PCI PMDs.
- Remove crypto specific routine and fallback to common name function.
- Introduce a eal private Update function for PCI device naming.
Signed-off-by: David Marchand <david.marchand@6wind.com>
[Shreyansh: Merge crypto/pci helper patches]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:20 +0000 (18:11 +0530)]
drivers: use PCI registration macro
Simplify crypto and ethdev pci drivers init by using newly introduced
init macros and helpers.
Those drivers then don't need to register as "rte_driver"s anymore.
Exceptions:
- virtio and mlx* use RTE_INIT directly as they have custom initialization
steps.
- VDEV devices are not modified - they continue to use PMD_REGISTER_DRIVER.
Update documentation for replacing an example referring to
PMD_REGISTER_DRIVER.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:19 +0000 (18:11 +0530)]
drivers: export probe/remove helpers for PCI drivers
crypto and ethdev drivers aligned to PCI probe/remove. These wrappers are
mapped directly to PCI resources.
Existing handlers for init/uninit can be easily reused for this.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:18 +0000 (18:11 +0530)]
eal: introduce driver init macros
Introduce a RTE_INIT macro used to mark an init function as a constructor.
Current eal macros have been converted to use this (no functional impact).
DRIVER_REGISTER_PCI is added as a helper for pci drivers.
Suggested-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
[Shreyansh: Update PCI Registration macro name]
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:17 +0000 (18:11 +0530)]
drivers: align PCI driver definitions
Pure coding style, but it might make it easier later if we want to move
fields in rte_cryptodev_driver and eth_driver structures.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:16 +0000 (18:11 +0530)]
cryptodev: remove PMD type
This information is not used and just adds noise.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>
Shreyansh Jain [Tue, 20 Sep 2016 12:41:15 +0000 (18:11 +0530)]
pci: replace devinit/devuninit with probe/remove
Probe and Remove are more appropriate names for PCI init and uninint
operations. This is a cosmetic change.
Only MLX* uses the PCI direct registration, bypassing PMD_* macro.
The callbacks for this too have been updated.
VDEV are left out. For them, init/uninit are more appropriate.
Suggested-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: David Marchand <david.marchand@6wind.com>
David Marchand [Tue, 20 Sep 2016 12:41:14 +0000 (18:11 +0530)]
pci: initialize lists statically
These lists can be initialized once and for all at build time.
With this, those lists are only manipulated in a common place
(and we could even make them private).
A nice side effect is that pci drivers can now register in constructors.
Signed-off-by: David Marchand <david.marchand@6wind.com>
Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
David Marchand [Tue, 20 Sep 2016 12:41:13 +0000 (18:11 +0530)]
eal: remove duplicate function declaration
rte_eal_dev_init is declared in both eal_private.h and rte_dev.h since its
introduction.
This function has been exported in ABI, so remove it from eal_private.h
Fixes:
e57f20e05177 ("eal: make vdev init path generic for both virtual and pci devices")
Signed-off-by: David Marchand <david.marchand@6wind.com>
Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Reviewed-by: Jan Viktorin <viktorin@rehivetech.com>
Flavio Leitner [Fri, 23 Sep 2016 14:47:31 +0000 (11:47 -0300)]
eal: move CPU flags check from constructor to init
An application might be linked to DPDK but not really use it,
so move the cpu flag check to the EAL initialization instead.
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Aaron Conole <aconole@redhat.com>
Maciej Czekaj [Wed, 28 Sep 2016 10:52:57 +0000 (12:52 +0200)]
mem: fix crash on hugepage mapping error
In ASLR-enabled system, it is possible that selected
virtual space is occupied by program segments. Therefore,
error path should not blindly unmap all memmory segments
but only those already mapped.
Steps that lead to crash:
1. memeseg 0 in secondary process overlaps with libc.so
2. mmap of /dev/zero fails for virtual space of memseg 0
3. munmap of memseg 0 leads to unmapping libc.so itself
4. app gets SIGSEGV after returning from syscall to libc
Fixes:
ea329d7f8e34 ("mem: fix leak after mapping failure")
Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Yuanhan Liu [Fri, 23 Sep 2016 07:10:46 +0000 (15:10 +0800)]
mem: remove single file segments
RTE_EAL_SINGLE_FILE_SEGMENTS was introduced with ivshmem integration.
Now that ivshmem was removed (commit
c711ccb30987)
and a simple git grep shows no one else references it;
I think we can now remove it.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Pablo de Lara [Fri, 26 Aug 2016 21:30:09 +0000 (22:30 +0100)]
hash: check if slot is empty with key index
Instead of checking if the current and alternative signatures are 0,
it is faster to check if the key index associated to an entry
is 0, meaning that the slot is empty.
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>
Pablo de Lara [Fri, 26 Aug 2016 21:30:08 +0000 (22:30 +0100)]
hash: fix false zero signature key hit lookup
This commit fixes a corner case scenario. When a key is deleted,
its signature in the hash table gets clear, which should prevent
a lookup of that same key, unless the signature of the key is all zeroes.
In that case, there will be a match, and key would be compared against
the key that is in the table (which does not get cleared,
as the performance penalty would be high), resulting in a wrong hit.
To prevent this from happening, the key index associated to that entry
should be set to zero when deleting it, so in case that same key
is looked up just after a deletion, it will point to the dummy key slot,
which guarantees a miss.
Fixes:
48a399119619 ("hash: replace with cuckoo hash implementation")
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: Saikrishna Edupuganti <saikrishna.edupuganti@intel.com>