Ferruh Yigit [Mon, 26 Sep 2016 15:39:25 +0000 (16:39 +0100)]
kni: remove useless return
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ferruh Yigit [Mon, 26 Sep 2016 15:39:24 +0000 (16:39 +0100)]
kni: prefer unsigned int to unsigned
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ferruh Yigit [Mon, 26 Sep 2016 15:39:23 +0000 (16:39 +0100)]
kni: fix spacing and line lenghts
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ferruh Yigit [Mon, 26 Sep 2016 15:39:22 +0000 (16:39 +0100)]
kni: make static struct const
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ferruh Yigit [Mon, 26 Sep 2016 15:39:21 +0000 (16:39 +0100)]
kni: uninitialize global variables
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Ferruh Yigit [Mon, 26 Sep 2016 15:39:20 +0000 (16:39 +0100)]
kni: move externs to the header file
Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com>
Vladyslav Buslov [Sat, 24 Sep 2016 13:13:02 +0000 (16:13 +0300)]
kni: support core id parameter in single threaded mode
Allow binding KNI thread to specific core in single threaded mode
by setting core_id and force_bind config parameters.
Signed-off-by: Vladyslav Buslov <vladyslav.buslov@harmonicinc.com>
Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>
Tested-by: Ferruh Yigit <ferruh.yigit@intel.com>
Wei Dai [Mon, 8 Aug 2016 06:40:45 +0000 (14:40 +0800)]
app/test: verify LPM tbl8 recycle
As a bug-fix for lpm tbl8 recycle is introduced,
add a test case to verify tbl8 group is correctly
freed when it only includes a rule with depth=24.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Wei Dai [Mon, 8 Aug 2016 06:42:37 +0000 (14:42 +0800)]
lpm: remove redundant check when adding rule
When a rule with depth > 24 is added into an existing
rule with depth <=24, a new tbl8 is allocated, the existing
rule first fulfill whole new tbl8, so the filed valid of
each entry in this tbl8 is always true and depth of each
entry is always <= 24 before adding the new rule with depth > 24.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
Wei Dai [Mon, 8 Aug 2016 06:39:51 +0000 (14:39 +0800)]
lpm: fix freeing unused sub-table on rule delete
When all rules with depth > 24 are deleted in a same sub-table
(tlb8 group) and only a rule with depth <=24 is left in it,
this sub-table (tlb8 group) should be recycled.
Fixes:
dc81ebbacaeb ("lpm: extend IPv4 next hop field")
Fixes:
af75078fece3 ("first public release")
Signed-off-by: Wei Dai <wei.dai@intel.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
John Ousterhout [Wed, 12 Oct 2016 19:38:32 +0000 (12:38 -0700)]
log: respect logger configured before EAL init
Before this patch, application-specific loggers could not be
installed before rte_eal_init completed (the initialization process
called rte_openlog_stream, overwriting any previously installed
logger). This made it impossible for an application to capture the
initial log messages generated during rte_eal_init. This patch changes
initialization so that information from a previous call to
rte_openlog_stream is not lost. Specifically:
* The default log stream is now maintained separately from an
application-specific log stream installed with rte_openlog_stream.
* rte_eal_common_log_init has been renamed to eal_log_set_default,
since this is all it does. It no longer invokes rte_openlog_stream; it
just updates the default stream. Also, this method now returns void,
rather than int, since there are no errors.
This patch also removes the "early log" mechanism and cleans up the
log initialization mechanism:
* The default log stream defaults to stderr on all platforms if
eal_log_set_default hasn't been invoked (Linux used to use stdout
during the first part of initialization).
* Removed rte_eal_log_early_init; all of the desired functionality can
be achieved by calling eal_log_set_default.
* Removed lib/librte_eal/bsdapp/eal/eal_log.c: it contained only one
function, rte_eal_log_init, which is not needed or invoked for BSD.
* Removed declaration for eal_default_log_stream in rte_log.h (it's now
private to eal_common_log.c).
* Moved call to rte_eal_log_init earlier in rte_eal_init for Linux, so
that it starts using the preferrred log ASAP.
Signed-off-by: John Ousterhout <ouster@cs.stanford.edu>
Mauricio Vasquez B [Fri, 2 Sep 2016 11:01:51 +0000 (13:01 +0200)]
doc: fix file argument of debug functions
Previous patch updated the functions without updating all the comments.
Fixes:
591a9d7985c1 ("add FILE argument to debug functions")
Signed-off-by: Mauricio Vasquez B <mauricio.vasquez@polito.it>
Acked-by: John McNamara <john.mcnamara@intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:11 +0000 (16:16 +0200)]
net/virtio: support TSO
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:10 +0000 (16:16 +0200)]
net/virtio: support LRO
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:09 +0000 (16:16 +0200)]
net/virtio: support Tx checksum offload
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:08 +0000 (16:16 +0200)]
net/virtio: support Rx checksum offload
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:07 +0000 (16:16 +0200)]
app/testpmd: display LRO segment size
In csumonly engine, display the value of LRO segment if the
LRO flag is set.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:06 +0000 (16:16 +0200)]
mbuf: add flag for LRO
When receiving coalesced packets in virtio, the original size of the
segments is provided. This is a useful information because it allows to
resegment with the same size.
Add a RX new flag in mbuf, that can be set when packets are coalesced by
a hardware or virtual driver when the m->tso_segsz field is valid and is
set to the segment size of original packets.
This flag is used in next commits in the virtio pmd.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:04 +0000 (16:16 +0200)]
mbuf: add new Rx checksum flags
Following discussions in [1] and [2], introduce a new bit to
describe the Rx checksum status in mbuf.
Before this patch, only one flag was available:
PKT_RX_L4_CKSUM_BAD: L4 cksum of RX pkt. is not OK.
And same for L3:
PKT_RX_IP_CKSUM_BAD: IP cksum of RX pkt. is not OK.
This had 2 issues:
- it was not possible to differentiate "checksum good" from
"checksum unknown".
- it was not possible for a virtual driver to say "the checksum
in packet may be wrong, but data integrity is valid".
This patch tries to solve this issue by having 4 states (2 bits)
for the IP and L4 Rx checksums. New values are:
- PKT_RX_L4_CKSUM_UNKNOWN: no information about the RX L4 checksum
-> the application should verify the checksum by sw
- PKT_RX_L4_CKSUM_BAD: the L4 checksum in the packet is wrong
-> the application can drop the packet without additional check
- PKT_RX_L4_CKSUM_GOOD: the L4 checksum in the packet is valid
-> the application can accept the packet without verifying the
checksum by sw
- PKT_RX_L4_CKSUM_NONE: the L4 checksum is not correct in the packet
data, but the integrity of the L4 data is verified.
-> the application can process the packet but must not verify the
checksum by sw. It has to take care to recalculate the cksum
if the packet is transmitted (either by sw or using tx offload)
And same for L3 (replace L4 by IP in description above).
This commit tries to be compatible with existing applications that
only check the existing flag (CKSUM_BAD).
[1] http://dpdk.org/ml/archives/dev/2016-May/039920.html
[2] http://dpdk.org/ml/archives/dev/2016-June/040007.html
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:03 +0000 (16:16 +0200)]
net: add function to calculate checksum in mbuf
This function can be used to calculate the checksum of data embedded in
mbuf, that can be composed of several segments.
This function will be used by the virtio pmd in next commits to calculate
the checksum in software in case the protocol is not recognized.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:02 +0000 (16:16 +0200)]
net/virtio: reinitialize device when configuring
Add the ability to reset the virtio device in the configure callback
if the features flag changed since previous reset. This will be possible
with the introduction of offload support in next commits.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:01 +0000 (16:16 +0200)]
net/virtio: move control queue configuration
Move the configuration of control queue in the configure callback.
This is needed by next commit, which introduces the reinitialization
of the device in the configure callback to change the feature flags.
Therefore, the control queue will have to be restarted at the same
place.
As virtio_dev_cq_queue_setup() is called from a place where
config->max_virtqueue_pairs is not available, we need to store this in
the private structure. It replaces max_rx_queues and max_tx_queues which
have the same value. The log showing the value of max_rx_queues and
max_tx_queues is also removed since config->max_virtqueue_pairs is
already displayed above.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Olivier Matz [Thu, 13 Oct 2016 14:16:00 +0000 (16:16 +0200)]
net/virtio: move device initialization in a function
Move all code related to device initialization in a new function
virtio_init_device().
This commit brings no functional change, it prepares the next commits
that will add the offload support. For that, it will be needed to
reinitialize the device from ethdev->configure(), using this new
function.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Zhihong Wang [Tue, 20 Sep 2016 02:00:12 +0000 (22:00 -0400)]
vhost: fix Windows VM hang
This patch fixes a Windows VM compatibility issue in DPDK 16.07 vhost code
which causes the guest to hang once any packets are enqueued when mrg_rxbuf
is turned on by setting the right id and len in the used ring.
As defined in virtio spec 0.95 and 1.0, in each used ring element, id means
index of start of used descriptor chain, and len means total length of the
descriptor chain which was written to. While in 16.07 code, index of the
last descriptor is assigned to id, and the length of the last descriptor is
assigned to len.
How to test?
1. Start testpmd in the host with a vhost port.
2. Start a Windows VM image with qemu and connect to the vhost port.
3. Start io forwarding with tx_first in host testpmd.
For 16.07 code, the Windows VM will hang once any packets are enqueued.
Signed-off-by: Zhihong Wang <zhihong.wang@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Yuanhan Liu [Sun, 9 Oct 2016 07:28:00 +0000 (15:28 +0800)]
net/vhost: add an option to enable dequeue zero copy
Add an option, dequeue-zero-copy, to enable this feature in vhost-pmd.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Yuanhan Liu [Sun, 9 Oct 2016 07:27:59 +0000 (15:27 +0800)]
examples/vhost: add --dequeue-zero-copy option
Add an option, --dequeue-zero-copy, to enable dequeue zero copy.
One thing worth noting while using dequeue zero copy is the nb_tx_desc
has to be small enough so that the eth driver will hit the mbuf free
threshold easily and thus free mbuf more frequently.
The reason behind that is, when dequeue zero copy is enabled, guest Tx
used vring will be updated only when corresponding mbuf is freed. If mbuf
is not freed frequently, the guest Tx vring could be starved.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Yuanhan Liu [Sun, 9 Oct 2016 07:27:58 +0000 (15:27 +0800)]
vhost: add a flag to enable dequeue zero copy
Dequeue zero copy is disabled by default. Here add a new flag
``RTE_VHOST_USER_DEQUEUE_ZERO_COPY`` to explictily enable it.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Yuanhan Liu [Sun, 9 Oct 2016 07:27:57 +0000 (15:27 +0800)]
vhost: add dequeue zero copy
The basic idea of dequeue zero copy is, instead of copying data from
the desc buf, here we let the mbuf reference the desc buf addr directly.
Doing so, however, has one major issue: we can't update the used ring
at the end of rte_vhost_dequeue_burst. Because we don't do the copy
here, an update of the used ring would let the driver to reclaim the
desc buf. As a result, DPDK might reference a stale memory region.
To update the used ring properly, this patch does several tricks:
- when mbuf references a desc buf, refcnt is added by 1.
This is to pin lock the mbuf, so that a mbuf free from the DPDK
won't actually free it, instead, refcnt is subtracted by 1.
- We chain all those mbuf together (by tailq)
And we check it every time on the rte_vhost_dequeue_burst entrance,
to see if the mbuf is freed (when refcnt equals to 1). If that
happens, it means we are the last user of this mbuf and we are
safe to update the used ring.
- "struct zcopy_mbuf" is introduced, to associate an mbuf with the
right desc idx.
Dequeue zero copy is introduced for performance reason, and some rough
tests show about 50% perfomance boost for packet size 1500B. For small
packets, (e.g. 64B), it actually slows a bit down (well, it could up to
15%). That is expected because this patch introduces some extra works,
and it outweighs the benefit from saving few bytes copy.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Yuanhan Liu [Sun, 9 Oct 2016 07:27:56 +0000 (15:27 +0800)]
vhost: introduce last available index for dequeue
So far, we retrieve both the used ring and avail ring idx by the var
last_used_idx; it won't be a problem because the used ring is updated
immediately after those avail entries are consumed.
But that's not true when dequeue zero copy is enabled, that used ring is
updated only when the mbuf is consumed. Thus, we need use another var to
note the last avail ring idx we have consumed.
Therefore, last_avail_idx is introduced.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Yuanhan Liu [Sun, 9 Oct 2016 07:27:55 +0000 (15:27 +0800)]
vhost: get guest/host physical address mappings
So that we can convert a guest physical address to host physical
address, which will be used in later Tx zero copy implementation.
MAP_POPULATE is set while mmaping guest memory regions, to make
sure the page tables are setup and then rte_mem_virt2phy() could
yield proper physical address.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Yuanhan Liu [Sun, 9 Oct 2016 07:27:54 +0000 (15:27 +0800)]
vhost: simplify memory regions handling
Due to history reason (that vhost-cuse comes before vhost-user), some
fields for maintaining the vhost-user memory mappings (such as mmapped
address and size, with those we then can unmap on destroy) are kept in
"orig_region_map" struct, a structure that is defined only in vhost-user
source file.
The right way to go is to remove the structure and move all those fields
into virtio_memory_region struct. But we simply can't do that before,
because it breaks the ABI.
Now, thanks to the ABI refactoring, it's never been a blocking issue
any more. And here it goes: this patch removes orig_region_map and
redefines virtio_memory_region, to include all necessary info.
With that, we can simplify the guest/host address convert a bit.
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Tested-by: Qian Xu <qian.q.xu@intel.com>
Jason Wang [Wed, 28 Sep 2016 08:25:12 +0000 (16:25 +0800)]
net/virtio: support IOMMU platform
Negotiate VIRTIO_F_IOMMU_PLATFORM to have IOMMU support.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Jason Wang [Wed, 28 Sep 2016 08:25:11 +0000 (16:25 +0800)]
net/virtio: support modern device id
Add modern device id and rename VIRTIO_PCI_DEVICEID_MIN to
VIRTIO_PCI_LEGACY_DEVICEID_NET. While at it, remove unused macros too.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
David Marchand [Fri, 7 Oct 2016 13:03:13 +0000 (15:03 +0200)]
net/virtio: add missing driver name
The driver name has been lost with the eal rework.
Restore it.
Fixes:
c830cb295411 ("drivers: use PCI registration macro")
Signed-off-by: David Marchand <david.marchand@6wind.com>
Acked-by: Shreyansh Jain <shreyansh.jain@nxp.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Souvik Dey [Sun, 9 Oct 2016 03:38:26 +0000 (11:38 +0800)]
net/virtio: set MTU
Virtio interfaces do not currently allow the user to specify a particular
Maximum Transmission Unit (MTU). Consequently, the MTU of Virtio interfaces
is typically set to the Ethernet default value of 1500.
This is problematic in the case of cloud deployments, in which a specific
(and potentially non-standard) MTU needs to be set by a DHCP server, which
needs to be honored by all interfaces across the traffic path.To acheive
this Virtio interfaces should support setting of MTU.
In case when GRE/VXLAN tunneling is used for internal communication, there
will be an overhead added by the infrastructure in the packet over and
above the ETHER MTU of 1518. So to take care of this overhead in these
cases the DHCP server corrects the L3 MTU to 1454. But since virtio
interfaces was not having the MTU set functionality that MTU sent by the
DHCP server was ignored and the instance will still send packets with 1500
MTU which after encapsulation will become more than 1518 and eventually
gets dropped in the infrastructure.
By adding an additional 'set_mtu' function to the Virtio driver, we can
honor the MTU sent by the DHCP server. The dhcp server/controller can
then leverage this 'set_mtu' functionality to resolve the above
mentioned issue of packets getting dropped due to incorrect size.
Signed-off-by: Souvik Dey <sodey@sonusnet.com>
Reviewed-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Mark Kavanagh [Thu, 6 Oct 2016 10:36:36 +0000 (11:36 +0100)]
doc: fix typo in pdump guide
- Fix copy/paste error in description of how to capture both rx
& tx traffic in a single pcap file
- Replace duplicate word with what original author presumably
intended, such that description now makes sense
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Mark Kavanagh [Fri, 9 Sep 2016 16:15:52 +0000 (17:15 +0100)]
doc: clarify usage of testpmd MAC forward mode
Explain default testpmd behavior in mac fwd mode to remove
amiguity/confusion regarding user's ability to specify Ethernet
addresses.
Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Maryam Tahhan [Wed, 7 Sep 2016 10:45:57 +0000 (11:45 +0100)]
doc: add xstats commands in testpmd guide
Update the testpmd user guide with instructions for retrieving extended
NIC statistics.
Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Ajit Khaparde [Wed, 12 Oct 2016 21:26:31 +0000 (16:26 -0500)]
app/testpmd: support 25G and 50G speeds
Support to configure 25G and 50G speeds is missing from testpmd.
This patch also updates the testpmd user guide accordingly.
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Maciej Czekaj [Fri, 26 Aug 2016 11:46:42 +0000 (13:46 +0200)]
app/testpmd: configure flowgen packet size with --txpkts
"flowgen" forwarding mode has fixed packet size (300).
Let it re-use --txpkts option for specifying generated packet size.
Signed-off-by: Maciej Czekaj <maciej.czekaj@caviumnetworks.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Wenzhuo Lu [Mon, 26 Sep 2016 01:11:11 +0000 (09:11 +0800)]
app/testpmd: fix DCB configuration
An issue is found that DCB cannot be configured on ixgbe
NICs. It's said the TX queue number is not right.
On ixgbe the max TX queue number is not fixed, it depends
on the multi-queue mode.
This patch adds the device configuration before getting
info in the DCB configuration process. So the right info
can be got depending on the configuration.
Fixes:
1a572499 ("app/testpmd: setup DCB forwarding based on traffic class")
Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
Acked-by: Bernard Iremonger <bernard.iremonger@intel.com>
Mohammad Abdul Awal [Fri, 5 Aug 2016 15:34:51 +0000 (16:34 +0100)]
app/testpmd: fix RSS hash key size
RSS hash-key-size is retrieved from device configuration instead of
using a fixed size of 40 bytes.
Fixes:
f79959ea1504 ("app/testpmd: allow to configure RSS hash key")
Signed-off-by: Mohammad Abdul Awal <mohammad.abdul.awal@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Olivier Matz [Thu, 13 Oct 2016 13:40:30 +0000 (15:40 +0200)]
app/testpmd: fix TSO with checksum engine
The commit that disabled TSO for small packets was broken during the
rebase. The problem is the IP checksum is not calculated in software if:
- TX IP checksum is disabled
- TSO is enabled
- the current packet is smaller than tso segment size
When checking if the PKT_TX_IP_CKSUM flag should be set (in case
of tso), use the local tso_segsz variable, which is set to 0 when the
packet is too small to require tso. Therefore the IP checksum will be
correctly calculated in software.
Moreover, we should not use tunnel segment size for non-tunnel tso, else
TSO will stay disabled for all packets.
Fixes:
97c21329d42b ("app/testpmd: do not use TSO for small packets")
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Thomas Monjalon [Thu, 6 Oct 2016 10:34:23 +0000 (12:34 +0200)]
app/testpmd: use consistent vdev names
The vdev eth_bond has been renamed to net_bond.
testpmd is creating a bonding device with the old prefix.
It is changed for consistency.
The script test-null.sh was failing because using the old name
for the null vdev.
Fixes also the bonding and testpmd doc.
Fixes:
2f45703c17ac ("drivers: make driver names consistent")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Thomas Monjalon [Thu, 6 Oct 2016 10:34:22 +0000 (12:34 +0200)]
app/test: fix vdev names
The vdev eth_ring has been renamed to net_ring.
Some unit tests are using the old name and fail.
Fixes also the vdev comments in EAL and ethdev.
Fixes:
2f45703c17ac ("drivers: make driver names consistent")
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Thomas Monjalon [Mon, 3 Oct 2016 20:58:33 +0000 (22:58 +0200)]
app/test: add mempool walk
The mempool function rte_mempool_walk was not tested.
It will print the name of all mempools.
Signed-off-by: Thomas Monjalon <thomas.monjalon@6wind.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
Wei Dai [Thu, 29 Sep 2016 18:11:33 +0000 (02:11 +0800)]
app/test: reduce IPv6 LPM data size
copy app/test/test_lpm6_routes.h to app/test/test_lpm6_data.h .
and then delete app/test/test_lpm6_routes.h and clear the
large_ips_table[ ] to make LPM6 test case size much smaller than
before. Also add codes in app/test/test_lpm6_data.h to generate test
data in large_ips_table[ ] at run time.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Wei Dai [Tue, 27 Sep 2016 17:38:27 +0000 (01:38 +0800)]
app/test: remove large IPv4 LPM data file
remove the large file app/test/test_lpm_routes.h and add codes to
auto-generate similar large route rule table which keeps same depth
and IP class distribution as previous one in test_lpm_routes.h .
With the rule table auto-generated at run time, the performance
of looking up keep similar to that from pervious constant table.
Signed-off-by: Wei Dai <wei.dai@intel.com>
Pablo de Lara [Thu, 6 Oct 2016 22:34:50 +0000 (23:34 +0100)]
app/test: fix hash multiwriter sequence
Hash multiwriter test consists of two subtests.
If the any of the subtests fails, the overall test should fail,
but the overall test only passed if the second subtest passed,
because the return of the first subtest was being overwritten.
Fixes:
be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Jianfeng Tan [Thu, 4 Aug 2016 07:58:49 +0000 (07:58 +0000)]
examples/tep_term: fix inner L4 checksum
When sending packets from virtual machine which in need of TSO
by hardware NIC, the inner L4 checksum is not correct on the
other side of the cable.
It's because get_psd_sum() depends on PKT_TX_TCP_SEG to calculate
pseudo-header checksum, but currently this bit is set after the
function get_psd_sum() is called. The fix is straightforward.
Move the bit setting before get_psd_sum() is called.
Fixes:
a50245ede72a ("examples/tep_term: initialize VXLAN sample")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Jianfeng Tan [Thu, 4 Aug 2016 07:58:48 +0000 (07:58 +0000)]
examples/tep_term: fix offload on VXLAN
Based on previous fix of offload on VXLAN using i40e, applications
need to set proper tunneling type on ol_flags so that i40e driver
can pass it to NIC.
Fixes:
a50245ede72a ("examples/tep_term: initialize VXLAN sample")
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Hemant Agrawal [Tue, 23 Aug 2016 14:54:40 +0000 (20:24 +0530)]
examples/l3fwd: enable 4M hash for all 64-bit archs
This patch enables the support for 4 million hash entries
for all 64 bit architectures.
Signed-off-by: Hemant Agrawal <hemant.agrawal@nxp.com>
Maxime Coquelin [Fri, 23 Sep 2016 13:50:53 +0000 (15:50 +0200)]
examples/l2fwd: add option --[no-]mac-updating
l2fwd could be useful for testing virtual devices without the need
of physical ones.
To achieve this, this patch adds a new option to enable/disable the
MAC addresses updating done at forwarding time: --[no-]mac-updating
It enables the use of l2fwd for basic VM to VM communication.
By default, MAC address updating remains enabled, to keep consistency
with previous usage.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: John McNamara <john.mcnamara@intel.com>
Jasvinder Singh [Thu, 1 Sep 2016 10:11:04 +0000 (11:11 +0100)]
examples/qos_sched: fix dequeue from ring
The app_worker_thread() and app_mixed_thread() use rte_ring_sc_dequeue_bulk
to dequeue packets from the ring and this imposes restriction on number of
packets in software ring to be greater than the specified value to start
actual dequeue operation, thus, adds latency to those packets. Therefore,
rte_ring_sc_dequeue_bulk is replaced with rte_ring_sc_dequeue_burst.
Fixes:
de3cfa2c9823 ("sched: initial import")
Suggested-by: Tao Y Yang <tao.y.yang@intel.com>
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Jasvinder Singh [Thu, 13 Oct 2016 09:17:49 +0000 (10:17 +0100)]
examples/ip_pipeline: add configuration with TAP
To illustrate the TAP port usage, the sample configuration file with
passthrough pipeline connected to TAP interface is added.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Jasvinder Singh [Thu, 13 Oct 2016 09:17:48 +0000 (10:17 +0100)]
examples/ip_pipeline: add TAP port
The TAP port support is added to ip_pipeline app. To parse
configuration file with TAP port entries, parsing function is implemented.
The TAP ports configuration check and initialization routines have been
included in application code.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Jasvinder Singh [Thu, 13 Oct 2016 09:17:47 +0000 (10:17 +0100)]
port: support file descriptor
This patch adds File Descriptor(FD) port type (e.g. TAP port) to the
packet framework library that allows interface with the kernel network
stack. The FD port APIs are defined that allow port creation, writing
and reading packet from the kernel interface.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Gowrishankar Muthukrishnan [Tue, 4 Oct 2016 10:43:17 +0000 (16:13 +0530)]
examples/ip_pipeline: fix plugin loading
There is typo in init.c of ip_pipeline example due to which,
invalid file path is added to -d option of EAL i.e path starting
with =.
Signed-off-by: Gowrishankar Muthukrishnan <gowrishankar.m@linux.vnet.ibm.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Jasvinder Singh [Fri, 26 Aug 2016 21:21:45 +0000 (22:21 +0100)]
examples/ip_pipeline: add swap action in configuration
The network_layers configuration file (config/network_layers.cfg)
demonstrates the various network layer components such as TCP, UDP,
ICMP etc, which can be easily integrated into ip pipeline
infrastructure.
The loopback function (implemented using passthrough pipeline) is
updated to perform swap operation on the IP source and destination
address, and UDP source and destination ports.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Jasvinder Singh [Fri, 26 Aug 2016 21:21:44 +0000 (22:21 +0100)]
examples/ip_pipeline: add swap action in pass-through
Pass-through pipeline is updated with addition of packet fields swap
action. To enable swap action, new entry i.e 'swap' is required in
the passthrough pipeline section of the configuration file, and this
entry contains the offsets (in bytes) of the packet fields to be
swapped.
Each swap entry specifies the pair of packet fields offsets to be
swapped. Therefore, to perform swap action on more than one pair of
packets fields, separate swap entries, each one responsible for unique
pair of packet fields are needed.
Following illustrates the pass-through pipeline configuration that
swaps the source and destination addresses of the mac and tcp
ports of the received packets.
[EAL]
log_level = 0
[PIPELINE0]
type = MASTER
core = 0
[PIPELINE1]
type = PASS-THROUGH
core = 1
pktq_in = RXQ0.0 RXQ1.0 RXQ2.0 RXQ3.0
pktq_out = TXQ0.0 TXQ1.0 TXQ2.0 TXQ3.0
swap = 256 262; MACDST <-> MACSRC
;swap = 282 286; IPSRC <-> IPDST
swap = 290 292; PORTSRC <-> PORTDST
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Jasvinder Singh [Tue, 9 Aug 2016 16:30:56 +0000 (17:30 +0100)]
examples/ip_pipeline: set source port default
The default value of ``file_name`` parameter of the source port structure is
changed from ``NULL`` to ``./config/packets.pcap``.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Jasvinder Singh [Tue, 9 Aug 2016 16:30:55 +0000 (17:30 +0100)]
port: modify source and sink port structure
The ``file_name`` data type of ``struct rte_port_source_params`` and
``struct rte_port_sink_params`` is changed from `char *`` to ``const char *``.
Signed-off-by: Jasvinder Singh <jasvinder.singh@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Guruprasad Rao [Thu, 22 Sep 2016 10:12:07 +0000 (03:12 -0700)]
app/test: add cuckoo hash table
This patch includes cuckoo hash table for testing all the APIs
The cuckoo hash is added for both test_table_tables and
test_table_combined cases.
The testing is completed and the results are OK.
Signed-off-by: Sankar Chokkalingam <sankarx.chokkalingam@intel.com>
Signed-off-by: Guruprasad Rao <guruprasadx.rao@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Guruprasad Rao [Thu, 22 Sep 2016 10:12:06 +0000 (03:12 -0700)]
app/test-pipeline: add cuckoo hash
This patch inclides cuckoo hash table into test-pipeline
This allows to benchmark the performance of the cuckoo hash table
The following key sizes are supported for cuckoo hash table
8, 16, 32, 48, 64, 80, 96, 112 and 128.
The test-pipeline can be run using the following command
say for key size 8
./app/testpipeline -c 0xe -n 4 -- -p 0xf --hash-cuckoo-8
Signed-off-by: Sankar Chokkalingam <sankarx.chokkalingam@intel.com>
Signed-off-by: Guruprasad Rao <guruprasadx.rao@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Guruprasad Rao [Thu, 22 Sep 2016 10:12:05 +0000 (03:12 -0700)]
table: add cuckoo hash
This patch provides table apis for dosig version of cuckoo hash
via rte_table_hash_cuckoo_dosig_ops
The following apis are implemented for cuckoo hash
rte_table_hash_cuckoo_create
rte_table_hash_cuckoo_free
rte_table_hash_cuckoo_entry_add
rte_table_hash_cuckoo_entry_delete
rte_table_hash_cuckoo_lookup_dosig
rte_table_hash_cuckoo_stats_read
Signed-off-by: Sankar Chokkalingam <sankarx.chokkalingam@intel.com>
Signed-off-by: Guruprasad Rao <guruprasadx.rao@intel.com>
Acked-by: Cristian Dumitrescu <cristian.dumitrescu@intel.com>
Pablo de Lara [Wed, 12 Oct 2016 01:26:44 +0000 (02:26 +0100)]
hash: fix bucket size usage
Multiwriter insert function was using a fixed value for
the bucket size, instead of using the
RTE_HASH_BUCKET_ENTRIES macro, which value was changed
recently (making it inconsistent in this case).
Fixes:
be856325cba3 ("hash: add scalable multi-writer insertion with Intel TSX")
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Pablo de Lara [Wed, 12 Oct 2016 00:50:13 +0000 (01:50 +0100)]
hash: fix unlimited cuckoo path
When trying to insert a new entry, if its target bucket is full,
the alternative location (bucket) of one of the entries is checked,
to try to find an empty slot, with make_space_bucket.
This function is called every time a new bucket is checked, recursively.
To avoid having a very long insert operation (and to avoid filling up
the stack), a limit in the number of pushes is introduced.
Fixes:
48a399119619 ("hash: replace with cuckoo hash implementation")
Signed-off-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Reshma Pattan [Tue, 4 Oct 2016 16:42:22 +0000 (17:42 +0100)]
app/procinfo: free xstats memory upon failure
Some of the failures cases inside the nic_xstats_display()
function doesn't free the allocated memory for the xstats and
their names, memory is freed now.
Fixes:
e2aae1c1 ("ethdev: remove name from extended statistic fetch")
Fixes:
22561383 ("app: replace dump_cfg by proc_info")
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Reshma Pattan [Mon, 10 Oct 2016 14:35:48 +0000 (15:35 +0100)]
pdump: fix created directory permissions
Inside the function pdump_get_socket_path(), pdump socket
directories are created using mkdir() call with permissions 700,
which was assigning wrong permissions to the directories
i.e. "d-w-r-xr-T" instead of drwx---. The reason is mkdir() call
doesn't consider 700 as an octal value until unless 0 is explicitly
added before the value. Because of this, socket creation failure is
observed when DPDK application was ran in non root user mode.
DPDK application running in root user mode never reported the issue.
So 0 is prefixed to the value to create directories with
the correct permissions.
Fixes:
e4ffa2d3 ("pdump: fix error handlings")
Fixes:
bdd8dcc6 ("pdump: fix default socket path")
Reported-by: Jianfeng Tan <jianfeng.tan@intel.com>
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Acked-by: Remy Horton <remy.horton@intel.com>
Reshma Pattan [Mon, 10 Oct 2016 21:33:13 +0000 (22:33 +0100)]
mk: use -march option with recent Intel processors names
The GCC 4.9 -march option supports the intel code names for processors,
for example -march=silvermont, -march=broadwell.
The RTE_MACHINE config flag can be used to pass code name to
the compiler as -march flag.
Release notes is updated.
Linux and FreeBSD getting started guides are updated with recommended
gcc version as 4.9 and above.
Some of the gmake command examples in sample application guide and driver
guides are updated with gcc version as 4.9.
Signed-off-by: Reshma Pattan <reshma.pattan@intel.com>
Olivier Matz [Wed, 12 Oct 2016 15:39:50 +0000 (17:39 +0200)]
app/testpmd: hide segment size when not relevant
When TSO is not asked, hide the segment size.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Olivier Matz [Wed, 12 Oct 2016 15:39:49 +0000 (17:39 +0200)]
app/testpmd: do not use TSO for small packets
Asking for TSO (TCP Segmentation Offload) on packets that are already
smaller than (headers + MSS) does not work, for instance on ixgbe.
Fix the csumonly engine to only set the TSO flag when a segmentation
offload is really required, i.e. when packet is large enough.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Olivier Matz [Wed, 12 Oct 2016 15:39:48 +0000 (17:39 +0200)]
app/testpmd: display Rx port in checksum engine
This information is useful when debugging, especially with
bidirectional traffic.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Olivier Matz [Wed, 12 Oct 2016 15:39:47 +0000 (17:39 +0200)]
app/testpmd: do not change IP addrs in checksum engine
The csum forward engine was updated to change the IP addresses in the
packet data in
commit
51f694dd40f5 ("app/testpmd: rework checksum forward engine")
This was done to ensure that the checksum is correctly reprocessed when
using hardware checksum offload. But the functions
process_inner_cksums() and process_outer_cksums() already reset the
checksum field to 0, so this is not necessary.
Moreover, this makes the engine more complex than needed, and prevents
to easily use it to forward traffic (like iperf) as it modifies the
packets.
This patch drops this behavior.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Olivier Matz [Wed, 12 Oct 2016 15:39:46 +0000 (17:39 +0200)]
app/testpmd: add option to enable LRO
Introduce a new argument '--enable-lro' to ask testpmd to enable the LRO
feature on enabled ports, like it's done for '--enable-rx-cksum' for
instance.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Olivier Matz [Wed, 12 Oct 2016 15:39:45 +0000 (17:39 +0200)]
app/testpmd: dump Rx flags in checksum engine
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Olivier Matz [Wed, 12 Oct 2016 15:39:44 +0000 (17:39 +0200)]
app/testpmd: dump offload flags with new functions
Use the functions introduced in the previous commit to dump the offload
flags.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Olivier Matz [Wed, 12 Oct 2016 15:39:43 +0000 (17:39 +0200)]
mbuf: add functions to dump offload flags
The functions rte_get_rx_ol_flag_name() and rte_get_tx_ol_flag_name()
can dump one flag, or set of flag that are part of the same mask (ex:
PKT_TX_UDP_CKSUM, part of PKT_TX_L4_MASK). But they are not designed to
dump the list of flags contained in mbuf->ol_flags.
This commit introduce new functions to do that. Similarly to the packet
type dump functions, the goal is to factorize the code that could be
used in several applications and reduce the risk of desynchronization
between the flags and the dump functions.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Acked-by: Pablo de Lara <pablo.de.lara.guarch@intel.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:57 +0000 (10:38 +0200)]
app/testpmd: display software packet type
In addition to the packet type returned by the PMD, also display the
packet type calculated by parsing the packet in software. This is
particularly useful to compare the 2 values.
Note: it does not mean that both hw and sw always have to provide the
same value, since it depends on what hardware supports.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:56 +0000 (10:38 +0200)]
app/testpmd: dump packet type with new function
Use the function introduced in previous commit to dump the packet type
of the received packet.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:55 +0000 (10:38 +0200)]
mbuf: clarify definition of fragment packet types
An IPv4 packet is considered as a fragment if:
- MF (more fragment) bit is set
- or Fragment_Offset field is non-zero
Update the API documentation of packet types to reflect this.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:54 +0000 (10:38 +0200)]
mbuf: add functions to dump packet type
Dumping the packet type is useful for debug purposes. Instead
of having each application providing its function to do that,
introduce functions to do it.
It factorizes the code and reduces the risk of desynchronization between
the new packet types and the dump function.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:53 +0000 (10:38 +0200)]
net: get packet type for the first layers only
Add a parameter to rte_net_get_ptype() to select which
layers should be parsed. This avoids to parse all layers if
only the first ones are required.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:52 +0000 (10:38 +0200)]
net: support NVGRE in software packet type parser
Add support of Nvgre tunnels in rte_net_get_ptype(). At the same
time, as Nvgre transports Ethernet, we need to add the support for inner
Vlan, QinQ, and Mpls.
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:51 +0000 (10:38 +0200)]
net: support GRE in software packet type parser
Add support of Gre tunnels in rte_net_get_ptype().
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:50 +0000 (10:38 +0200)]
net: add GRE header structure
Add the Gre header structure in librte_net. It will be used by next
patches that adds the support of Gre tunnels in the software packet type
parser.
The extended headers (checksum, key or sequence number) are not defined.
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:49 +0000 (10:38 +0200)]
net: support IP tunnels in software packet type parser
Add support of IP and IP6 tunnels in rte_net_get_ptype().
We need to duplicate some code because the packet types do not have the
same value for a given protocol between inner and outer.
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:48 +0000 (10:38 +0200)]
net: support QinQ in software packet type parser
Add a new RTE_PTYPE_L2_ETHER_QINQ packet type, and its support in
rte_net_get_ptype().
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:47 +0000 (10:38 +0200)]
net: support VLAN in software packet type parser
Add a new RTE_PTYPE_L2_ETHER_VLAN packet type, and its support in
rte_net_get_ptype().
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:46 +0000 (10:38 +0200)]
net: add function to get packet type from data
Introduce the function rte_net_get_ptype() that parses a mbuf and
returns its packet type. For now, the following packet types are parsed:
L2: Ether
L3: IPv4, IPv6
L4: TCP, UDP, SCTP
The goal here is to provide a reference implementation for packet type
parsing. This function will be used by testpmd in next commits, allowing
to compare its result with the value given by the hardware.
This function will also be useful when implementing Rx offload support
in virtio pmd. Indeed, the virtio protocol gives the csum start and
offset, but it does not give the L4 protocol nor it tells if the
checksum is relevant for inner or outer. This information has to be
known to properly set the ol_flags in mbuf.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Jean Dao <jean.dao@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:45 +0000 (10:38 +0200)]
net: introduce net library
Previously, librte_net only contained header files. Add a C file
(empty for now) and generate a library. It will contain network helpers
like checksum calculation, software packet type parser, ...
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:44 +0000 (10:38 +0200)]
mbuf: move packet type definitions in a new file
The file rte_mbuf.h starts to be quite big, and next commits
will introduce more functions related to packet types. Let's
move them in a new file.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:43 +0000 (10:38 +0200)]
net: move ethernet definitions to the net library
The proper place for rte_ether.h is in librte_net because it defines
network headers.
Moving it will also prevent to have circular references in the following
patches that will require the Ethernet header definition in rte_mbuf.c.
By the way, fix minor checkpatch issues.
Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
Olivier Matz [Mon, 3 Oct 2016 08:38:42 +0000 (10:38 +0200)]
mbuf: add function to read packet data
Introduce a new function to read the packet data from an mbuf chain. It
linearizes the data if required, and also ensures that the mbuf is large
enough.
This function is used in next commits that add a software parser to
retrieve the packet type.
Signed-off-by: Olivier Matz <olivier.matz@6wind.com>
David Marchand [Fri, 7 Oct 2016 13:01:16 +0000 (15:01 +0200)]
ethdev: fix vendor id in debug message
Fixes:
af75078fece3 ("first public release")
Signed-off-by: David Marchand <david.marchand@6wind.com>
David Marchand [Fri, 7 Oct 2016 13:01:15 +0000 (15:01 +0200)]
ethdev: fix hotplug attach
If a pci probe operation creates a port but, for any reason, fails to
finish this operation and decides to delete the newly created port, then
the last created port id can not be trusted anymore and any subsequent
attach operations will fail.
This problem was noticed while working on a vm that had a virtio-net
management interface bound to the virtio-net kernel driver and no port
whitelisted in the commandline:
root@ubuntu1404:~/dpdk# ./build/app/testpmd -c 0x6 --
-i --total-num-mbufs=2049
EAL: Detected 3 lcore(s)
EAL: Probing VFIO support...
EAL: Debug logs available - lower performance
EAL: WARNING: cpu flags constant_tsc=yes nonstop_tsc=no -> using
unreliable clock cycles !
EAL: PCI device 0000:00:03.0 on NUMA socket -1
EAL: probe driver: 1af4:1000 (null)
rte_eth_dev_pci_probe: driver (null): eth_dev_init(vendor_id=0x6900
device_id=0x1000) failed
EAL: No probed ethernet devices
^
|
Here, rte_eth_dev_pci_probe() fails since vtpci_init() reports an
error. This results in a rte_eth_dev_release_port() right after a
rte_eth_dev_allocate().
Then, if we try to attach a port using rte_eth_dev_attach:
testpmd> port attach net_ring0
Attaching a new port...
PMD: Initializing pmd_ring for net_ring0
PMD: Creating rings-backed ethdev on numa socket 0
Two solutions:
- either update the last created port index to something invalid
(when freeing a ethdev port),
- or rely on the port count, before and after the eal attach.
The latter solution seems (well not really more robust but at least)
less fragile than the former.
We still have some issues with drivers that create multiple ethdev
ports with a single probe operation, but this was already the case.
Fixes:
b0fb26685570 ("ethdev: convert to EAL hotplug")
Reported-by: Daniel Mrzyglod <danielx.t.mrzyglod@intel.com>
Signed-off-by: David Marchand <david.marchand@6wind.com>
Jianfeng Tan [Mon, 26 Sep 2016 13:48:34 +0000 (13:48 +0000)]
app/testpmd: support tunneled TSO in checksum engine
Add a new command "tunnel_tso set <tso_segsz> <port>" to enable
segmentation offload and set MSS to tso_segsz. Another command,
"tunnel_tso show <port>" is added to show tunneled packet MSS.
Result 0 means tunnel_tso is disabled.
The original commands, "tso set <tso_segsz> <port>" and "tso show
<port>" are only reponsible for non-tunneled packets. And the new
commands are for tunneled packets.
Below conditions are needed to make it work:
a. tunnel TSO is supported by the NIC;
b. "csum parse_tunnel" must be set so that tunneled pkts are
recognized;
c. for tunneled pkts with outer L3 is IPv4, "csum set outer-ip"
must be set to hw, because after tso, total_len of outer IP
header is changed, and the checksum of outer IP header calculated
by sw should be wrong; that is not necessary for IPv6 tunneled
pkts because there's no checksum field to be filled anymore.
Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Jianfeng Tan [Mon, 1 Aug 2016 03:56:54 +0000 (03:56 +0000)]
net/i40e: support TSO on tunneling packet
To enable Tx side offload on tunneling packet, driver should set
correct tunneling parameters: (1) EIPT, External IP header type;
(2) EIPLEN, External IP; (3) L4TUNT; (4) L4TUNLEN. This parsing
behavior is based on (ol_flag & PKT_TX_TUNNEL_MASK). And when
it's a tunneling packet, MACLEN defines the outer L2 header.
Also, we define TSO on each kind of tunneling type as a capabilities.
Now only i40e declares to support them.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Jianfeng Tan [Mon, 1 Aug 2016 03:56:53 +0000 (03:56 +0000)]
mbuf: add Tx side tunneling type
To support tunneling packet offload capabilities on Tx side, PMDs
(e.g., i40e) need to know what kind of tunneling type of this packet.
Instead of analyzing the packet itself, we depend on applications to
correctly set the tunneling type. These flags are defined inside
rte_mbuf.ol_flags.
Signed-off-by: Zhe Tao <zhe.tao@intel.com>
Signed-off-by: Jianfeng Tan <jianfeng.tan@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Fiona Trahe [Thu, 6 Oct 2016 17:34:29 +0000 (18:34 +0100)]
app/test: remove crypto queue number hard-coding
ts_params->conf.nb_queue_pairs should not be hard coded with device
specific number. It should be retrieved from the device info.
Any test which changes it should restore it to orig value.
Signed-off-by: Akhil Goyal <akhil.goyal@nxp.com>
Signed-off-by: Fiona Trahe <fiona.trahe@intel.com>