From: Bruce Richardson Date: Thu, 11 Sep 2014 13:15:44 +0000 (+0100) Subject: mbuf: split mbuf across two cache lines. X-Git-Tag: spdx-start~10426 X-Git-Url: http://git.droids-corp.org/?a=commitdiff_plain;h=f867492346bd271742dd34974e9cf8ac55ddb869;p=dpdk.git mbuf: split mbuf across two cache lines. This change splits the mbuf in two to move the pool and next pointers to the second cache line. This frees up 16 bytes in first cache line. The reason for this change is that we believe that there is no possible way that we can ever fit all the fields we need to fit into a 64-byte mbuf, and so we need to start looking at a 128-byte mbuf instead. Examples of new fields that need to fit in, include - * 32-bits more for filter information for support for the new filters in the i40e driver (and possibly other future drivers) * an additional 2-4 bytes for storing info on a second vlan tag to allow drivers to support double Vlan/QinQ * 4-bytes for storing a sequence number to enable out of order packet processing and subsequent packet reordering as well as potentially a number of other fields or splitting out fields that are superimposed over each other right now, e.g. for the qos scheduler. We also want to allow space for use by other non-Intel NIC drivers that may be open-sourced to dpdk.org in the future too, where they support fields and offloads that currently supported hardware doesn't. If we accept the fact of a 2-cache-line mbuf, then the issue becomes how to rework things so that we spread our fields over the two cache lines while causing the lowest slow-down possible. The general approach that we are looking to take is to focus the first cache line on fields that are updated on RX , so that receive only deals with one cache line. The second cache line can be used for application data and information that will only be used on the TX leg. This would allow us to work on the first cache line in RX as now, and have the second cache line being prefetched in the background so that it is available when necessary. Hardware prefetches should help us out here. We also may move rarely used, or slow-path RX fields e.g. such as those for chained mbufs with jumbo frames, to the second cache line, depending upon the performance impact and bytes savings achieved. Signed-off-by: Bruce Richardson Acked-by: Thomas Monjalon --- diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c index 1b254812da..66bcbc54b0 100644 --- a/app/test/test_mbuf.c +++ b/app/test/test_mbuf.c @@ -782,7 +782,7 @@ test_failing_mbuf_sanity_check(void) static int test_mbuf(void) { - RTE_BUILD_BUG_ON(sizeof(struct rte_mbuf) != 64); + RTE_BUILD_BUG_ON(sizeof(struct rte_mbuf) != CACHE_LINE_SIZE * 2); /* create pktmbuf pool if it does not exist */ if (pktmbuf_pool == NULL) { diff --git a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h index ab022bd775..25ed672136 100644 --- a/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h +++ b/lib/librte_eal/linuxapp/eal/include/exec-env/rte_kni_common.h @@ -108,7 +108,7 @@ struct rte_kni_fifo { * Padding is necessary to assure the offsets of these fields */ struct rte_kni_mbuf { - void *buf_addr; + void *buf_addr __attribute__((__aligned__(64))); char pad0[10]; uint16_t data_off; /**< Start address of data in segment buffer. */ char pad1[4]; @@ -117,9 +117,9 @@ struct rte_kni_mbuf { uint16_t data_len; /**< Amount of data in segment buffer. */ uint32_t pkt_len; /**< Total pkt len: sum of all segment data_len. */ char pad3[8]; - void *pool; + void *pool __attribute__((__aligned__(64))); void *next; -} __attribute__((__aligned__(64))); +}; /* * Struct used to create a KNI device. Passed to the kernel in IOCTL call diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index 34900d4bc2..508021bd90 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -176,7 +176,8 @@ struct rte_mbuf { uint32_t sched; /**< Hierarchical scheduler */ } hash; /**< hash information */ - /* fields only used in slow path or on TX */ + /* second cache line - fields only used in slow path or on TX */ + MARKER cacheline1 __rte_cache_aligned; struct rte_mempool *pool; /**< Pool from which mbuf was allocated. */ struct rte_mbuf *next; /**< Next segment of scattered packet. */