2 Copyright(c) 2017 Intel Corporation. All rights reserved.
5 Redistribution and use in source and binary forms, with or without
6 modification, are permitted provided that the following conditions
9 * Redistributions of source code must retain the above copyright
10 notice, this list of conditions and the following disclaimer.
11 * Redistributions in binary form must reproduce the above copyright
12 notice, this list of conditions and the following disclaimer in
13 the documentation and/or other materials provided with the
15 * Neither the name of Intel Corporation nor the names of its
16 contributors may be used to endorse or promote products derived
17 from this software without specific prior written permission.
19 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
20 "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
21 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
22 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
23 OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
24 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
25 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
26 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
27 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
28 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
29 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
31 Generic Receive Offload Library
32 ===============================
34 Generic Receive Offload (GRO) is a widely used SW-based offloading
35 technique to reduce per-packet processing overheads. By reassembling
36 small packets into larger ones, GRO enables applications to process
37 fewer large packets directly, thus reducing the number of packets to
38 be processed. To benefit DPDK-based applications, like Open vSwitch,
39 DPDK also provides own GRO implementation. In DPDK, GRO is implemented
40 as a standalone library. Applications explicitly use the GRO library to
46 In the GRO library, there are many GRO types which are defined by packet
47 types. One GRO type is in charge of process one kind of packets. For
48 example, TCP/IPv4 GRO processes TCP/IPv4 packets.
50 Each GRO type has a reassembly function, which defines own algorithm and
51 table structure to reassemble packets. We assign input packets to the
52 corresponding GRO functions by MBUF->packet_type.
54 The GRO library doesn't check if input packets have correct checksums and
55 doesn't re-calculate checksums for merged packets. The GRO library
56 assumes the packets are complete (i.e., MF==0 && frag_off==0), when IP
57 fragmentation is possible (i.e., DF==0). Additionally, it complies RFC
58 6864 to process the IPv4 ID field.
60 Currently, the GRO library provides GRO supports for TCP/IPv4 packets.
65 For different usage scenarios, the GRO library provides two sets of API.
66 The one is called the lightweight mode API, which enables applications to
67 merge a small number of packets rapidly; the other is called the
68 heavyweight mode API, which provides fine-grained controls to
69 applications and supports to merge a large number of packets.
74 The lightweight mode only has one function ``rte_gro_reassemble_burst()``,
75 which process N packets at a time. Using the lightweight mode API to
76 merge packets is very simple. Calling ``rte_gro_reassemble_burst()`` is
77 enough. The GROed packets are returned to applications as soon as it
80 In ``rte_gro_reassemble_burst()``, table structures of different GRO
81 types are allocated in the stack. This design simplifies applications'
82 operations. However, limited by the stack size, the maximum number of
83 packets that ``rte_gro_reassemble_burst()`` can process in an invocation
84 should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``.
89 Compared with the lightweight mode, using the heavyweight mode API is
90 relatively complex. Firstly, applications need to create a GRO context
91 by ``rte_gro_ctx_create()``. ``rte_gro_ctx_create()`` allocates tables
92 structures in the heap and stores their pointers in the GRO context.
93 Secondly, applications use ``rte_gro_reassemble()`` to merge packets.
94 If input packets have invalid parameters, ``rte_gro_reassemble()``
95 returns them to applications. For example, packets of unsupported GRO
96 types or TCP SYN packets are returned. Otherwise, the input packets are
97 either merged with the existed packets in the tables or inserted into the
98 tables. Finally, applications use ``rte_gro_timeout_flush()`` to flush
99 packets from the tables, when they want to get the GROed packets.
101 Note that all update/lookup operations on the GRO context are not thread
102 safe. So if different processes or threads want to access the same
103 context object simultaneously, some external syncing mechanisms must be
109 The reassembly algorithm is used for reassembling packets. In the GRO
110 library, different GRO types can use different algorithms. In this
111 section, we will introduce an algorithm, which is used by TCP/IPv4 GRO.
116 The reassembly algorithm determines the efficiency of GRO. There are two
117 challenges in the algorithm design:
119 - a high cost algorithm/implementation would cause packet dropping in a
122 - packet reordering makes it hard to merge packets. For example, Linux
123 GRO fails to merge packets when encounters packet reordering.
125 The above two challenges require our algorithm is:
127 - lightweight enough to scale fast networking speed
129 - capable of handling packet reordering
131 In DPDK GRO, we use a key-based algorithm to address the two challenges.
133 Key-based Reassembly Algorithm
134 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
136 :numref:`figure_gro-key-algorithm` illustrates the procedure of the
137 key-based algorithm. Packets are classified into "flows" by some header
138 fields (we call them as "key"). To process an input packet, the algorithm
139 searches for a matched "flow" (i.e., the same value of key) for the
140 packet first, then checks all packets in the "flow" and tries to find a
141 "neighbor" for it. If find a "neighbor", merge the two packets together.
142 If can't find a "neighbor", store the packet into its "flow". If can't
143 find a matched "flow", insert a new "flow" and store the packet into the
147 Packets in the same "flow" that can't merge are always caused
148 by packet reordering.
150 The key-based algorithm has two characters:
152 - classifying packets into "flows" to accelerate packet aggregation is
153 simple (address challenge 1).
155 - storing out-of-order packets makes it possible to merge later (address
158 .. _figure_gro-key-algorithm:
160 .. figure:: img/gro-key-algorithm.*
163 Key-based Reassembly Algorithm
168 The table structure used by TCP/IPv4 GRO contains two arrays: flow array
169 and item array. The flow array keeps flow information, and the item array
170 keeps packet information.
172 Header fields used to define a TCP/IPv4 flow include:
174 - source and destination: Ethernet and IP address, TCP port
176 - TCP acknowledge number
178 TCP/IPv4 packets whose FIN, SYN, RST, URG, PSH, ECE or CWR bit is set
181 Header fields deciding if two packets are neighbors include:
183 - TCP sequence number
185 - IPv4 ID. The IPv4 ID fields of the packets, whose DF bit is 0, should
189 We comply RFC 6864 to process the IPv4 ID field. Specifically,
190 we check IPv4 ID fields for the packets whose DF bit is 0 and
191 ignore IPv4 ID fields for the packets whose DF bit is 1.
192 Additionally, packets which have different value of DF bit can't