-technique to reduce per-packet processing overhead. It gains performance
-by reassembling small packets into large ones. To enable more flexibility
-to applications, DPDK implements GRO as a standalone library. Applications
-explicitly use the GRO library to merge small packets into large ones.
-
-The GRO library assumes all input packets have correct checksums. In
-addition, the GRO library doesn't re-calculate checksums for merged
-packets. If input packets are IP fragmented, the GRO library assumes
-they are complete packets (i.e. with L4 headers).
-
-Currently, the GRO library implements TCP/IPv4 packet reassembly.
-
-Reassembly Modes
-----------------
-
-The GRO library provides two reassembly modes: lightweight and
-heavyweight mode. If applications want to merge packets in a simple way,
-they can use the lightweight mode API. If applications want more
-fine-grained controls, they can choose the heavyweight mode API.
-
-Lightweight Mode
-~~~~~~~~~~~~~~~~
-
-The ``rte_gro_reassemble_burst()`` function is used for reassembly in
-lightweight mode. It tries to merge N input packets at a time, where
-N should be less than or equal to ``RTE_GRO_MAX_BURST_ITEM_NUM``.
-
-In each invocation, ``rte_gro_reassemble_burst()`` allocates temporary
-reassembly tables for the desired GRO types. Note that the reassembly
-table is a table structure used to reassemble packets and different GRO
-types (e.g. TCP/IPv4 GRO and TCP/IPv6 GRO) have different reassembly table
-structures. The ``rte_gro_reassemble_burst()`` function uses the reassembly
-tables to merge the N input packets.
-
-For applications, performing GRO in lightweight mode is simple. They
-just need to invoke ``rte_gro_reassemble_burst()``. Applications can get
-GROed packets as soon as ``rte_gro_reassemble_burst()`` returns.
-
-Heavyweight Mode
-~~~~~~~~~~~~~~~~
-
-The ``rte_gro_reassemble()`` function is used for reassembly in heavyweight
-mode. Compared with the lightweight mode, performing GRO in heavyweight mode
-is relatively complicated.
-
-Before performing GRO, applications need to create a GRO context object
-by calling ``rte_gro_ctx_create()``. A GRO context object holds the
-reassembly tables of desired GRO types. Note that all update/lookup
-operations on the context object are not thread safe. So if different
-processes or threads want to access the same context object simultaneously,
-some external syncing mechanisms must be used.
-
-Once the GRO context is created, applications can then use the
-``rte_gro_reassemble()`` function to merge packets. In each invocation,
-``rte_gro_reassemble()`` tries to merge input packets with the packets
-in the reassembly tables. If an input packet is an unsupported GRO type,
-or other errors happen (e.g. SYN bit is set), ``rte_gro_reassemble()``
-returns the packet to applications. Otherwise, the input packet is either
-merged or inserted into a reassembly table.
-
-When applications want to get GRO processed packets, they need to use
-``rte_gro_timeout_flush()`` to flush them from the tables manually.
+technique to reduce per-packet processing overheads. By reassembling
+small packets into larger ones, GRO enables applications to process
+fewer large packets directly, thus reducing the number of packets to
+be processed. To benefit DPDK-based applications, like Open vSwitch,
+DPDK also provides own GRO implementation. In DPDK, GRO is implemented
+as a standalone library. Applications explicitly use the GRO library to
+reassemble packets.
+
+Overview
+--------
+
+In the GRO library, there are many GRO types which are defined by packet
+types. One GRO type is in charge of process one kind of packets. For
+example, TCP/IPv4 GRO processes TCP/IPv4 packets.
+
+Each GRO type has a reassembly function, which defines own algorithm and
+table structure to reassemble packets. We assign input packets to the
+corresponding GRO functions by MBUF->packet_type.
+
+The GRO library doesn't check if input packets have correct checksums and
+doesn't re-calculate checksums for merged packets. The GRO library
+assumes the packets are complete (i.e., MF==0 && frag_off==0), when IP
+fragmentation is possible (i.e., DF==0). Additionally, it requires IPv4
+ID to be increased by one.