+Malloc
+------
+
+The EAL provides a malloc API to allocate any-sized memory.
+
+The objective of this API is to provide malloc-like functions to allow
+allocation from hugepage memory and to facilitate application porting.
+The *DPDK API Reference* manual describes the available functions.
+
+Typically, these kinds of allocations should not be done in data plane
+processing because they are slower than pool-based allocation and make
+use of locks within the allocation and free paths.
+However, they can be used in configuration code.
+
+Refer to the rte_malloc() function description in the *DPDK API Reference*
+manual for more information.
+
+Cookies
+~~~~~~~
+
+When CONFIG_RTE_MALLOC_DEBUG is enabled, the allocated memory contains
+overwrite protection fields to help identify buffer overflows.
+
+Alignment and NUMA Constraints
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The rte_malloc() takes an align argument that can be used to request a memory
+area that is aligned on a multiple of this value (which must be a power of two).
+
+On systems with NUMA support, a call to the rte_malloc() function will return
+memory that has been allocated on the NUMA socket of the core which made the call.
+A set of APIs is also provided, to allow memory to be explicitly allocated on a
+NUMA socket directly, or by allocated on the NUMA socket where another core is
+located, in the case where the memory is to be used by a logical core other than
+on the one doing the memory allocation.
+
+Use Cases
+~~~~~~~~~
+
+This API is meant to be used by an application that requires malloc-like
+functions at initialization time.
+
+For allocating/freeing data at runtime, in the fast-path of an application,
+the memory pool library should be used instead.
+
+Internal Implementation
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Data Structures
+^^^^^^^^^^^^^^^
+
+There are two data structure types used internally in the malloc library:
+
+* struct malloc_heap - used to track free space on a per-socket basis
+
+* struct malloc_elem - the basic element of allocation and free-space
+ tracking inside the library.
+
+Structure: malloc_heap
+""""""""""""""""""""""
+
+The malloc_heap structure is used to manage free space on a per-socket basis.
+Internally, there is one heap structure per NUMA node, which allows us to
+allocate memory to a thread based on the NUMA node on which this thread runs.
+While this does not guarantee that the memory will be used on that NUMA node,
+it is no worse than a scheme where the memory is always allocated on a fixed
+or random node.
+
+The key fields of the heap structure and their function are described below
+(see also diagram above):
+
+* lock - the lock field is needed to synchronize access to the heap.
+ Given that the free space in the heap is tracked using a linked list,
+ we need a lock to prevent two threads manipulating the list at the same time.
+
+* free_head - this points to the first element in the list of free nodes for
+ this malloc heap.
+
+* first - this points to the first element in the heap.
+
+* last - this points to the last element in the heap.
+
+.. _figure_malloc_heap:
+
+.. figure:: img/malloc_heap.*
+
+ Example of a malloc heap and malloc elements within the malloc library
+
+
+.. _malloc_elem:
+
+Structure: malloc_elem
+""""""""""""""""""""""
+
+The malloc_elem structure is used as a generic header structure for various
+blocks of memory.
+It is used in two different ways - all shown in the diagram above:
+
+#. As a header on a block of free or allocated memory - normal case
+
+#. As a padding header inside a block of memory
+
+The most important fields in the structure and how they are used are described below.
+
+Malloc heap is a doubly-linked list, where each element keeps track of its
+previous and next elements. Due to the fact that hugepage memory can come and
+go, neighboring malloc elements may not necessarily be adjacent in memory.
+Also, since a malloc element may span multiple pages, its contents may not
+necessarily be IOVA-contiguous either - each malloc element is only guaranteed
+to be virtually contiguous.
+
+.. note::
+
+ If the usage of a particular field in one of the above three usages is not
+ described, the field can be assumed to have an undefined value in that
+ situation, for example, for padding headers only the "state" and "pad"
+ fields have valid values.
+
+* heap - this pointer is a reference back to the heap structure from which
+ this block was allocated.
+ It is used for normal memory blocks when they are being freed, to add the
+ newly-freed block to the heap's free-list.
+
+* prev - this pointer points to previous header element/block in memory. When
+ freeing a block, this pointer is used to reference the previous block to
+ check if that block is also free. If so, and the two blocks are immediately
+ adjacent to each other, then the two free blocks are merged to form a single
+ larger block.
+
+* next - this pointer points to next header element/block in memory. When
+ freeing a block, this pointer is used to reference the next block to check
+ if that block is also free. If so, and the two blocks are immediately
+ adjacent to each other, then the two free blocks are merged to form a single
+ larger block.
+
+* free_list - this is a structure pointing to previous and next elements in
+ this heap's free list.
+ It is only used in normal memory blocks; on ``malloc()`` to find a suitable
+ free block to allocate and on ``free()`` to add the newly freed element to
+ the free-list.
+
+* state - This field can have one of three values: ``FREE``, ``BUSY`` or
+ ``PAD``.
+ The former two are to indicate the allocation state of a normal memory block
+ and the latter is to indicate that the element structure is a dummy structure
+ at the end of the start-of-block padding, i.e. where the start of the data
+ within a block is not at the start of the block itself, due to alignment
+ constraints.
+ In that case, the pad header is used to locate the actual malloc element
+ header for the block.
+
+* pad - this holds the length of the padding present at the start of the block.
+ In the case of a normal block header, it is added to the address of the end
+ of the header to give the address of the start of the data area, i.e. the
+ value passed back to the application on a malloc.
+ Within a dummy header inside the padding, this same value is stored, and is
+ subtracted from the address of the dummy header to yield the address of the
+ actual block header.
+
+* size - the size of the data block, including the header itself.
+
+Memory Allocation
+^^^^^^^^^^^^^^^^^
+
+On EAL initialization, all preallocated memory segments are setup as part of the
+malloc heap. This setup involves placing an :ref:`element header<malloc_elem>`
+with ``FREE`` at the start of each virtually contiguous segment of memory.
+The ``FREE`` element is then added to the ``free_list`` for the malloc heap.
+
+This setup also happens whenever memory is allocated at runtime (if supported),
+in which case newly allocated pages are also added to the heap, merging with any
+adjacent free segments if there are any.
+
+When an application makes a call to a malloc-like function, the malloc function
+will first index the ``lcore_config`` structure for the calling thread, and
+determine the NUMA node of that thread.
+The NUMA node is used to index the array of ``malloc_heap`` structures which is
+passed as a parameter to the ``heap_alloc()`` function, along with the
+requested size, type, alignment and boundary parameters.
+
+The ``heap_alloc()`` function will scan the free_list of the heap, and attempt
+to find a free block suitable for storing data of the requested size, with the
+requested alignment and boundary constraints.
+
+When a suitable free element has been identified, the pointer to be returned
+to the user is calculated.
+The cache-line of memory immediately preceding this pointer is filled with a
+struct malloc_elem header.
+Because of alignment and boundary constraints, there could be free space at
+the start and/or end of the element, resulting in the following behavior:
+
+#. Check for trailing space.
+ If the trailing space is big enough, i.e. > 128 bytes, then the free element
+ is split.
+ If it is not, then we just ignore it (wasted space).
+
+#. Check for space at the start of the element.
+ If the space at the start is small, i.e. <=128 bytes, then a pad header is
+ used, and the remaining space is wasted.
+ If, however, the remaining space is greater, then the free element is split.
+
+The advantage of allocating the memory from the end of the existing element is
+that no adjustment of the free list needs to take place - the existing element
+on the free list just has its size value adjusted, and the next/previous elements
+have their "prev"/"next" pointers redirected to the newly created element.
+
+In case when there is not enough memory in the heap to satisfy allocation
+request, EAL will attempt to allocate more memory from the system (if supported)
+and, following successful allocation, will retry reserving the memory again. In
+a multiprocessing scenario, all primary and secondary processes will synchronize
+their memory maps to ensure that any valid pointer to DPDK memory is guaranteed
+to be valid at all times in all currently running processes.
+
+Failure to synchronize memory maps in one of the processes will cause allocation
+to fail, even though some of the processes may have allocated the memory
+successfully. The memory is not added to the malloc heap unless primary process
+has ensured that all other processes have mapped this memory successfully.
+
+Any successful allocation event will trigger a callback, for which user
+applications and other DPDK subsystems can register. Additionally, validation
+callbacks will be triggered before allocation if the newly allocated memory will
+exceed threshold set by the user, giving a chance to allow or deny allocation.
+
+.. note::
+
+ Any allocation of new pages has to go through primary process. If the
+ primary process is not active, no memory will be allocated even if it was
+ theoretically possible to do so. This is because primary's process map acts
+ as an authority on what should or should not be mapped, while each secondary
+ process has its own, local memory map. Secondary processes do not update the
+ shared memory map, they only copy its contents to their local memory map.
+
+Freeing Memory
+^^^^^^^^^^^^^^
+
+To free an area of memory, the pointer to the start of the data area is passed
+to the free function.
+The size of the ``malloc_elem`` structure is subtracted from this pointer to get
+the element header for the block.
+If this header is of type ``PAD`` then the pad length is further subtracted from
+the pointer to get the proper element header for the entire block.
+
+From this element header, we get pointers to the heap from which the block was
+allocated and to where it must be freed, as well as the pointer to the previous
+and next elements. These next and previous elements are then checked to see if
+they are also ``FREE`` and are immediately adjacent to the current one, and if
+so, they are merged with the current element. This means that we can never have
+two ``FREE`` memory blocks adjacent to one another, as they are always merged
+into a single block.
+
+If deallocating pages at runtime is supported, and the free element encloses
+one or more pages, those pages can be deallocated and be removed from the heap.
+If DPDK was started with command-line parameters for preallocating memory
+(``-m`` or ``--socket-mem``), then those pages that were allocated at startup
+will not be deallocated.
+
+Any successful deallocation event will trigger a callback, for which user
+applications and other DPDK subsystems can register.