+* **Added the rte_cldemote API.**
+
+ Added a hardware hint CLDEMOTE, which is similar to prefetch in reverse.
+ CLDEMOTE moves the cache line to the more remote cache, where it expects
+ sharing to be efficient. Moving the cache line to a level more distant from
+ the processor helps to accelerate core-to-core communication.
+ This API is specific to x86 and implemented as a stub for other
+ architectures.
+
+* **Added support for limiting maximum SIMD bitwidth.**
+
+ Added a new EAL config setting ``max_simd_bitwidth`` to limit the vector
+ path selection at runtime. This value can be set by apps using the
+ ``rte_vect_set_max_simd_bitwidth`` function, or by the user with EAL flag
+ ``--force-max-simd-bitwidth``.
+
+* **Added zero copy APIs for rte_ring.**
+
+ For rings with producer/consumer in ``RTE_RING_SYNC_ST``, ``RTE_RING_SYNC_MT_HTS``
+ modes, these APIs split enqueue/dequeue operation into three phases
+ (enqueue/dequeue start, copy data to/from ring, enqueue/dequeue finish).
+ Along with the advantages of the peek APIs, these provide the ability to
+ copy the data to the ring memory directly without the need for temporary
+ storage.
+