doc/guides/prog_guide/eventdev.rst

   1 ..  BSD LICENSE
   2     Copyright(c) 2017 Intel Corporation. All rights reserved.
   3
   4     Redistribution and use in source and binary forms, with or without
   5     modification, are permitted provided that the following conditions
   6     are met:
   7
   8     * Redistributions of source code must retain the above copyright
   9     notice, this list of conditions and the following disclaimer.
  10     * Redistributions in binary form must reproduce the above copyright
  11     notice, this list of conditions and the following disclaimer in
  12     the documentation and/or other materials provided with the
  13     distribution.
  14     * Neither the name of Intel Corporation nor the names of its
  15     contributors may be used to endorse or promote products derived
  16     from this software without specific prior written permission.
  17
  18     THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
  19     "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
  20     LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
  21     A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
  22     OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  23     SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
  24     LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
  25     DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
  26     THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
  27     (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
  28     OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  29
  30 Event Device Library
  31 ====================
  32
  33 The DPDK Event device library is an abstraction that provides the application
  34 with features to schedule events. This is achieved using the PMD architecture
  35 similar to the ethdev or cryptodev APIs, which may already be familiar to the
  36 reader.
  37
  38 The eventdev framework introduces the event driven programming model. In a
  39 polling model, lcores poll ethdev ports and associated Rx queues directly
  40 to look for a packet. By contrast in an event driven model, lcores call the
  41 scheduler that selects packets for them based on programmer-specified criteria.
  42 The Eventdev library adds support for an event driven programming model, which
  43 offers applications automatic multicore scaling, dynamic load balancing,
  44 pipelining, packet ingress order maintenance and synchronization services to
  45 simplify application packet processing.
  46
  47 By introducing an event driven programming model, DPDK can support both polling
  48 and event driven programming models for packet processing, and applications are
  49 free to choose whatever model (or combination of the two) best suits their
  50 needs.
  51
  52 Step-by-step instructions of the eventdev design is available in the `API
  53 Walk-through`_ section later in this document.
  54
  55 Event struct
  56 ------------
  57
  58 The eventdev API represents each event with a generic struct, which contains a
  59 payload and metadata required for scheduling by an eventdev.  The
  60 ``rte_event`` struct is a 16 byte C structure, defined in
  61 ``libs/librte_eventdev/rte_eventdev.h``.
  62
  63 Event Metadata
  64 ~~~~~~~~~~~~~~
  65
  66 The rte_event structure contains the following metadata fields, which the
  67 application fills in to have the event scheduled as required:
  68
  69 * ``flow_id`` - The targeted flow identifier for the enq/deq operation.
  70 * ``event_type`` - The source of this event, eg RTE_EVENT_TYPE_ETHDEV or CPU.
  71 * ``sub_event_type`` - Distinguishes events inside the application, that have
  72   the same event_type (see above)
  73 * ``op`` - This field takes one of the RTE_EVENT_OP_* values, and tells the
  74   eventdev about the status of the event - valid values are NEW, FORWARD or
  75   RELEASE.
  76 * ``sched_type`` - Represents the type of scheduling that should be performed
  77   on this event, valid values are the RTE_SCHED_TYPE_ORDERED, ATOMIC and
  78   PARALLEL.
  79 * ``queue_id`` - The identifier for the event queue that the event is sent to.
  80 * ``priority`` - The priority of this event, see RTE_EVENT_DEV_PRIORITY.
  81
  82 Event Payload
  83 ~~~~~~~~~~~~~
  84
  85 The rte_event struct contains a union for payload, allowing flexibility in what
  86 the actual event being scheduled is. The payload is a union of the following:
  87
  88 * ``uint64_t u64``
  89 * ``void *event_ptr``
  90 * ``struct rte_mbuf *mbuf``
  91
  92 These three items in a union occupy the same 64 bits at the end of the rte_event
  93 structure. The application can utilize the 64 bits directly by accessing the
  94 u64 variable, while the event_ptr and mbuf are provided as convenience
  95 variables.  For example the mbuf pointer in the union can used to schedule a
  96 DPDK packet.
  97
  98 Queues
  99 ~~~~~~
 100
 101 An event queue is a queue containing events that are scheduled by the event
 102 device. An event queue contains events of different flows associated with
 103 scheduling types, such as atomic, ordered, or parallel.
 104
 105 Queue All Types Capable
 106 ^^^^^^^^^^^^^^^^^^^^^^^
 107
 108 If RTE_EVENT_DEV_CAP_QUEUE_ALL_TYPES capability bit is set in the event device,
 109 then events of any type may be sent to any queue. Otherwise, the queues only
 110 support events of the type that it was created with.
 111
 112 Queue All Types Incapable
 113 ^^^^^^^^^^^^^^^^^^^^^^^^^
 114
 115 In this case, each stage has a specified scheduling type.  The application
 116 configures each queue for a specific type of scheduling, and just enqueues all
 117 events to the eventdev. An example of a PMD of this type is the eventdev
 118 software PMD.
 119
 120 The Eventdev API supports the following scheduling types per queue:
 121
 122 *   Atomic
 123 *   Ordered
 124 *   Parallel
 125
 126 Atomic, Ordered and Parallel are load-balanced scheduling types: the output
 127 of the queue can be spread out over multiple CPU cores.
 128
 129 Atomic scheduling on a queue ensures that a single flow is not present on two
 130 different CPU cores at the same time. Ordered allows sending all flows to any
 131 core, but the scheduler must ensure that on egress the packets are returned to
 132 ingress order on downstream queue enqueue. Parallel allows sending all flows
 133 to all CPU cores, without any re-ordering guarantees.
 134
 135 Single Link Flag
 136 ^^^^^^^^^^^^^^^^
 137
 138 There is a SINGLE_LINK flag which allows an application to indicate that only
 139 one port will be connected to a queue.  Queues configured with the single-link
 140 flag follow a FIFO like structure, maintaining ordering but it is only capable
 141 of being linked to a single port (see below for port and queue linking details).
 142
 143
 144 Ports
 145 ~~~~~
 146
 147 Ports are the points of contact between worker cores and the eventdev. The
 148 general use-case will see one CPU core using one port to enqueue and dequeue
 149 events from an eventdev. Ports are linked to queues in order to retrieve events
 150 from those queues (more details in `Linking Queues and Ports`_ below).
 151
 152
 153 API Walk-through
 154 ----------------
 155
 156 This section will introduce the reader to the eventdev API, showing how to
 157 create and configure an eventdev and use it for a two-stage atomic pipeline
 158 with a single core for TX. The diagram below shows the final state of the
 159 application after this walk-through:
 160
 161 .. _figure_eventdev-usage1:
 162
 163 .. figure:: img/eventdev_usage.*
 164
 165    Sample eventdev usage, with RX, two atomic stages and a single-link to TX.
 166
 167
 168 A high level overview of the setup steps are:
 169
 170 * rte_event_dev_configure()
 171 * rte_event_queue_setup()
 172 * rte_event_port_setup()
 173 * rte_event_port_link()
 174 * rte_event_dev_start()
 175
 176
 177 Init and Config
 178 ~~~~~~~~~~~~~~~
 179
 180 The eventdev library uses vdev options to add devices to the DPDK application.
 181 The ``--vdev`` EAL option allows adding eventdev instances to your DPDK
 182 application, using the name of the eventdev PMD as an argument.
 183
 184 For example, to create an instance of the software eventdev scheduler, the
 185 following vdev arguments should be provided to the application EAL command line:
 186
 187 .. code-block:: console
 188
 189    ./dpdk_application --vdev="event_sw0"
 190
 191 In the following code, we configure eventdev instance with 3 queues
 192 and 6 ports as follows. The 3 queues consist of 2 Atomic and 1 Single-Link,
 193 while the 6 ports consist of 4 workers, 1 RX and 1 TX.
 194
 195 .. code-block:: c
 196
 197         const struct rte_event_dev_config config = {
 198                 .nb_event_queues = 3,
 199                 .nb_event_ports = 6,
 200                 .nb_events_limit  = 4096,
 201                 .nb_event_queue_flows = 1024,
 202                 .nb_event_port_dequeue_depth = 128,
 203                 .nb_event_port_enqueue_depth = 128,
 204         };
 205         int err = rte_event_dev_configure(dev_id, &config);
 206
 207 The remainder of this walk-through assumes that dev_id is 0.
 208
 209 Setting up Queues
 210 ~~~~~~~~~~~~~~~~~
 211
 212 Once the eventdev itself is configured, the next step is to configure queues.
 213 This is done by setting the appropriate values in a queue_conf structure, and
 214 calling the setup function. Repeat this step for each queue, starting from
 215 0 and ending at ``nb_event_queues - 1`` from the event_dev config above.
 216
 217 .. code-block:: c
 218
 219         struct rte_event_queue_conf atomic_conf = {
 220                 .event_queue_cfg = RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY,
 221                 .priority = RTE_EVENT_DEV_PRIORITY_NORMAL,
 222                 .nb_atomic_flows = 1024,
 223                 .nb_atomic_order_sequences = 1024,
 224         };
 225         int dev_id = 0;
 226         int queue_id = 0;
 227         int err = rte_event_queue_setup(dev_id, queue_id, &atomic_conf);
 228
 229 The remainder of this walk-through assumes that the queues are configured as
 230 follows:
 231
 232  * id 0, atomic queue #1
 233  * id 1, atomic queue #2
 234  * id 2, single-link queue
 235
 236 Setting up Ports
 237 ~~~~~~~~~~~~~~~~
 238
 239 Once queues are set up successfully, create the ports as required. Each port
 240 should be set up with its corresponding port_conf type, worker for worker cores,
 241 rx and tx for the RX and TX cores:
 242
 243 .. code-block:: c
 244
 245         struct rte_event_port_conf rx_conf = {
 246                 .dequeue_depth = 128,
 247                 .enqueue_depth = 128,
 248                 .new_event_threshold = 1024,
 249         };
 250         struct rte_event_port_conf worker_conf = {
 251                 .dequeue_depth = 16,
 252                 .enqueue_depth = 64,
 253                 .new_event_threshold = 4096,
 254         };
 255         struct rte_event_port_conf tx_conf = {
 256                 .dequeue_depth = 128,
 257                 .enqueue_depth = 128,
 258                 .new_event_threshold = 4096,
 259         };
 260         int dev_id = 0;
 261         int port_id = 0;
 262         int err = rte_event_port_setup(dev_id, port_id, &CORE_FUNCTION_conf);
 263
 264 It is now assumed that:
 265
 266  * port 0: RX core
 267  * ports 1,2,3,4: Workers
 268  * port 5: TX core
 269
 270 Linking Queues and Ports
 271 ~~~~~~~~~~~~~~~~~~~~~~~~
 272
 273 The final step is to "wire up" the ports to the queues. After this, the
 274 eventdev is capable of scheduling events, and when cores request work to do,
 275 the correct events are provided to that core. Note that the RX core takes input
 276 from eg: a NIC so it is not linked to any eventdev queues.
 277
 278 Linking all workers to atomic queues, and the TX core to the single-link queue
 279 can be achieved like this:
 280
 281 .. code-block:: c
 282
 283         uint8_t port_id = 0;
 284         uint8_t atomic_qs[] = {0, 1};
 285         uint8_t single_link_q = 2;
 286         uint8_t tx_port_id = 5;
 287         uin8t_t priority = RTE_EVENT_DEV_PRIORITY_NORMAL;
 288
 289         for(int i = 0; i < 4; i++) {
 290                 int worker_port = i + 1;
 291                 int links_made = rte_event_port_link(dev_id, worker_port, atomic_qs, NULL, 2);
 292         }
 293         int links_made = rte_event_port_link(dev_id, tx_port_id, &single_link_q, &priority, 1);
 294
 295 Starting the EventDev
 296 ~~~~~~~~~~~~~~~~~~~~~
 297
 298 A single function call tells the eventdev instance to start processing
 299 events. Note that all queues must be linked to for the instance to start, as
 300 if any queue is not linked to, enqueuing to that queue will cause the
 301 application to backpressure and eventually stall due to no space in the
 302 eventdev.
 303
 304 .. code-block:: c
 305
 306         int err = rte_event_dev_start(dev_id);
 307
 308 Ingress of New Events
 309 ~~~~~~~~~~~~~~~~~~~~~
 310
 311 Now that the eventdev is set up, and ready to receive events, the RX core must
 312 enqueue some events into the system for it to schedule. The events to be
 313 scheduled are ordinary DPDK packets, received from an eth_rx_burst() as normal.
 314 The following code shows how those packets can be enqueued into the eventdev:
 315
 316 .. code-block:: c
 317
 318         const uint16_t nb_rx = rte_eth_rx_burst(eth_port, 0, mbufs, BATCH_SIZE);
 319
 320         for (i = 0; i < nb_rx; i++) {
 321                 ev[i].flow_id = mbufs[i]->hash.rss;
 322                 ev[i].op = RTE_EVENT_OP_NEW;
 323                 ev[i].sched_type = RTE_EVENT_QUEUE_CFG_ATOMIC_ONLY;
 324                 ev[i].queue_id = 0;
 325                 ev[i].event_type = RTE_EVENT_TYPE_ETHDEV;
 326                 ev[i].sub_event_type = 0;
 327                 ev[i].priority = RTE_EVENT_DEV_PRIORITY_NORMAL;
 328                 ev[i].mbuf = mbufs[i];
 329         }
 330
 331         const int nb_tx = rte_event_enqueue_burst(dev_id, port_id, ev, nb_rx);
 332         if (nb_tx != nb_rx) {
 333                 for(i = nb_tx; i < nb_rx; i++)
 334                         rte_pktmbuf_free(mbufs[i]);
 335         }
 336
 337 Forwarding of Events
 338 ~~~~~~~~~~~~~~~~~~~~
 339
 340 Now that the RX core has injected events, there is work to be done by the
 341 workers. Note that each worker will dequeue as many events as it can in a burst,
 342 process each one individually, and then burst the packets back into the
 343 eventdev.
 344
 345 The worker can lookup the events source from ``event.queue_id``, which should
 346 indicate to the worker what workload needs to be performed on the event.
 347 Once done, the worker can update the ``event.queue_id`` to a new value, to send
 348 the event to the next stage in the pipeline.
 349
 350 .. code-block:: c
 351
 352         int timeout = 0;
 353         struct rte_event events[BATCH_SIZE];
 354         uint16_t nb_rx = rte_event_dequeue_burst(dev_id, worker_port_id, events, BATCH_SIZE, timeout);
 355
 356         for (i = 0; i < nb_rx; i++) {
 357                 /* process mbuf using events[i].queue_id as pipeline stage */
 358                 struct rte_mbuf *mbuf = events[i].mbuf;
 359                 /* Send event to next stage in pipeline */
 360                 events[i].queue_id++;
 361         }
 362
 363         uint16_t nb_tx = rte_event_enqueue_burst(dev_id, port_id, events, nb_rx);
 364
 365
 366 Egress of Events
 367 ~~~~~~~~~~~~~~~~
 368
 369 Finally, when the packet is ready for egress or needs to be dropped, we need
 370 to inform the eventdev that the packet is no longer being handled by the
 371 application. This can be done by calling dequeue() or dequeue_burst(), which
 372 indicates that the previous burst of packets is no longer in use by the
 373 application.
 374
 375 An event driven worker thread has following typical workflow on fastpath:
 376
 377 .. code-block:: c
 378
 379        while (1) {
 380                rte_event_dequeue_burst(...);
 381                (event processing)
 382                rte_event_enqueue_burst(...);
 383        }
 384
 385
 386 Summary
 387 -------
 388
 389 The eventdev library allows an application to easily schedule events as it
 390 requires, either using a run-to-completion or pipeline processing model.  The
 391 queues and ports abstract the logical functionality of an eventdev, providing
 392 the application with a generic method to schedule events.  With the flexible
 393 PMD infrastructure applications benefit of improvements in existing eventdevs
 394 and additions of new ones without modification.