doc/guides/prog_guide/switch_representation.rst

   1 ..  SPDX-License-Identifier: BSD-3-Clause
   2     Copyright(c) 2018 6WIND S.A.
   3
   4 .. _switch_representation:
   5
   6 Switch Representation within DPDK Applications
   7 ==============================================
   8
   9 .. contents:: :local:
  10
  11 Introduction
  12 ------------
  13
  14 Network adapters with multiple physical ports and/or SR-IOV capabilities
  15 usually support the offload of traffic steering rules between their virtual
  16 functions (VFs), physical functions (PFs) and ports.
  17
  18 Like for standard Ethernet switches, this involves a combination of
  19 automatic MAC learning and manual configuration. For most purposes it is
  20 managed by the host system and fully transparent to users and applications.
  21
  22 On the other hand, applications typically found on hypervisors that process
  23 layer 2 (L2) traffic (such as OVS) need to steer traffic themselves
  24 according on their own criteria.
  25
  26 Without a standard software interface to manage traffic steering rules
  27 between VFs, PFs and the various physical ports of a given device,
  28 applications cannot take advantage of these offloads; software processing is
  29 mandatory even for traffic which ends up re-injected into the device it
  30 originates from.
  31
  32 This document describes how such steering rules can be configured through
  33 the DPDK flow API (**rte_flow**), with emphasis on the SR-IOV use case
  34 (PF/VF steering) using a single physical port for clarity, however the same
  35 logic applies to any number of ports without necessarily involving SR-IOV.
  36
  37 Port Representors
  38 -----------------
  39
  40 In many cases, traffic steering rules cannot be determined in advance;
  41 applications usually have to process a bit of traffic in software before
  42 thinking about offloading specific flows to hardware.
  43
  44 Applications therefore need the ability to receive and inject traffic to
  45 various device endpoints (other VFs, PFs or physical ports) before
  46 connecting them together. Device drivers must provide means to hook the
  47 "other end" of these endpoints and to refer them when configuring flow
  48 rules.
  49
  50 This role is left to so-called "port representors" (also known as "VF
  51 representors" in the specific context of VFs), which are to DPDK what the
  52 Ethernet switch device driver model (**switchdev**) [1]_ is to Linux, and
  53 which can be thought as a software "patch panel" front-end for applications.
  54
  55 - DPDK port representors are implemented as additional virtual Ethernet
  56   device (**ethdev**) instances, spawned on an as needed basis through
  57   configuration parameters passed to the driver of the underlying
  58   device using devargs.
  59
  60 ::
  61
  62    -a pci:dbdf,representor=0
  63    -a pci:dbdf,representor=[0-3]
  64    -a pci:dbdf,representor=[0,5-11]
  65
  66 - As virtual devices, they may be more limited than their physical
  67   counterparts, for instance by exposing only a subset of device
  68   configuration callbacks and/or by not necessarily having Rx/Tx capability.
  69
  70 - Among other things, they can be used to assign MAC addresses to the
  71   resource they represent.
  72
  73 - Applications can tell port representors apart from other physical of virtual
  74   port by checking the dev_flags field within their device information
  75   structure for the RTE_ETH_DEV_REPRESENTOR bit-field.
  76
  77 .. code-block:: c
  78
  79   struct rte_eth_dev_info {
  80       ...
  81       uint32_t dev_flags; /**< Device flags */
  82       ...
  83   };
  84
  85 - The device or group relationship of ports can be discovered using the
  86   switch ``domain_id`` field within the devices switch information structure. By
  87   default the switch ``domain_id`` of a port will be
  88   ``RTE_ETH_DEV_SWITCH_DOMAIN_ID_INVALID`` to indicate that the port doesn't
  89   support the concept of a switch domain, but ports which do support the concept
  90   will be allocated a unique switch ``domain_id``, ports within the same switch
  91   domain will share the same ``domain_id``. The switch ``port_id`` is used to
  92   specify the port_id in terms of the switch, so in the case of SR-IOV devices
  93   the switch ``port_id`` would represent the virtual function identifier of the
  94   port.
  95
  96 .. code-block:: c
  97
  98    /**
  99     * Ethernet device associated switch information
 100     */
 101    struct rte_eth_switch_info {
 102        const char *name; /**< switch name */
 103        uint16_t domain_id; /**< switch domain id */
 104        uint16_t port_id; /**< switch port id */
 105    };
 106
 107
 108 .. [1] `Ethernet switch device driver model (switchdev)
 109        <https://www.kernel.org/doc/Documentation/networking/switchdev.txt>`_
 110
 111 Basic SR-IOV
 112 ------------
 113
 114 "Basic" in the sense that it is not managed by applications, which
 115 nonetheless expect traffic to flow between the various endpoints and the
 116 outside as if everything was linked by an Ethernet hub.
 117
 118 The following diagram pictures a setup involving a device with one PF, two
 119 VFs and one shared physical port
 120
 121 ::
 122
 123        .-------------.                 .-------------. .-------------.
 124        | hypervisor  |                 |    VM 1     | |    VM 2     |
 125        | application |                 | application | | application |
 126        `--+----------'                 `----------+--' `--+----------'
 127           |                                       |       |
 128     .-----+-----.                                 |       |
 129     | port_id 3 |                                 |       |
 130     `-----+-----'                                 |       |
 131           |                                       |       |
 132         .-+--.                                .---+--. .--+---.
 133         | PF |                                | VF 1 | | VF 2 |
 134         `-+--'                                `---+--' `--+---'
 135           |                                       |       |
 136           `---------.     .-----------------------'       |
 137                     |     |     .-------------------------'
 138                     |     |     |
 139                  .--+-----+-----+--.
 140                  | interconnection |
 141                  `--------+--------'
 142                           |
 143                      .----+-----.
 144                      | physical |
 145                      |  port 0  |
 146                      `----------'
 147
 148 - A DPDK application running on the hypervisor owns the PF device, which is
 149   arbitrarily assigned port index 3.
 150
 151 - Both VFs are assigned to VMs and used by unknown applications; they may be
 152   DPDK-based or anything else.
 153
 154 - Interconnection is not necessarily done through a true Ethernet switch and
 155   may not even exist as a separate entity. The role of this block is to show
 156   that something brings PF, VFs and physical ports together and enables
 157   communication between them, with a number of built-in restrictions.
 158
 159 Subsequent sections in this document describe means for DPDK applications
 160 running on the hypervisor to freely assign specific flows between PF, VFs
 161 and physical ports based on traffic properties, by managing this
 162 interconnection.
 163
 164 Controlled SR-IOV
 165 -----------------
 166
 167 Initialization
 168 ~~~~~~~~~~~~~~
 169
 170 When a DPDK application gets assigned a PF device and is deliberately not
 171 started in `basic SR-IOV`_ mode, any traffic coming from physical ports is
 172 received by PF according to default rules, while VFs remain isolated.
 173
 174 ::
 175
 176        .-------------.                 .-------------. .-------------.
 177        | hypervisor  |                 |    VM 1     | |    VM 2     |
 178        | application |                 | application | | application |
 179        `--+----------'                 `----------+--' `--+----------'
 180           |                                       |       |
 181     .-----+-----.                                 |       |
 182     | port_id 3 |                                 |       |
 183     `-----+-----'                                 |       |
 184           |                                       |       |
 185         .-+--.                                .---+--. .--+---.
 186         | PF |                                | VF 1 | | VF 2 |
 187         `-+--'                                `------' `------'
 188           |
 189           `-----.
 190                 |
 191              .--+----------------------.
 192              | managed interconnection |
 193              `------------+------------'
 194                           |
 195                      .----+-----.
 196                      | physical |
 197                      |  port 0  |
 198                      `----------'
 199
 200 In this mode, interconnection must be configured by the application to
 201 enable VF communication, for instance by explicitly directing traffic with a
 202 given destination MAC address to VF 1 and allowing that with the same source
 203 MAC address to come out of it.
 204
 205 For this to work, hypervisor applications need a way to refer to either VF 1
 206 or VF 2 in addition to the PF. This is addressed by `VF representors`_.
 207
 208 VF Representors
 209 ~~~~~~~~~~~~~~~
 210
 211 VF representors are virtual but standard DPDK network devices (albeit with
 212 limited capabilities) created by PMDs when managing a PF device.
 213
 214 Since they represent VF instances used by other applications, configuring
 215 them (e.g. assigning a MAC address or setting up promiscuous mode) affects
 216 interconnection accordingly. If supported, they may also be used as two-way
 217 communication ports with VFs (assuming **switchdev** topology)
 218
 219
 220 ::
 221
 222        .-------------.                 .-------------. .-------------.
 223        | hypervisor  |                 |    VM 1     | |    VM 2     |
 224        | application |                 | application | | application |
 225        `--+---+---+--'                 `----------+--' `--+----------'
 226           |   |   |                               |       |
 227           |   |   `-------------------.           |       |
 228           |   `---------.             |           |       |
 229           |             |             |           |       |
 230     .-----+-----. .-----+-----. .-----+-----.     |       |
 231     | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 232     `-----+-----' `-----+-----' `-----+-----'     |       |
 233           |             |             |           |       |
 234         .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 235         | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 236         `-+--'    `-----+-----' `-----+-----' `---+--' `--+---'
 237           |             |             |           |       |
 238           |             |   .---------'           |       |
 239           `-----.       |   |   .-----------------'       |
 240                 |       |   |   |   .---------------------'
 241                 |       |   |   |   |
 242              .--+-------+---+---+---+--.
 243              | managed interconnection |
 244              `------------+------------'
 245                           |
 246                      .----+-----.
 247                      | physical |
 248                      |  port 0  |
 249                      `----------'
 250
 251 - VF representors are assigned arbitrary port indices 4 and 5 in the
 252   hypervisor application and are respectively associated with VF 1 and VF 2.
 253
 254 - They can't be dissociated; even if VF 1 and VF 2 were not connected,
 255   representors could still be used for configuration.
 256
 257 - In this context, port index 3 can be thought as a representor for physical
 258   port 0.
 259
 260 As previously described, the "interconnection" block represents a logical
 261 concept. Interconnection occurs when hardware configuration enables traffic
 262 flows from one place to another (e.g. physical port 0 to VF 1) according to
 263 some criteria.
 264
 265 This is discussed in more detail in `traffic steering`_.
 266
 267 Traffic Steering
 268 ~~~~~~~~~~~~~~~~
 269
 270 In the following diagram, each meaningful traffic origin or endpoint as seen
 271 by the hypervisor application is tagged with a unique letter from A to F.
 272
 273 ::
 274
 275        .-------------.                 .-------------. .-------------.
 276        | hypervisor  |                 |    VM 1     | |    VM 2     |
 277        | application |                 | application | | application |
 278        `--+---+---+--'                 `----------+--' `--+----------'
 279           |   |   |                               |       |
 280           |   |   `-------------------.           |       |
 281           |   `---------.             |           |       |
 282           |             |             |           |       |
 283     .----(A)----. .----(B)----. .----(C)----.     |       |
 284     | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 285     `-----+-----' `-----+-----' `-----+-----'     |       |
 286           |             |             |           |       |
 287         .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 288         | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 289         `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
 290           |             |             |           |       |
 291           |             |   .---------'           |       |
 292           `-----.       |   |   .-----------------'       |
 293                 |       |   |   |   .---------------------'
 294                 |       |   |   |   |
 295              .--+-------+---+---+---+--.
 296              | managed interconnection |
 297              `------------+------------'
 298                           |
 299                      .---(F)----.
 300                      | physical |
 301                      |  port 0  |
 302                      `----------'
 303
 304 - **A**: PF device.
 305 - **B**: port representor for VF 1.
 306 - **C**: port representor for VF 2.
 307 - **D**: VF 1 proper.
 308 - **E**: VF 2 proper.
 309 - **F**: physical port.
 310
 311 Although uncommon, some devices do not enforce a one to one mapping between
 312 PF and physical ports. For instance, by default all ports of **mlx4**
 313 adapters are available to all their PF/VF instances, in which case
 314 additional ports appear next to **F** in the above diagram.
 315
 316 Assuming no interconnection is provided by default in this mode, setting up
 317 a `basic SR-IOV`_ configuration involving physical port 0 could be broken
 318 down as:
 319
 320 PF:
 321
 322 - **A to F**: let everything through.
 323 - **F to A**: PF MAC as destination.
 324
 325 VF 1:
 326
 327 - **A to D**, **E to D** and **F to D**: VF 1 MAC as destination.
 328 - **D to A**: VF 1 MAC as source and PF MAC as destination.
 329 - **D to E**: VF 1 MAC as source and VF 2 MAC as destination.
 330 - **D to F**: VF 1 MAC as source.
 331
 332 VF 2:
 333
 334 - **A to E**, **D to E** and **F to E**: VF 2 MAC as destination.
 335 - **E to A**: VF 2 MAC as source and PF MAC as destination.
 336 - **E to D**: VF 2 MAC as source and VF 1 MAC as destination.
 337 - **E to F**: VF 2 MAC as source.
 338
 339 Devices may additionally support advanced matching criteria such as
 340 IPv4/IPv6 addresses or TCP/UDP ports.
 341
 342 The combination of matching criteria with target endpoints fits well with
 343 **rte_flow** [6]_, which expresses flow rules as combinations of patterns
 344 and actions.
 345
 346 Enhancing **rte_flow** with the ability to make flow rules match and target
 347 these endpoints provides a standard interface to manage their
 348 interconnection without introducing new concepts and whole new API to
 349 implement them. This is described in `flow API (rte_flow)`_.
 350
 351 .. [6] :doc:`Generic flow API (rte_flow) <rte_flow>`
 352
 353 Flow API (rte_flow)
 354 -------------------
 355
 356 Extensions
 357 ~~~~~~~~~~
 358
 359 Compared to creating a brand new dedicated interface, **rte_flow** was
 360 deemed flexible enough to manage representor traffic only with minor
 361 extensions:
 362
 363 - Using physical ports, PF, VF or port representors as targets.
 364
 365 - Affecting traffic that is not necessarily addressed to the DPDK port ID a
 366   flow rule is associated with (e.g. forcing VF traffic redirection to PF).
 367
 368 For advanced uses:
 369
 370 - Rule-based packet counters.
 371
 372 - The ability to combine several identical actions for traffic duplication
 373   (e.g. VF representor in addition to a physical port).
 374
 375 - Dedicated actions for traffic encapsulation / decapsulation before
 376   reaching an endpoint.
 377
 378 Traffic Direction
 379 ~~~~~~~~~~~~~~~~~
 380
 381 From an application standpoint, "ingress" and "egress" flow rule attributes
 382 apply to the DPDK port ID they are associated with. They select a traffic
 383 direction for matching patterns, but have no impact on actions.
 384
 385 When matching traffic coming from or going to a different place than the
 386 immediate port ID a flow rule is associated with, these attributes keep
 387 their meaning while applying to the chosen origin, as highlighted by the
 388 following diagram
 389
 390 ::
 391
 392        .-------------.                 .-------------. .-------------.
 393        | hypervisor  |                 |    VM 1     | |    VM 2     |
 394        | application |                 | application | | application |
 395        `--+---+---+--'                 `----------+--' `--+----------'
 396           |   |   |                               |       |
 397           |   |   `-------------------.           |       |
 398           |   `---------.             |           |       |
 399           | ^           | ^           | ^         |       |
 400           | | ingress   | | ingress   | | ingress |       |
 401           | | egress    | | egress    | | egress  |       |
 402           | v           | v           | v         |       |
 403     .----(A)----. .----(B)----. .----(C)----.     |       |
 404     | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 405     `-----+-----' `-----+-----' `-----+-----'     |       |
 406           |             |             |           |       |
 407         .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 408         | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 409         `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
 410           |             |             |         ^ |       | ^
 411           |             |             |  egress | |       | | egress
 412           |             |             | ingress | |       | | ingress
 413           |             |   .---------'         v |       | v
 414           `-----.       |   |   .-----------------'       |
 415                 |       |   |   |   .---------------------'
 416                 |       |   |   |   |
 417              .--+-------+---+---+---+--.
 418              | managed interconnection |
 419              `------------+------------'
 420                         ^ |
 421                 ingress | |
 422                  egress | |
 423                         v |
 424                      .---(F)----.
 425                      | physical |
 426                      |  port 0  |
 427                      `----------'
 428
 429 Ingress and egress are defined as relative to the application creating the
 430 flow rule.
 431
 432 For instance, matching traffic sent by VM 2 would be done through an ingress
 433 flow rule on VF 2 (**E**). Likewise for incoming traffic on physical port
 434 (**F**). This also applies to **C** and **A** respectively.
 435
 436 Transferring Traffic
 437 ~~~~~~~~~~~~~~~~~~~~
 438
 439 Without Port Representors
 440 ^^^^^^^^^^^^^^^^^^^^^^^^^
 441
 442 `Traffic direction`_ describes how an application could match traffic coming
 443 from or going to a specific place reachable from a DPDK port ID. This makes
 444 sense when the traffic in question is normally seen (i.e. sent or received)
 445 by the application creating the flow rule (e.g. as in "redirect all traffic
 446 coming from VF 1 to local queue 6").
 447
 448 However this does not force such traffic to take a specific route. Creating
 449 a flow rule on **A** matching traffic coming from **D** is only meaningful
 450 if it can be received by **A** in the first place, otherwise doing so simply
 451 has no effect.
 452
 453 A new flow rule attribute named "transfer" is necessary for that. Combining
 454 it with "ingress" or "egress" and a specific origin requests a flow rule to
 455 be applied at the lowest level
 456
 457 ::
 458
 459              ingress only           :       ingress + transfer
 460                                     :
 461     .-------------. .-------------. : .-------------. .-------------.
 462     | hypervisor  | |    VM 1     | : | hypervisor  | |    VM 1     |
 463     | application | | application | : | application | | application |
 464     `------+------' `--+----------' : `------+------' `--+----------'
 465            |           | | traffic  :        |           | | traffic
 466      .----(A)----.     | v          :  .----(A)----.     | v
 467      | port_id 3 |     |            :  | port_id 3 |     |
 468      `-----+-----'     |            :  `-----+-----'     |
 469            |           |            :        | ^         |
 470            |           |            :        | | traffic |
 471          .-+--.    .---+--.         :      .-+--.    .---+--.
 472          | PF |    | VF 1 |         :      | PF |    | VF 1 |
 473          `-+--'    `--(D)-'         :      `-+--'    `--(D)-'
 474            |           | | traffic  :        | ^         | | traffic
 475            |           | v          :        | | traffic | v
 476         .--+-----------+--.         :     .--+-----------+--.
 477         | interconnection |         :     | interconnection |
 478         `--------+--------'         :     `--------+--------'
 479                  | | traffic        :              |
 480                  | v                :              |
 481             .---(F)----.            :         .---(F)----.
 482             | physical |            :         | physical |
 483             |  port 0  |            :         |  port 0  |
 484             `----------'            :         `----------'
 485
 486 With "ingress" only, traffic is matched on **A** thus still goes to physical
 487 port **F** by default
 488
 489
 490 ::
 491
 492    testpmd> flow create 3 ingress pattern vf id is 1 / end
 493               actions queue index 6 / end
 494
 495 With "ingress + transfer", traffic is matched on **D** and is therefore
 496 successfully assigned to queue 6 on **A**
 497
 498
 499 ::
 500
 501     testpmd> flow create 3 ingress transfer pattern vf id is 1 / end
 502               actions queue index 6 / end
 503
 504
 505 With Port Representors
 506 ^^^^^^^^^^^^^^^^^^^^^^
 507
 508 When port representors exist, implicit flow rules with the "transfer"
 509 attribute (described in `without port representors`_) are be assumed to
 510 exist between them and their represented resources. These may be immutable.
 511
 512 In this case, traffic is received by default through the representor and
 513 neither the "transfer" attribute nor traffic origin in flow rule patterns
 514 are necessary. They simply have to be created on the representor port
 515 directly and may target a different representor as described in `PORT_ID
 516 action`_.
 517
 518 Implicit traffic flow with port representor
 519
 520 ::
 521
 522        .-------------.   .-------------.
 523        | hypervisor  |   |    VM 1     |
 524        | application |   | application |
 525        `--+-------+--'   `----------+--'
 526           |       | ^               | | traffic
 527           |       | | traffic       | v
 528           |       `-----.           |
 529           |             |           |
 530     .----(A)----. .----(B)----.     |
 531     | port_id 3 | | port_id 4 |     |
 532     `-----+-----' `-----+-----'     |
 533           |             |           |
 534         .-+--.    .-----+-----. .---+--.
 535         | PF |    | VF 1 rep. | | VF 1 |
 536         `-+--'    `-----+-----' `--(D)-'
 537           |             |           |
 538        .--|-------------|-----------|--.
 539        |  |             |           |  |
 540        |  |             `-----------'  |
 541        |  |              <-- traffic   |
 542        `--|----------------------------'
 543           |
 544      .---(F)----.
 545      | physical |
 546      |  port 0  |
 547      `----------'
 548
 549 Pattern Items And Actions
 550 ~~~~~~~~~~~~~~~~~~~~~~~~~
 551
 552 PORT Pattern Item
 553 ^^^^^^^^^^^^^^^^^
 554
 555 Matches traffic originating from (ingress) or going to (egress) a physical
 556 port of the underlying device.
 557
 558 Using this pattern item without specifying a port index matches the physical
 559 port associated with the current DPDK port ID by default. As described in
 560 `traffic steering`_, specifying it should be rarely needed.
 561
 562 - Matches **F** in `traffic steering`_.
 563
 564 PORT Action
 565 ^^^^^^^^^^^
 566
 567 Directs matching traffic to a given physical port index.
 568
 569 - Targets **F** in `traffic steering`_.
 570
 571 PORT_ID Pattern Item
 572 ^^^^^^^^^^^^^^^^^^^^
 573
 574 Matches traffic originating from (ingress) or going to (egress) a given DPDK
 575 port ID.
 576
 577 Normally only supported if the port ID in question is known by the
 578 underlying PMD and related to the device the flow rule is created against.
 579
 580 This must not be confused with the `PORT pattern item`_ which refers to the
 581 physical port of a device. ``PORT_ID`` refers to a ``struct rte_eth_dev``
 582 object on the application side (also known as "port representor" depending
 583 on the kind of underlying device).
 584
 585 - Matches **A**, **B** or **C** in `traffic steering`_.
 586
 587 PORT_ID Action
 588 ^^^^^^^^^^^^^^
 589
 590 Directs matching traffic to a given DPDK port ID.
 591
 592 Same restrictions as `PORT_ID pattern item`_.
 593
 594 - Targets **A**, **B** or **C** in `traffic steering`_.
 595
 596 PF Pattern Item
 597 ^^^^^^^^^^^^^^^
 598
 599 Matches traffic originating from (ingress) or going to (egress) the physical
 600 function of the current device.
 601
 602 If supported, should work even if the physical function is not managed by
 603 the application and thus not associated with a DPDK port ID. Its behavior is
 604 otherwise similar to `PORT_ID pattern item`_ using PF port ID.
 605
 606 - Matches **A** in `traffic steering`_.
 607
 608 PF Action
 609 ^^^^^^^^^
 610
 611 Directs matching traffic to the physical function of the current device.
 612
 613 Same restrictions as `PF pattern item`_.
 614
 615 - Targets **A** in `traffic steering`_.
 616
 617 VF Pattern Item
 618 ^^^^^^^^^^^^^^^
 619
 620 Matches traffic originating from (ingress) or going to (egress) a given
 621 virtual function of the current device.
 622
 623 If supported, should work even if the virtual function is not managed by
 624 the application and thus not associated with a DPDK port ID. Its behavior is
 625 otherwise similar to `PORT_ID pattern item`_ using VF port ID.
 626
 627 Note this pattern item does not match VF representors traffic which, as
 628 separate entities, should be addressed through their own port IDs.
 629
 630 - Matches **D** or **E** in `traffic steering`_.
 631
 632 VF Action
 633 ^^^^^^^^^
 634
 635 Directs matching traffic to a given virtual function of the current device.
 636
 637 Same restrictions as `VF pattern item`_.
 638
 639 - Targets **D** or **E** in `traffic steering`_.
 640
 641 \*_ENCAP actions
 642 ^^^^^^^^^^^^^^^^
 643
 644 These actions are named according to the protocol they encapsulate traffic
 645 with (e.g. ``VXLAN_ENCAP``) and using specific parameters (e.g. VNI for
 646 VXLAN).
 647
 648 While they modify traffic and can be used multiple times (order matters),
 649 unlike `PORT_ID action`_ and friends, they have no impact on steering.
 650
 651 As described in `actions order and repetition`_ this means they are useless
 652 if used alone in an action list, the resulting traffic gets dropped unless
 653 combined with either ``PASSTHRU`` or other endpoint-targeting actions.
 654
 655 \*_DECAP actions
 656 ^^^^^^^^^^^^^^^^
 657
 658 They perform the reverse of `\*_ENCAP actions`_ by popping protocol headers
 659 from traffic instead of pushing them. They can be used multiple times as
 660 well.
 661
 662 Note that using these actions on non-matching traffic results in undefined
 663 behavior. It is recommended to match the protocol headers to decapsulate on
 664 the pattern side of a flow rule in order to use these actions or otherwise
 665 make sure only matching traffic goes through.
 666
 667 Actions Order and Repetition
 668 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 669
 670 Flow rules are currently restricted to at most a single action of each
 671 supported type, performed in an unpredictable order (or all at once). To
 672 repeat actions in a predictable fashion, applications have to make rules
 673 pass-through and use priority levels.
 674
 675 It's now clear that PMD support for chaining multiple non-terminating flow
 676 rules of varying priority levels is prohibitively difficult to implement
 677 compared to simply allowing multiple identical actions performed in a
 678 defined order by a single flow rule.
 679
 680 - This change is required to support protocol encapsulation offloads and the
 681   ability to perform them multiple times (e.g. VLAN then VXLAN).
 682
 683 - It makes the ``DUP`` action redundant since multiple ``QUEUE`` actions can
 684   be combined for duplication.
 685
 686 - The (non-)terminating property of actions must be discarded. Instead, flow
 687   rules themselves must be considered terminating by default (i.e. dropping
 688   traffic if there is no specific target) unless a ``PASSTHRU`` action is
 689   also specified.
 690
 691 Switching Examples
 692 ------------------
 693
 694 This section provides practical examples based on the established testpmd
 695 flow command syntax [2]_, in the context described in `traffic steering`_
 696
 697 ::
 698
 699       .-------------.                 .-------------. .-------------.
 700       | hypervisor  |                 |    VM 1     | |    VM 2     |
 701       | application |                 | application | | application |
 702       `--+---+---+--'                 `----------+--' `--+----------'
 703          |   |   |                               |       |
 704          |   |   `-------------------.           |       |
 705          |   `---------.             |           |       |
 706          |             |             |           |       |
 707    .----(A)----. .----(B)----. .----(C)----.     |       |
 708    | port_id 3 | | port_id 4 | | port_id 5 |     |       |
 709    `-----+-----' `-----+-----' `-----+-----'     |       |
 710          |             |             |           |       |
 711        .-+--.    .-----+-----. .-----+-----. .---+--. .--+---.
 712        | PF |    | VF 1 rep. | | VF 2 rep. | | VF 1 | | VF 2 |
 713        `-+--'    `-----+-----' `-----+-----' `--(D)-' `-(E)--'
 714          |             |             |           |       |
 715          |             |   .---------'           |       |
 716          `-----.       |   |   .-----------------'       |
 717                |       |   |   |   .---------------------'
 718                |       |   |   |   |
 719             .--|-------|---|---|---|--.
 720             |  |       |   `---|---'  |
 721             |  |       `-------'      |
 722             |  `---------.            |
 723             `------------|------------'
 724                          |
 725                     .---(F)----.
 726                     | physical |
 727                     |  port 0  |
 728                     `----------'
 729
 730 By default, PF (**A**) can communicate with the physical port it is
 731 associated with (**F**), while VF 1 (**D**) and VF 2 (**E**) are isolated
 732 and restricted to communicate with the hypervisor application through their
 733 respective representors (**B** and **C**) if supported.
 734
 735 Examples in subsequent sections apply to hypervisor applications only and
 736 are based on port representors **A**, **B** and **C**.
 737
 738 .. [2] :ref:`Flow syntax <testpmd_rte_flow>`
 739
 740 Associating VF 1 with Physical Port 0
 741 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 742
 743 Assign all port traffic (**F**) to VF 1 (**D**) indiscriminately through
 744 their representors
 745
 746 ::
 747
 748    flow create 3 ingress pattern / end actions port_id id 4 / end
 749    flow create 4 ingress pattern / end actions port_id id 3 / end
 750
 751 More practical example with MAC address restrictions
 752
 753 ::
 754
 755    flow create 3 ingress
 756        pattern eth dst is {VF 1 MAC} / end
 757        actions port_id id 4 / end
 758
 759 ::
 760
 761    flow create 4 ingress
 762        pattern eth src is {VF 1 MAC} / end
 763        actions port_id id 3 / end
 764
 765
 766 Sharing Broadcasts
 767 ~~~~~~~~~~~~~~~~~~
 768
 769 From outside to PF and VFs
 770
 771 ::
 772
 773    flow create 3 ingress
 774       pattern eth dst is ff:ff:ff:ff:ff:ff / end
 775       actions port_id id 3 / port_id id 4 / port_id id 5 / end
 776
 777 Note ``port_id id 3`` is necessary otherwise only VFs would receive matching
 778 traffic.
 779
 780 From PF to outside and VFs
 781
 782 ::
 783
 784    flow create 3 egress
 785       pattern eth dst is ff:ff:ff:ff:ff:ff / end
 786       actions port / port_id id 4 / port_id id 5 / end
 787
 788 From VFs to outside and PF
 789
 790 ::
 791
 792    flow create 4 ingress
 793       pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 1 MAC} / end
 794       actions port_id id 3 / port_id id 5 / end
 795
 796    flow create 5 ingress
 797       pattern eth dst is ff:ff:ff:ff:ff:ff src is {VF 2 MAC} / end
 798       actions port_id id 4 / port_id id 4 / end
 799
 800 Similar ``33:33:*`` rules based on known MAC addresses should be added for
 801 IPv6 traffic.
 802
 803 Encapsulating VF 2 Traffic in VXLAN
 804 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 805
 806 Assuming pass-through flow rules are supported
 807
 808 ::
 809
 810    flow create 5 ingress
 811       pattern eth / end
 812       actions vxlan_encap vni 42 / passthru / end
 813
 814 ::
 815
 816    flow create 5 egress
 817       pattern vxlan vni is 42 / end
 818       actions vxlan_decap / passthru / end
 819
 820 Here ``passthru`` is needed since as described in `actions order and
 821 repetition`_, flow rules are otherwise terminating; if supported, a rule
 822 without a target endpoint will drop traffic.
 823
 824 Without pass-through support, ingress encapsulation on the destination
 825 endpoint might not be supported and action list must provide one
 826
 827 ::
 828
 829    flow create 5 ingress
 830       pattern eth src is {VF 2 MAC} / end
 831       actions vxlan_encap vni 42 / port_id id 3 / end
 832
 833    flow create 3 ingress
 834       pattern vxlan vni is 42 / end
 835       actions vxlan_decap / port_id id 5 / end