1 .. SPDX-License-Identifier: BSD-3-Clause
2 Copyright 2019 Olivier Matz <zer0@droids-corp.org>
7 Most of *libecoli* is written in C. It provides a C API to ease the
8 creation of interactive command line interfaces. It is split in several
11 - the core: it provides the main API to parse and complete strings.
12 - ecoli nodes: this is the modular part of *libecoli*. Each node
13 implements a specific parsing (integers, strings, regular expressions,
15 - yaml parser: load an ecoli tree described in yaml.
16 - editline: helpers to instantiate an *editline* and connect it to
18 - utilities: logging, string, strings vector, hash table, ...
23 The *ecoli nodes* are organized in a graph. Actually, it is mostly a
24 tree, but loops are allowed in some cases. An *ecoli grammar tree*
25 describes how the input is parsed and completed. Let's take a simple
28 .. figure:: simple-tree.svg
30 A simple *ecoli grammar graph*.
32 We can also represent it as text like this::
43 This tree matches the following strings:
48 But, for instance, it does **not** match the following strings:
57 When the *libecoli* parses an input, it browses the tree (depth first
58 search) and returns an *ecoli parse tree*. Let's decompose what is done
59 when ``ec_parse_strvec(root_node, input)`` is called, step by step:
61 1. The initial input is a string vector ``["foo bar"]``.
62 2. The *sh_lex* node splits the input as a shell would have done it, in
63 ``["foo", "bar"]``, and passes it to its child.
64 3. The *seq* node (sequence) passes the input to its first child.
65 4. The *str* node matches if the first element of the string vector is
66 ``foo``. This is the case, so it returns the number of eaten
67 strings (1) to its parent.
68 5. The *seq* node passes the remaining part ``["bar"]`` to its next
70 6. The option node passes the string vector to its child, which
71 matches. It returns 1 to its parent.
72 7. The *seq* node has no more child, it returns the number of eaten
73 strings in the vector (2).
74 8. Finally, *sh_lex* compares the returned value to the number of
75 strings in its initial vector, and return 1 (match) to its caller.
77 If a node does not match, it returns ``EC_PARSE_NOMATCH`` to its parent,
78 and it is propagated until it reaches a node that handle the case. For
79 instance, the *or* node is able to handle this error. If a child does
80 not match, the next one is tried until it finds one that matches.
85 Let's take another simple example.
87 .. figure:: simple-tree2.svg
89 Another simple *ecoli grammar graph*.
91 In that case, there is no lexer (the *sh_lex* node), so the input must
92 be a string vector that is already split. For instance, it matches:
98 But it does **not** match:
100 - ``["bar 1"]``, because there is no *sh_lex* node on the top of the tree
102 - ``[]`` (empty vector)
104 At the time the input is parsed, a *parse tree* is built. When it
105 matches, it describes which part of the *ecoli grammar graph* that
106 actually matched, and what input matched for each node.
108 Passing ``["bar", "1"]`` to the previous tree would result in the
109 following *parse tree*:
111 .. figure:: parse-tree.svg
113 The *ecoli parse tree*, result of parsing.
115 Each node of the *parse tree* references the node of the *grammar graph*
116 that matched. It also stores the string vector that matched.
118 .. figure:: parse-tree2.svg
120 The same *ecoli parse tree*, showing the references to the grammar
123 We can see that not all of the grammar nodes are referenced, since the
124 str("foo") node was not parsed. Similarly, one grammar node can be
125 referenced several times in the parse tree, as we can see it in the
128 .. figure:: simple-tree3.svg
130 A simple *ecoli grammar graph*, that matches ``[]``, ``[foo]``,
133 Here is the resulting *parse tree* when the input vector is ``[foo, foo,
136 .. figure:: parse-tree3.svg
138 The resulting *parse tree*.
152 - params are consumed
155 - extending lib with external nodes (in dev guide?)