1 | <html> |
---|
2 | <head> |
---|
3 | <title>The Rule</title> |
---|
4 | <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
---|
5 | <link rel="stylesheet" href="theme/style.css" type="text/css"> |
---|
6 | </head> |
---|
7 | |
---|
8 | <body> |
---|
9 | <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2"> |
---|
10 | <tr> |
---|
11 | <td width="10"> |
---|
12 | </td> |
---|
13 | <td width="85%"> |
---|
14 | <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>The Rule</b></font> |
---|
15 | </td> |
---|
16 | <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td> |
---|
17 | </tr> |
---|
18 | </table> |
---|
19 | <br> |
---|
20 | <table border="0"> |
---|
21 | <tr> |
---|
22 | <td width="10"></td> |
---|
23 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
---|
24 | <td width="30"><a href="numerics.html"><img src="theme/l_arr.gif" border="0"></a></td> |
---|
25 | <td width="30"><a href="epsilon.html"><img src="theme/r_arr.gif" border="0"></a></td> |
---|
26 | </tr> |
---|
27 | </table> |
---|
28 | <p>The <b>rule</b> is a polymorphic parser that acts as a named place-holder capturing |
---|
29 | the behavior of an EBNF expression assigned to it. Naming an EBNF expression |
---|
30 | allows it to be referenced later. The <tt>rule</tt> is a template class parameterized |
---|
31 | by the type of the scanner (<tt>ScannerT</tt>), the rule's <a href="indepth_the_parser_context.html">context</a> |
---|
32 | and its <a href="#tag">tag</a>. Default template parameters are provided to |
---|
33 | make it easy to use the rule.</p> |
---|
34 | <pre><code><font color="#000000"><span class=identifier> </span><span class=keyword>template</span><span class=special>< |
---|
35 | </span><span class=keyword>typename </span><span class=identifier>ScannerT </span><span class=special>= </span><span class=identifier>scanner</span><span class=special><>, |
---|
36 | </span><span class=keyword>typename </span><span class=identifier>ContextT </span><span class=special>= </span><span class=identifier>parser_context</span><span class=special><></span><span class=identifier>, |
---|
37 | </span><span class="keyword">typename</span><span class=identifier> TagT </span><span class="special">=</span><span class=identifier> parser_address_tag</span><span class=special>> |
---|
38 | </span><span class=keyword>class </span><span class=identifier>rule</span><span class=special>;</span></font></code></pre> |
---|
39 | <p>Default template parameters are supplied to handle the most common case. <tt>ScannerT</tt> |
---|
40 | defaults to <tt>scanner<></tt>, a plain vanilla scanner that acts on <tt>char |
---|
41 | const<span class="operators">*</span></tt> iterators and does nothing special |
---|
42 | at all other than iterate through all the chars in the null terminated input |
---|
43 | a character at a time. The rule tag, <tt>TagT</tt>, typically used with <a href="trees.html">ASTs</a>, |
---|
44 | is used to identify a rule; it is explained <a href="#tag">here</a>. In trivial |
---|
45 | cases, declaring a rule as <tt>rule<></tt> is enough. You need not be |
---|
46 | concerned at all with the <tt>ContextT</tt> template parameter unless you wish |
---|
47 | to tweak the low level behavior of the rule. Detailed information on the <tt>ContextT</tt> |
---|
48 | template parameter is provided <a href="indepth_the_parser_context.html">elsewhere</a>. |
---|
49 | </p> |
---|
50 | <h3><a name="order_of_parameters"></a>Order of parameters</h3> |
---|
51 | <p>As of v1.8.0, the <tt>ScannerT</tt>, <tt>ContextT</tt> and <tt>TagT</tt> can |
---|
52 | be specified in any order. If a template parameter is missing, it will assume |
---|
53 | the defaults. Examples:</p> |
---|
54 | <pre><span class=identifier> rule</span><span class=special><> </span><span class=identifier>rx1</span><span class=special>; |
---|
55 | </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>scanner</span><span class=special><> </span><span class=special>> </span><span class=identifier>rx2</span><span class=special>; |
---|
56 | </span> <span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_context<code><font color="#000000"><span class=special><></span></font></code> </span><span class=special>> </span><span class=identifier>rx3</span><span class=special>; |
---|
57 | </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_context<code><font color="#000000"><span class=special><></span></font></code></span><span class=special>, </span><span class=identifier>parser_address_tag</span><span class=special>> </span><span class=identifier>rx4</span><span class=special>; |
---|
58 | </span> <span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_address_tag</span><span class=special>> </span><span class=identifier>rx5</span><span class=special>; |
---|
59 | </span> <span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_address_tag</span><span class=special>, </span><span class=identifier>scanner</span><span class=special><>, </span><span class=identifier>parser_context<code><font color="#000000"><span class=special><></span></font></code> </span><span class=special>> </span><span class=identifier>rx6</span><span class=special>; |
---|
60 | </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>parser_context<code><font color="#000000"><span class=special><></span></font></code></span><span class=special>, </span><span class=identifier>scanner</span><span class=special><>, </span><span class=identifier>parser_address_tag</span><span class=special>> </span><span class=identifier>rx7</span><span class=special>;</span></pre> |
---|
61 | <h3><a name="multiple_scanner_support" id="multiple_scanner_support"></a>Multiple scanners</h3> |
---|
62 | <p>As of v1.8.0, rules can use one or more scanner types. There are cases, for |
---|
63 | instance, where we need a rule that can work on the phrase and character levels. |
---|
64 | Rule/scanner mismatch has been a source of confusion and is the no. 1 <a href="faq.html#scanner_business">FAQ</a>. |
---|
65 | To address this issue, we now have multiple scanner support. Example:</p> |
---|
66 | <pre><span class=special> </span><span class=keyword>typedef </span><span class=identifier>scanner_list</span><span class=special><</span><span class=identifier>scanner</span><span class=special><>, </span><span class=identifier>phrase_scanner_t</span><span class=special>> </span><span class=identifier>scanners</span><span class=special>; |
---|
67 | |
---|
68 | </span><span class=identifier>rule</span><span class=special><</span><span class=identifier>scanners</span><span class=special>> </span><span class=identifier>r </span><span class=special>= </span><span class=special>+</span><span class=identifier>anychar_p</span><span class=special>; |
---|
69 | </span><span class=identifier>assert</span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=string>"abcdefghijk"</span><span class=special>, </span><span class=identifier>r</span><span class=special>).</span><span class=identifier>full</span><span class=special>); |
---|
70 | </span><span class=identifier>assert</span><span class=special>(</span><span class=identifier>parse</span><span class=special>(</span><span class=string>"a b c d e f g h i j k"</span><span class=special>, </span><span class=identifier>r</span><span class=special>, </span><span class=identifier>space_p</span><span class=special>).</span><span class=identifier>full</span><span class=special>);</span></pre> |
---|
71 | <p>Notice how rule <tt>r</tt> is used in both the phrase and character levels. |
---|
72 | </p> |
---|
73 | <p>By default support for multiple scanners is disabled. The macro |
---|
74 | <tt>BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT</tt> must be defined to the |
---|
75 | maximum number of scanners allowed in a scanner_list. The value must |
---|
76 | be greater than 1 to enable multiple scanners. Given the |
---|
77 | example above, to define a limit of two scanners for the list, the |
---|
78 | following line must be inserted into the source file before the |
---|
79 | inclusion of Spirit headers: |
---|
80 | </p> |
---|
81 | <pre><span class=special> </span><span class=preprocessor>#define </span><span class=identifier>BOOST_SPIRIT_RULE_SCANNERTYPE_LIMIT</span> <span class=literal>2</span></pre> |
---|
82 | <table width="80%" border="0" align="center"> |
---|
83 | <tr> |
---|
84 | <td class="note_box"><img src="theme/bulb.gif" width="13" height="18"> See |
---|
85 | the techniques section for an <a href="techniques.html#multiple_scanner_support">example</a> |
---|
86 | of a <a href="grammar.html">grammar</a> using a multiple scanner enabled |
---|
87 | rule, <a href="scanner.html#lexeme_scanner">lexeme_scanner</a> and <a href="scanner.html#as_lower_scanner">as_lower_scanner.</a></td> |
---|
88 | </tr> |
---|
89 | </table> |
---|
90 | <h3>Rule Declarations</h3> |
---|
91 | <p>The rule class models EBNF's production rule. Example:</p> |
---|
92 | <pre><code><font color="#000000"> <span class=identifier>rule</span><span class=special><> </span><span class=identifier>a_rule </span><span class=special>= </span><span class=special>*(</span><span class=identifier>a </span><span class=special>| </span><span class=identifier>b</span><span class=special>) </span><span class=special>& </span><span class=special>+(</span><span class=identifier>c </span><span class=special>| </span><span class=identifier>d </span><span class=special>| </span><span class=identifier>e</span><span class=special>);</span></font></code></pre> |
---|
93 | <p>The type and behavior of the right-hand (rhs) EBNF expression, which may be |
---|
94 | arbitrarily complex, is encoded in the rule named a_rule. a_rule may now be |
---|
95 | referenced elsewhere in the grammar:</p> |
---|
96 | <pre><code><font color="#000000"> <span class=identifier>rule</span><span class=special><> </span><span class=identifier>another_rule </span><span class=special>= </span><span class=identifier>f </span><span class=special>>> </span><span class=identifier>g </span><span class=special>>> </span><span class=identifier>h </span><span class=special>>> </span><span class=identifier>a_rule</span><span class=special>;</span></font></code></pre> |
---|
97 | <table width="80%" border="0" align="center"> |
---|
98 | <tr> |
---|
99 | <td class="note_box"><img src="theme/alert.gif" width="16" height="16"> <b>Referencing |
---|
100 | rules <br> |
---|
101 | </b><br> |
---|
102 | When a rule is referenced anywhere in the right hand side of an EBNF expression, |
---|
103 | the rule is held by the expression by reference. It is the responsibility |
---|
104 | of the client to ensure that the referenced rule stays in scope and does |
---|
105 | not get destructed while it is being referenced. </td> |
---|
106 | </tr> |
---|
107 | </table> |
---|
108 | <pre><span class=special> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>int_p</span><span class=special>; |
---|
109 | </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>a</span><span class=special>; |
---|
110 | </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>int_p </span><span class=special>>> </span><span class=identifier>b</span><span class=special>;</span></pre> |
---|
111 | <h3>Copying Rules</h3> |
---|
112 | <p>The rule is a weird C++ citizen, unlike any other C++ object. It does not have |
---|
113 | the proper copy and assignment semantics and cannot be stored and passed around |
---|
114 | by value. If you need to copy a rule you have to explicitly call its member |
---|
115 | function <tt>copy()</tt>:</p> |
---|
116 | <pre><span class=special> </span><span class=identifier>r</span><span class="special">.</span><span class=identifier>copy()</span><span class=special>;</span></pre> |
---|
117 | <p>However, be warned that copying a rule will not deep copy other referenced |
---|
118 | rules of the source rule being copied. This might lead to dangling references. |
---|
119 | Again, it is the responsibility of the client to ensure that all referenced |
---|
120 | rules stay in scope and does not get destructed while it is being referenced. |
---|
121 | Caveat emptor.</p> |
---|
122 | <p>If you copy a rule, then you'll want to place it in a storage somewhere. The |
---|
123 | problem is how? The storage can't be another rule:</p> |
---|
124 | <pre> <code><font color="#000000"><span class=identifier>rule</span><span class=special><></span></font></code> r2 <span class="special">=</span> <span class=identifier>r</span><span class="special">.</span><span class=identifier>copy()</span><span class=special>; </span><span class="comment">// BAD!</span></pre> |
---|
125 | <p>because rules are weird and does not have the expected C++ copy-constructor |
---|
126 | and assignment semantics! As a general rule: <strong>Don't put a copied rule |
---|
127 | into another rule! </strong>Instead, use the <a href="stored_rule.html">stored_rule</a> |
---|
128 | for that purpose.</p> |
---|
129 | <h3>Forward declarations</h3> |
---|
130 | <p>A <tt>rule</tt> may be declared before being defined to allow cyclic structures |
---|
131 | typically found in BNF declarations. Example:</p> |
---|
132 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>rule</span><span class=special><> </span><span class=identifier>a</span><span class=special>, </span><span class=identifier>b</span><span class=special>, </span><span class=identifier>c</span><span class=special>; |
---|
133 | |
---|
134 | </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>b </span><span class=special>| </span><span class=identifier>a</span><span class=special>; |
---|
135 | </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>c </span><span class=special>| </span><span class=identifier>a</span><span class=special>;</span></font></code></pre> |
---|
136 | <h3>Recursion</h3> |
---|
137 | <p>The right-hand side of a rule may reference other rules, including itself. |
---|
138 | The limitation is that direct or indirect left recursion is not allowed (this |
---|
139 | is an unchecked run-time error that results in an infinite loop). This is typical |
---|
140 | of top-down parsers. Example:</p> |
---|
141 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>a </span><span class=special>| </span><span class=identifier>b</span><span class=special>; </span><span class=comment>// infinite loop!</span></font></code></pre> |
---|
142 | <table width="80%" border="0" align="center"> |
---|
143 | <tr> |
---|
144 | <td class="note_box"><img src="theme/lens.gif" width="15" height="16"> <b>What |
---|
145 | is left recursion?<br> |
---|
146 | </b><br> |
---|
147 | Left recursion happens when you have a rule that calls itself before anything |
---|
148 | else. A top-down parser will go into an infinite loop when this happens. |
---|
149 | See the <a href="faq.html#left_recursion">FAQ</a> for details on how to |
---|
150 | eliminate left recursion.</td> |
---|
151 | </tr> |
---|
152 | </table> |
---|
153 | <h3>Undefined rules</h3> |
---|
154 | <p>An undefined rule matches nothing and is semantically equivalent to <tt>nothing_p</tt>.</p> |
---|
155 | <h3>Redeclarations</h3> |
---|
156 | <p>Like any other C++ assignment, a second assignment to a rule is destructive |
---|
157 | and will redefine it. The old definition is lost. Rules are dynamic. A rule |
---|
158 | can change its definition anytime:</p> |
---|
159 | <pre><code><font color="#000000"><span class=identifier> r </span><span class=special>= </span><span class=identifier>a_definition</span><span class=special>; |
---|
160 | </span><span class=identifier> r </span><span class=special>= </span><span class=identifier>another_definition</span><span class=special>;</span></font></code></pre> |
---|
161 | <p>Rule <tt>r</tt> loses the old definition when the second assignment is made. |
---|
162 | As mentioned, an undefined rule matches nothing and is semantically equivalent |
---|
163 | to <tt>nothing_p</tt>. |
---|
164 | <h3>Dynamic Parsers</h3> |
---|
165 | <p>Hosting declarative EBNF in imperative C++ yields an interesting blend. We |
---|
166 | have the best of both worlds. We have the ability to conveniently modify the |
---|
167 | grammar at run time using imperative constructs such as <tt>if</tt>, <tt>else</tt> |
---|
168 | statements. Example:</p> |
---|
169 | <pre><code><font color="#000000"><span class=special> </span><span class=keyword>if </span><span class=special>(</span><span class=identifier>feature_is_available</span><span class=special>) |
---|
170 | </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>add_this_feature</span><span class=special>;</span></font></code></pre> |
---|
171 | <p>Rules are essentially dynamic parsers. A dynamic parser is characterized by |
---|
172 | its ability to modify its behavior at run time. Initially, an undefined rule |
---|
173 | matches nothing. At any time, the rule may be defined and redefined, thus, dynamically |
---|
174 | altering its behavior.</p> |
---|
175 | <h3>No start rule</h3> |
---|
176 | <p>Typically, parsers have what is called a start symbol, chosen to be the root |
---|
177 | of the grammar where parsing starts. The Spirit parser framework has no notion |
---|
178 | of a start symbol. Any rule can be a start symbol. This feature promotes step-wise |
---|
179 | creation of parsers. We can build parsers from the bottom up while fully testing |
---|
180 | each level or module up untill we get to the top-most level.</p> |
---|
181 | <h3><a name="tag"></a>Parser Tags</h3> |
---|
182 | <p>Rules may be tagged for identification purposes. This is necessary, especially |
---|
183 | when dealing with <a href="trees.html">parse trees and ASTs</a> to see which |
---|
184 | rule created a specific AST/parse tree node. Each rule has an ID of type <tt>parser_id</tt>. |
---|
185 | This ID can be obtained through the rule's <tt>id()</tt> member function:</p> |
---|
186 | <pre><code><font color="#000000"><span class=identifier> my_rule</span><span class=special>.</span><span class=identifier>id</span><span class=special>(); </span><span class=comment>// get my_rule's id</span></font></code></pre> |
---|
187 | <p>The <tt>parser_id</tt> class is declared as:</p> |
---|
188 | <pre> <span class="keyword">class</span> <span class="identifier">parser_id</span><br> <span class="special">{</span><br> <span class="keyword">public</span><span class="special">:</span><br> parser_id<span class="special">();</span><br> <span class="keyword">explicit</span> parser_id<span class="special">(</span><span class="keyword">void const</span><span class="special">*</span> p<span class="special">);</span><br> parser_id<span class="special">(</span><span class="keyword">std::size_t</span> l<span class="special">);</span> |
---|
189 | |
---|
190 | <span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">==(</span><span class="identifier">parser_id</span> <span class="keyword">const</span><span class="special">&</span> x<span class="special">)</span> const<span class="special">;</span><br> <span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">!=(</span><span class="identifier">parser_id</span> <span class="keyword">const</span><span class="special">&</span> x<span class="special">)</span> const<span class="special">;</span> |
---|
191 | <span class="keyword">bool</span> <span class="keyword"> operator</span><span class="special"><(</span><span class="identifier">parser_id</span> <span class="keyword">const</span><span class="special">&</span> x<span class="special">)</span> const<span class="special">;</span> |
---|
192 | <span class="special"></span><span class="keyword">std::size_t</span><span class="identifier"> to_long</span><span class="special">()</span> <span class="keyword">const</span><span class="special">; |
---|
193 | };</span></pre> |
---|
194 | <h3>parser_address_tag</h3> |
---|
195 | <p>The rule's <tt>TagT</tt> template parameter supplies this ID. This defaults |
---|
196 | to <tt>parser_address_tag</tt>. The <tt>parser_address_tag</tt> uses the address |
---|
197 | of the rule as its ID. This is often not the most convenient, since it is not |
---|
198 | always possible to get the address of a rule to compare against. </p> |
---|
199 | <h3>parser_tag</h3> |
---|
200 | <p>It is possible to have specific constant integers to identify a rule. For this |
---|
201 | purpose, we can use the <tt>parser_tag<N></tt>, where N is a constant |
---|
202 | integer:</p> |
---|
203 | <pre><code><font color="#000000"><span class=identifier> rule</span><span class=special><</span><span class=identifier>parser_tag</span><span class="special"><</span><span class=identifier>123</span><span class="special">> > </span><span class="identifier">my_rule</span><span class="special">; </span><span class="comment">// set my_rule's id to 123</span></font></code></pre> |
---|
204 | <h3>dynamic_parser_tag</h3> |
---|
205 | <p>The <tt>parser_tag<N></tt> can only specifiy a <strong>static ID</strong>, |
---|
206 | which is defined at compile time. If you need the ID to be <strong>dynamic</strong> |
---|
207 | (changeable at runtime), you can use the <tt>dynamic_parser_tag</tt> class as |
---|
208 | the <tt>TagT</tt> template parameter. This template parameter enables the <tt>set_id()</tt> |
---|
209 | function, which may be used to set the required id at runtime:</p> |
---|
210 | <pre><code><font color="#000000"><span class=identifier> rule</span><span class=special><</span><span class=identifier>dynamic_parser_tag</span><span class="special">> </span><span class="identifier">my_dynrule</span><span class="special">;</span> |
---|
211 | my_dynrule.set_id(1234); <span class="comment">// set my_dynrule's id to 1234</span></font></code></pre> |
---|
212 | <p>If the <tt>set_id()</tt> function isn't called, the parser id defaults to the |
---|
213 | address of the rule as its ID, just like the <tt>parser_address_tag</tt> template |
---|
214 | parameter would do. </p> |
---|
215 | <table border="0"> |
---|
216 | <tr> |
---|
217 | <td width="10"></td> |
---|
218 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
---|
219 | <td width="30"><a href="numerics.html"><img src="theme/l_arr.gif" border="0"></a></td> |
---|
220 | <td width="30"><a href="epsilon.html"><img src="theme/r_arr.gif" border="0"></a></td> |
---|
221 | </tr> |
---|
222 | </table> |
---|
223 | <br> |
---|
224 | <hr size="1"> |
---|
225 | <p class="copyright">Copyright © 1998-2003 Joel de Guzman<br> |
---|
226 | <br> |
---|
227 | <font size="2">Use, modification and distribution is subject to the Boost Software |
---|
228 | License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at |
---|
229 | http://www.boost.org/LICENSE_1_0.txt)</font></p> |
---|
230 | </body> |
---|
231 | </html> |
---|