Planet
navi homePPSaboutscreenshotsdownloaddevelopmentforum

source: downloads/boost_1_33_1/libs/spirit/doc/subrules.html @ 12

Last change on this file since 12 was 12, checked in by landauf, 17 years ago

added boost

File size: 24.8 KB
Line 
1<html>
2<head>
3<title>Subrules</title>
4<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
5<link rel="stylesheet" href="theme/style.css" type="text/css">
6</head>
7
8<body>
9<table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2">
10  <tr>
11    <td width="10">
12    </td>
13    <td width="85%">
14      <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Subrules</b></font>
15    </td>
16    <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td>
17  </tr>
18</table>
19<br>
20<table border="0">
21  <tr>
22    <td width="10"></td>
23    <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
24    <td width="30"><a href="grammar.html"><img src="theme/l_arr.gif" border="0"></a></td>
25    <td width="30"><a href="semantic_actions.html"><img src="theme/r_arr.gif" border="0"></a></td>
26  </tr>
27</table>
28<p>Spirit is implemented using expression templates. This is a very powerful technique.
29  Along with its power comes some complications. We almost take for granted that
30  when we write <tt>i | j &gt;&gt; k</tt> where <tt>i</tt>, <tt>j</tt> and <tt>k</tt>
31  are all integers the result is still an integer. Yet, with expression templates,
32  the same expression <tt>i | j &gt;&gt; k</tt> where <tt>i</tt>, <tt>j</tt> and
33  <tt>k</tt> are of type <tt>T</tt>, the result is a complex composite type [see
34  <a href="basic_concepts.html">Basic Concepts</a>]. Spirit expressions, which
35  are combinations of primitives and composites yield an infinite set of new types.
36  One problem is that C++ offers no easy facility to deduce the type of an arbitrarily
37  complex expression that yields a complex type. Thus, while it is easy to write:</p>
38<pre><code><font color="#000000"><span class=identifier>    </span><span class=keyword>int </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>&gt;&gt; </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are ints</span></font></code></pre>
39<p>Expression templates yield an endless supply of types. Without the <a href="rule.html">rule</a>,
40  there is no easy way to do this in C++ if <tt>i</tt>, <tt>j</tt> and <tt>k</tt> 
41  are Spirit parsers:</p>
42<pre><code><font color="#000000"><span class=comment>    </span><span class=special>&lt;</span><span class=identifier>what_type???</span><span class=special>&gt; </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>&gt;&gt; </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are Spirit parsers</span></font></code></pre>
43<p>If <tt>i</tt>, <tt>j</tt> and <tt>k</tt> are all <tt>chlit&lt;&gt;</tt> objects,
44  the type that we want is:</p>
45<pre><code><font color="#000000"><span class=comment>    </span><span class=keyword>typedef
46        </span><span class=identifier>alternative</span><span class=special>&lt;
47            </span><span class=identifier>chlit</span><span class=special>&lt;&gt;</span><span class=comment>      //  i
48          </span><span class=special>,</span> <span class=identifier>sequence</span><span class=special>&lt;
49                </span><span class=identifier>chlit</span><span class=special>&lt;&gt;  </span><span class=comment>//  j
50          </span><span class=special>    ,</span><span class=comment> </span><span class=identifier>chlit</span><span class=special>&lt;&gt;  </span><span class=comment>//  k
51            </span><span class=special>&gt;
52        &gt;
53    </span><span class=identifier>rule_t</span><span class=special>;
54
55    </span><span class=identifier>rule_t r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>&gt;&gt; </span><span class=identifier>k</span><span class=special>; </span><span class=comment>// where i, j, and k are chlit&lt;&gt; objects</span></font></code></pre>
56<p>We deliberately formatted the type declaration nicely to make it understandable.
57  Try that with a more complex expression. While it can be done, explicitly spelling
58  out the type of a Spirit expression template is tedious and error prone. The
59  right hand side (rhs) has to mirror the type of the left hand side (lhs). (<img src="theme/lens.gif" width="15" height="16"> 
60  Yet, if you still wish to do it, see this <a href="techniques.html#no_rules">link</a> 
61  for a technique). </p>
62<table width="80%" border="0" align="center">
63  <tr>
64    <td class="note_box"><p><img src="theme/lens.gif" width="15" height="16"><b> 
65        typeof and auto</b> <br>
66        <br>
67        Some compilers already support the <tt>typeof</tt> keyword. This can be
68        used to free us from having to explicitly type the type (pun intentional).
69        Using the <tt>typeof</tt>, we can rewrite the Spirit expression above
70        as:<br>
71        <br>
72        <span class="keyword"><code>typeof</code><code></code></span><code><span class=special>(</span><span class=identifier>i
73        </span><span class=special>| </span><span class=identifier>j </span><span class=special>&gt;&gt; 
74        </span><span class=identifier>k</span><span class=special>) </span><span class=identifier>r
75        </span><span class=special>= </span><span class=identifier>i </span><span class=special>|
76        </span><span class=identifier>j </span><span class=special>&gt;&gt; </span><span class=identifier>k</span><span class=special>;</span></code><br>
77        <br>
78        While this is better than having to explicitly declare a complex type,
79        it is redundant, error prone and still an eye sore. The expression is
80        typed twice. The only way to simplify this is to introduce a macro (See
81        this <a href="techniques.html#typeof">link</a> for more information).<br>
82        <br>
83        <a href="http://www.boost-consulting.com">David Abrahams</a> proposed
84        in comp.std.c++ to reuse the <tt>auto</tt> keyword for type deduced variables.
85        This has been extensibly discussed in <a href="http://www.boost.org">boost.org</a>. Example:
86        <br>
87        <br>
88        <span class=keyword><code>auto </code></span><code><span class=identifier>r
89        </span><span class=special>= </span><span class=identifier>i </span><span class=special>|
90        </span><span class=identifier>j </span><span class=special>&gt;&gt; </span><span class=identifier>k</span><span class=special>;</span></code><br>
91        <br>
92        Once such a C++ extension is accepted into the standard, this would be
93        a neat solution and a nice fit for our purpose. It's not a complete solution
94        though since there are still situations where we do not know the rhs beforehand;
95        for instance when pre-declaring cyclic dependent rules.</p>
96    </td>
97  </tr>
98</table>
99<p>Fortunately, rules come to the rescue. Rules can capture the type of the expression
100  assigned to it. Thus:</p>
101<pre><code><font color="#000000">    <span class=identifier>rule</span><span class=special>&lt;&gt; </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>i </span><span class=special>| </span><span class=identifier>j </span><span class=special>&gt;&gt; </span><span class=identifier>k</span><span class=special></span><span class=comment>// where i, j, and k are chlit&lt;&gt; objects</span></font></code></pre>
102<p>It might not be apparent but behind the scenes, plain rules are actually implemented
103  using a pointer to a runtime polymorphic abstract class that holds the dynamic
104  type of the parser assigned to it. When a Spirit expression is assigned to a
105  rule, its type is encapsulated in a concrete subclass of the abstract class.
106  A virtual parse function delegates the parsing to the encapsulated object.</p>
107<p>Rules have drawbacks though:</p>
108<p><img src="theme/bullet.gif" width="12" height="12"> It is coupled to a specific
109  scanner type. The rule is tied to a specific scanner [see <a href="faq.html#scanner_business">The
110  Scanner Business</a>].<br>
111  <img src="theme/bullet.gif" width="12" height="12"> The rule's parse member
112function has a virtual function call overhead that cannot be inlined.</p>
113<h2>Static rules: subrules</h2>
114<p>The subrule is a fully static version of the rule. The subrule does not have
115  the drawbacks listed above. </p>
116<p><img src="theme/bullet.gif" width="12" height="12"> The subrule is not tied
117  to a specific scanner so just about any scanner type may be used<br>
118  <img src="theme/bullet.gif" width="12" height="12"> The subrule also allows
119  aggressive inlining since there are no virtual function calls</p>
120<pre><code><font color="#000000"><span class=identifier>    </span><span class=keyword>template</span><span class=special>&lt;</span><span class=keyword>int </span></font><span class="identifier">ID</span><font color="#000000"><span class=special>, </span><span class=keyword>typename </span><span class=identifier>ContextT </span><span class=special>= </span><span class=identifier>parser_context</span><span class=special>&lt;&gt;</span> <span class=special>&gt;
121    </span><span class=keyword>class </span><span class=identifier>subrule</span><span class=special>;</span></font></code></pre>
122<p>The first template parameter gives the subrule an identification tag. Like
123  the <a href="rule.html">rule</a>, there is a ContextT template parameter that
124  defaults to <code><tt>parser_context</tt></code>. You need not be concerned
125  at all with the <tt>ContextT</tt> template parameter unless you wish to tweak
126  the low level behavior of the subrule. Detailed information on the <tt>ContextT</tt> 
127  template parameter is provided <a href="indepth_the_parser_context.html">elsewhere</a>.
128</p>
129<p>Presented above is the public API. There may actually be more template parameters
130  after <tt>ContextT</tt>. Everything after the <tt>ContextT</tt> parameter should
131  not be of concern to the client and are strictly for internal use only.</p>
132<p>Apart from a few minor differences, the subrule follows the usage and syntax
133  of the rule closely. Here's the calculator grammar using subrules:</p>
134<pre><code><font color="#000000"><span class=comment>    </span><span class=keyword>struct </span><span class=identifier>calculator </span><span class=special>: </span><span class=keyword>public </span><span class=identifier>grammar</span><span class=special>&lt;</span><span class=identifier>calculator</span><span class=special>&gt;
135    </span><span class=special>{
136        </span><span class=keyword>template </span><span class=special>&lt;</span><span class=keyword>typename </span><span class=identifier>ScannerT</span><span class=special>&gt;
137        </span><span class=keyword>struct </span><span class=identifier>definition
138        </span><span class=special>{
139            </span><span class=identifier>definition</span><span class=special>(</span><span class=identifier>calculator </span><span class=keyword>const</span><span class=special>& </span><span class=identifier>self</span><span class=special>)
140            </span><span class=special>{
141                </span><span class=identifier>first </span><span class=special>=
142                </span><span class=special>(
143                    </span><span class=identifier>expression  </span><span class=special>= </span><span class=identifier>term </span><span class=special>&gt;&gt; </span><span class=special>*((</span><span class=literal>'+' </span><span class=special>&gt;&gt; </span><span class=identifier>term</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'-' </span><span class=special>&gt;&gt; </span><span class=identifier>term</span><span class=special>)),
144                    </span><span class=identifier>term        </span><span class=special>= </span><span class=identifier>factor </span><span class=special>&gt;&gt; </span><span class=special>*((</span><span class=literal>'*' </span><span class=special>&gt;&gt; </span><span class=identifier>factor</span><span class=special>) </span><span class=special>| </span><span class=special>(</span><span class=literal>'/' </span><span class=special>&gt;&gt; </span><span class=identifier>factor</span><span class=special>)),
145                    </span><span class=identifier>factor      </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>group</span><span class=special>,
146                    </span><span class=identifier>group       </span><span class=special>= </span><span class=literal>'(' </span><span class=special>&gt;&gt; </span><span class=identifier>expression </span><span class=special>&gt;&gt; </span><span class=literal>')'
147                </span><span class=special>);
148            </span><span class=special>}
149
150            </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>0</span><span class=special>&gt;  </span><span class=identifier>expression</span><span class=special>;
151            </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>1</span><span class=special>&gt;  </span><span class=identifier>term</span><span class=special>;
152            </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>2</span><span class=special>&gt;  </span><span class=identifier>factor</span><span class=special>;
153            </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>3</span><span class=special>&gt;  </span><span class=identifier>group</span><span class=special>;
154
155            </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt; </span><span class=identifier>first</span><span class=special>;
156            </span><span class=identifier>rule</span><span class=special>&lt;</span><span class=identifier>ScannerT</span><span class=special>&gt; </span><span class=keyword>const</span><span class=special>&
157            </span><span class=identifier>start</span><span class=special>() </span><span class=keyword>const </span><span class=special>{ </span><span class=keyword>return </span><span class=identifier>first</span><span class=special>; </span><span class=special>}
158        </span><span class=special>};
159    </span><span class=special>};</span></font></code></pre>
160<p><img src="theme/lens.gif" width="15" height="16"> A fully working example with
161  <a href="semantic_actions.html">semantic actions</a> can be <a href="../example/fundamental/subrule_calc.cpp">viewed
162  here</a>. This is part of the Spirit distribution. </p>
163<table border="0" align="left">
164  <tr> 
165    <td width="199"><img src="theme/subrule1.png" width="234" height="224"></td>
166    <td width="2"></td>
167  </tr>
168</table>
169<p>The subrule as an efficient version of the rule. Compiler optimizations such
170  as aggressive inlining help reduce the code size and increase performance significantly.
171</p>
172<p>The subrule is not a panacea however. Subrules push the C++ compiler hard to
173  its knees. For example, current compilers have a limit on recursion depth that
174  may not be exceeded. Don't even think about writing a full pascal grammar using
175  subrules alone. A grammar using subrules is a single C++ expression. Current
176  C++ compilers cannot handle very complex expressions very well. Finally, a plain
177  rule is still needed to act as place holder for subrules.</p>
178<p>The code above is a good example of the recommended way to use subrules. Notice
179  the hierarchy. We have a grammar that encapsulates the whole calculator. The
180  start rule is a plain rule that holds the set of subrules. The subrules in turn
181  defines the actual details of the grammar.</p>
182<table width="80%" border="0" align="center">
183  <tr> 
184    <td class="note_box"><img src="theme/lens.gif" width="15" height="16"><b> 
185      Template instantiation depth</b> <br> <br>
186      Spirit pushes the C++ compiler hard. Current C++ compilers cannot handle
187      very complex heavily nested expressions very well. One restricting factor
188      is the typical compiler's limit on template recursion depth. Some, but not
189      all, compilers allow this limit to be configured.<br>
190      <br>
191      g++'s maximum can be set using a compiler flag: -ftemplate-depth. Set this
192      appropriately if you have a relatively complex grammar.<br>
193      <br>
194      Microsoft Visual C++ can take greater than 1000 for both template class
195      and function instantiation depths. However, the linker chokes with deep
196      template function instantiation unless inline recursion depth is set using
197      these pragmas:<br>
198      <br>
199      <span class="preprocessor">#pragma</span> inline_depth<span class="special">(</span>255<span class="special">)</span><br>
200      <span class="preprocessor">#pragma</span> inline_recursion<span class="special">(</span>on<span class="special">)<br>
201      <br>
202      </span>Perhaps this limitations no longer applies to more current versions
203      of these compilers. Be sure to check your compiler documentation.</td>
204  </tr>
205</table>
206<p>This setup gives a good balance. The subrules do all the work. Each grammar
207  will have only one rule: <tt>first</tt>. The rule is used just to hold the subrules
208  and make them visible to the grammar. </p>
209<h3>The subrule definition</h3>
210<p>Like the rule, the expression after assignment operator <tt>=</tt> defines
211  the subrule:</p>
212<pre>    <span class=identifier>identifier </span><span class=special>= </span><span class=identifier>expression</span></pre>
213<p>Unlike rules, subrules may be defined only once. Redefining a subrule is illegal
214  and will result to a compile time assertion.</p>
215<h3>Separators [ , ]</h3>
216<p>While rules are terminated by the semicollon <tt>';'</tt>. Subrules are not
217  terminated but are separated by the comma: <tt>','</tt>. Like Pascal statements,
218  the last subrule in a group may not have a trailing comma.</p>
219<pre><span class=identifier>    </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>),
220    </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'b'</span><span class=special>),
221    </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'c'</span><span class=special>), </span><span class=comment>// BAD, trailing comma</span><code><font color="#000000"><font color="#800000"><i></i></font></font></code><code><font color="#000000"><font color="#800000"><i></i></font></font><i></i></code></pre>
222<p>
223<pre><code><span class=comment>    </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>),
224    </span><span class=identifier>b </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'b'</span><span class=special>),
225    </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'c'</span><span class=special></span><span class=comment>// OK</span></code></pre>
226<h3> The start subrule</h3>
227<p>Unlike rules, parsing proceeds from the start subrule. The first (topmost)
228  subrule in a group of subrules is called the <b>start subrule</b>. In our example
229  above, <tt>expression</tt> is the start subrule. When a group of subrules is
230  called forth, the start subrule <tt>expression</tt> is called first.</p>
231<h3>IDs</h3>
232<p>Each subrule has a corresponding ID; an integral constant that uniquely specifies
233  the subrule. Our example above has four subrules. They are declared as:</p>
234<pre><code><span class=comment>    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>0</span><span class=special>&gt;  </span><span class=identifier>expression</span><span class=special>;
235    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>1</span><span class=special>&gt;  </span><span class=identifier>term</span><span class=special>;
236    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>2</span><span class=special>&gt;  </span><span class=identifier>factor</span><span class=special>;
237    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>3</span><span class=special>&gt;  </span><span class=identifier>group</span><span class=special>;</span></code></pre>
238<h3> Aliases</h3>
239<p>It is possible to have subrules with similar IDs. A subrule with a similar
240  ID to will be an alias of the other. Both subrules may be used interchangeably.</p>
241<pre><code><span class=special>    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>0</span><span class=special>&gt;  </span><span class=identifier>a</span><span class=special>;
242    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>0</span><span class=special>&gt;  </span><span class=identifier>alias</span><span class=special></span><span class=comment>// alias of a</span></code></pre>
243<h3>Groups: scope and nesting</h3>
244<p>The scope of a subrule and its definition is the enclosing group, typically
245  (and by convention) enclosed inside the parentheses. IDs outside a scope are
246  not directly visible. Inner subrule groups can be nested by enclosing each sub-group
247  inside another set of parentheses. Each group is unique and acts independently.
248  Consequently, while it may not be advisable to do so, a subrule in a group may
249  share the same ID as a subrule in another group since both groups are independent
250  of each other.</p>
251<pre><code><span class=comment>    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>0</span><span class=special>&gt; </span><span class=identifier>a</span><span class=special>;
252    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>1</span><span class=special>&gt; </span><span class=identifier>b</span><span class=special>;
253    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>0</span><span class=special>&gt; </span><span class=identifier>c</span><span class=special>;
254    </span><span class=identifier>subrule</span><span class=special>&lt;</span><span class=number>1</span><span class=special>&gt; </span><span class=identifier>d</span><span class=special>;
255
256    </span><span class=special>(                       </span><span class=comment>// outer subrule group, scope of a and b
257        </span><span class=identifier>a </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'a'</span><span class=special>),
258        </span><span class=identifier>b </span><span class=special>=
259        </span><span class=special>(                   </span><span class=comment>// inner subrule group, scope of b and c
260            </span><span class=identifier>c </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'c'</span><span class=special>),
261            </span><span class=identifier>d </span><span class=special>= </span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'d'</span><span class=special>)
262        </span><span class=special>)
263    </span><span class=special>)</span></code></pre>
264<p>Subrule IDs need to be unique only within a group. A grammar is an implicit
265  group. Furthermore, even subrules in a grammar may have the same IDs without
266  clashing if they are inside a group. Subrules may be explicitly grouped using
267  the parentheses. Parenthesized groups have unique scopes. In the code above,
268  the outer subrule group defines the subrules <tt>a</tt> and <tt>b</tt> while
269  the inner subrule group defines the subrules <tt>c</tt> and <tt>d</tt>. Notice
270  that the definition of <tt>b</tt> is the inner subrule.</p>
271<table border="0">
272  <tr>
273    <td width="10"></td>
274    <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td>
275    <td width="30"><a href="grammar.html"><img src="theme/l_arr.gif" border="0"></a></td>
276    <td width="30"><a href="semantic_actions.html"><img src="theme/r_arr.gif" border="0"></a></td>
277  </tr>
278</table>
279<br>
280<hr size="1">
281<p class="copyright">Copyright &copy; 1998-2003 Joel de Guzman<br>
282  <br>
283  <font size="2">Use, modification and distribution is subject to the Boost Software
284    License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at
285    http://www.boost.org/LICENSE_1_0.txt)</font></p>
286<p>&nbsp;</p>
287<p><code><font color="#000000"><font color="#0000ff"></font></font></code></p>
288</body>
289</html>
Note: See TracBrowser for help on using the repository browser.