1 | <html> |
---|
2 | <head> |
---|
3 | <title>Directives</title> |
---|
4 | <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
---|
5 | <link rel="stylesheet" href="theme/style.css" type="text/css"> |
---|
6 | </head> |
---|
7 | |
---|
8 | <body> |
---|
9 | <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2"> |
---|
10 | <tr> |
---|
11 | <td width="10"> |
---|
12 | </td> |
---|
13 | <td width="85%"> |
---|
14 | <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Directives</b></font> |
---|
15 | </td> |
---|
16 | <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td> |
---|
17 | </tr> |
---|
18 | </table> |
---|
19 | <br> |
---|
20 | <table border="0"> |
---|
21 | <tr> |
---|
22 | <td width="10"></td> |
---|
23 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
---|
24 | <td width="30"><a href="epsilon.html"><img src="theme/l_arr.gif" border="0"></a></td> |
---|
25 | <td width="30"><a href="scanner.html"><img src="theme/r_arr.gif" border="0"></a></td> |
---|
26 | </tr> |
---|
27 | </table> |
---|
28 | <p>Parser directives have the form: <b>directive[expression]</b></p> |
---|
29 | <p>A directive modifies the behavior of its enclosed expression, essentially <em>decorating</em> |
---|
30 | it. The framework pre-defines a few directives. Clients of the framework are |
---|
31 | free to define their own directives as needed. Information on how this is done |
---|
32 | will be provided later. For now, we shall deal only with predefined directives.</p> |
---|
33 | <h2>lexeme_d</h2> |
---|
34 | <p>Turns off white space skipping. At the phrase level, the parser ignores white |
---|
35 | spaces, possibly including comments. Use <tt>lexeme_d</tt> in situations where |
---|
36 | we want to work at the character level instead of the phrase level. Parsers |
---|
37 | can be made to work at the character level by enclosing the pertinent parts |
---|
38 | inside the lexeme_d directive. For example, let us complete the example presented |
---|
39 | in the <a href="introduction.html">Introduction</a>. There, we skipped the definition |
---|
40 | of the <tt>integer</tt> rule. Here's how it is actually defined:</p> |
---|
41 | <pre><code><font color="#000000"><span class=identifier> </span><span class=identifier>integer </span><span class=special>= </span><span class=identifier>lexeme_d</span><span class=special>[ </span><span class=special>!(</span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'+'</span><span class=special>) </span><span class=special>| </span><span class=literal>'-'</span><span class=special>) </span><span class=special>>> </span><span class=special>+</span><span class=identifier>digit </span><span class=special>];</span></font></code></pre> |
---|
42 | <p>The <tt>lexeme_d</tt> directive instructs the parser to work on the character |
---|
43 | level. Without it, the <tt>integer</tt> rule would have allowed erroneous embedded |
---|
44 | white spaces in inputs such as <span class="quotes">"1 2 345"</span> |
---|
45 | which will be parsed as <span class="quotes">"12345"</span>.</p> |
---|
46 | <h2>as_lower_d</h2> |
---|
47 | <p>There are times when we want to inhibit case sensitivity. The <tt>as_lower_d</tt> |
---|
48 | directive converts all characters from the input to lower-case.</p> |
---|
49 | <table width="80%" border="0" align="center"> |
---|
50 | <tr> |
---|
51 | <td class="note_box"><img src="theme/alert.gif" width="16" height="16"><b> |
---|
52 | as_lower_d behavior</b> <br> |
---|
53 | <br> |
---|
54 | It is important to note that only the input is converted to lower case. |
---|
55 | Parsers enclosed inside the <tt>as_lower_d</tt> expecting upper case characters |
---|
56 | will fail to parse. Example: <tt>as_lower_d[<span class="quotes">'X'</span>]</tt> |
---|
57 | will never succeed because it expects an upper case <tt class="quotes">'X'</tt> |
---|
58 | that the <tt>as_lower_d</tt> directive will never supply.</td> |
---|
59 | </tr> |
---|
60 | </table> |
---|
61 | <p>For example, in Pascal, keywords and identifiers are case insensitive. Pascal |
---|
62 | ignores the case of letters in identifiers and keywords. Identifiers Id, ID |
---|
63 | and id are indistinguishable in Pascal. Without the as_lower_d directive, it |
---|
64 | would be awkward to define a rule that recognizes this. Here's a possibility:</p> |
---|
65 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>str_p</span><span class=special>(</span><span class=string>"id"</span><span class=special>) </span><span class=special>| </span><span class=string>"Id" </span><span class=special>| </span><span class=string>"iD" </span><span class=special>| </span><span class=string>"ID"</span><span class=special>;</span></font></code></pre> |
---|
66 | <p>Now, try doing that with the case insensitive Pascal keyword <span class="quotes">"BEGIN"</span>. |
---|
67 | The <tt>as_lower_d</tt> directive makes this simple:</p> |
---|
68 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>as_lower_d</span><span class=special>[</span><span class=string>"begin"</span><span class=special>];</span></font></code></pre> |
---|
69 | <table width="80%" border="0" align="center"> |
---|
70 | <tr> |
---|
71 | <td class="note_box"><div align="justify"><img src="theme/note.gif" width="16" height="16"> |
---|
72 | <b>Primitive arguments</b> <br> |
---|
73 | <br> |
---|
74 | The astute reader will notice that we did not explicitly wrap <span class="quotes">"begin"</span> |
---|
75 | inside an <tt>str_p</tt>. Whenever appropriate, directives should be able |
---|
76 | to allow primitive types such as <tt>char</tt>, <tt>int</tt>, <tt>wchar_t</tt>, |
---|
77 | <tt>char const<span class="operators">*</span></tt>, <tt>wchar_t const<span class="operators">*</span></tt> |
---|
78 | and so on. Examples: <tt><br> |
---|
79 | <br> |
---|
80 | </tt><code><span class=identifier>as_lower_d</span><tt><span class=special>[</span><span class=string>"hello"</span><span class=special>] |
---|
81 | </span><span class=comment>// same as as_lower_d[str_p("hello")]</span></tt><code></code><span class=identifier><br> |
---|
82 | as_lower_d</span><span class=special>[</span><span class=literal>'x'</span><span class=special>] |
---|
83 | </span><span class=comment>// same as as_lower_d[ch_p('x')]</span></code></div></td> |
---|
84 | </tr> |
---|
85 | </table> |
---|
86 | <h3>no_actions_d</h3> |
---|
87 | <p>There are cases where you want <a href="semantic_actions.html">semantic actions</a> |
---|
88 | not to be triggered. By enclosing a parser in the <tt>no_actions_d</tt> directive, |
---|
89 | all semantic actions directly or indirectly attached to the parser will not |
---|
90 | fire. </p> |
---|
91 | <pre><code><font color="#000000"><span class=special> </span>no_actions_d<span class=special>[</span><span class=identifier>expression</span><span class=special>]</span></font></code><code><font color="#000000"><span class=special></span></font></code></pre> |
---|
92 | <h3>Tweaking the Scanner Type</h3> |
---|
93 | <p><img src="theme/note.gif" width="16" height="16"> How does <tt>lexeme_d, as_lower_d</tt> |
---|
94 | and <font color="#000000"><tt>no_actions_d</tt></font> work? These directives |
---|
95 | do their magic by tweaking the scanner policies. Well, you don't need to know |
---|
96 | what that means for now. Scanner policies are discussed <a href="indepth_the_scanner.html">later</a>. |
---|
97 | However, it is important to note that when the scanner policy is tweaked, the |
---|
98 | result is a different scanner. Why is this important to note? The <a href="rule.html">rule</a> |
---|
99 | is tied to a particular scanner (one or more scanners, to be precise). If you |
---|
100 | wrap a rule inside a <tt>lexeme_d, as_lower_d</tt> or <font color="#000000"><tt>no_actions_d,</tt>the |
---|
101 | compiler will complain about <a href="faq.html#scanner_business">scanner mismatch</a> |
---|
102 | unless you associate the required scanner with the rule. </font></p> |
---|
103 | <p><tt>lexeme_scanner</tt>, <tt>as_lower_scanner</tt> and <tt>no_actions_scanner</tt> |
---|
104 | are your friends if the need to wrap a rule inside these directives arise. Learn |
---|
105 | bout these beasts in the next chapter on <a href="scanner.html#lexeme_scanner">The |
---|
106 | Scanner and Parsing</a>.</p> |
---|
107 | <h2>longest_d</h2> |
---|
108 | <p>Alternatives in the Spirit parser compiler are short-circuited (see <a href="operators.html">Operators</a>). |
---|
109 | Sometimes, this is not what is desired. The <tt>longest_d</tt> directive instructs |
---|
110 | the parser not to short-circuit alternatives enclosed inside this directive, |
---|
111 | but instead makes the parser try all possible alternatives and choose the one |
---|
112 | matching the longest portion of the input stream.</p> |
---|
113 | <p>Consider the parsing of integers and real numbers:</p> |
---|
114 | <pre><code><font color="#000000"><span class=comment> </span><span class=identifier>number </span><span class=special>= </span><span class=identifier>real </span><span class=special>| </span><span class=identifier>integer</span><span class=special>;</span></font></code></pre> |
---|
115 | <p>A number can be a real or an integer. This grammar is ambiguous. An input <span class="quotes">"1234"</span> |
---|
116 | should potentially match both real and integer. Recall though that alternatives |
---|
117 | are short-circuited . Thus, for inputs such as above, the real alternative always |
---|
118 | wins. However, if we swap the alternatives:</p> |
---|
119 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>number </span><span class=special>= </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>real</span><span class=special>;</span></font></code></pre> |
---|
120 | <p>we still have a problem. Now, an input <span class="quotes">"123.456"</span> |
---|
121 | will be partially matched by integer until the decimal point. This is not what |
---|
122 | we want. The solution here is either to fix the ambiguity by factoring out the |
---|
123 | common prefixes of real and integer or, if that is not possible nor desired, |
---|
124 | use the <tt>longest_d</tt> directive:</p> |
---|
125 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>number </span><span class=special>= </span><span class=identifier>longest_d</span><span class=special>[ </span><span class=identifier>integer </span><span class=special>| </span><span class=identifier>real </span><span class=special>];</span></font></code></pre> |
---|
126 | <h2>shortest_d</h2> |
---|
127 | <p>Opposite of the <tt>longest_d</tt> directive.</p> |
---|
128 | <table width="80%" border="0" align="center"> |
---|
129 | <tr> |
---|
130 | <td class="note_box"><img src="theme/note.gif" width="16" height="16"> <b>Multiple |
---|
131 | alternatives</b> <br> |
---|
132 | <br> |
---|
133 | The <tt>longest_d</tt> and <tt>shortest_d</tt> directives can accept two |
---|
134 | or more alternatives. Examples:<br> |
---|
135 | <br> |
---|
136 | <font color="#000000"><span class=identifier><code>longest</code></span><code><span class=special>[ |
---|
137 | </span><span class=identifier>a </span><span class=special>| </span><span class=identifier>b |
---|
138 | </span><span class=special>| </span><span class=identifier>c </span><span class=special>]; |
---|
139 | </span><span class=identifier><br> |
---|
140 | shortest</span><span class=special>[ </span><span class=identifier>a </span><span class=special>| |
---|
141 | </span><span class=identifier>b </span><span class=special>| </span><span class=identifier>c |
---|
142 | </span><span class=special>| </span><span class=identifier>d </span><span class=special>];</span></code></font></td> |
---|
143 | </tr> |
---|
144 | </table> |
---|
145 | <h2>limit_d</h2> |
---|
146 | <p>Ensures that the result of a parser is constrained to a given min..max range |
---|
147 | (inclusive). If not, then the parser fails and returns a no-match.</p> |
---|
148 | <p><b>Usage:</b></p> |
---|
149 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>limit_d</span><span class=special>(</span><span class=identifier>min</span><span class=special>, </span><span class=identifier>max</span><span class=special>)[</span><span class=identifier>expression</span><span class=special>]</span></font></code></pre> |
---|
150 | <p>This directive is particularly useful in conjunction with parsers that parse |
---|
151 | specific scalar ranges (for example, <a href="numerics.html">numeric parsers</a>). |
---|
152 | Here's a practical example. Although the numeric parsers can be configured to |
---|
153 | accept only a limited number of digits (say, 0..2), there is no way to limit |
---|
154 | the result to a range (say -1.0..1.0). This design is deliberate. Doing so would |
---|
155 | have undermined Spirit's design rule that <i><span class="quotes">"the |
---|
156 | client should not pay for features that she does not use"</span></i>. We |
---|
157 | would have stored the min, max values in the numeric parser itself, used or |
---|
158 | unused. Well, we could get by by using static constants configured by a non-type |
---|
159 | template parameter, but that is not acceptable because that way, we can only |
---|
160 | accommodate integers. What about real numbers or user defined numbers such as |
---|
161 | big-ints?</p> |
---|
162 | <p><b>Example</b>, parse time of the form <b>HH:MM:SS</b>:</p> |
---|
163 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>uint_parser</span><span class=special><</span><span class=keyword>int</span><span class=special>, </span><span class=number>10</span><span class=special>, </span><span class=number>2</span><span class=special>, </span><span class=number>2</span><span class=special>> </span><span class=identifier>uint2_p</span><span class=special>; |
---|
164 | |
---|
165 | </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>lexeme_d |
---|
166 | </span><span class=special>[ |
---|
167 | </span><span class=identifier>limit_d</span><span class=special>(</span><span class=number>0u</span><span class=special>, </span><span class=number>23u</span><span class=special>)[</span><span class=identifier>uint2_p</span><span class=special>] </span><span class=special>>> </span><span class=literal>':' </span><span class=comment>// Hours 00..23 |
---|
168 | </span><span class=special>>> </span><span class=identifier>limit_d</span><span class=special>(</span><span class=number>0u</span><span class=special>, </span><span class=number>59u</span><span class=special>)[</span><span class=identifier>uint2_p</span><span class=special>] </span><span class=special>>> </span><span class=literal>':' </span><span class=comment>// Minutes 00..59 |
---|
169 | </span><span class=special>>> </span><span class=identifier>limit_d</span><span class=special>(</span><span class=number>0u</span><span class=special>, </span><span class=number>59u</span><span class=special>)[</span><span class=identifier>uint2_p</span><span class=special>] </span><span class=comment>// Seconds 00..59 |
---|
170 | </span><span class=special>];</span></font></code> |
---|
171 | </pre> |
---|
172 | <h2>min_limit_d</h2> |
---|
173 | <p>Sometimes, it is useful to unconstrain just the maximum limit. This will allow |
---|
174 | for an interval that's unbounded in one direction. The directive min_limit_d |
---|
175 | ensures that the result of a parser is not less than minimun. If not, then the |
---|
176 | parser fails and returns a no-match.</p> |
---|
177 | <p><b>Usage:</b></p> |
---|
178 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>min_limit_d</span><span class=special>(</span><span class=identifier>min</span><span class=special>)[</span><span class=identifier>expression</span><span class=special>]</span></font></code></pre> |
---|
179 | <p><b>Example</b>, ensure that a date is not less than 1900</p> |
---|
180 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>min_limit_d</span><span class=special>(</span><span class=number>1900u</span><span class=special>)[</span><span class=identifier>uint_p</span><span class=special>]</span></font></code></pre> |
---|
181 | <h2>max_limit_d</h2> |
---|
182 | <p>Opposite of <tt>min_limit_d</tt>. Take note that <tt>limit_d[p]</tt> is equivalent |
---|
183 | to:</p> |
---|
184 | <pre><code><font color="#000000"><span class=special> </span><span class=identifier>min_limit_d</span><span class=special>(</span><span class=identifier>min</span><span class=special>)[</span><span class=identifier>max_limit_d</span><span class=special>(</span><span class=identifier>max</span><span class=special>)[</span><span class=identifier>p</span><span class=special>]]</span></font></code><code><font color="#000000"><span class=special></span></font></code></pre> |
---|
185 | <table border="0"> |
---|
186 | <tr> |
---|
187 | <td width="10"></td> |
---|
188 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
---|
189 | <td width="30"><a href="epsilon.html"><img src="theme/l_arr.gif" border="0"></a></td> |
---|
190 | <td width="30"><a href="scanner.html"><img src="theme/r_arr.gif" border="0"></a></td> |
---|
191 | </tr> |
---|
192 | </table> |
---|
193 | <br> |
---|
194 | <hr size="1"> |
---|
195 | <p class="copyright">Copyright © 1998-2003 Joel de Guzman<br> |
---|
196 | <br> |
---|
197 | <font size="2">Use, modification and distribution is subject to the Boost Software |
---|
198 | License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at |
---|
199 | http://www.boost.org/LICENSE_1_0.txt) </font> </p> |
---|
200 | <p> </p> |
---|
201 | </body> |
---|
202 | </html> |
---|