1 | <html> |
---|
2 | <head> |
---|
3 | <title> Loops</title> |
---|
4 | <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> |
---|
5 | <link rel="stylesheet" href="theme/style.css" type="text/css"> |
---|
6 | </head> |
---|
7 | |
---|
8 | <body> |
---|
9 | <table width="100%" border="0" background="theme/bkd2.gif" cellspacing="2"> |
---|
10 | <tr> |
---|
11 | <td width="10"> |
---|
12 | </td> |
---|
13 | <td width="85%"> |
---|
14 | <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b> Loops</b></font> |
---|
15 | </td> |
---|
16 | <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" width="112" height="48" align="right" border="0"></a></td> |
---|
17 | </tr> |
---|
18 | </table> |
---|
19 | <br> |
---|
20 | <table border="0"> |
---|
21 | <tr> |
---|
22 | <td width="10"></td> |
---|
23 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
---|
24 | <td width="30"><a href="escape_char_parser.html"><img src="theme/l_arr.gif" border="0"></a></td> |
---|
25 | <td width="30"><a href="character_sets.html"><img src="theme/r_arr.gif" border="0"></a></td> |
---|
26 | </tr> |
---|
27 | </table> |
---|
28 | <p>So far we have introduced a couple of EBNF operators that deal with looping. |
---|
29 | We have the <tt>+</tt> positive operator, which matches the preceding symbol |
---|
30 | one (1) or more times, as well as the Kleene star <tt>*</tt> which matches the |
---|
31 | preceding symbol zero (0) or more times.</p> |
---|
32 | <p>Taking this further, we may want to have a generalized loop operator. To some |
---|
33 | this may seem to be a case of overkill. Yet there are grammars that are impractical |
---|
34 | and cumbersome, if not impossible, for the basic EBNF iteration syntax to specify. |
---|
35 | Examples:</p> |
---|
36 | <blockquote> |
---|
37 | <p><img src="theme/bullet.gif" width="12" height="12"> A file name may have |
---|
38 | a maximum of 255 characters only.<br> |
---|
39 | <img src="theme/bullet.gif" width="12" height="12"> A specific bitmap file |
---|
40 | format has exactly 4096 RGB color information. <br> |
---|
41 | <img src="theme/bullet.gif" width="12" height="12"> A 32 bit binary string |
---|
42 | (1..32 1s or 0s).</p> |
---|
43 | </blockquote> |
---|
44 | <p>Other than the Kleene star <tt>*</tt>, the Positive closure <tt>+</tt>, and |
---|
45 | the optional <tt>!</tt>, a more flexible mechanism for looping is provided for |
---|
46 | by the framework. <br> |
---|
47 | </p> |
---|
48 | <table width="80%" border="0" align="center"> |
---|
49 | <tr> |
---|
50 | <td colspan="2" class="table_title">Loop Constructs</td> |
---|
51 | </tr> |
---|
52 | <tr> |
---|
53 | <td class="table_cells" width="26%"><b>repeat_p (n) [p]</b></td> |
---|
54 | <td class="table_cells" width="74%">Repeat <b>p</b> exactly <b>n</b> times</td> |
---|
55 | </tr> |
---|
56 | <tr> |
---|
57 | <td class="table_cells" width="26%"><b>repeat_p (n1, n2) [p]</b></td> |
---|
58 | <td class="table_cells" width="74%">Repeat <b>p</b> at least <b>n1</b> times |
---|
59 | and at most <b>n2</b> times</td> |
---|
60 | </tr> |
---|
61 | <tr> |
---|
62 | <td class="table_cells" width="26%"><b>repeat_p (n, more) [p] </b></td> |
---|
63 | <td class="table_cells" width="74%">Repeat <b>p</b> at least <b>n</b> times, |
---|
64 | continuing until <b>p</b> fails or the input is consumed</td> |
---|
65 | </tr> |
---|
66 | </table> |
---|
67 | <p>Using the <tt>repeat_p</tt> parser, we can now write our examples above:</p> |
---|
68 | <p>A file name with a maximum of 255 characters:<br> |
---|
69 | </p> |
---|
70 | <pre> <span class=identifier>valid_fname_chars </span><span class=special>= </span><span class=comment>/*..*/</span><span class=special>; |
---|
71 | </span><span class=identifier>filename </span><span class=special>= </span><span class=identifier>repeat_p</span><span class=special>(</span><span class=number>1</span><span class=special>, </span><span class=number>255</span><span class=special>)[</span><span class=identifier>valid_fname_chars</span><span class=special>];</span></pre> |
---|
72 | <p>A specific bitmap file format which has exactly 4096 RGB color information:<span class=special><br> |
---|
73 | </span></p> |
---|
74 | <pre> <span class=identifier>uint_parser</span><span class=special><</span><span class=keyword>unsigned</span><span class=special>, </span><span class=number>16</span><span class=special>, </span><span class=number>6</span><span class=special>, </span><span class=number>6</span><span class=special>> </span><span class=identifier>rgb_p</span><span class=special>; |
---|
75 | </span><span class=identifier>bitmap </span><span class=special>= </span><span class=identifier>repeat_p</span><span class=special>(</span><span class=number>4096</span><span class=special>)[</span><span class=identifier>rgb_p</span><span class=special>];</span></pre> |
---|
76 | <p>As for the 32 bit binary string (1..32 1s or 0s), of course we could have easily |
---|
77 | used the <tt>bin_p</tt> numeric parser instead. For the sake of demonstration |
---|
78 | however:<span class=special><br> |
---|
79 | </span></p> |
---|
80 | <pre> <span class=identifier>bin</span><span class=number>32</span> <span class=special>= </span><span class=identifier>lexeme_d</span><span class=special>[</span><span class=identifier>repeat_p</span><span class=special>(</span>1, <span class=number>32</span><span class=special>)[</span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'1'</span><span class=special>) </span><span class=special>| </span><span class=literal>'0'</span><span class=special>]];</span></pre> |
---|
81 | <table width="80%" border="0" align="center"> |
---|
82 | <tr> |
---|
83 | <td class="note_box"><img src="theme/note.gif" width="16" height="16"> Loop |
---|
84 | parsers are run-time <a href="parametric_parsers.html">parametric</a>.</td> |
---|
85 | </tr> |
---|
86 | </table> |
---|
87 | <p>The Loop parsers can be dynamic. Consider the parsing of a binary file of Pascal-style |
---|
88 | length prefixed string, where the first byte determines the length of the incoming |
---|
89 | string. Here's a sample input: |
---|
90 | <blockquote> |
---|
91 | <table width="363" border="0" cellspacing="0" cellpadding="0"> |
---|
92 | <tr> |
---|
93 | <td class="dk_grey_bkd"> |
---|
94 | <table width="100%" border="0" cellspacing="2" cellpadding="2"> |
---|
95 | <tr> |
---|
96 | <td class="white_bkd" width=8%"> |
---|
97 | <div align="center">11</div> |
---|
98 | </td> |
---|
99 | <td class="white_bkd" width="8%"> |
---|
100 | <div align="center">h</div> |
---|
101 | </td> |
---|
102 | <td class="white_bkd" width="8%"> |
---|
103 | <div align="center">e</div> |
---|
104 | </td> |
---|
105 | <td class="white_bkd" width="8%"> |
---|
106 | <div align="center">l</div> |
---|
107 | </td> |
---|
108 | <td class="white_bkd" width="8%"> |
---|
109 | <div align="center">l</div> |
---|
110 | </td> |
---|
111 | <td class="white_bkd" width="8%"> |
---|
112 | <div align="center">o</div> |
---|
113 | </td> |
---|
114 | <td class="white_bkd" width="8%"> |
---|
115 | <div align="center"> _</div> |
---|
116 | </td> |
---|
117 | <td class="white_bkd" width="8%"> |
---|
118 | <div align="center">w</div> |
---|
119 | </td> |
---|
120 | <td class="white_bkd" width="8%"> |
---|
121 | <div align="center">o</div> |
---|
122 | </td> |
---|
123 | <td class="white_bkd" width="8%"> |
---|
124 | <div align="center">r</div> |
---|
125 | </td> |
---|
126 | <td class="white_bkd" width="8%"> |
---|
127 | <div align="center">l</div> |
---|
128 | </td> |
---|
129 | <td class="white_bkd" width="8%"> |
---|
130 | <div align="center">d</div> |
---|
131 | </td> |
---|
132 | </tr> |
---|
133 | </table> |
---|
134 | </td> |
---|
135 | </tr> |
---|
136 | </table> |
---|
137 | |
---|
138 | </blockquote> |
---|
139 | <p>This trivial example cannot be practically defined in traditional EBNF. Although |
---|
140 | some EBNF syntax allow more powerful repetition constructs other than the Kleene |
---|
141 | star, we are still limited to parsing fixed strings. The nature of EBNF forces |
---|
142 | the repetition factor to be a constant. On the other hand, Spirit allows the |
---|
143 | repetition factor to be variable at run time. We could write a grammar that |
---|
144 | accepts the input string above:</p> |
---|
145 | <pre><span class=identifier> </span><span class=keyword>int </span><span class=identifier>c</span><span class=special>; |
---|
146 | </span><span class=identifier>r </span><span class=special>= </span><span class=identifier>anychar_p</span><span class=special>[</span><span class=identifier>assign_a</span><span class=special>(</span><span class=identifier>c</span><span class=special>)] </span><span class=special>>> </span><span class=identifier>repeat_p</span><span class=special>(</span><span class=identifier>boost</span><span class=special>::</span><span class=identifier>ref</span><span class=special>(</span><span class=identifier>c</span><span class=special>))[</span><span class=identifier>anychar_p</span><span class=special>];</span></pre> |
---|
147 | <p>The expression</p> |
---|
148 | <pre> <span class=identifier>anychar_p</span><span class=special>[</span><span class=identifier>assign_a</span><span class=special>(</span><span class=identifier>c</span><span class=special>)]</span></pre> |
---|
149 | <p>extracts the first character from the input and puts it in <tt>c</tt>. What |
---|
150 | is interesting is that in addition to constants, we can also use variables as |
---|
151 | parameters to <tt>repeat_p</tt>, as demonstrated in </p> |
---|
152 | <pre> <span class=identifier>repeat_p</span><span class=special>(</span><span class=identifier>boost</span><span class=special>::</span><span class=identifier>ref</span><span class=special>(</span><span class=identifier>c</span><span class=special>)</span><span class=special>)</span><span class=special>[</span><span class=identifier>anychar_p</span><span class=special>]</span></pre> |
---|
153 | <p>Notice that <tt>boost::ref</tt> is used to reference the integer <tt>c</tt>. |
---|
154 | This usage of <tt>repeat_p</tt> makes the parser defer the evaluation of the |
---|
155 | repetition factor until it is actually needed. Continuing our example, since |
---|
156 | the value 11 is already extracted from the input, <tt>repeat_p</tt> is is now |
---|
157 | expected to loop exactly 11 times.</p> |
---|
158 | <table border="0"> |
---|
159 | <tr> |
---|
160 | <td width="10"></td> |
---|
161 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
---|
162 | <td width="30"><a href="escape_char_parser.html"><img src="theme/l_arr.gif" border="0"></a></td> |
---|
163 | <td width="30"><a href="character_sets.html"><img src="theme/r_arr.gif" border="0"></a></td> |
---|
164 | </tr> |
---|
165 | </table> |
---|
166 | <br> |
---|
167 | <hr size="1"> |
---|
168 | <p class="copyright">Copyright © 1998-2003 Joel de Guzman<br> |
---|
169 | <br> |
---|
170 | <font size="2">Use, modification and distribution is subject to the Boost Software |
---|
171 | License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at |
---|
172 | http://www.boost.org/LICENSE_1_0.txt) </font> </p> |
---|
173 | </body> |
---|
174 | </html> |
---|