1 | <html> |
---|
2 | <head> |
---|
3 | <!-- Generated by the Spirit (http://spirit.sf.net) QuickDoc --> |
---|
4 | <title>Distinct Parser</title> |
---|
5 | <link rel="stylesheet" href="theme/style.css" type="text/css"> |
---|
6 | </head> |
---|
7 | <body> |
---|
8 | <table width="100%" height="48" border="0" background="theme/bkd2.gif" cellspacing="2"> |
---|
9 | <tr> |
---|
10 | <td width="10"> |
---|
11 | </td> |
---|
12 | <td width="85%"> |
---|
13 | <font size="6" face="Verdana, Arial, Helvetica, sans-serif"><b>Distinct Parser </b></font></td> |
---|
14 | <td width="112"><a href="http://spirit.sf.net"><img src="theme/spirit.gif" align="right" border="0"></a></td> |
---|
15 | </tr> |
---|
16 | </table> |
---|
17 | <br> |
---|
18 | <table border="0"> |
---|
19 | <tr> |
---|
20 | <td width="10"></td> |
---|
21 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
---|
22 | <td width="30"><a href="scoped_lock.html"><img src="theme/l_arr.gif" border="0"></a></td> |
---|
23 | <td width="30"><a href="symbols.html"><img src="theme/r_arr.gif" border="0"></a></td> |
---|
24 | </tr> |
---|
25 | </table> |
---|
26 | <h3>Distinct Parsers</h3><p> |
---|
27 | The distinct parsers are utility parsers which ensure that matched input is |
---|
28 | not immediately followed by a forbidden pattern. Their typical usage is to |
---|
29 | distinguish keywords from identifiers.</p> |
---|
30 | <h3>distinct_parser</h3> |
---|
31 | <p> |
---|
32 | The basic usage of the <tt>distinct_parser</tt> is to replace the <tt>str_p</tt> parser. For |
---|
33 | example the <tt>declaration_rule</tt> in the following example:</p> |
---|
34 | <pre> |
---|
35 | <code><span class=identifier>rule</span><span class=special><</span><span class="identifier">ScannerT</span><span class=special>> </span><span class=identifier>declaration_rule </span><span class=special>= </span><span class=identifier>str_p</span><span class=special>(</span><span class=string>"declare"</span><span class=special>) >> </span><span class=identifier>lexeme_d</span><span class=special>[+</span><span class=identifier>alpha_p</span><span class=special>]; |
---|
36 | </span></code></pre> |
---|
37 | <p> |
---|
38 | would correctly match an input "declare abc", but as well an input"declareabc" what is usually not intended. In order to avoid this, we can |
---|
39 | use <tt>distinct_parser</tt>:</p> |
---|
40 | <code> |
---|
41 | <pre> |
---|
42 | <span class=comment>// keyword_p may be defined in the global scope |
---|
43 | </span><span class=identifier>distinct_parser</span><span class=special><> </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=string>"a-zA-Z0-9_"</span><span class=special>); |
---|
44 | |
---|
45 | </span><span class=identifier>rule</span><span class=special><</span><span class="identifier">ScannerT</span><span class=special>> </span><span class=identifier>declaration_rule </span><span class=special>= </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=string>"declare"</span><span class=special>) >> </span><span class=identifier>lexeme_d</span><span class=special>[+</span><span class=identifier>alpha_p</span><span class=special>]; |
---|
46 | </span></pre> |
---|
47 | </code> |
---|
48 | <p> |
---|
49 | The <tt>keyword_p</tt> works in the same way as the <tt>str_p</tt> parser but matches only |
---|
50 | when the matched input is not immediately followed by one of the characters |
---|
51 | from the set passed to the constructor of <tt>keyword_p</tt>. In the example the |
---|
52 | "declare" can't be immediately followed by any alphabetic character, any |
---|
53 | number or an underscore.</p> |
---|
54 | <p> |
---|
55 | See the full <a href="../example/fundamental/distinct/distinct_parser.cpp">example here </a>.</p> |
---|
56 | <h3>distinct_directive</h3><p> |
---|
57 | For more sophisticated cases, for example when keywords are stored in a |
---|
58 | symbol table, we can use <tt>distinct_directive</tt>.</p> |
---|
59 | <pre> |
---|
60 | <code><span class=identifier>distinct_directive</span><span class=special><> </span><span class=identifier>keyword_d</span><span class=special>(</span><span class=string>"a-zA-Z0-9_"</span><span class=special>); |
---|
61 | |
---|
62 | </span><span class=identifier>symbol</span><span class=special><> </span><span class=identifier>keywords </span><span class=special>= </span><span class=string>"declare"</span><span class=special>, </span><span class=string>"begin"</span><span class=special>, </span><span class=string>"end"</span><span class=special>; |
---|
63 | </span><span class=identifier>rule</span><span class=special><</span><span class="identifier">ScannerT</span><span class=special>> </span><span class=identifier>keyword </span><span class=special>= </span><span class=identifier>keyword_d</span><span class=special>[</span><span class=identifier>keywords</span><span class=special>]; |
---|
64 | </span></code></pre> |
---|
65 | <h3>dynamic_distinct_parser and dynamic_distinct_directive</h3><p> |
---|
66 | In some cases a set of forbidden follow-up characters is not sufficient. |
---|
67 | For example ASN.1 naming conventions allows identifiers to contain dashes, |
---|
68 | but not double dashes (which marks the beginning of a comment). |
---|
69 | Furthermore, identifiers can't end with a dash. So, a matched keyword can't |
---|
70 | be followed by any alphanumeric character or exactly one dash, but can be |
---|
71 | followed by two dashes.</p> |
---|
72 | <p> |
---|
73 | This is when <tt>dynamic_distinct_parser</tt> and the <tt>dynamic_distinct_directive </tt>come into play. The constructor of the <tt>dynamic_distinct_parser</tt> accepts a |
---|
74 | parser which matches any input that <strong>must NOT</strong> follow the keyword.</p> |
---|
75 | <pre> |
---|
76 | <code><span class=comment>// Alphanumeric characters and a dash followed by a non-dash |
---|
77 | // may not follow an ASN.1 identifier. |
---|
78 | </span><span class=identifier>dynamic_distinct_parser</span><span class=special><> </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=identifier>alnum_p </span><span class=special>| (</span><span class=literal>'-' </span><span class=special>>> ~</span><span class=identifier>ch_p</span><span class=special>(</span><span class=literal>'-'</span><span class=special>))); |
---|
79 | |
---|
80 | </span><span class=identifier>rule</span><span class=special><</span><span class="identifier">ScannerT</span><span class=special>> </span><span class=identifier>declaration_rule </span><span class=special>= </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=string>"declare"</span><span class=special>) >> </span><span class=identifier>lexeme_d</span><span class=special>[+</span><span class=identifier>alpha_p</span><span class=special>]; |
---|
81 | </span></code></pre> |
---|
82 | <p> |
---|
83 | Since the <tt>dynamic_distinct_parser</tt> internally uses a rule, its type is |
---|
84 | dependent on the scanner type. So, the <tt>keyword_p</tt> shouldn't be defined |
---|
85 | globally, but rather within the grammar.</p> |
---|
86 | <p> |
---|
87 | See the full <a href="../example/fundamental/distinct/distinct_parser_dynamic.cpp">example here</a>.</p> |
---|
88 | <h3>How it works</h3><p> |
---|
89 | When the <tt>keyword_p_1</tt> and the <tt>keyword_p_2</tt> are defined as</p> |
---|
90 | <code><pre> |
---|
91 | <span class=identifier>distinct_parser</span><span class=special><> </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=identifier>forbidden_chars</span><span class=special>); |
---|
92 | </span><span class=identifier>distinct_parser_dynamic</span><span class=special><> </span><span class=identifier>keyword_p</span><span class=special>(</span><span class=identifier>forbidden_tail_parser</span><span class=special>); |
---|
93 | </span></pre></code> |
---|
94 | <p> |
---|
95 | the parsers</p> |
---|
96 | <code><pre> |
---|
97 | <span class=identifier>keyword_p_1</span><span class=special>(</span><span class=identifier>str</span><span class=special>) |
---|
98 | </span><span class=identifier>keyword_p_2</span><span class=special>(</span><span class=identifier>str</span><span class=special>) |
---|
99 | </span></pre></code> |
---|
100 | <p> |
---|
101 | are equivalent to the rules</p> |
---|
102 | <code><pre> |
---|
103 | <span class=identifier>lexeme_d</span><span class=special>[</span><span class=identifier>chseq_p</span><span class=special>(</span><span class=identifier>str</span><span class=special>) >> ~</span><span class=identifier>epsilon_p</span><span class=special>(</span><span class=identifier>chset_p</span><span class=special>(</span><span class=identifier>forbidden_chars</span><span class=special>))] |
---|
104 | </span><span class=identifier>lexeme_d</span><span class=special>[</span><span class=identifier>chseq_p</span><span class=special>(</span><span class=identifier>str</span><span class=special>) >> ~</span><span class=identifier>epsilon_p</span><span class=special>(</span><span class=identifier>forbidden_tail_parser</span><span class=special>)] |
---|
105 | </span></pre></code> |
---|
106 | <table border="0"> |
---|
107 | <tr> |
---|
108 | <td width="10"></td> |
---|
109 | <td width="30"><a href="../index.html"><img src="theme/u_arr.gif" border="0"></a></td> |
---|
110 | <td width="30"><a href="scoped_lock.html"><img src="theme/l_arr.gif" border="0"></a></td> |
---|
111 | <td width="30"><a href="symbols.html"><img src="theme/r_arr.gif" border="0"></a></td> |
---|
112 | </tr> |
---|
113 | </table> |
---|
114 | <br> |
---|
115 | <hr size="1"> |
---|
116 | <p class="copyright">Copyright © 2003-2004 |
---|
117 | |
---|
118 | |
---|
119 | Vaclav Vesely<br><br> |
---|
120 | <font size="2">Use, modification and distribution is subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt) </font> </p> |
---|
121 | </body> |
---|
122 | </html> |
---|