1 | [section Localization and Regex Traits] |
---|
2 | |
---|
3 | [h2 Overview] |
---|
4 | |
---|
5 | Matching a regular expression against a string often requires locale-dependent information. For example, |
---|
6 | how are case-insensitive comparisons performed? The locale-sensitive behavior is captured in a traits class. |
---|
7 | xpressive provides three traits class templates: `cpp_regex_traits<>`, `c_regex_traits<>` and `null_regex_traits<>`. |
---|
8 | The first wraps a `std::locale`, the second wraps the global C locale, and the third is a stub traits type for |
---|
9 | use when searching non-character data. All traits templates conform to the |
---|
10 | [link boost_xpressive.user_s_guide.concepts.traits_requirements Regex Traits Concept]. |
---|
11 | |
---|
12 | [h2 Setting the Default Regex Trait] |
---|
13 | |
---|
14 | By default, xpressive uses `cpp_regex_traits<>` for all patterns. This causes all regex objects to use |
---|
15 | the global `std::locale`. If you compile with `BOOST_XPRESSIVE_USE_C_TRAITS` defined, then xpressive will use |
---|
16 | `c_regex_traits<>` by default. |
---|
17 | |
---|
18 | [h2 Using Custom Traits with Dynamic Regexes] |
---|
19 | |
---|
20 | To create a dynamic regex that uses a custom traits object, you must use _regex_compiler_. |
---|
21 | The basic steps are shown in the following example: |
---|
22 | |
---|
23 | // Declare a regex_compiler that uses the global C locale |
---|
24 | regex_compiler<char const *, c_regex_traits<char> > crxcomp; |
---|
25 | cregex crx = crxcomp.compile( "\\w+" ); |
---|
26 | |
---|
27 | // Declare a regex_compiler that uses a custom std::locale |
---|
28 | std::locale loc = /* ... create a locale here ... */; |
---|
29 | regex_compiler<char const *, cpp_regex_traits<char> > cpprxcomp(loc); |
---|
30 | cregex cpprx = cpprxcomp.compile( "\\w+" ); |
---|
31 | |
---|
32 | The `regex_compiler` objects act as regex factories. Once they have been imbued with a locale, |
---|
33 | every regex object they create will use that locale. |
---|
34 | |
---|
35 | [h2 Using Custom Traits with Static Regexes] |
---|
36 | |
---|
37 | If you want a particular static regex to use a different set of traits, you can use the special `imbue()` |
---|
38 | pattern modifier. For instance: |
---|
39 | |
---|
40 | // Define a regex that uses the global C locale |
---|
41 | c_regex_traits<char> ctraits; |
---|
42 | sregex crx = imbue(ctraits)( +_w ); |
---|
43 | |
---|
44 | // Define a regex that uses a customized std::locale |
---|
45 | std::locale loc = /* ... create a locale here ... */; |
---|
46 | cpp_regex_traits<char> cpptraits(loc); |
---|
47 | sregex cpprx1 = imbue(cpptraits)( +_w ); |
---|
48 | |
---|
49 | // A sharthand for above |
---|
50 | sregex cpprx2 = imbue(loc)( +_w ); |
---|
51 | |
---|
52 | The `imbue()` pattern modifier must wrap the entire pattern. It is an error to `imbue` only |
---|
53 | part of a static regex. For example: |
---|
54 | |
---|
55 | // ERROR! Cannot imbue() only part of a regex |
---|
56 | sregex error = _w >> imbue(loc)( _w ); |
---|
57 | |
---|
58 | [h2 Searching Non-Character Data With [^null_regex_traits]] |
---|
59 | |
---|
60 | With xpressive static regexes, you are not limitted to searching for patterns in character sequences. |
---|
61 | You can search for patterns in raw bytes, integers, or anything that conforms to the |
---|
62 | [link boost_xpressive.user_s_guide.concepts.chart_requirements Char Concept]. The `null_regex_traits<>` makes it simple. It is a |
---|
63 | stub implementation of the [link boost_xpressive.user_s_guide.concepts.traits_requirements Regex Traits Concept]. It recognizes |
---|
64 | no character classes and does no case-sensitive mappings. |
---|
65 | |
---|
66 | For example, with `null_regex_traits<>`, you can write a static regex to find a pattern in a |
---|
67 | sequence of integers as follows: |
---|
68 | |
---|
69 | // some integral data to search |
---|
70 | int const data[] = {0, 1, 2, 3, 4, 5, 6}; |
---|
71 | |
---|
72 | // create a null_regex_traits<> object for searching integers ... |
---|
73 | null_regex_traits<int> nul; |
---|
74 | |
---|
75 | // imbue a regex object with the null_regex_traits ... |
---|
76 | basic_regex<int const *> rex = imbue(nul)(1 >> +((set= 2,3) | 4) >> 5); |
---|
77 | match_results<int const *> what; |
---|
78 | |
---|
79 | // search for the pattern in the array of integers ... |
---|
80 | regex_search(data, data + 7, what, rex); |
---|
81 | |
---|
82 | assert(what[0].matched); |
---|
83 | assert(*what[0].first == 1); |
---|
84 | assert(*what[0].second == 6); |
---|
85 | |
---|
86 | [endsect] |
---|