1 | [section:tips_n_tricks Tips 'N Tricks] |
---|
2 | |
---|
3 | Squeeze the most performance out of xpressive with these tips and tricks. |
---|
4 | |
---|
5 | [h2 Use Static Regexes] |
---|
6 | |
---|
7 | On average, static regexes execute about 10 to 15% faster than their |
---|
8 | dynamic counterparts. It's worth familiarizing yourself with the static |
---|
9 | regex dialect. |
---|
10 | |
---|
11 | [h2 Reuse _match_results_ Objects] |
---|
12 | |
---|
13 | The _match_results_ object caches dynamically allocated memory. For this |
---|
14 | reason, it is far better to reuse the same _match_results_ object if you |
---|
15 | have to do many regex searches. |
---|
16 | |
---|
17 | Caveat: _match_results_ objects are not thread-safe, so don't go wild |
---|
18 | reusing them across threads. |
---|
19 | |
---|
20 | [h2 Prefer Algorithms That Take A _match_results_ Object] |
---|
21 | |
---|
22 | This is a corollary to the previous tip. If you are doing multiple searches, |
---|
23 | you should prefer the regex algorithms that accept a _match_results_ object |
---|
24 | over the ones that don't, and you should reuse the same _match_results_ object |
---|
25 | each time. If you don't provide a _match_results_ object, a temporary one |
---|
26 | will be created for you and discarded when the algorithm returns. Any |
---|
27 | memory cached in the object will be deallocated and will have to be reallocated |
---|
28 | the next time. |
---|
29 | |
---|
30 | [h2 Prefer Algorithms That Accept Iterator Ranges Over Null-Terminated Strings] |
---|
31 | |
---|
32 | xpressive provides overloads of the _regex_match_ and _regex_search_ |
---|
33 | algorithms that operate on C-style null-terminated strings. You should |
---|
34 | prefer the overloads that take iterator ranges. When you pass a |
---|
35 | null-terminated string to a regex algorithm, the end iterator is calculated |
---|
36 | immediately by calling `strlen`. If you already know the length of the string, |
---|
37 | you can avoid this overhead by calling the regex algorithms with a `[begin, end)` |
---|
38 | pair. |
---|
39 | |
---|
40 | [h2 Compile Patterns Once And Reuse Them] |
---|
41 | |
---|
42 | Compiling a regex (dynamic or static) is more expensive than executing a |
---|
43 | match or search. If you have the option, prefer to compile a pattern into |
---|
44 | a _basic_regex_ object once and reuse it rather than recreating it over |
---|
45 | and over. |
---|
46 | |
---|
47 | [h2 Understand [^syntax_option_type::optimize]] |
---|
48 | |
---|
49 | The `optimize` flag tells the regex compiler to spend some extra time analyzing |
---|
50 | the pattern. It can cause some patterns to execute faster, but it increases |
---|
51 | the time to compile the pattern, and often increases the amount of memory |
---|
52 | consumed by the pattern. If you plan to reuse your pattern, `optimize` is |
---|
53 | usually a win. If you will only use the pattern once, don't use `optimize`. |
---|
54 | |
---|
55 | [h1 Common Pitfalls] |
---|
56 | |
---|
57 | Keep the following tips in mind to avoid stepping in potholes with xpressive. |
---|
58 | |
---|
59 | [h2 Create Grammars On A Single Thread] |
---|
60 | |
---|
61 | With static regexes, you can create grammars by nesting regexes inside one |
---|
62 | another. When compiling the outer regex, both the outer and inner regex objects, |
---|
63 | and all the regex objects to which they refer either directly or indirectly, are |
---|
64 | modified. For this reason, it's dangerous for global regex objects to participate |
---|
65 | in grammars. It's best to build regex grammars from a single thread. Once built, |
---|
66 | the resulting regex grammar can be executed from multiple threads without |
---|
67 | problems. |
---|
68 | |
---|
69 | [h2 Beware Nested Quantifiers] |
---|
70 | |
---|
71 | This is a pitfall common to many regular expression engines. Some patterns can |
---|
72 | cause exponentially bad performance. Often these patterns involve one quantified |
---|
73 | term nested withing another quantifier, such as `"(a*)*"`, although in many |
---|
74 | cases, the problem is harder to spot. Beware of patterns that have nested |
---|
75 | quantifiers. |
---|
76 | |
---|
77 | [endsect] |
---|