C++ Boost

Boost.Regex

Collating Element Names

Boost.Regex Index


Contents

Digraphs
POSIX Symbolic Names
Unicode Symbolic Names

Digraphs

The following are treated as valid digraphs when used as a collating name:

"ae", "Ae", "AE", "ch", "Ch", "CH", "ll", "Ll", "LL", "ss", "Ss", "SS", "nj", "Nj", "NJ", "dz", "Dz", "DZ", "lj", "Lj", "LJ".

POSIX Symbolic Names

The following symbolic names are recognised as valid collating element names, in addition to any single character:

Name Character
NUL \x00
SOH \x01
STX \x02
ETX \x03
EOT \x04
ENQ \x05
ACK \x06
alert \x07
backspace \x08
tab \t
newline \n
vertical-tab \v
form-feed \f
carriage-return \r
SO \xE
SI \xF
DLE \x10
DC1 \x11
DC2 \x12
DC3 \x13
DC4 \x14
NAK \x15
SYN \x16
ETB \x17
CAN \x18
EM \x19
SUB \x1A
ESC \x1B
IS4 \x1C
IS3 \x1D
IS2 \x1E
IS1 \x1F
space \x20
exclamation-mark !
quotation-mark "
number-sign #
dollar-sign $
percent-sign %
ampersand &
apostrophe '
left-parenthesis (
right-parenthesis )
asterisk *
plus-sign +
comma ,
hyphen -
period .
slash /
zero 0
one 1
two 2
three 3
four 4
five 5
six 6
seven 7
eight 8
nine 9
colon :
semicolon ;
less-than-sign <
equals-sign =
greater-than-sign >
question-mark ?
commercial-at @
left-square-bracket [
backslash \
right-square-bracket ]
circumflex ~
underscore _
grave-accent `
left-curly-bracket {
vertical-line |
right-curly-bracket }
tilde ~
DEL \x7F

Named Unicode Characters

When using Unicode aware regular expressions (with the u32regex type), all the normal symbolic names for Unicode characters (those given in Unidata.txt) are recognised.


Revised 12 Jan 2005

© Copyright John Maddock 2004-2005

Use, modification and distribution are subject to the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)