Go to the previous, next section.
In any particular syntax for regular expressions, some characters are
always special, others are sometimes special, and others are never
special. The particular syntax that Regex recognizes for a given
regular expression depends on the value in the syntax
field of
the pattern buffer of that regular expression.
You get a pattern buffer by compiling a regular expression. See section GNU Pattern Buffers, and section POSIX Pattern Buffers, for more information
on pattern buffers. See section GNU Regular Expression Compiling, section POSIX Regular Expression Compiling, and section BSD Regular Expression Compiling, for more information on compiling.
Regex considers the value of the syntax
field to be a collection
of bits; we refer to these bits as syntax bits. In most cases,
they affect what characters represent what operators. We describe the
meanings of the operators to which we refer in section Common Operators,
section GNU Operators, and section GNU Emacs Operators.
For reference, here is the complete list of syntax bits, in alphabetical
order:
@cnindex RE_BACKSLASH_ESCAPE_IN_LIST
RE_BACKSLASH_ESCAPE_IN_LISTS
- If this bit is set, then `\' inside a list (see section List Operators (
[
... ]
and [^
... ]
)
quotes (makes ordinary, if it's special) the following character; if
this bit isn't set, then `\' is an ordinary character inside lists.
(See section The Backslash Character, for what `\' does outside of lists.)
@cnindex RE_BK_PLUS_QM
RE_BK_PLUS_QM
- If this bit is set, then `\+' represents the match-one-or-more
operator and `\?' represents the match-zero-or-more operator; if
this bit isn't set, then `+' represents the match-one-or-more
operator and `?' represents the match-zero-or-one operator. This
bit is irrelevant if
RE_LIMITED_OPS
is set.
@cnindex RE_CHAR_CLASSES
RE_CHAR_CLASSES
- If this bit is set, then you can use character classes in lists; if this
bit isn't set, then you can't.
@cnindex RE_CONTEXT_INDEP_ANCHORS
RE_CONTEXT_INDEP_ANCHORS
- If this bit is set, then `^' and `$' are special anywhere outside
a list; if this bit isn't set, then these characters are special only in
certain contexts. See section The Match-beginning-of-line Operator (
^
), and
section The Match-end-of-line Operator ($
).
@cnindex RE_CONTEXT_INDEP_OPS
RE_CONTEXT_INDEP_OPS
- If this bit is set, then certain characters are special anywhere outside
a list; if this bit isn't set, then those characters are special only in
some contexts and are ordinary elsewhere. Specifically, if this bit
isn't set then `*', and (if the syntax bit
RE_LIMITED_OPS
isn't set) `+' and `?' (or `\+' and `\?', depending
on the syntax bit RE_BK_PLUS_QM
) represent repetition operators
only if they're not first in a regular expression or just after an
open-group or alternation operator. The same holds for `{' (or
`\{', depending on the syntax bit RE_NO_BK_BRACES
) if
it is the beginning of a valid interval and the syntax bit
RE_INTERVALS
is set.
@cnindex RE_CONTEXT_INVALID_OPS
RE_CONTEXT_INVALID_OPS
- If this bit is set, then repetition and alternation operators can't be
in certain positions within a regular expression. Specifically, the
regular expression is invalid if it has:
- a repetition operator first in the regular expression or just after a
match-beginning-of-line, open-group, or alternation operator; or
- an alternation operator first or last in the regular expression, just
before a match-end-of-line operator, or just after an alternation or
open-group operator.
If this bit isn't set, then you can put the characters representing the
repetition and alternation characters anywhere in a regular expression.
Whether or not they will in fact be operators in certain positions
depends on other syntax bits.
@cnindex RE_DOT_NEWLINE
RE_DOT_NEWLINE
- If this bit is set, then the match-any-character operator matches
a newline; if this bit isn't set, then it doesn't.
@cnindex RE_DOT_NOT_NULL
RE_DOT_NOT_NULL
- If this bit is set, then the match-any-character operator doesn't match
a null character; if this bit isn't set, then it does.
@cnindex RE_INTERVALS
RE_INTERVALS
- If this bit is set, then Regex recognizes interval operators; if this bit
isn't set, then it doesn't.
@cnindex RE_LIMITED_OPS
RE_LIMITED_OPS
- If this bit is set, then Regex doesn't recognize the match-one-or-more,
match-zero-or-one or alternation operators; if this bit isn't set, then
it does.
@cnindex RE_NEWLINE_ALT
RE_NEWLINE_ALT
- If this bit is set, then newline represents the alternation operator; if
this bit isn't set, then newline is ordinary.
@cnindex RE_NO_BK_BRACES
RE_NO_BK_BRACES
- If this bit is set, then `{' represents the open-interval operator
and `}' represents the close-interval operator; if this bit isn't
set, then `\{' represents the open-interval operator and
`\}' represents the close-interval operator. This bit is relevant
only if
RE_INTERVALS
is set.
@cnindex RE_NO_BK_PARENS
RE_NO_BK_PARENS
- If this bit is set, then `(' represents the open-group operator and
`)' represents the close-group operator; if this bit isn't set, then
`\(' represents the open-group operator and `\)' represents
the close-group operator.
@cnindex RE_NO_BK_REFS
RE_NO_BK_REFS
- If this bit is set, then Regex doesn't recognize `\'digit as
the back reference operator; if this bit isn't set, then it does.
@cnindex RE_NO_BK_VBAR
RE_NO_BK_VBAR
- If this bit is set, then `|' represents the alternation operator;
if this bit isn't set, then `\|' represents the alternation
operator. This bit is irrelevant if
RE_LIMITED_OPS
is set.
@cnindex RE_NO_EMPTY_RANGES
RE_NO_EMPTY_RANGES
- If this bit is set, then a regular expression with a range whose ending
point collates lower than its starting point is invalid; if this bit
isn't set, then Regex considers such a range to be empty.
@cnindex RE_UNMATCHED_RIGHT_PAREN_ORD
RE_UNMATCHED_RIGHT_PAREN_ORD
- If this bit is set and the regular expression has no matching open-group
operator, then Regex considers what would otherwise be a close-group
operator (based on how
RE_NO_BK_PARENS
is set) to match `)'.
Go to the previous, next section.