Groups are special objects that can match a class of
characters. A group
is indicated by a pair of braces [
and ]
, with the
characters that belong to the class between the braces. So the pattern
//[13579]/will match any of the single characters 1, 3, 5, 7, or 9. This isn't very practical when many characters are involved, so there is a way to indicate a range of characters:
//[3-7f-p]/The above group contains the characters 3 to 7 and f to p. If the first character in the group is the character
//[^3-7f-p]/matches anything except for the digits 3 to 7 and the characters f to p. The above leaves one problem: How are the characters
[
,
]
or -
included in a group? This can be done by putting
them in a position in which they `cannot' occur, or which would make the
whole group meaningless as in:
//[]-[]/ //[^]^-[]/ //[-z]/The first group contains exactly the three special characters, the second group contains all characters except for the characters
]
,
-
, ^
and [
. The third group contains the two
characters - and z. To facilitate the use of the special non-ASCII
characters that occur in the native character fonts on some
computers there are some special ranges of characters: Whereas a-z means
all lower case regular characters a-ä means all lower case characters,
including the accented ones. The ä may be replaced by any other
accented character in the extended character set. Similarly A-Ä means
all uppercase characters including the ones in the extended set. The
range ä-ö (or any other two lower case extended characters) gives
all extended lower case characters, Ä-Ö gives all extended upper
case characters and ä-Ä gives all characters in the extended
character set.
Finally there are some shortcuts for groups that are used frequently. These are:
character | group |
# | [0-9] |
& | [a-äA-Ä] (all alphabetic characters) |
![]() |
[0-9a-äA-Ä] (all alphanumerics) |
! | any `word' character |
!![]() |
any character not in words |