Groups

Groups are special objects that can match a class of characters. A group is indicated by a pair of braces [ and ], with the characters that belong to the class between the braces. So the pattern

    //[13579]/
will match any of the single characters 1, 3, 5, 7, or 9. This isn't very practical when many characters are involved, so there is a way to indicate a range of characters:
    //[3-7f-p]/
The above group contains the characters 3 to 7 and f to p. If the first character in the group is the character $\wedge$ the group contains all characters except for the characters that are mentioned. So
    //[^3-7f-p]/
matches anything except for the digits 3 to 7 and the characters f to p. The above leaves one problem: How are the characters [, ] or - included in a group? This can be done by putting them in a position in which they `cannot' occur, or which would make the whole group meaningless as in:
    //[]-[]/
    //[^]^-[]/
    //[-z]/
The first group contains exactly the three special characters, the second group contains all characters except for the characters ], -, ^ and [. The third group contains the two characters - and z. To facilitate the use of the special non-ASCII characters that occur in the native character fonts on some computers there are some special ranges of characters: Whereas a-z means all lower case regular characters a-ä means all lower case characters, including the accented ones. The ä may be replaced by any other accented character in the extended character set. Similarly A-Ä means all uppercase characters including the ones in the extended set. The range ä-ö (or any other two lower case extended characters) gives all extended lower case characters, Ä-Ö gives all extended upper case characters and ä-Ä gives all characters in the extended character set.

Finally there are some shortcuts for groups that are used frequently. These are:

character group
# [0-9]
& [a-äA-Ä] (all alphabetic characters)
$\sim$ [0-9a-äA-Ä] (all alphanumerics)
! any `word' character
!$\wedge$ any character not in words



Special groups


The word characters are explained on page [*]. These shortcuts are an extension over the regular UNIX definitions.