Regular expressions

At times the user may wish to search for a pattern rather than for a fixed string. A pattern is a description of all strings that should be acceptable during the search. An example of a pattern would be "all strings that start with an A and end with a B and don't contain any blanks". It is of course necessary to have a language for the specification of patterns. The language that is used follows the definitions in the book by Aho, Sethi and Ullman ("Compilers, principles, techniques and tools, Addison Wesley 1986, page 148) rather closely. This means that people who are familiar with UNIX will have to note only a few differences (mainly extensions) over what they are used to. In addition the current implementation has fewer restrictions and extensions have been made to facilitate the matching or replacement of linefeeds.

Of course the greater generality of using complete patterns rather than a fixed string makes a search operation much slower. Therefore the user should select the use of patterns specifically by starting the search or search and replace operation with // rather than with a single /. In the single slash mode the searching is performed with the Boyer and Moore algorithm, while in the double slash mode searching uses a complicated pattern matching ``engine''. The language which defines the patterns is defined by Aho et al. and is referred to as regular expressions. It is possible to define patterns that take so much time during the searching, that the user may decide to discontinue the operation. In several implementations of stedi this can be done by pressing the key combination that indicates a break. In a UNIX version this would be accomplished by pressing Ctrl-C.

In addition to the speed advantage the use of the single / offers also the advantage of simplicity. There are very few special characters, so the searching for strings containing characters that have a special meaning in the language of the regular expressions doesn't need special thought.



Subsections