aboutform

About FORM

FORM is a Symbolic Manipulation System. Its original author is Jos Vermaseren of Nikhef, the Dutch institute for subatomic physics which is part of the Dutch physics granting agency FOM. Over the years several other people have made contributions. These people are: Albert Retey, Denny Fliegner, Andrei Onyshenko, Markus Frank, Misha Tentyukov and Takahiro Ueda (all at the univertity of Karlsruhe and paid for by the DFG) and Jens Vollinga, Irina Pushkina and Jan Kuipers (all at Nikhef and paid for by FOM). Much systems support has been given by the late Eric Wassenaar and by Ton Damen, both at Nikhef. Organizational support has been provided by Hans Staudenmaier and Hans Kühn (Karlsruhe) and the directors of Nikhef. Many people have put in effort to provide bug reports.

Symbolic Manipulation Systems exist in two varieties. In the most popular variety the system is set up to mimic the way humans work with formulas as much as possible. This carries the penalty of a certain loss in efficiency. For most users the efficiency isn't an issue. In the other variety the system is set up to make efficient use of the possibilities of a computer to allow fast processing of very large formulas. At first such systems are a bit harder to use, but they are far more powerful and hence can be used for solving problems that really tax the available resources. An analogy is that there are far more people using calculators than that there are people programming in FORTRAN or C. Most of the systems in the last category are rather specialized.

FORM is a system that tries to use the computer resources as efficiently as possible, while at the same time trying to be as general as possible. Its internal representations are far more compact than those of the popular Computer Algebra Systems (CAS), but at the same time this has not been overdone as speed of processing is the major optimization criterion. What is special about it is that formulas can be disk based without big penalties in the performance. This means that the available diskspace puts the practical limits on the size of expressions, rather than the amount of RAM memory.

It would be a mistake to see FORM as a faster version of the popular Computer Algebra Systems (CAS). Its programming model is completely different. For some people there may be an initial threshhold, but once this is passed the language is experienced as rather logical and natural. Very often people use both a CAS and FORM and create hybrid programs that move formulas between them. This way complicated problems both in physics and in mathematics have been solved.

The development of FORM started in 1984 and has been done on the following sequence of computers:

Apollo workstation (1984-1986)
Atari ST (1986-1991)
Next station (1991-1994)
Pentium PC's using the NextStep operating system. (1994-2000)
Pentium PC's and laptops using a variety of flavors of LINUX. (2000-now)
A dual Pentium system running LINUX. (2002-2006)
A quad Opteron system running LINUX. (2006-2011)
A 24 core Opteron system running LINUX. (2011-)

The special version called ParFORM has been developed at the university of Karlsruhe as part of a grant of the DFG. The computers used for this development are:

Compaq system with 8 Alpha processors (1999-2004)
SGI system with 32 itaniums (2004-now).
HP AMD Opteron cluster with 750 nodes of 4 cores each.

The new multithreaded version TFORM has been developed on the QUAD opteron system.

The versions of FORM are at the moment:

FORM: This is the sequential version which is meant to run on a single processor.
ParFORM: The multi processor version which can use clusters and systems in which the processors have their own memory. It can give benefits for more than two processors.
TFORM: The multi-threaded version of FORM for systems in which several processors share the same memory. Mainly for systems with a limited number of processors. Benefits start already with two processors.

Parallel processing involves a certain amount of overhead. Hence the naive theoretical limits will never be reached. Currently the total running time is given roughly by $T & = & c_1 + c_2/N + c_3 {}^2\log N$ in which the constants are problem dependent and N is the number of worker threads/processes.

Because parallel processing involves extra overhead which is not needed in the sequential version the theoretical limits for improvement will never be reached. In addition the parallelization model employs a master and N workers, in which there are tasks that involve only the master to be active while all the workers wait. This means that there will be saturation effects when the number of processors becomes rather large. The theoretical limits would be T > 1/P for TFORM and T > 1/(P-1) for ParFORM in which P is the number of processors and T the total clock time, normalized to the time needed by the sequential version. Future improvements would involve the reduction of the time spent by the master process while the workers are waiting. The current versions have been constructed with correctness in mind, and getting the system to work properly. Some work has been done in the field of optimizations, but there is still room for improvement. Experience with using these systems should show where the bottlenecks are. Also the trends in hardware development will play a role. The eventual improvement when using a system with several processors depends on the way the problem is solved and programmed. Many modules, each with a very small amount of work per term don't give as dramatic an improvement as modules in which there is much work per term. The user should experiment.