I. Recursive definitions.

II. A Simple Recursive Procedure.

III. The Principle of Information Hiding.

IV. When Do You Use Recursion?

V. The Four Rules.

VI. Another Simple Example.

A *recursive definition *is a definition in which the thing
being defined occurs as part of its own definition.

Sounds circular, doesn't it? Circular definitions, in computer
science as elsewhere, are valueless and should be avoided. The
distinction between a recursive and a circular definition lies in the
use of the work "part": any recursive definition has a noncircular
part, or *basis, *which is like an ordinary definition, and
directly specifies some objects that fit the definition; and also a
recursive (circular) part, which allows you to use the objects that
fit the definition in determining whether new objects also fit the
definition.

Some examples should clarify these ideas.

Here is an ordinary definition:

*Vowel*: one of the letters "a," "e," "i," "o," "u," or "y".

Clearly, there are exactly six things that fit the above definition of a vowel. Now consider the following circular definition:

*Yowl*: any yowl followed by a vowel.

By this definition, any yowl followed by a vowel is also a yowl; thus, if we could find one thing which is a yowl, then any things formed by adding vowels to that yowl would themselves be yowls. The problem is in finding that first yowl. Since we have no place to start, the definition is circular, and valueless.

Now consider the following definition:

*Howl:*

- the letter "h," or
- any howl followed by a vowel.

This definition is recursive. The first part is an ordinary
definition, and tells us that the letter "h" is a howl; this gives up
a *basis,* or place to start. The second part is the recursive
part; it tells us that, since "h" is a howl, so are "ha," "he," "hi,"
"ho," "hu," and "hy". But since these are howls too, so also must be
"haa," "hae,"..., "hyy,".... "haaeeuooeea," etc.

Note that this is a "good" definition in the sense that some things are howls, some things are not howls, and it is easy to tell from the definition which are which. The "circularity" of the second part of the definition causes no harm, because the first part provides a basis.

Recursion abounds in computer science. The usual definition of "identifier" (or "variable name"), for example, is as follows.

*Identifier:*

- a letter, or
- an identifier followed by a letter or a digit.

Notice that the definitions of both "howl" and "identifier" allow arbitrarily long strings. It is possible to write recursive definitions which limit the size of the things being defined, but in general this is neither easy nor desirable. Instead, if there must be limits, they are set by some external agency. Some programming languages, for example, allow arbitrarily long identifiers, but require that two different identifiers must differ in the first k characters, where k is set by the implementation. In the same way, the maximum length of a howl might be determined by your lung capacity.

You have probably noticed that the above definitions don't
*have *to be recursive; they could be made into ordinary
definitions by using the phrase "any number of...". This is true--we
don't need recursion. In fact, recursion is *never *absolutely
necessary, merely useful.

The definitions we have considered so far have all been examples
of *direct *recursion. For an example of *indirect
*recursion, consider the following pair of definitions (adapted
from Lisp):

*S-expression: *an identifier or a list.

*List: *a sequence consisting of

- a left parenthesis,
- zero or more S- expressions separated by blanks, and
- a right parenthesis.

Thus, the following things are lists (and also S-expressions):

( )

(ZIP ZAP ZOT)

(ONE (TWO THREE) ((FOUR)) FIVE)

( ( ) IXOHOXI ( ) )

These definitions are mutually recursive: each is defined in terms of the other. The basis for the S-expression is an identifier, while the basis for the list is the sequence ( ).

To show that these definitions "work," consider the S-expression (NOT (HARD) ).

- (NOT (HARD) ) is an S-expression because it is a list.
- (NOT (HARD) ) is a list because it consists of a left parenthesis, the two S-expressions NOT and (HARD), and a right parenthesis.
- NOT is an S-expression because it is an identifier.
- (HARD) is an S-expression because it is a list.
- (HARD) is a list because it consists of a left parenthesis, the S- expression HARD, and a right parenthesis.
- HARD is an S-expression because it is an identifier.

Q.E.D.

Of course you don't have to go through this complete process every time you see an S-expression, but we were being very formal.

Now is a good time to put in a plug for the value of recursion. The definition of "list" given above may seem confusing at first (because you're not used to recursive definitions), but I challenge anyone to write a reasonable definition of "list" which is equivalent to the one given above, yet does not use any form of recursion.

A *recursive procedure *(or function, or subroutine) is a
procedure which calls itself to to part of the work.

Again, the trick is in the use of the word "part". If it called
itself to do *all *of the work, the result would be an infinite
recursion (the recursive equivalent of an infinite loop). Just as a
loop does some part of the work during each iteration of the loop, so
must a recursive procedure do some part of the work at each level of
recursion, until eventually all the work is done.

The usual first example of recursion is the factorial function. This is in many respects a poor example, but we will use it for the same reason everyone else does: it is simple.

The *factorial *of a positive integer n, written n!, is the
product of all the positive integers from 1 up to and including n.
Thus, we have

1! = 1

2! = 1 * 2 * 2

3! = 1 * 2 * 3 = 6

4! = 1 * 2 * 3 * 4 = 24

etc.

Note that, for example, 4! = 1 * 2 * 3 * 4 = (1 * 2 * 3) * 4 = 3! * 4, and in general n! = (n - 1)! * n. This leads to the following recursive definition:

The *factorial *of a positive integer n, written n!, is

(1) one, if n = 1 (basis), or

(2) (n - 1)! * n, if n > 1 (recursion).

This definition leads immediately to the following computer program (written in Pascal):

functionfactorial (n: integer): integer;begin ifn = 1thenfactorial := 1elsefactorial := factorial (n - 1) * nend

This function has a basis (n = 1) and a recursive part (factorial (n - 1)), and it does part of the work (multiplying by n) at each level. Thus it seems as though the function might possibly work. But if this is the first time you have seen recursion, you probably are not at all comfortable with it.

In fact, the program does work. The main thing wrong with it is that it's a stupid way to compute a factorial--a loop would be simpler and better. Have patience.

Many authors suggest that the best way to understand a recursive
routine is to trace through it, keeping track of what happens at
every level of the recursion. By all means work through such a trace,
if doing so helps you believe recursion can actually work. But you
should *never *think of tracing through a recursive procedure as
a means of understanding, or debugging, such a procedure. "Tracing
through" has probably kept hundreds of people from ever really
understanding recursion. The purpose of this paper is to describe a
better technique.

Perhaps the single most important tool we have for controlling the complexity of programs is modularization: breaking a single complex program up into several simpler, logically independent modules (procedures, functions, etc.).

Whenever a program is broken into two parts, there comes into
being an *interface *between them. This interface consists of
the parameter list, any global or common variables they may share
between, them, and often more subtle ways of information
transmission. If modularization is to succeed in reducing complexity,
this interface must be kept as *narrow *and as *explicit
*as possible.

The best way to keep the interface narrow is to ensure that each procedure (module) does just one thing, and that that thing can be easily described in an English sentence with few or no "buts" and "excepts" in it. Keeping parameter lists short and avoiding global variables also helps.

The best way to make the interface explicit is to ensure, whenever possible, that all information transmission occurs via the parameter list; shun global variables. There are those who feel global variables should be avoided entirely, but I don't take such an extreme position. It is often more convenient to have global access to arrays and other large data structures; just take special care not to mess them up.

Side effects should also be avoided. The *main effect *of a
routine is the thing it's supposed to do; *side effects *are the
little extra things it does. Side effects are unhealthful even when
they're intentional. For example, if a routine `sort(A,n)`
sorts an array `A` of `n` integers, that's its main
effect; but if it changes `n` in the process, that's a side
effect. The bad thing about it is that it happens in the sort
routine, but errors which result from this side effect will show up
in the routines which call the sort routine. Since these routines may
themselves be correct, the likely result is that the programmer
spends hours trying to debug the wrong routines.

We are now in a position to state the Principle of Information
Hiding: *every routine should mind its own business*. No routine
should meddle in the affairs of another.

If some routine calls a sort routine, that routine should neither know nor care just how the sort routine operates. Nor should the sort routine know why its calling routine wants the sorting done.

Here are three simple examples of violations of the Principle of Information Hiding. In each case the function is supposed to return the maximum value in an array of integers. Can you spot the violations?

functionmax (varA:array[1 . . n]of integer): integer;beginsort(A,n); max := A[1]end; functionmax (A:array[1 . . n]ofinteger): integer;beginmax := A[1];fori := 2tondoifA[1] > maxthenmax := A[i]end; functionmax (A:array[1 . . n]ofinteger):integer;vari : integer;beginmax := 0;fori := 1tondo ifA[i] > maxthenmax := A[i]end;

Answers:

- Sorting the array is a rather drastic side effect.
- The function uses (and changes) the global variable "i".
- This function should not know that A contains only positive numbers.

These examples are trivial, and in real life are not likely to cause major problems. Things get worse when the rest of the program gets more interesting.

The following "main program" controls a two-player game between the human ant the computer, in which the players alternate moves. The Principle of Information Hiding has been largely respected in this example; indeed, it is impossible to deduce from the program just which game is being played.

beginsetup; choosefirst (player);repeat ifplayer = "human"then begin repeataccept (move); checklegality (move,ok)untilok; make (player, move, gameover); player := "computer"end else(* player = "computer" *)begindecide (move); make (player, move, gameover); player := "human"end untilgameoverend.

It is possible to understand this program and verify its correctness, given very simple assumptions about what the called routines do. For example, we would expect "choosefirst (player)" to somehow decide who plays first, and to set its parameter to "human" or "computer" accordingly. All information relevant to the operation of the main program is fully specified in the parameter lists.

I have not shown any of the declarations. There is, presumably, a structured variable "board" which represents the playing board; since there is only one of it, and since it is used by all routines, it seems to me to make more sense to make "board" a global variable rather than to put it in the parameter list of every routine. (It may be that procedure "decide" uses a scratch board to do its thinking on--but that's its business.)

A more serious violation of the Principle of Information Hiding is variable "move," which must somehow be declared in the main program; to do this, we must know how a move can be specified. True, the variable is not really used in the main program, just passed along from one procedure call to the next. There do exist languages which recognize this and allow you to defer the declaration, but you won't be programming in one of them anytime soon. Some things we just learn to live with.

Now consider an alternate version of the same program.

beginsetup; choosefirst (player);repeat ifplayer = "human"then beginrepeataccept (move); checklegality (move); k := k - 1untilok; make (move)end else(* player = "computer" *)begindecide (move); make (move); player := "computer"end untilgameoverend

Due to the many violations of the Principle of Information Hiding, it is impossible to tell whether the program is correct or not, without extensive reference to the called routines.

- What routine or routines update "player"? Do they do it in such a way that the main program works?
- If the human's move is not ok, has "player" already been updated?
- How does "make" know whose move to make?
- Who sets "ok"?
- When does the game end? Who's responsible for deciding this?
- Is "k" being initialized properly? Computed properly? What is it anyway, and who uses it?

It may seem at this point that we have drifted away from the topic of recursion, and perhaps we have, since the Principle of Information Hiding applies to all programs. As we shall see in section V, however, it has a special importance for recursive programs.

It is an easily proved fact that you never *need *recursion;
any recursive program can be changed into a nonrecursive one, for
example by the use of stacks. (In the same way you can prove than any
program could be written in absolute binary.)

The best answer to the question of when to use recursion is simply, when you happen to find it useful. You never just set out to "use recursion," any more than you just set out to "use a loop". You program; sometimes you loop, sometimes you use an array, sometimes you do input/output, and sometimes you recur.

But that's not a very useful answer when you're first getting started. We'll try to be more specific.

One fairly good rule of thumb is to use recursion when you're processing recursively defined data structures. If you try to evaluate an arithmetic expression, for example, parentheses may be used to enclose a "subexpression" which must be evaluated first, and is an expression in its own right. The only reasonable way to to this is to write a recursive expression evaluation routine; loops alone don't suffice.

Another (equivalent) rule is to handle nested structures with recursion. As an example, in many languages any statement may be nested in an IF statement, even another IF statement. There is a clear use for recursion in any processor (compiler, preprocessor, interpreter, debugger, pretty printer) for such a language.

Finally, programs which use a single stack can often be written more clearly as recursive programs which don't use a stack. (Programs which use two or more stacks, however, often cannot be rewritten as recursive programs without any stacks; but the proof of this fact is beyond the scope of the current paper.)

Here we propose four rules which, if understood and followed, will result in working recursive programs. Once you become comfortable with recursion you will realize that these rules are not absolute, and can be violated for good cause, if care is taken; but stick with them for now.

You already know that a recursive routine must have some base cases, cases that it can compute directly without need for recursion. For example, in the factorial program the only base case is n - 1, and the result returned is 1. The very first thing you should do upon entry to a recursive routine is to test whether you have a base case,

and if you do, to whatever needs to be done for that case and
*return. *Don't mess around, and don't recur, just to what needs
to be done and get out.

This rule is crucial for preventing circularity and infinite recursion. If you ever, even once, recur with the same (or harder) problem, your program immediately disappears off into Cloud- Cuckoo Land.

What, however, does it mean to be "simpler"? That depends on the problem. "Simpler" can mean: with a smaller number, with a smaller array, with a more nearly sorted array, with a shorter expression, with a larger number, a more nearly random array, or just about anything else.

Once you have established your base cases, those are the
"simplest" cases, and "simpler" is anything that is in some clear way
"closer" to one or more of those base cases. Sometimes it is obvious
when we are getting closer, as in the factorial function: for n >
1, clearly n - 1 is closer to 1 than n is. Similarly when we evaluate
expressions, it may be obvious that a shorter expression is a simpler
one; or perhaps it is an expression with fewer parentheses, or fewer
arithmetic operators. When you write a recursive function, *you
*must be clear in your mind just when a problem is "closer" to the
base case (hence "simpler"), and you must stick to it.

Suppose we try to speed up the factorial function by doing two multiplications at each level of recursion, rather than just one. We might get

functionfactorial (n: integer): integer;begin ifn - 1thenfactorial := 1elsefactorial := n * (n - 1) * factorial (n - 2)end

This doesn't work. Before you read any further, try to figure our the error for yourself.

The problem is that n - 2 isn't necessarily "closer" to 1 than n is. In particular, if n is even, so is n - 2, and soon we will overshoot 1 and try to compute the factorials of 0, -2, -4, -6, and so on. It is part of the notion of "closer" that (in a finite system), if we get closer enough times, sooner or later we will get there.

So you to have to be careful that your notion of "closer" will sooner or later get you there.

If this all sounds vague and mysterious, don't panic. For any
particular, concrete problem, it is 99% certain that it will be
obvious what meaning to attach to the word "simpler". But *do
*take twenty seconds, before you rush ahead with the program, to
decide what the word means. Then each time you write a recursive
call, make sure you recur with a simpler problem.

Consider the following sequence:

- compute the value of x,
- call a subprogram,
- use x for something.

If the subprogram destroys the value of x, then clearly there is a problem. We return, finally, to the Principle of Information Hiding: unless it is the specific purpose of the subprogram to modify x for us, the subprogram has no business changing x, or even knowing that x exists. In short, the subprogram should not use x.

Now for the kicker. If the subprogram call is recursive--if the
routine we are in calls itself--then the called routine *must
*use the same variable names as the calling routine. After all,
it's really the same routine both times.

You should now be horrified. If you aren't, you missed something; go back to the beginning of rule 3, and read more carefully this time.

The implication seems to be that you can't use variables in a
recursive routine; or, more precisely, every recursive call destroys
the value of all your variables. This is pretty intolerable.
Fortunately, there is an easy solution: *never change the value of
a global variable*.

If you have forgotten about scope rules, review them now. They have suddenly become important.

Briefly, you cannot avoid using the same *names *over again
in a recursive routine, but if you *declare *them inside the
routine they are then local (hence, *different) *variables, and
cannot interfere with other variables of the same name but at a
different level of recursion.

A recursive routine thus may access three types of variables: (1) local variables, which are declared inside the routine itself, and are totally safe; (2) parameters, which behave rather like local variables, and are safe so long as you are careful to avoid aide effects; and (3) global variables, which are extremely unsafe--you may look at their current values, but if you change those values you are playing with fire.

Moral: Make sure *all *your variables are either local or are
parameters. If you can't, make sure you never change the value of a
global variable. And if you must alter global variables, think long
and hard about what you're doing.

Fortunately, it is easy to get into the habit of avoiding globals. You don't have to think through the reasons each time. Just avoid them.

That is, when you reach a recursive call, don't go back to the top of your routine and start working through it all over again. Don't try to "trace through" recursive calls. That way lies madness.

Consider this: if you are working your way through a complex but nonrecursive routine, and it calls, say a sorting routine, do you drop everything and rush off to see if the sorting routine works? Of course not. You simply assume that the sorting routine works (and if you have doubts you save them for later), and keep going in the routine you're checking. Rule 4 simply says to treat all calls this way, even recursive ones.

It may seem odd at first to assume the called routine works, when that called routine happens to be the very one you're checking. However, you must do so: the human mind does have limitations. Pretend, if it helps, that you are not actually recurring. but rather calling an entirely different routine (which happens to have the same name) written by someone you trust.

What is involved here is perhaps best described as an "act of
faith." You have to learn to trust recursion; to believe that, if you
can get the routine correct *at this level*, then you don't have
to worry about any deeper levels. Until you can bring yourself to
make this commitment, you will be compelled to look down into the
recursion, and you will never untangle what you see there.

Think about rules one to three, and try to convince yourself logically that a program written according to these rules ought to work. Then, even if you can't yet manage rule four, pretend you do when you write your program.

Now run your program. It will bomb, of course--programs always
do--but steadfastly refuse to look down. Go over rules one, two, and
three, and make sure you didn't violate one of these. Next, go over
the program the same way you always would, looking for errors that
have nothing to do with the recursion itself (the error doesn't
*have *to be in the recursion, remember). If this too fails, get
help.

Don't even think about looking down, for this one simple reason: it doesn't help.

If the notion of "faith" bothers you, think about the novice programmer who complains that his program can't be wrong, so the computer must have made a mistake. You may have once done this yourself. In time, all programmers learn to trust the computer; now you must learn to trust recursion.

This time we will devise a recursive Pascal program to find the maximum value in an array. Again, alas, it would be better to do this without recursion, but it is difficult to find simple examples that really need recursion, at least until we have learned about some recursively defined data structures.

We will use recursion to find the maximum value in an array. What is a simpler problem of the same sort? Obviously, finding the maximum of a smaller array. So we will plan to recur only with smaller arrays, and our base case will be the smallest array--say, an array of only one element.

Here is one approach: pluck a number off one end of the array; recursively find the maximum of the remaining array; compare the maximum to the one number plucked off, and the maximum is the larger of these. For a base case, the maximum of a single-element array is that element.

While this approach is workable, it isn't very interesting. Here's an approach I like better: divide the array (approximately) in half; recursively find the maximum of each half; return the larger of these two maxima as the grand maximum. Again, the base case is a one-element array, with that element as the maximum.

Here is a first cut at writing the program. Remember that the base case is put first.

functionmax(A):begin ifthe array is only one element longthenreturn that elementelsebeginfind the midpoint of array A; leftmax := max(left half of A); rightmax := max(right half of A);ifleftmax > rightmaxthenmax := leftmaxelsemax := rightmaxend end

We have handled the base case first, and each recursive call is with a simpler case (the array size is closer to one). We now run up against a practical problem, having little or nothing to do with recursion: how do we pass part of an array as a parameter?

In Pascal, at least, the answer is that we don't. What we can do, instead, is to pass the entire array, and two variables, "low" and "high," to specify the lower bound and the higher bound, respectively, of the subarray that we will be using. Thus the original call to max will be of the form "max(A,l,n)" where n is the size of the array. Our next cut, using proper Pascal syntax for the parameters, is

functionmax(A:array[1 . . n]of integer;low, high:integer): integer;begin iflow = highthenmax := A[low]else beginmid := trunc((low + high) /2); leftmax := max(A, low, mid); rightnax := max(A, mid+l, high);ifleftmax > rightmaxthenmax := leftmaxelsemax := rightmaxend end

This program satisfies rules one and two. When we check rule
three, we find some undeclared variables that must be made *local
*to max; declaring them globally would be a disaster. Hence we put
the following declaration immediately before the first **begin**:

varmid, leftmax, rightmax: integer;

and we're done.