We will show that, if L is a context free language, then strings of L that are at least m symbols long can be "pumped" to produce additional strings in L. (The value of m depends on the particular language.)

Let L be an infinite context-free language. Then there is some positive integer m such that, if S is a string of L of length at least m, then

- S = uvwxy (for some u, v, w, x, y)
- |vwx| m
- |vx| 1
- uv
^{i}wx^{i}y L

Let's see what this says.

- If S is a sufficiently long string, then there are two substrings, v and x, somewhere in S. There is stuff (u) before v, stuff (w) between v and x, and stuff (y) after x.
- The stuff between v and x won't be too long, because |vwx| can't be larger than m.
- Substrings v and x won't both be empty (though either one could be).
- If we duplicate substring v some number (i) of times, and duplicate x the same number of times, the resultant string will also be in L.

Copyright © 1996 by David Matuszek

Last modified Mar 19, 1996