## Recursion

A recursive process is one that refers to itself in its own definition^{1}
. If you’re looking for some intuition, search for recursion on Google and try the *Did you mean recursion* link. Clearer yet, consider this function:

`def endless: Nothing = endless`

You don’t need to understand lambda calculus^{2}
or evaluation-by-substitution to understand that a call to `endless`

will never terminate. The first call makes a second call, which makes a third call, ad infinitum. But of course all other recursive functions are more complex than this one, so we must ask the question: how *does* a recursive function evaluate?

### Substitution

Without discussing details like compiler optimizations, we can think of code evaluation in a functional paradigm as the substitution of a statement with its definition. Substitution only works for so-called “purely functional” code, where any two identical expressions always evaluate to the same result. Many examples of non-functional code would fail such an equivalence test. For example, consider constructions involving the current date time, or *any* two invocations of the same constructor, which look identical but create distinct, distinguishable instances of the same class.

As an example to demonstrate substitution-based evaluation, we will consider two canonical algorithms: factorial and greatest common divisor. Here are naïve definitions:

```
// factorial computes n! = n * (n-1) * (n-2) * ... * 1
def factorial(n: Int): Long =
if (n == 1) 1
else n * factorial(n - 1)
```

```
// gcd computes the greatest integer that divides both a and b, using the
// Euclidean Algorithm, which works on the basis that:
// a == c (mod b) => gcd(a, b) == gcd(b, c)
def gcd(a: Int, b: Int): Int =
if (b == 0) a
else gcd(b, a % b)
```

These definitions are considered to be functional, in that they do not involve state mutations (i.e. no variables, no constructors for complex objects that track state). As such, we can apply a substitution model of expression evaluation, replacing each function call to `factorial`

or `gcd`

with the given definitions.

The idea seems obvious, but it is powerful because it complies with mathematical intuition. After all, there is no “external” or “mutable” state lurking in a mathematical proof, capable of derailing the process; there is only a sequence of claims, one following the next by virtue of a legal substitution. In the context of functional programming, a legal substitution will generally mean substituting a function call with its definition, replacing parameter names with the corresponding values provided in the call.

As an example, let’s first use substitution to evaluate a simple instance of `factorial`

:

`factorial(3)`

`if (3 == 1) 1 else 3 * factorial(3 - 1)`

`3 * factorial(3 - 1)`

`3 * factorial(2)`

`3 * (if (2 == 1) 1 else 2 * factorial(2 - 1))`

`3 * (2 * factorial(2 - 1))`

`3 * (2 * factorial(1))`

`3 * (2 * (if (1 == 1) 1 else 1 * factorial(1 - 1)))`

`3 * (2 * (1))`

`3 * (2)`

`6`

Take note of the shape of this substitutive process, how the expression grows and shrinks with respect to the parameter, `n`

. Contrast it with the example of a simple `gcd`

call:

`gcd(9, 6)`

`if (6 == 0) 9 else gcd(6, 9 % 6)`

`gcd(6, 9 % 6)`

`gcd(6, 3)`

`if (3 == 0) 6 else gcd(3, 6 % 3)`

`gcd(3, 6 % 3)`

`gcd(3, 0)`

`if (0 == 0) 3 else gcd(0, 3 % 0)`

`3`

You will notice that the `factorial`

expressions grows with `n`

; that is, a new operation is appended with each recursive call, which does not collapse until the recursion completes and the multiplicative chain is rolled up, whereas the `gcd`

expressions do not grow with respect to `a`

and `b`

because each recursive call simply replaces the previous call with different values. Thus, we can call `gcd`

**tail-recursive**. And the fact that `factorial`

is not tail recursive will quickly become a serious problem.

### Stack overflow

If you search for “tail recursion” on Google, you will likely see a post entitled *algorithm - What is tail recursion?* on StackOverflow. Fittingly, if you do not understand tail recursion, you are likely to cause an error called **stack overflow**.

First, we must first understand the basic idea of the call stack, which is the mechanism a process uses to manage constituent sub-processes. Think of the process as your program and each function call as a sub-process, no matter how shallowly- or deeply-nested. Consider this example:

```
def process = {
println("process starts")
sub1
println("process ends")
}
def sub1 = {
println("sub1 starts")
sub2
println("sub1 ends")
}
def sub2 = {
println("sub2 starts")
println("sub2 ends")
}
```

```
> process starts
> sub1 starts
> sub2 starts
> sub2 ends
> sub1 ends
> process ends
```

Think of the call stack as, literally, a stack, which is last-in-first-out (LIFO) data structure. As each process is pushed onto the stack, the stack grows. Only when a process returns a value does it get popped off the stack. The call stack has a static allocation of memory, so when it runs out the program will report a stack overflow error and crash.

Now, recall the evaluation of factorial. How many calls can get pushed onto the stack before it runs out of memory? That depends on the environment in which the code is running, but in general, the stack will overflow relatively quickly. Factorial is not a good example because its value grows with `n`

at such a pace the stack *might* survive long enough to first exhaust the maximum `Int`

or `Long`

value, but we can consider an example demonstrating how pernicious stack overflows can be.

```
// altSum computes the sum of -1 + 2 - 3 + 4 ... for n steps, starting at step 1
// e.g. altSum(4) = -1 + 2 - 3 + 4 = 2
def altSum(n: Long): Long = {
if (n == 0) 0
else {
val sign = (if (n % 2 == 0) 1 else -1).toLong
(n * sign) + altSum(n - 1)
}
}
```

*Thanks to Brilliant.org for the inspiration ^{4}
for this example!*

Of course, there are proofs that can describe the behavior of this function as `n`

goes to infinity, but what if we want to poke around computationally, possibly taking some action at each iterative step? Let’s give it a shot:

```
altSum(10)
> 5
altSum(11)
> -6
altSum(100)
> 50
altSum(101)
> -51
altSum(1000)
> 500
altSum(1001)
> -501
altSum(10000)
> java.lang.StackOverflowError
at recursion.A$A11$A$A11.f1(sum.sc:54)
at recursion.A$A11$A$A11.f1(sum.sc:54)
at recursion.A$A11$A$A11.f1(sum.sc:54)
at recursion.A$A11$A$A11.f1(sum.sc:54)
...
```

Pretty quickly, at that last example, we get a `java.lang.StackOverflowError`

error, even though we suspect that the answer is `5000`

, which doesn’t even come close to the maximum `Int`

value of `2147483647`

, not to mention the maximum `Long`

value of `9223372036854775807`

.

Can we fix `altSum`

to avoid stack overflows, pushing the results closer to the boundaries of these maximum values?

### Tail recursion

If the problem is a growing call stack, how can we restructure our code to exist in a bounded stack space? Ideally, the return value of the function should be a single recursive function call; in other words, the function needs to encapsulate and evaluate the dangling operations that would otherwise accrue. As we’ve already mentioned, this pattern is called **tail recursion**.

```
def altSum(n: Long): Long = {
def asAcc(n: Long, acc: Long): Long = {
if (n == 0) acc
else {
val sign = (if (n % 2 == 0) 1 else -1).toLong
asAcc(n - 1, acc + (n * sign))
}
}
asAcc(n, 0)
}
```

```
altSum(10000)
> 5000
altSum(10001)
> -5001
altSum(10000000)
> 5000000
altSum(10000001)
> -5000001
altSum(1000000000)
> 500000000
altSum(1000000001)
> -500000001
```

That’s better! To see this in action, and to make *extremely* clear why it works, let’s step through a smaller call, as we did for `factorial`

and `gcd`

:

`altSum(3)`

`asAcc(3, 0)`

```
if (3 == 0) 0
else {
val sign = (if (3 % 2 == 0) 1 else -1).toLong
asAcc(3 - 1, 0 + (3 * sign))
}
```

```
val sign = (if (3 % 2 == 0) 1 else -1).toLong
asAcc(3 - 1, 0 + (3 * sign))
```

```
val sign = -1
asAcc(3 - 1, 0 + (3 * sign))
```

`asAcc(3 - 1, 0 + (3 * -1))`

`asAcc(2, -3)`

At this point, we’ve only made it through one complete cycle, but that’s all we need to see that **the compounding operation must be evaluated for it to be passed as a parameter, known as the accumulator, into the recursive call**. That way, the stack flushes after a relatively small number of operations. It’s subtle, but we can be convinced that this new version of the function will operate in bounded stack space.

#### Access to acc

Notice that there’s a side-effect, which is really the primary effect: **we have access to the accumulator** within the function. What else can we do with it? In this case, we can see that the function oscillates and diverges, but we can also show via algebraic manipulation that the infinite sum evaluates to -^{1}⁄_{4}. Does having access to `acc`

give us the ability to glean any insights about how the -^{1}⁄_{4} could make intuitive sense?

For instance, we can compute a running average of the accumulating `sum`

by turning `acc`

into a tuple that records both at each iteration.

```
def altSum(n: Long): (Long, Double) = {
def asAcc(i: Long, n: Long, acc: (Long, Double)): (Long, Double) = {
if (i > n) acc
else {
val sign = (if (i % 2 == 0) 1 else -1).toLong
val sum = acc._1 + (i * sign)
val avg = ((acc._2 * (i-1)) + sum) / i
asAcc(i + 1, n, (sum, avg))
}
}
asAcc(1, n, (0, 0))
}
```

```
altSum(1)
> (-1,-1.0)
altSum(2)
> (1,0.0)
altSum(3)
> (-2,-0.6666666666666666)
altSum(4)
> (2,0.0)
altSum(100)
> (50,-2.842170943040401E-16)
altSum(101)
> (-51,-0.5049504950495052)
altSum(1000)
> (500,5.684341886080802E-17)
altSum(1001)
> (-501,-0.5004995004995004)
altSum(1000000)
> (500000,-1.0419171303510665E-14)
altSum(1000001)
> (-500001,-0.5000004999995105)
```

Thanks to our accumulator’s running average, we are able to gain an important insight. The running average oscillates between 0 and -^{1}⁄_{2}, lending some supporting intuition to the algebraic outcome of -^{1}⁄_{4}.

### To forget loops…

We’ve now seen a few recursive examples that could have been represented with loops. Can we convert *any* loop into a recursive call? There are certainly academic arguments, like the Church-Turing thesis^{6}
, but let’s instead attempt to build a simple implementation. For instance, what is a functional way of looking at a simple `for`

loop in Javascript that increments and logs a counter?

```
for (x = 1; x <= 3; x++) {
console.log(x)
console.log("hello")
console.log('hello')
}
```

```
> 1
> 2
> 3
```

Functional design preaches to **parameterize everything**—after all, if we can’t mutate state, then the only way to generate new values is to feed what we have into a function as parameters, then return the new value(s). Here, the parameter `x`

is obvious. But what else can we pull out? How about `console.log(x)`

? Most functional languages will support functions as first-class citizens, meaning a function is a value that can be passed along, so we can abstract everything that happens in the for loop closure as a parameter called `action`

.

If we’re going to parameterize functions, we must also capture the for loop’s condition and final expression. Otherwise, without a breaking condition and a way to get there, our recursive function will run forever. We can call those `condition`

and `next`

, respectively.

Putting the parts together, we can mimic Javascript, in which a for loop^{7}
is evaulated thusly:

- initialization (
`params`

) - condition (
`condition`

) - statement (
`action`

) - final expression (
`next`

)

```
const loop = (params, condition, action, next) => {
if (condition(params)) {
action(params);
return loop(next(params), condition, action, next);
}
};
const params = { x: 1 };
const condition = (params) => params.x <= 3;
const action = (params) => console.log(params.x);
const next = (params) => ({ x: params.x + 1 });
loop(params, condition, action, next);
```

```
> 1
> 2
> 3
```

### …or not to forget loops

Just because we *can* replace loops with recursion whenever we like, doesn’t mean we should. In fact, there’s a sense in which all of our anti-loop efforts are just a delusion, given that as our code gets compiled and run it may be rewritten into equivalent but non-recursive forms. For an example of when loops may be more beneficial to use, consider Scala’s own implementation of `foldLeft`

and `foldRight`

.

*This example was brought to my attention by Matt Malone on his blog,* Old Fashioned Software^{8}
*.*

```
def foldLeft[B](z: B)(@deprecatedName('f) op: (B, A) => B): B = {
var acc = z
var these = this
while (!these.isEmpty) {
acc = op(acc, these.head)
these = these.tail
}
acc
}
```

```
def foldRight[B](z: B)(@deprecatedName('f) op: (A, B) => B): B =
if (this.isEmpty) z
else op(head, tail.foldRight(z)(op))
```

At first glance, the functional programmer gasps! (Oh, the horror of `var`

.) But the pragmatic programmer thinks, tests, and determines (like Matt Malone) that for very long lists, `foldLeft`

is likely to succeed while `foldRight`

is likely to cause a stack overflow. In fact, the best way to `foldRight`

safely is to `reverse.foldLeft`

!

The bottom line is that as often as recursion compact, elegant, and exciting, it is difficult, inefficient, and impractical. Knowing why (or, better, knowing how and when to use it) is an essential tool for any functional programmer.

### Proofs

#### Euclid’s algorithm for greatest common divisor

See WolframMathWorld: Euclidean Algorithm for more^{9}
.

```
a == c (mod b)
(a - c) == 0 (mod b)
b | (a - c)
exists y : by == a - c
c == a - by
if d|c, then d|a and d|b
d == gcd(a, b) == gcd(b, c)
```

```
gcd(10, 4)
10 = 2 (mod 4)
(10 - 2) = 0 (mod 4)
4y = (10 - 2)
2 = (10 - 4y)
=> gcd(10, 4) = gcd(4, 2)
```

```
gcd(4, 10 % 4 = 2)
gcd(2, 4 % 2 = 0)
> 2
```

#### Alternating sum

See the original Brilliant.org problem for several explanations^{4}
.

```
S = - 1 + 2 - 3 + 4 ...
define T as:
T = + 1 - 1 + 1 - 1 ...
T - 1 = - 1 + 1 - 1 + 1 ...
T + T - 1 = + 1 - 1 + 1 - 1 ...
- 1 + 1 - 1 + 1 ...
= 0
T + T - 1 = 0
2T = 1
T = 1/2
S + T = - 1 + 2 - 3 + 4 ...
+ 1 - 1 + 1 - 1 ...
= 0 + 1 - 2 + 3 ...
= - S
2S = -T
= -1/2
S = -1/4
```