What are free variables?
Consider the function literal for curried multiplication:
(x: Int) => (y: Int) => x * y
If you consider only the subexpression e := (y: Int) => x * y, then you can compute the set of free variables in e recursively as follows (starting from leafs x and y, and then building up the expression from its parts):
FREE(x) = {x} because x is a variable
FREE(y) = {y} because y is a variable
FREE(*) = {} because * is a built-in constant, for all practical purposes
FREE(x * y) = FREE(*) U FREE(x) U FREE(y) = {} U {x} U {y} = {x, y} by recursive descent into subtrees
FREE( (y: Int) => x * y ) = FREE(x * y) - {y} = {x, y} - {y} = {x}, because the variable y is bound by the in-line function definition, and therefore the variable y from FREE(x * y) becomes no longer free.
Thus the variable y is bound and x is free in the expression (y: Int) => x * y.
What are closures?
Now, what happens if you take the whole expression (x: Int) => (y: Int) => x * y and apply it to some constant, 7, say? What is returned?
The returned object is a closure of the expression (y: Int) => x * y, where the free variable x is set to value 7. So, in a sense, the result of
((x: Int) => (y: Int) => x * y)(7)
is something like an object with a single method, that looks roughly like
class y_to_x_times_y_closure(x: Int) {
def apply(y: Int) = x * y
}
instantiated as y_to_x_times_y_closure(7). Note that the value 7 cannot reside in the call stack of some function, because you can easily produce a whole bunch of closures like
for (i <- 0 to 1000) yield ((x: Int) => (y: Int) => x * y)(i))
that will have to coexist simultaneously in the same thread, with different captured values of i bound to the variable x. Therefore, the closures in Scala are implemented as heap-resident objects.
Why is this useful?
This is actually a way too broad question: it enables you to do functional programming. It's built into the language at such a deep level, that you can't really do anything without using tons of closures everywhere.
Consider the following trivial example: printing the multiplication table of one-digit numbers:
for (x <- 1 to 9) {
for (y <- 1 to 9) {
printf("%3d", x * y)
}
println()
}
This produces:
1 2 3 4 5 6 7 8 9
2 4 6 8 10 12 14 16 18
3 6 9 12 15 18 21 24 27
4 8 12 16 20 24 28 32 36
5 10 15 20 25 30 35 40 45
6 12 18 24 30 36 42 48 54
7 14 21 28 35 42 49 56 63
8 16 24 32 40 48 56 64 72
9 18 27 36 45 54 63 72 81
Those for-loops are actually desugared into the following method applications on the Range objects 1 to 9:
(1 to 9) foreach { x =>
(1 to 9) foreach { y =>
printf("%3d", x * y)
}
println()
}
What's the y => printf("%3d", x * y)-thing that is passed to the inner foreach? It is actually an expression with the free variable x. In order to print anything meaningful, it has to capture the value of x, which is in turn set to values 1 to 9 by the outer foreach. Thus, the inner foreach works with a closure of y => printf("%3d", x * y), where x is set to values 1 to 9.