The order of evaluation in Haskell is determined by one and only one thing: Data Dependence.
An expression will only be evaluated when it is needed, and will not be evaluated if it is not needed.
Pure Code
Consider the following code:
f x = v2 + 4
where v0 = undefined + 1
v1 = 2 * x
v2 = v1 / 3 + x
main = putStrLn $ show (f 10)
The order of the variable definitions in the where clause has no effect whatsoever on the evaluation of this function.
f' x = v2 + 4
where v2 = v1 / 3 + x
v0 = undefined + 1
v1 = 2 * x
main = putStrLn $ show (f' 10)
v0
is never evaluated in either f
or f'
since it is not needed by the result value v2 + 4
.
Now that we've shown what won't be evaluated, how do we go about seeing when things are evaluated?
Detour: Tracing in Haskell
The standard Haskell libraries come with a module Debug.Trace
which has a trace
function
trace :: String -> a -> a
which will output the given string to stderr
before returning its second argument.
However we can't use trace
by itself to track when things are evaluated because trace
, like everything else, is lazy. So evaluating trace str x
will cause str
to be output without forcing x
to be evaluated.
import Debug.Trace
main = putStrLn $ show $ trace "res" $ trace "1" 1 + trace "2" 2
As we can see, res
is output before 1
and 2
.
However, we can write a function to force evaluation before tracing.
import Debug.Trace
tr msg x = seq x $ trace msg x
main = putStrLn $ show $ tr "res" $ tr "1" 1 + tr "2" 2
Now, as expected, res
is output after 1
and 2
.
Order of Evaluation
We can now see explicitly when each term is evaluated.
import Debug.Trace
tr msg x = seq x $ trace msg x
f' x = v2 + 4
where v2 = tr "v2" $ v1 / 3 + x'
v0 = tr "v0" $ undefined + 1
v1 = tr "v1" $ 2 * x'
x' = tr "x" x
main = putStrLn $ show (f' 10)
I've introduced x'
so we can also see when x
is evaluated.
x
is needed by v1
which is needed by v2
which is needed by the resulting value which is printed. Therefore the order of evaluation is: x
, v1
, v2
.
The same principle holds true for expressions anywhere, not just in where
clauses.
data E a b = L a | R b
keepRs [] = []
keepRs (R x : xs) = x : keepRs xs
keepRs (L _ : xs) = keepRs xs
g = sum . keepRs
main = putStrLn $ show (g [R 10, L undefined, R 20])
No expressions underneath a L
constructor in the input list to g
ever gets evaluated since keepRs
never uses the expressions underneath L
constructors.
To get a more detailed picture of how evaluation is working, we can have a traced version of the above.
import Debug.Trace
tr msg x = seq x $ trace msg x
data E a b = L a | R b
keepRs [] = []
keepRs (R x : xs) = x : keepRs xs
keepRs (L _ : xs) = keepRs xs
g = sum . keepRs
main = putStrLn $ show (g [(tr "R_0" R) (tr "R_0's 10" 10),
(tr "L" L) (tr "undefined" undefined),
(tr "R_1" R) (tr "R_1's 20" 20)
]
)
Notice the outer constructors (L
and R
) of each list element are evaluated; they are needed by keepRs
to figure out which clause to apply. However, only the arguments of the R
constructors are evaluated when they are needed by sum
.
One thing to mention is that bang patterns can be used to create strict data constructors which can force the evaluation of their arguments. For example, suppose we rewrite the previous code with a strict version of E
:
data E a b = L !a | R !b
keepRs [] = []
keepRs (R x : xs) = x : keepRs xs
keepRs (L _ : xs) = keepRs xs
g = sum . keepRs
main = putStrLn $ show (g [R 10, L undefined, R 20])
Now, even though keepRs
doesn't use the arguments of L
, they still get evaluated since L
is strict.
Monadic Code
Even in monadic code, data dependence is the only thing which determines if and when an expression gets evaluated. Although the order of monadic actions affects a program-- it determines the order in which the monadic operations are performed and in which variables are brought into scope-- it does not determine the order in which expressions get evaluated.
Consider the following monadic version of our first example, extended with some extra monadic actions.
import Debug.Trace
tr msg x = seq x $ trace msg x
fM x = do
x' <- return $ tr "x" x
v0 <- return $ tr "v0" $ undefined + 1
v1 <- return $ tr "v1" $ 2 * x'
v2 <- return $ tr "v2" $ v1 / 3 + x'
putStrLn "Enter an Int: "
v3 <- fmap (\x -> tr "v3" $ v0 + 2 + read x) getLine
return $ v2 + 4
main = putStrLn . show =<< fM 10
Even though the getLine
is executed, there is no error during evaluation. v0
does not get evaluated since it is only used in v3
, but v3
is not evaluated since it is not used by the result value. In fact, you can enter complete gibberish and there will still be no error.
To understand better what exactly monadic code causes to happen, we just need to look a little closer. do
notation is just syntactic sugar and can be expanded into regular Haskell syntax using >>=
:
do x <- m
f
where f
is some code which depends upon x
(i.e. in which x
occurs unbound) gets translated to
m >>= (\x -> f)
I have added parentheses to make the two arguments to >>=
clear.
The type of >>=
(>>=) :: Monad m => m a -> (a -> m b) -> m b
along with the left identity monad law
return x >>= f == f x
tell us that >>=
does not look at the expression it takes out of its monad and passes on to its second argument; consider the case where x
is undefined
and f
is const ()
.
The first argument of >>=
is only evaluated enough to get a monadic expression, but anything under the constructor for the monad type is not evaluated. For example, if we are in the Maybe
monad
data Maybe a = Nothing | Just a
instance Monad Maybe where
return x = Just x
(Just x) >>= f = f x
Nothing >>= _ = Nothing
>>=
will not cause anything under the Just
constructor to be evaluated.