Foldable.mapM_, Maybe, and recursive functions

As of March 2020, School of Haskell has been switched to read-only mode.

I've run into this issue myself, and seen others hit it too. Let's start off with some very simple code:

sayHi :: Maybe String -> IO ()
sayHi mname =
    case mname of
        Nothing -> return ()
        Just name -> putStrLn $ "Hello, " ++ name

main :: IO ()
main = sayHi $ Just "Alice"

There's nothing amazing about this code, it's pretty straight-forward pattern matching Haskell. And at some point, many Haskellers end up deciding that they don't like the explicit pattern matching, and instead want to use a combinator. So the code above might get turned into one of the following:

import qualified Data.Foldable as F
hiHelper :: String -> IO ()
hiHelper name = putStrLn $ "Hello, " ++ name

sayHi1 :: Maybe String -> IO ()
sayHi1 = maybe (return ()) hiHelper

sayHi2 :: Maybe String -> IO ()
sayHi2 = F.mapM_ hiHelper

main :: IO ()
main = do
    sayHi1 $ Just "Alice"
    sayHi2 $ Just "Bob"
    -- or often times this:
    F.forM_ (Just "Charlie") hiHelper

The theory is that all three approaches (maybe, mapM_, and forM_) will end up being identical. We can fairly conclusively state that forM_ will be the exact same thing as mapM_, since it's just mapM_ flipped. So the question is: will the maybe and mapM_ approaches do the same thing? In this case, the answer is yes, but let's spice it up a bit more. First, the maybe version:

import qualified Data.Text.Lazy as T
import qualified Data.Foldable as F
import Control.Monad (when)

printChars :: Int -> T.Text -> IO ()
printChars idx t = maybe (return ()) (\(c, t') -> do
    when (idx `mod` 100000 == 0)
        $ putStrLn $ "Character #" ++ show idx ++ ": " ++ show c
    printChars (idx + 1) t') (T.uncons t)

main :: IO ()
main = printChars 1 $ T.replicate 5000000 $ T.singleton 'x'

The code above works correctly in constant space. However, the usage of maybe makes this a bit ugly. This is a common time to use forM_ to syntactically clean things up. So let's give that a shot:

import qualified Data.Text.Lazy as T
import qualified Data.Foldable as F
import Control.Monad (when)

printChars :: Int -> T.Text -> IO ()
printChars idx t = F.forM_ (T.uncons t) $ \(c, t') -> do
    when (idx `mod` 100000 == 0)
        $ putStrLn $ "Character #" ++ show idx ++ ": " ++ show c
    printChars (idx + 1) t'

main :: IO ()
main = printChars 1 $ T.replicate 5000000 $ T.singleton 'x'

The code is certainly cleaner and easier to follow. However, try running it: you'll get a stack overflow. The issue is that the implementation of mapM_ in Data.Foldable is not tail recursive. As a result, each recursive call ends up accumulating a bunch of "do nothing" actions to perform after completing the recursive call, which wipes out the stack.

Fortunately, solving this issue is pretty easy: write a tail-recursive version of forM_ for Maybe:

import qualified Data.Text.Lazy as T
import qualified Data.Foldable as F
import Control.Monad (when)

forM_Maybe :: Monad m => Maybe a -> (a -> m ()) -> m ()
forM_Maybe Nothing _ = return ()
forM_Maybe (Just x) f = f x

printChars :: Int -> T.Text -> IO ()
printChars idx t = forM_Maybe (T.uncons t) $ \(c, t') -> do
    when (idx `mod` 100000 == 0)
        $ putStrLn $ "Character #" ++ show idx ++ ": " ++ show c
    printChars (idx + 1) t'

main :: IO ()
main = printChars 1 $ T.replicate 5000000 $ T.singleton 'x'

There's one slight difference in the type of forM_Maybe and forM_ specialized to Maybe. The former takes a second argument of type a -> m (), while the latter takes a second argument of type a -> m b. This difference is unfortunately necessary; if we try to get back the original type signature, we have to add an extra action to wipe out the return value, which again reintroduces the stack overflow:

import qualified Data.Text.Lazy as T
import qualified Data.Foldable as F
import Control.Monad (when)

forM_Maybe :: Monad m => Maybe a -> (a -> m b) -> m ()
forM_Maybe Nothing _ = return ()
-- show
forM_Maybe (Just x) f = f x {-hi-}>> return (){-/hi-}
-- /show

printChars :: Int -> T.Text -> IO ()
printChars idx t = forM_Maybe (T.uncons t) $ \(c, t') -> do
    when (idx `mod` 100000 == 0)
        $ putStrLn $ "Character #" ++ show idx ++ ": " ++ show c
    printChars (idx + 1) t'

main :: IO ()
main = printChars 1 $ T.replicate 5000000 $ T.singleton 'x'

mono-traversable

I'd like to address this issue in mono-traversable, but it would require changing the type of mapM_ and forM_. I'm tempted to do so, but am interested if it causes breakage for anyone. If you have an opinion on this, please comment on the Github issue. For the record, here's the same stack overflow with mono-traversable.

import qualified Data.Text.Lazy as T
import Data.MonoTraversable (oforM_)
import Control.Monad (when)

printChars :: Int -> T.Text -> IO ()
printChars idx t = oforM_ (T.uncons t) $ \(c, t') -> do
    when (idx `mod` 100000 == 0)
        $ putStrLn $ "Character #" ++ show idx ++ ": " ++ show c
    printChars (idx + 1) t'

main :: IO ()
main = printChars 1 $ T.replicate 5000000 $ T.singleton 'x'
comments powered by Disqus