The Service Pattern

This tutorial describes a design pattern that I discovered when working at Better. It explains how to structure the IO-layer of a Haskell application as a Service-oriented architecture. By IO-layer, I mean the set of all functions that can execute IO actions. At Better, we used this pattern extensively to modularize the IO-layer of our application.

Acknowledgements: thanks to Jasper Van der Jeugt for proof reading.

When to use the service pattern

The service pattern allows you to modularize the use of subsystems that are shared between different parts of the IO-layer of your application. Typical examples of such subsystems are logging and monitoring support, database access, caches, authentication and authorization support, or connection pools. We found that it is easier to refer to these sub-systems as services, which is what we are going to use in the remainder of this tutorial.

What is the service pattern

The service pattern defines a particular module structure for implementing services. Before giving a formal definition of this structure, I'm going to demonstrate it on a simple example.

A simple logging service

Logging is a central functionality in many applications. It is a must for web-application servers, and it provides a principled means to implement console output at different verbosity levels for command-line tools. Interestingly, logging is a functionality for which most applications require at least two different implementations. During execution all code should log to the same file or socket according to the verbosity level, while during testing we would like each test to have its own logger and we would like to verify that no warnings or errors are logged. We can achieve this polymorphicity as follows. We first introduce an abstract interface for a logger.

-- | This is an abstract interface for a simple logger. It is intended to be
-- imported qualified as follows.
--
-- > import qualified Acme.System.Logger as Logger
--
module Acme.System.Logger
  ( -- * Abstract handles
    Handle(..)

    -- * Priorities and convenience functions
  , Priority(..)
    -- | The following functions simplify logging at a fixed priority level.
  , logDebug
  , logInfo
  , logWarning
  , logError
  ) where

import qualified Data.Text as T

data Priority
    = Debug    -- ^ Debug messages
    | Info     -- ^ Notable information that requires no immediate action.
    | Warning  -- ^ Something is probably wrong, and we should investigate.
    | Error    -- ^ Something is wrong and immediate action is required.
    deriving (Eq, Ord, Show)

newtype Handle = Handle
    { log :: Priority -> T.Text -> IO () }

logDebug, logInfo, logWarning, logError :: Handle -> T.Text -> IO ()
logDebug   = (`log` Debug)
logInfo    = (`log` Info)
logWarning = (`log` Warning)
logError   = (`log` Error)

Functions that require access to logging can now be written against this abstract interface. For example, a naive web-crawler could have the signature.

import qualified Acme.System.Logger as Logger

collectLinksBelow :: Logger.Handle -> T.Text -> IO [T.Text]

We can write concrete implementations of this logger are as follows.

-- | A logger implementation that logs all messages to a 'System.IO.Handle'.
module Acme.System.Logger.Impl.FileHandle
  ( newHandle
  ) where

import qualified Acme.System.Logger      as Logger
import           Data.Monoid             (<>)
import           Control.Concurrent.MVar (newMVar, withMVar)
import qualified System.IO

-- | Create a new 'Logger.Handle' that logs to a 'System.IO.Handle'.
newHandle :: System.IO.Handle -> IO Logger.Handle
newHandle fileH = do
    -- We use a mutex to make our logger thread-safe.
    -- (Note that we should take this mutex as an argument for maximal
    -- compositionality.)
    mutex <- newMVar ()

    return $ Logger.Handle
      { Logger.log = \prio msg ->
            withMVar mutex $ \() ->
                hPutStrLn fileH $ show prio <> ": " <> T.unpack msg
      }

-- | A logger implementation that stores all messages in an `IORef` for latter
-- inspection by a test-harness.
module Acme.System.Logger.Impl.Pure
  ( withHandle
  ) where

import qualified Acme.System.Logger as Logger

import           Data.IORef (newIORef, atomicModifyIORef, readIORef)


-- | Create a 'Logger.Handle' records all messages and returns them once the
-- inner 'IO' action terminates.
withHandle :: (Logger.Handle -> IO a) -> (a, [(Priority, T.Text)])
withHandle io = do
    -- We use a shared 'IORef' to records all log messages in reverse order.
    msgsRef <- newIORef []
    -- Call inner IO action
    x <- io $ Handle
      { Logger.log = \prio msg ->
            -- We don't need the strict variant here, as we are just
            -- accumulating constructors.
            atomicModifyIORef msgsRef $ \msgs -> ((prio, msg) : msgs, ())
      }
    -- return the logged messages
    msgs <- readIORef msgsRef
    return (x, reverse msgs)

Formal definition of the service pattern

The service pattern is built around the two concepts of service specifications and service implementations. A service specification for a service X is a module of the following form.

-- | Specification of the X service. This module is intended to be imported
-- qualified as follows.
--
-- > import qualified YourModulePrefix.X as X
--
module YourModulePrefix.X
  (
    -- * Abstract handle
    Handle(..)

    -- * Pure types
    -- Here you export types that are necessary to interact with service X,
    -- but are too specific to fit into their own module, e.g., the priorities
    -- above in the logger service.

    -- * Derived functions
    -- Here you export derived functions that work for any implementation of
    -- service X.
    -- They are typically of the shape `fun :: Handle -> params -> IO result`

  ) where

data Handle = Handle
    { function1 :: !(param1_1 -> ... -> param1_N1 -> IO result1)
      ...
    , functionM :: !(paramM_1 -> ... -> paramM_NM -> IO resultM)
    }

Note that this is analogous to manually constructing type-class dictionaries. See the Section "Discussion of the service pattern" for a discussion on why we prefer this formulation to one based on type-classes.

A service implementation for a service X is a module of the following form.

-- | Implementation of the X service that uses ...
module YourModulePrefix.X.Impl.DescriptiveImplName
  ( Config(..)
  , withHandle
  ) where

-- Import specifications for services that we require for this implementation.
import qualified YourModulePrefix.Service1 as Service1
...
import qualified YourModulePrefix.ServiceN as ServiceN

-- Import specification that we are implementing.
import qualified YourModulePrefix.X        as X


-- | Pure representation of the configuration of your service implementation.
-- This type provides both named parameters for the 'withHandle' function and
-- a representation that can be easily serialized or parsed from an external
-- config file.
data Config = Config
    { cParam1 :: !t1
      ...
    , cParamM :: !tM
    }
    deriving (Eq, Show)

-- | Run an 'IO' action with access to an 'X.Handle'.
withHandle
    :: Config             -- ^ Configuration
    -> Service1.Handle    -- ^ Service dependency 1 (e.g. a logger)
    -> ...
    -> ServiceN.Handle    -- ^ Service dependency N
    -> (X.Handle -> IO a)
    -> IO a
withHandle config reqService1H ... reqServiceNH = do
    -- allocate implementation resources (e.g., a mutex)
    -- construct 'X.Handle'
    -- run inner action while making sure that we deallocate our resources
    --   both on exceptions and normal termination

-- For complicated services it often helps to define an implementation handle
-- as follows. It might even be that you have whole module structure below
-- 'YourModulePrefix.X.Impl.DescriptiveImplName' that you use to implement
-- service 'X'.
data IHandle = IHandle
    { ihConfig        :: !Config
    , ihService1      :: !Service1.Handle
      ...
    , ihServiceN      :: !ServiceN.Handle
    , ihImplResource1 :: !t1
      ...
    , ihImplResource1 :: !tM
    }

-- | Implementation of 'X.Handle.function1'.
function1 :: IHandle -> param1_1 -> ... -> param1_N -> IO result1

-- ... remaining service function implementations ...

These two definitions are quite long. To understand the definitions, it helps to compare them to the logger example given earlier. I also invite you to ponder how you could apply this pattern in your codebase. Sadly, I cannot point to the two codebases where I used this pattern, as both of them are closed-source.

Discussion of the service pattern

The key benefits gained from the service pattern are modularity and increased compositionality. We can use this pattern to structure our application purely as a composition of services. In fact the main function of the Better web-application, looks roughly as follows.

import Better.Http.Impl.Snap             as Http.Impl.Snap

import Better.Storage.Impl.Postgres      as Storage.Impl.Postgres

import Better.System.Monitor.Impl.Ekg    as Monitor.Impl.Ekg
import Better.System.Logger.Impl.Default as Logger.Impl.Default

-- NOTE that Postgres and LRUCache are services where we combine the
-- specification and the implementation, as we currently have only one
-- implementation. We can easily abstract later, once we need additional
-- implementations.
import Better.System.Postgres            as Postgres
import Better.System.LRUCache            as LRUCache

-- We use https://hackage.haskell.org/package/managed to avoid the
-- right-indent drift from the 'withXXX' functions.
import Control.Monad.Managed (runManaged, managed)

-- | Start the Better web-service.
main :: IO ()
main = do
    -- get name of config file from arguments
    configFile <- parseArguments

    -- parse service implementation configurations from the file
    monitorC  <- parseMonitorC configFile
    loggerC   <- parseLoggerC  configFile
    postgresC <- parsePostresC configFile
    -- ... more config parsing

    -- construct service handles and run until the process is killed
    runManged $ do
        monitorH  <- managed $ Monitor.Impl.Ekg.withHandle monitorC
        loggerH   <- managed $ Logger.Impl.Default.withHandle loggerC monitorH
        postgresH <- managed $ Postgres.withHandle postgresC monitorH loggerH
        lruCacheH <- managed $ LRUCache.withHandle lruCacheC monitorH

        -- Our storage service provides all functionality related to
        -- persistence. Its API is quite large. Our production implementation
        -- of this service uses PostgreSQL.
        storageH  <- managed $ Storage.Impl.Postgres storageC lruCacheC loggerH postgresH

        -- ... more services for implementing our business logic and in the end
        -- ... our snap handlers

        snapH <- managed $ Http.Impl.Snap.withHandle snapC monitorH loggerH handlers

        -- wait until the the process is killed
        forever $ threadDelay 100000

Note that with the service pattern we can push all implementation decisions to the main function, i.e., the root of our dependency tree. All service implementations themselves can be implemented completely independent of each other. Moreover, services implementations can make all their dependencies explicit, which enables us to mock them appropriately during testing. It also means that more of our code becomes platform independent, as we can switch dependencies easily.

In my opinion, there are two non-obvious design decisions in the above definition of the service pattern. First, it does not abstract over the concrete monad, but uses IO pervasively. Second, it does not use type-classes to specify abstract Handles. We discuss the rationale for these two decision in the following two sections.

Relying on just IO

The main reason why the functions in a service specification just use IO result instead of custom monad stacks is that this composes better. Monad stacks fix the order of arguments and produce friction at abstraction boundaries due to the necessary transformations between different monad stacks. We found it to be more pleasant to just pass around the handles to the necessary services as parameters (possibly grouped in a record). This way all dependencies are explicit and function application is all we need pass around handles.

On a related note, we found at Better that adding effects to IO is often unnecessary, as IO is powerful enough on its own. The typical use was just a MaybeT or EitherT to track exceptions explicitly in the types. On service boundaries, we typically translated these exceptions to a descriptive log-message and a simple Nothing.

There can be made a case for abstracting Handles over the the monad used to compute results, i.e., as data Handle m = Handle { fun1 :: ... -> m result }. This can be useful for building pure models of an application for testing. We found that is often sufficient to build semi-pure implementations that are based on hidden IORefs.

Why not use type classes to define handles?

The short answer is that we do not use type classes because we would neither gain further functionality nor code that is easier to reason about. The long answer is the following.

We see two options on how one can use type classes to manage service dependencies. The first one is to attach service specifications to the monad in which we compute the results. For example, we could define the following two type classes to model monads that support logging and database access.

class MonadLogger m where
    log :: Priority -> T.Text -> m ()

class MonadDB m  where
    runTransaction :: Transaction a -> m a

The service dependencies of any function are then captured by corresponding type class constraints, as shown in the following example.

someBusinessLogic
    :: (MonadLogger m, MonadDB m)
    => param1 -> ... -> paramN -> m result
someBusinessLogic ... = do
    mbResult <- runTransaction someTransaction
    case mbResult of
      Just result -> return result
      Nothing     -> do
        log Warning "falling back to default"
        return defaultResult

The main problem that we see with this approach is that it strongly favors the situation where all service dependencies are satisfied by a single instance. This situation is however rather seldom, as for example configuring using different loggers for different service instances is common (e.g., to set different verbosity levels). One can handle this case using mtl-style transformer stacks, but this requires more effort compared to just passing the handles explicitly. Moreover, the explicit handles also work in the case where we need to abstract over some pure functions, e.g., how to render a link to an HTTP API call on the backend. In this case, the monad typeclasses introduce unnecessary sequencing.

The second option to use type classes is to use functions of the form fun1 :: X.Handle h => h -> param1 -> ... -> paramN -> IO result. This would allow a single handle to implement multiple services. However, we are not sure where this would be beneficial. A service's dependencies should always be kept private; and in a service implementation we can just store the handles of service dependencies as fields in the implementation handle. In general, we found that being explicit about dependencies simplified reasoning about our service implementations and did not negatively impact refactoring.

How does this relate to object-oriented programming?

The implementation strategy used in the service pattern corresponds closely to how interfaces and classes work in OOP. Our service specifications declare both an interface at the type level, and a value representation for the interface at runtime.

In terms of memory consumption, our interfaces are heavier than typical class instances in an OOP language. We store at least one pointer to the instance-data per closure, whereas OOP implementations typically store a single pointer to the virtual method table together with the other instance-data.

In terms of call overhead, we pay the cost of an unknown function call, which is probably a bit slower than a virtual method invocation in an OOP langauge like Java. If this becomes a performance bottleneck, we will have to avoid the abstraction and specialize at compile time. Backpack will allow us to do this in a principled fashion without losing modularity.

Compared to OOP languages, our interfaces are more powerful, as they have a first order representation. Their functions can be mixed and matched freely, and we are not constrained to explain extension via inheritance.

Conclusion

This tutorial documents a pattern that I found to apply pervasively in the codebases I worked on. The description of the pattern is a bit long, but I invite you to give it a shot in your codebase. I am looking forward to hearing how it worked out. Happy coding :-)

The Service Pattern

Sections