The Service Pattern
This tutorial describes a design pattern
that I discovered when working at Better.
It explains how to structure
the IO
-layer of a Haskell application as a
Service-oriented architecture.
By IO
-layer,
I mean the set of all functions that can execute IO
actions.
At Better,
we used this pattern extensively
to modularize the IO
-layer of our application.
Acknowledgements: thanks to Jasper Van der Jeugt for proof reading.
When to use the service pattern
The service pattern allows you to modularize the use of subsystems
that are shared between different parts of the IO
-layer of your
application.
Typical examples of such subsystems are
logging and monitoring support,
database access,
caches,
authentication and authorization support,
or
connection pools.
We found that it is easier to refer to these sub-systems as services,
which is what we are going to use in the remainder of this tutorial.
What is the service pattern
The service pattern defines a particular module structure for implementing services. Before giving a formal definition of this structure, I'm going to demonstrate it on a simple example.
A simple logging service
Logging is a central functionality in many applications. It is a must for web-application servers, and it provides a principled means to implement console output at different verbosity levels for command-line tools. Interestingly, logging is a functionality for which most applications require at least two different implementations. During execution all code should log to the same file or socket according to the verbosity level, while during testing we would like each test to have its own logger and we would like to verify that no warnings or errors are logged. We can achieve this polymorphicity as follows. We first introduce an abstract interface for a logger.
-- | This is an abstract interface for a simple logger. It is intended to be
-- imported qualified as follows.
--
-- > import qualified Acme.System.Logger as Logger
--
module Acme.System.Logger
( -- * Abstract handles
Handle(..)
-- * Priorities and convenience functions
, Priority(..)
-- | The following functions simplify logging at a fixed priority level.
, logDebug
, logInfo
, logWarning
, logError
) where
import qualified Data.Text as T
data Priority
= Debug -- ^ Debug messages
| Info -- ^ Notable information that requires no immediate action.
| Warning -- ^ Something is probably wrong, and we should investigate.
| Error -- ^ Something is wrong and immediate action is required.
deriving (Eq, Ord, Show)
newtype Handle = Handle
{ log :: Priority -> T.Text -> IO () }
logDebug, logInfo, logWarning, logError :: Handle -> T.Text -> IO ()
logDebug = (`log` Debug)
logInfo = (`log` Info)
logWarning = (`log` Warning)
logError = (`log` Error)
Functions that require access to logging can now be written against this abstract interface. For example, a naive web-crawler could have the signature.
import qualified Acme.System.Logger as Logger
collectLinksBelow :: Logger.Handle -> T.Text -> IO [T.Text]
We can write concrete implementations of this logger are as follows.
-- | A logger implementation that logs all messages to a 'System.IO.Handle'.
module Acme.System.Logger.Impl.FileHandle
( newHandle
) where
import qualified Acme.System.Logger as Logger
import Data.Monoid (<>)
import Control.Concurrent.MVar (newMVar, withMVar)
import qualified System.IO
-- | Create a new 'Logger.Handle' that logs to a 'System.IO.Handle'.
newHandle :: System.IO.Handle -> IO Logger.Handle
newHandle fileH = do
-- We use a mutex to make our logger thread-safe.
-- (Note that we should take this mutex as an argument for maximal
-- compositionality.)
mutex <- newMVar ()
return $ Logger.Handle
{ Logger.log = \prio msg ->
withMVar mutex $ \() ->
hPutStrLn fileH $ show prio <> ": " <> T.unpack msg
}
-- | A logger implementation that stores all messages in an `IORef` for latter
-- inspection by a test-harness.
module Acme.System.Logger.Impl.Pure
( withHandle
) where
import qualified Acme.System.Logger as Logger
import Data.IORef (newIORef, atomicModifyIORef, readIORef)
-- | Create a 'Logger.Handle' records all messages and returns them once the
-- inner 'IO' action terminates.
withHandle :: (Logger.Handle -> IO a) -> (a, [(Priority, T.Text)])
withHandle io = do
-- We use a shared 'IORef' to records all log messages in reverse order.
msgsRef <- newIORef []
-- Call inner IO action
x <- io $ Handle
{ Logger.log = \prio msg ->
-- We don't need the strict variant here, as we are just
-- accumulating constructors.
atomicModifyIORef msgsRef $ \msgs -> ((prio, msg) : msgs, ())
}
-- return the logged messages
msgs <- readIORef msgsRef
return (x, reverse msgs)
Formal definition of the service pattern
The service pattern is built around the two concepts of
service specifications and service implementations.
A service specification for a service X
is a module of the following form.
-- | Specification of the X service. This module is intended to be imported
-- qualified as follows.
--
-- > import qualified YourModulePrefix.X as X
--
module YourModulePrefix.X
(
-- * Abstract handle
Handle(..)
-- * Pure types
-- Here you export types that are necessary to interact with service X,
-- but are too specific to fit into their own module, e.g., the priorities
-- above in the logger service.
-- * Derived functions
-- Here you export derived functions that work for any implementation of
-- service X.
-- They are typically of the shape `fun :: Handle -> params -> IO result`
) where
data Handle = Handle
{ function1 :: !(param1_1 -> ... -> param1_N1 -> IO result1)
...
, functionM :: !(paramM_1 -> ... -> paramM_NM -> IO resultM)
}
Note that this is analogous to manually constructing type-class dictionaries. See the Section "Discussion of the service pattern" for a discussion on why we prefer this formulation to one based on type-classes.
A service implementation for a service X
is a module of the following form.
-- | Implementation of the X service that uses ...
module YourModulePrefix.X.Impl.DescriptiveImplName
( Config(..)
, withHandle
) where
-- Import specifications for services that we require for this implementation.
import qualified YourModulePrefix.Service1 as Service1
...
import qualified YourModulePrefix.ServiceN as ServiceN
-- Import specification that we are implementing.
import qualified YourModulePrefix.X as X
-- | Pure representation of the configuration of your service implementation.
-- This type provides both named parameters for the 'withHandle' function and
-- a representation that can be easily serialized or parsed from an external
-- config file.
data Config = Config
{ cParam1 :: !t1
...
, cParamM :: !tM
}
deriving (Eq, Show)
-- | Run an 'IO' action with access to an 'X.Handle'.
withHandle
:: Config -- ^ Configuration
-> Service1.Handle -- ^ Service dependency 1 (e.g. a logger)
-> ...
-> ServiceN.Handle -- ^ Service dependency N
-> (X.Handle -> IO a)
-> IO a
withHandle config reqService1H ... reqServiceNH = do
-- allocate implementation resources (e.g., a mutex)
-- construct 'X.Handle'
-- run inner action while making sure that we deallocate our resources
-- both on exceptions and normal termination
-- For complicated services it often helps to define an implementation handle
-- as follows. It might even be that you have whole module structure below
-- 'YourModulePrefix.X.Impl.DescriptiveImplName' that you use to implement
-- service 'X'.
data IHandle = IHandle
{ ihConfig :: !Config
, ihService1 :: !Service1.Handle
...
, ihServiceN :: !ServiceN.Handle
, ihImplResource1 :: !t1
...
, ihImplResource1 :: !tM
}
-- | Implementation of 'X.Handle.function1'.
function1 :: IHandle -> param1_1 -> ... -> param1_N -> IO result1
-- ... remaining service function implementations ...
These two definitions are quite long. To understand the definitions, it helps to compare them to the logger example given earlier. I also invite you to ponder how you could apply this pattern in your codebase. Sadly, I cannot point to the two codebases where I used this pattern, as both of them are closed-source.
Discussion of the service pattern
The key benefits gained from the service pattern are modularity and increased compositionality. We can use this pattern to structure our application purely as a composition of services. In fact the main function of the Better web-application, looks roughly as follows.
import Better.Http.Impl.Snap as Http.Impl.Snap
import Better.Storage.Impl.Postgres as Storage.Impl.Postgres
import Better.System.Monitor.Impl.Ekg as Monitor.Impl.Ekg
import Better.System.Logger.Impl.Default as Logger.Impl.Default
-- NOTE that Postgres and LRUCache are services where we combine the
-- specification and the implementation, as we currently have only one
-- implementation. We can easily abstract later, once we need additional
-- implementations.
import Better.System.Postgres as Postgres
import Better.System.LRUCache as LRUCache
-- We use https://hackage.haskell.org/package/managed to avoid the
-- right-indent drift from the 'withXXX' functions.
import Control.Monad.Managed (runManaged, managed)
-- | Start the Better web-service.
main :: IO ()
main = do
-- get name of config file from arguments
configFile <- parseArguments
-- parse service implementation configurations from the file
monitorC <- parseMonitorC configFile
loggerC <- parseLoggerC configFile
postgresC <- parsePostresC configFile
-- ... more config parsing
-- construct service handles and run until the process is killed
runManged $ do
monitorH <- managed $ Monitor.Impl.Ekg.withHandle monitorC
loggerH <- managed $ Logger.Impl.Default.withHandle loggerC monitorH
postgresH <- managed $ Postgres.withHandle postgresC monitorH loggerH
lruCacheH <- managed $ LRUCache.withHandle lruCacheC monitorH
-- Our storage service provides all functionality related to
-- persistence. Its API is quite large. Our production implementation
-- of this service uses PostgreSQL.
storageH <- managed $ Storage.Impl.Postgres storageC lruCacheC loggerH postgresH
-- ... more services for implementing our business logic and in the end
-- ... our snap handlers
snapH <- managed $ Http.Impl.Snap.withHandle snapC monitorH loggerH handlers
-- wait until the the process is killed
forever $ threadDelay 100000
Note that with the service pattern we can push all implementation decisions
to the main
function, i.e., the root of our dependency tree.
All service implementations themselves can be implemented completely
independent of each other.
Moreover,
services implementations can make all their dependencies explicit,
which enables us to mock them appropriately during testing.
It also means that more of our code becomes platform independent,
as we can switch dependencies easily.
In my opinion,
there are two non-obvious design decisions in the above definition of
the service pattern.
First,
it does not abstract over the concrete monad, but uses IO
pervasively.
Second,
it does not use type-classes to specify abstract Handle
s.
We discuss the rationale for these two decision in the following two sections.
Relying on just IO
The main reason why the functions in a service specification just use
IO result
instead of custom monad stacks is that this composes better.
Monad stacks fix the order of arguments and produce friction
at abstraction boundaries due to the necessary transformations between
different monad stacks.
We found it to be more pleasant to just pass around the handles to the
necessary services as parameters (possibly grouped in a record).
This way all dependencies are explicit
and function application is all we need pass around handles.
On a related note,
we found at Better that adding effects to IO
is often unnecessary,
as IO
is powerful enough on its own.
The typical use was just a MaybeT
or EitherT
to track exceptions
explicitly in the types.
On service boundaries,
we typically translated these exceptions to a descriptive log-message and
a simple Nothing
.
There can be made a case for abstracting Handle
s over the the monad used to
compute results, i.e., as
data Handle m = Handle { fun1 :: ... -> m result }
.
This can be useful for building pure models of an application for testing.
We found that is often sufficient to build semi-pure implementations
that are based on hidden IORef
s.
Why not use type classes to define handles?
The short answer is that we do not use type classes because we would neither gain further functionality nor code that is easier to reason about. The long answer is the following.
We see two options on how one can use type classes to manage service dependencies. The first one is to attach service specifications to the monad in which we compute the results. For example, we could define the following two type classes to model monads that support logging and database access.
class MonadLogger m where
log :: Priority -> T.Text -> m ()
class MonadDB m where
runTransaction :: Transaction a -> m a
The service dependencies of any function are then captured by corresponding type class constraints, as shown in the following example.
someBusinessLogic
:: (MonadLogger m, MonadDB m)
=> param1 -> ... -> paramN -> m result
someBusinessLogic ... = do
mbResult <- runTransaction someTransaction
case mbResult of
Just result -> return result
Nothing -> do
log Warning "falling back to default"
return defaultResult
The main problem that we see with this approach is that it strongly favors the situation where all service dependencies are satisfied by a single instance. This situation is however rather seldom, as for example configuring using different loggers for different service instances is common (e.g., to set different verbosity levels). One can handle this case using mtl-style transformer stacks, but this requires more effort compared to just passing the handles explicitly. Moreover, the explicit handles also work in the case where we need to abstract over some pure functions, e.g., how to render a link to an HTTP API call on the backend. In this case, the monad typeclasses introduce unnecessary sequencing.
The second option to use type classes is to use
functions of the form
fun1 :: X.Handle h => h -> param1 -> ... -> paramN -> IO result
.
This would allow a single handle to implement multiple services.
However,
we are not sure where this would be beneficial.
A service's dependencies should always be kept private;
and in a service implementation we can just store the handles of service
dependencies as fields in the implementation handle.
In general,
we found that being explicit about dependencies simplified reasoning about
our service implementations and did not negatively impact refactoring.
How does this relate to object-oriented programming?
The implementation strategy used in the service pattern corresponds closely to how interfaces and classes work in OOP. Our service specifications declare both an interface at the type level, and a value representation for the interface at runtime.
In terms of memory consumption, our interfaces are heavier than typical class instances in an OOP language. We store at least one pointer to the instance-data per closure, whereas OOP implementations typically store a single pointer to the virtual method table together with the other instance-data.
In terms of call overhead, we pay the cost of an unknown function call, which is probably a bit slower than a virtual method invocation in an OOP langauge like Java. If this becomes a performance bottleneck, we will have to avoid the abstraction and specialize at compile time. Backpack will allow us to do this in a principled fashion without losing modularity.
Compared to OOP languages, our interfaces are more powerful, as they have a first order representation. Their functions can be mixed and matched freely, and we are not constrained to explain extension via inheritance.
Conclusion
This tutorial documents a pattern that I found to apply pervasively in the codebases I worked on. The description of the pattern is a bit long, but I invite you to give it a shot in your codebase. I am looking forward to hearing how it worked out. Happy coding :-)