mark Saga, compensate for failures in a terse and composable way

Motivation for a Saga

Imagine you want to book a trip which includes a car, hotel, and a flight. If something cannot be booked, why would you bother going? If we would build this in Scala it would probably involve calling some external API's. These API's can go down, what happens if one of these API's is down?

Saga: example of a process which can fail, but takes failure into account

You could use recover to execute a compensating action (an action which reverses the side-effect), but it will only work with one action. Of course, you could apply this to all your actions, but this will get messy because you need to keep track of your compensating actions.

How to describe that nicely?

To describe a program that can handle failure you need to couple the outcome of a successful action to it's compensating action.

A short example of a Saga program

import cats.effect.IO
import cats.implicits._
import cats.effect.concurrent.Ref
import goedverhaal._
import scala.util.control.NonFatal

def prg(ref: Ref[IO, Int]): Saga[IO, Unit] = for {
  _ <- Saga.recoverable(ref.tryUpdate(_ + 1))(_ => ref.tryUpdate(_ - 1) *> IO.unit).replicateA(500)
  _ <- Saga.recoverable(ref.tryUpdate(_ + 1))(_ => ref.tryUpdate(_ - 1) *> IO.unit).replicateA(500)
  _ <- Saga.nonRecoverable[IO, Nothing](IO.raiseError(new Throwable("Error")))
} yield ()

def main: IO[Int] = for {
  ref <- Ref.of[IO, Int](0)
  _ <- prg(ref).run.recoverWith { case NonFatal(_) => IO.unit }
  current <- ref.get
} yield current

The outcome of the main function will be zero, as the prg will crash at the end. The first action will increase the Ref[IO, Int] to 500 and the second action by another 500, but since it crashes the compensating actions will roll it back to 0.

The importance of lazy evaluation

The compensating action needs to be description of an action. A description means, that's it is not executed immediately (lazy evaluation) and it may be a side-effect. This is called a computation in functional programming. The opposite of lazy evaluation is eager evaluation. An example of a type which is eager is Future and Try.

Couple success with compensation

In my Saga the signature of the a recoverable combinator is defined as:

def recoverable[F[_] : Sync, A](comp: F[A])(rollback: A => F[Unit]): Saga[F, A]

The type class Sync constrain on F[_] is to enforce a type which supports lazy evaluation. The thing we need for our description of a Saga. The function itself takes two arguments, an actual comp (short for computation) which is the do action, and the rollback which uses the outcome of the do action to construct a rollback/compensating action.

Saga, a specialized Free Monad

As you can it returns a Saga[F, A]. A Saga itself is a description of several computations. In fact it's a slightly altered variant of a Free Monad:

case class Pure[F[_], A](action: A) extends Saga[F, A]
case class Next[F[_], A](action: F[A], compensate: A => F[Unit]) extends Saga[F, A]
case class Bind[F[_], A, B](fa: Saga[F, A], f: A => Saga[F, B]) extends Saga[F, B]

The Pure and Bind are descriptions of operations that you'll find on a Monad as well. The Next case, however, is not. This will store the parameters of the recoverable method as is for later evaluation.

This data is interpreted by the decide method on Saga, which looks like this:

def decide[B](f: (A, List[F[Unit]]) => F[B]): F[B]

It will fold the description of computations as described in the Saga data type to a F[B]. If anything fails (due a Sync.onError) it will execute the compensating actions accumulated so far. If it succeeds, it will execute the f: (A, List[F[Unit]] => F[B] function. This function lets you decide what to do with the outcome of the computation. This is useful when you work with an EitherT or OptionT. The outcome may be None or Left. In that case, you might want to rollback all the actions.

You can also use the run variant on Saga which uses the decide

def run: F[A] = decide { case (a, _) => F.pure(a) }


As you can see Saga is a useful tool when interacting with multiple API's which are crossing an asynchronous boundary and might not offer transactional guarantees. It might not be the best solution, but in a lot of cases you don't have a better choice I guess (welcome to the microservice/API era)!

If you want to have a closer look at how that's done or have feedback. Have a look at the source code on Github

Happy hacking!

Created by

Mark de Jong

Mark de Jong

Software Creator