Service Fabric

Today at Build, Microsoft detailed Service Fabric, which was previously known as Windows Fabric. I’ve taken to just calling it Fabric.

At its core, Fabric is a framework for hosting services. It handles high availability, service discovery, partitioning, zero-downtime upgrades, monitoring, load balancing, failure detection, and data replication. It has a bunch of other features, too, but we will focus on the major ones.

Most services are stateless, simply needing to be hosted in a highly available, discoverable manner. Fabric allows you to deploy packages to it under a URI of your choosing (fabric:/mystuff/demo). It will keep your service running on available nodes and help you to lookup your service’s endpoints.

When you need to upgrade your service, you deploy a newer version of the package and fabric will maintain your service’s availability during the upgrade. If the upgrade goes south (for example because of a programming error), then Fabric will automatically roll-back the upgrade for you and let you try again with a fixed version. All the while, your service remains available.

One of the hardest parts of writing a reliable stateful service is data replication. Fabric allows you to replicate a log of operations/events within a ring of replicas. Under the hood, this is implemented using distributed consensus. Primary replicas receive commands and replicate operations to secondaries. If the primary fails, an up-to-date secondary will be elected as the new primary. You provide the data, and Fabric handles distributed consensus. If you want to allow reads from secondaries, that is an option, too. Fabric supports both in-memory (volatile) and persisted stateful services. Fabric helps a great deal, but creating stateful services from scratch is not trivial. The recommended approach for most is to create services using either the Distributed Collections API or the Virtual Actor API.

The Distributed Collections API lets you create stateful services by composing replicated dictionaries and queues alongside your business logic. The Actor API is by far the quickest way to create stateful services, though. Based on the Virtual Actor model, first implemented in Project Orleans, it allows you to very easily create stateful, actor-based services.

Service partitioning is used to scale Fabric services. There are a few kinds of partitions in Fabric: named (string), range (long), and singleton. Both stateless and stateful services can be partitioned.

That’s Fabric in a nutshell, but there is a lot more to it. For example, the Naming Service which Fabric exposes is conceptually similar to Zookeeper or etcd.

Oh, for the Orleans people: We host Orleans on Fabric, and you can gain most of the benefits of Fabric very easily. Of course, Orleans is OSS, so we can change it to use whatever functionality we like. I’ll post code for hosting Orleans services on Fabric if there’s demand. EDIT: There was demand, so the code is on GitHub:¬†

Hit me up on Twitter @ReubenBond, if you have any questions.