The Hidden Costs of Microservices

Recently there has been much buzz around the concept of "microservices." Microservices take service-oriented architecture to the next level by dividing applications into even smaller sub-units, often along lines of business capabilities ¹. Proponents of microservices point to the ability to more readily "scale up" their application in response to demand, and ability to achieve encapsulation behind the barrier of an Application Programming Interface (API).

In the recent hype behind microservices, relatively little has been written regarding the costs associated with their implementation. Are microservices a justified architectural investment, or a costly, premature optimization? This article examines the costs of microservice architectures, specifically considering the increased surface area of their published interface, and realities of distributed computing.

Microservices and Maintenance

It's probably a safe assumption that as a software developer, you’ve come across some form of "dependency hell" at least once. This may have involved an unexpected interface change in a library that you use, or incompatible versions of libraries. One of the more popular conventions for helping to mitigate dependency issues is Semantic Versioning. By putting your dependencies in services instead of a single software program, you’re effectively publishing your entire interface.

Once a microservice makes an endpoint public, subsequent development must pay the cost of either supporting multiple versions of services or the deployment of services in sync. Microservices, in this sense, increase the maintenance cost of a software system by increasing the number of published API endpoints.

Practices such as integration testing, semantic versioning, and continuous delivery may minimize the costs incurred by increasing the size of the published API. Whether this is worth the benefit of creating microservices in the first place would have to be assessed by members of the software team.

Microservices as Distributed Systems

Distributed systems are characterized by the use of message passing in order to communicate, as opposed to some other system such as locking using shared memory. Thus, all microservices (and many other parts of web-based software systems) are inherently distributed systems. As distributed systems, microservices face many engineering challenges. Below, I will discuss how microservices are affected by complexities related to processing transactions, time, referential integrity and querying in a distributed environment.

Transactions in a Distributed Environment

Atomic transactions are one of the most useful abstractions available to the programmer, and they end up being utilized in most applications which rely on a relational database system. Many databases systems, such as PostgreSQL, provide ACID semantics which allow us to rely on transactions without worrying about their implementation. In the case of PostgreSQL, an algorithm called two-phase commit is used to provide ACID semantics with concurrent database access.

It’s easy to think of a scenario where you would miss having an already-implemented algorithm for two-phase commit. Let's imagine we're working on a startup implementing a streaming radio system.

In our hypothetical application, we need to implement a system where you can use payment credits to buy a song. Since you’ve nicely broken up your application into separate services for "account balance" and "purchased songs" you must debit the balance, which is an API call, followed by purchasing the songs. In a consolidated database this is easily accomplished via a couple of SQL statements inside of a transaction. In contrast, in your microservice you should be prepared to implement two-phase commit on your own or you may have problems with data consistency across services.

The complexity involved in achieving referential transparency in a distributed system should warn us strongly not to be overly aggressive in breaking systems into granular microservices. In general, as your application shards into a multitude of microservices, the probability of needing to perform two actions together (i.e., transactionally) increases, so be prepared to run into this situation again as your application evolves. The fact that you may not realize where you need inter-service consistency until long after your system has been released to production, when it is relatively costly to change architecture in significant ways, should be a strong warning against advocating the construction of microservices early in development.

Got the Time?

Dealing with distributed systems requires throwing away some of your most cherished assumptions that are valid when dealing with software running on a system having shared memory. For example, it’s not safe to assume that clocks on machines in a network are the same. As Leslie Lamport said, "The concept of the temporal ordering of events pervades our thinking about systems. ... However, we will see that this concept must be carefully reexamined when considering events in a distributed system ²."

In most systems, at least for auditing or reporting purposes, it’s useful to know roughly when things happened. Depending on your application, there may be much more precise requirements for logical ordering of events, or time synchronization. In distributed systems, it's crucial to think about how you'll achieve a good enough understanding of time across your cluster. You should read up on systems for synchronization of algorithms, logical ordering and time synchronization to figure out which considerations are important for your particular application. Make sure your budget is sufficient to address the complex issue of time in your distributed system, both during initial development and subsequent expansion of applications with microservice architectures.

Foreign Keys, Where Art Thou?

Foreign keys are another useful abstraction provided by database systems. Foreign keys are useful because they allow you to add constraints to make sure that the data in your system reflects real-world scenarios. Unfortunately, in applications engineered as microservices, the automatic enforcement of referential integrity is often a casualty of a drive to create separate services isolating business products ¹.

For example, in the streaming radio application I presented above, you may have a table of purchases for a song corresponding to a system user. It doesn’t make sense that you would have a song purchased by a non-user, and so database systems allow you to specify "referential integrity constraints" to ensure that this property holds true.

With services relying on isolated database schema, the options are, at best, complicated. Common solutions include writing software to denormalize data, and incurring risks associated with data duplication. Ultimately, it's just extremely hard to ensure data consistency once you give up the abstraction of foreign keys in a centralized database. In the end, referential integrity constraints are a simple-to-use abstraction that you may just have to let go in the move to microservices ². In this case, you should make a mental note that you’ll likely have to spend more time manually verifying that your programs ensure referential transparency in lieu of this property that you would get for free from database systems.

Ultimately, the constraints in a database schema give you a measure of confidence similar to a type system in a programming language. The programmer of a database system with appropriate constraints in place can often read those types to discover the domain of values that it accepts. Not having foreign key constraints, or relying on run-time checks in a web application, can increase the cost of writing subsequent programs as developers may have to investigate data manually to ensure that they are valid in terms of their relationships to other data.

As a programmer, I’ve spent countless hours dealing with bugs in programs related to data that didn’t have proper relationships, so I think that lack of easy access to foreign key constraints is a significant cost to development of microservice architectures.

Complexities in Ad-Hoc Querying

Having a consolidated database schema makes it easy to do ad-hoc queries. Eventually, you may hit performance issues due to size of data or number of queries when using a standard relational database with typical modeling techniques. At this point, developers generally move to solutions like data warehousing or NoSQL.

It’s also conceivable to address performance problems by having microservices on top of isolated database servers which can be scaled independently. While this may be the first tool that a microservice-oriented architecture reaches for, it has considerable disadvantages. Specifically, rather than using built-in tools to aggregate data, you're left either to write your own algorithms for merging data from disparate sources, or adding another piece of middleware such as Hive to consolidate data for you. These solutions at best add significant complexity to infrastructure, and at worst create a substantial amount of work and extra code that needs to be written by the application developer.

It’s Not Over Yet

Above, I’ve given some concrete cases where dividing an application into microservices will add to your list of responsibilities as a developer. This list is far from comprehensive. If you’re not already familiar with the Eight Fallacies of Distributed Computing, now would be a good time to check it out, and make mental notes of the cost of addressing the real-world distributed computing scenarios of which you should be aware before building a microservice.

Other Concerns: Microservices and Their Discontents

Blog posts and recent conference talks tout many of the potential benefits of microservices. After writing the draft of this article, I had discussions with teammates at Stack Builders regarding characteristics of microservice architectures that worked well, and ones that didn't.

One developer mentioned a project divided into microservices for a government agency. While the individual products started being encapsulated in terms of their data, ultimately a framework developed for copying parts of databases among applications based on the shared user data that they required. A benefit of this system, though, was that isolation of unrelated system parts added an element of fault tolerance.
Several years ago I helped to start one project that was initially built as microservices, but which was brought back together to speed development. In that project, microservices didn't serve to allow rapid evolution of component parts; instead, it fostered an anti-agile fragmentation of software product teams that was detrimental to project progress. Other team members with whom I discussed the idea of microservices also indicated concerns about isolating team members from each other. Usually it's very important to take steps to encourage developers to work to achieve the same goals, which may be hard to do in a microservice environment.
Developers on our team also noted that microservices encourage "big design up front." Software practitioners in recent years have emphasized how iterative software development and design reduce costs, but microservices have the opposite effect of encouraging us to lock in to a specific division of services.
Some proponents of microservices tout their benefits to reduce coupling. Sam Gibson, in a recent blog post, suggests that modularity offered by traditional object-oriented languages isn't enough to deter strongly-coupled programs. He suggests relying on microservices to reduce coupling. Many of our developers, however, feel that a more pragmatic approach is to simply be more disciplined with good design practices in traditional languages, or to adopt languages such as Haskell, which encourage decoupled design via pure functions. Microservices are a "sledgehammer" approach to achieve decoupling when compared to the features of modern languages intended to facilitate modular design.

Conclusion: On Fairly Assessing Costs of Microservice Implementation

Microservices are one of the hottest trends in web application architecture. Benefits include scalability of individual application components, fault tolerance via isolation between products, and encapsulation of application logic behind the wall of an API. In addition, it's often the most practical way to combine services in different languages (in our case, we'd love to use it more to combine Ruby and Haskell applications).

While these benefits can be substantial, I think it's important to carefully consider the costs of building microservices. These costs include the addition of complexity due to the realities of distributed systems. The costs of microservices increase, often dramatically, as initial expectation about isolation of systems turns out to be false, or as microservices encourage the separation of teams that should be working more closely towards common goals.

While there are undoubtedly use cases where microservices are an excellent choice, I believe that we should be careful about jumping in to microservices too soon on any given project. In many cases, rather than deciding on microservices, a more pragmatic decision may be just to rely on the methods of achieving modularity offered by your programming language of choice in order to leave open the path to microservices in cases where they are truly justified based on experiences in initial development.

Credits

I would like to thank Eric Jones, Enio Lopes, Juan Pablo Santos, Marcin Olichwirowicz, Juan Pedro Villa and Juanda Zapata for providing useful comments on an early version of this post.

See Products, not Projects in the larger post describing microservices by James Lewis, titled "Microservices." ↩↩
Time, clocks, and the ordering of events in a distributed system by Leslie Lamport was written in 1978, and is still a great way to get your feet wet in the messy world of distributed computing: http://dl.acm.org/citation.cfm?id=359545.359563 ↩↩
There is an interesting paper on an algorithm called p-flood that helps to maintain referential integrity in distributed system (A Scalable Architecture for Maintaining Referential Integrity in Distributed Information Systems). I don’t know of anyone who has implemented it, but I’d be interested to hear if readers know of a real-world implementation. ↩
"Microservice architectures make change less expensive, freeing your business to weaponize architecture," from "4 reasons why microservices resonate Microservices optimize evolutionary change at a granular level." by Neal Ford ↩