Usually, whenever somebody wants to start building an HTTP API for their website, they pretty much exclusively use REST as the go-to architectural style. They prefer REST to alternative approaches such as RPC and GraphQL.
REST works well for most cases, especially if you build a monolithic application that does not have complex within-application communication. The problems come when you build an application that consists of dozens of backend services regularly communicating with each other. An example of such architecture can look like this.
When Architecture Becomes Complex
It is important for the architecture to be fast, flexible, easy to scale, and simple. But the more services, endpoints, and business logic you have, the more complex the architecture could become. There are a number of design concerns.
First, each endpoint requires Swagger documentation and contract tests. The more API endpoints you have, the more documentation and tests you need to implement. For example, supporting the public API for end-users and a simplified API for internal calls requires the API to be written twice. This might be manageable, if not for the other factors that can multiply the complexity to developers.
If underlying services are called from several other services, each in a different language, we are required to rewrite the code for API SDK, data models, and other services for each language. This rewrite not only includes the code itself but also unit tests, documentation, continuous integration tools, and other aspects. Thus, multi-language support requires very high accuracy and attention from the development team.
Not only can additional languages add complexity, but different versions of the same language can also add complexity. Backward compatibility issues require close attention, especially when there are many dependencies. Furthermore, a change in a library for one language can be missed in a library for another language. Thus, even one additional dependency can add significant complexity.
Because of all these issues, making changes requires substantial engineer time and carries the risk of introducing errors. Meanwhile, the overhead to the network must be manageable. It is easy to overload a high-load network if several services in a row encode and decode JSON data for each API call.
Looking Toward a Solution
While GraphQL is a good solution, it is still not the best solution for direct microservice communication. With one GraphQL schema as an API gateway, the schema on the API gateway side would need to be changed each time the microservice contract input and output are changed. With multiple GraphQL schemas for each microservice, the purpose of GraphQL is defeated. GraphQL is a schema over the entire application data, to allow the data to be obtained in a single round trip.
For our example, we will consider the RPC approach. To understand why the RPC approach is better than the previous solution, let’s look at this diagram of a simple distributed architecture.
Here, note a number of things. First, making the request, the client needs to serialize the data before sending it. The server needs to deserialize the data after receiving it, as well as when it sends the response.
It is important to use the right communication protocol and data format for the interactions between the client and the server. Ideally, data should be compact, fast to encode and decode, and their protocol should be optimized.
Compared to the JSON format, the binary format is clearly more compact.
Data Encode/Decode Time
The binary format is clearly faster than the JSON format as well.
So Why Not Just Use the Binary Format?
It is controversial whether the raw binary format is actually easier to use. To facilitate the use of the binary format, we need to create a binary specification for our data, implement encode and decode functions, develop good debugging tools, and write good documentation.
Even if we increase the performance of our API communication, we still need to address the issue of multi-language microservices communication. This is where RPC comes in. A possible solution is to describe our data models in some language-agnostic way so we can generate the same models for each language. This requires the use of an IDL, or interface definition language. The data model code in the IDL could be used as input for RPC calls.
This means that we also need to have a multi-language RPC framework that will support our data structures. There are alternatives to developing IDL libraries and RPC frameworks for each language in the stack.
Where gRPC and Thrift Come In
Yes, using an IDL can address the communication problems with microservices. With an IDL, you can create one model, and then using a code generation tool, you can generate your target language model which can be included in your business logic. This solves the problem of reusing the same models across different languages.
A good IDL is also readable enough to be used as documentation, eliminating the need for Swagger and API specifications. Using the RPC framework, we solve the problem of duplicating our client request library for several languages. We also simplify our API, because RPC is much simpler than the REST API. Using the RPC framework, there is no need to write your own REST client.
gRPC and Apache Thrift are two of the most popular RPC solutions and are based on IDLs. They help solve the problem of managing complex networks of microservices.
This article was co-written with Oleksandr Piekhota, software engineer at airSlate, as part of a discussion of airSlate’s technologies.