• Latest
Low Latency Microservices, a Retrospective

Low Latency Microservices, a Retrospective

December 11, 2021
Buffalo and Uvalde shootings set to rekindle E2E encryption arguments

Buffalo and Uvalde shootings set to rekindle E2E encryption arguments

May 27, 2022
Mysterious SEGA Broadcast Announced, Will Unveil A New Project

Mysterious SEGA Broadcast Announced, Will Unveil A New Project

May 27, 2022
TA Playlist game for June 2022 announced

TA Playlist game for June 2022 announced

May 27, 2022
Koumajou Remilia: Scarlet Symphony Will Get An ‘Extra Easy Mode’ For Newcomers

Koumajou Remilia: Scarlet Symphony Will Get An ‘Extra Easy Mode’ For Newcomers

May 27, 2022
The Witcher 4 has entered pre-production, says CD Projekt

The Witcher 4 has entered pre-production, says CD Projekt

May 27, 2022
SteamOS 3.2 update adds refresh rate settings, quieter fan curve to the Steam Deck

SteamOS 3.2 update adds refresh rate settings, quieter fan curve to the Steam Deck

May 27, 2022
Cyberpunk 2077: PS5 and Xbox Series X/S Versions Are Out Now

Cyberpunk 2077 Expansion: Leak Points to Inaccessible Areas Being Opened

May 27, 2022
Nintendo Download: 27th May (Europe)

Nintendo Download: 27th May (Europe)

May 27, 2022
Minecraft’s The Wild update confirmed for June

Minecraft’s The Wild update confirmed for June

May 27, 2022
A solid option for Fuji!

A solid option for Fuji!

May 27, 2022
Honor shares details about the Magic4 Pro R&D process

Honor shares details about the Magic4 Pro R&D process

May 27, 2022
Xbox boss will “absolutely support” Raven Software’s new union

Xbox boss will “absolutely support” Raven Software’s new union

May 27, 2022
Advertise with us
Friday, May 27, 2022
Bookmarks
  • Login
  • Register
GetUpdated
  • Home
  • Game Updates
    • Mobile Gaming
    • Playstation News
    • Xbox News
    • Switch News
    • MMORPG
    • Game News
    • IGN
    • Retro Gaming
  • Tech News
    • Apple Updates
    • Jailbreak News
    • Mobile News
  • Software Development
  • Photography
  • Contact
    • Advertise With Us
    • About
No Result
View All Result
GetUpdated
No Result
View All Result
GetUpdated
No Result
View All Result
ADVERTISEMENT

Low Latency Microservices, a Retrospective

December 11, 2021
in Software Development
Reading Time:7 mins read
0 0
0
Share on FacebookShare on WhatsAppShare on Twitter


I wrote an article on low latency microservices almost five years ago now. In that time, my company has worked with a number of tier-one investment banks to implement and support those systems. What has changed in that time and what lessons have we learned?

Read this article and learn what we learned after five years of developing and supporting low latency microservices.

Separation of Concerns Give Better Testability

Microservices repeatedly demonstrated that testing and debugging business components were much easier with simple, stand-alone components with clear contracts between microservices.

While unit tests were still used to start with, in 2017 we moved almost entirely to behavior-driven development of microservices. Unit tests are still used for lower-level libraries and utilities. As our microservices are all based on Kappa Architecture, all our behavior-driven tests are modeled as a series of events in and out of the service.

An input test might look like this:

---
oms: OMS1 # the service to receive this message
newOrder: {
  eventId: orderevent1,
  eventTime: 2017-04-27T07:26:40.9836487,
  triggerTime: 2017-05-05T12:56:40.7534873,
  instrument: USDCHF,
  settlementTime: 2017-04-26T09:46:40,
  market: FXALL_MID,
  orderId: orderid1,
  hedgerName: Hedger1,
  user: MidHedger,
  orderType: MARKET,
  side: SELL,
  quantity: 500E3,
  maxShowQuantity: 500E3,
  timeInForceType: GTC,
  timeInForceExpireTime: 2018-01-01T01:00:00
}
---
# more messages

While the output looks very similar as this is an Order Management Service and its job is to filter and track orders.

---
newOrder: {
  eventId: orderevent1,
  eventTime: 2017-04-27T07:26:40.9836487,
  triggerTime: 2017-05-05T12:56:40.7534873,
  instrument: USDCHF,
  settlementTime: 2017-04-26T09:46:40,
  market: FXALL_MID,
  orderId: orderid1,
  hedgerName: Hedger1,
  orderType: MARKET,
  quantity: 500E3,
  user: MidHedger,
  side: SELL,
  maxShowQuantity: 500E3,
  timeInForceType: GTC,
  timeInForceExpireTime: 2018-01-01T01:00:00
}
---
# more results

Building variations on tests to explore all the things which could go wrong and check how they are handled is easy.

What We Needed to Add

Beyond implementing what we envisioned five years ago, there were some features we discovered we needed to add.

To ensure our services produced the same results every time, whether in tests or between production and any redundant system, we made the time input. This appeared in our test like this:

periodicUpdate: 2017-04-27T07:26:51
---

This ensured that all time-outs or events triggered by the clock could be tested, but also ensure each redundant system did the same things at the same point and produced the same output.

We started with millisecond timestamps but quickly found we needed greater resolution switching to microseconds timestamps, we now use nanosecond resolution timestamps.

We found that nanosecond timestamps were more useful if we could also ensure some level of uniqueness. We added support for ensuring nanosecond timestamps are unique on a single host in a low latency way. This takes well under 100 nanoseconds. That way a timestamp can be used as a unique id for tracing events through a system (adding a pre-set host id if needed).

  • Ability to store complex data as primitives

For testing purposes, all messages appear as text, however for performance reasons all data is written/read in a binary form. The typical latency of a persisted message between microservices is less than a microsecond so how the objects are stored can make a big difference.

The use of String and LocalDateTime was very expensive for our use case. To reduce the impact of storing this information, we developed a number of strategies for:

  • Encoding Strings and dates in long fields.
  • Object pooling Strings
  • Storing text in a mutable/reusable field

Ultimately, this lead to the support of Trivially Copyable objects, a concept adopted from C++, where the majority (or entire) Java object could be copied as a memory copy without any serialization logic. This allowed us to support passing complex market data with around 50 fields between microservices in different processes at well under a microsecond most of the time.

  • Ability to run microservices as a single process or a single thread

One problem with microservices is that they are less productive when it comes to integration tests and development. To solve this we created a means of running multiple microservices in a single container/JVM, or even in a single thread. This gave our customers the flexibility of running parts of the system as a compound microservice in the test environment while they could still deploy individual microservices independently in production.

  • Simplifying restarts with idempotency

Idempotency allows an operation to be harmlessly attempted multiple times. Restarts are simplified by replaying any messages you are not sure if they were processed fully. 

In low latency systems, transnationality needs to be as simple as possible. This suits most needs, however when you get a complex transaction, idempotency can significantly reduce the complexity of recovery.

  • Multiple replay strategies

Having a complete deterministic persisted record of every message made restarts and failover simpler, however, we found that different use cases needed different strategies as to how that is done.  We have learned a lot about real replayability/restartability requirements from customers’ real use cases and the trade-offs they make for service start time vs accuracy etc.

One key feature is ensuring upstream messages are replicated before downstream ones are to simplify rebuilding the current state.

What Became Less Important

Some of the features we thought would be vital five years ago, turned out to be not always required as clients used our technology in broader contexts.

  • Ultra-low garbage collection

Five years ago, we were entirely focused on no minor collections over a day of processing. However, many clients wanted the productivity gains we offered but didn’t have stringent performance requirements and occasional garbage collections were not a concern.  Creating microservices naturally supports having latency-critical services as well as less latency-critical services in the same system. 

This required a shift in some of our thinking which assumed GCs never occurred normally. We significantly reduced our use of WeakReferences for example.

  • Low throughput data processing

The lowest latencies tend to be at around 1% of the peak capacity. At lower throughputs, you tend to get either your hardware trying to power save, increasing latency when an event does occur, and at higher throughputs, you increase the chance of a message coming in while you are still processing previous ones, adding to latency.

Again, we saw a broadening of use cases to very low message rates which showed up behavior that is only seen when the machine is spending most of its time waiting. We added performance test tools to see how our code behaves when the CPU is running cold.

We also saw the need to support higher throughputs of messages for hours at a time (rather than seconds or minutes). For example, if we had a microservice that could process a million messages per second, we would test the latency at 1% of this as this was considered the normal volume. It also wasn’t possible to get high-performance, high-volume drives that could sustain this rate for hours without filling up. Today, we are testing the latency of systems sustaining long bursts of one million messages per second for many hours at a time. 

If you are looking for such a drive you can test on a desktop, I suggest looking at the Corsair MP600 Pro series

Make Your Infrastructure as Fast as Your Application Needs

Over the last five years, the requirements for core systems have been more stringent, however, as we need to integrate with existing systems, we have seen the need to easily support systems that don’t have the same requirements (and would rather it be easy and natural to work with).

For the more stringent systems, the latencies clients care about:

  • Have moved from the 99%ile (worst 1 in 100) to the 99.9%ile or 99.99%ile
  • The latencies they are looking to achieve, wire to wire (as measured on the network), is more around the 99.99%ile at 20 microseconds or 99.9%ile at 100 microseconds
  • Have a clearer idea of the worst-case latencies they need to see. Many are looking for low milliseconds end to end worst case

At the same time, our client needs to integrate with existing systems where all they need is for that to be as easy as possible.

Make the Message Format a Configuration Consideration

Five years ago I imagined we would need to support all sorts of formats however a relatively small number turned out to be really useful. Have a look:

  • An existing format. Make using a corporate standard format easy, no need to use ours
  • Text format, YAML seems the best one, esp for  readability and typed data
  • The binary form of YAML. The good trade-off between ease of use and performance
  • Typed JSON for use over WebSockets to Javascript based UIs
  • Marshaling of fields as binary
  • Direct memory copy objects e.g., Trivially Copyable, to maximize speed
  • FIX protocol

We regularly use a single DTO in multiple formats depending on what is most appropriate without the need to copy data between DTO specialized for a given format.

Conclusion

For the use case of our clients, I believe most of the concerns around replacing microservices with a monolith have been solved. However, having the option to run multiple microservices as if it was a monolith handles those cases where it is easier to work with e.g., testing and debugging multiple services.

OpenHFT open-source libraries



Source link

ShareSendTweet
Previous Post

GTO TUNING

Next Post

GTO TUNING

Related Posts

Android App to Monitor Hudson – Part II Configurations

May 27, 2022
0
0
Android App to Monitor Hudson – Part II Configurations
Software Development

Last week, demonstrated building and Android application that queried Hudson remote api through REST calls, which returned back JSON objects;...

Read more

Announcing the JavaFXpert RIA Exemplar Challenge

May 27, 2022
0
0
Announcing the JavaFXpert RIA Exemplar Challenge
Software Development

I posed the question "Should There be Enterprise RIA Style Guidelines?" on JavaLobby in late 2008, and received some valuable...

Read more
Next Post
GTO TUNING

GTO TUNING

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

© 2021 GetUpdated – MW.

  • About
  • Advertise
  • Privacy & Policy
  • Terms & Conditions
  • Contact

No Result
View All Result
  • Home
  • Game Updates
    • Mobile Gaming
    • Playstation News
    • Xbox News
    • Switch News
    • MMORPG
    • Game News
    • IGN
    • Retro Gaming
  • Tech News
    • Apple Updates
    • Jailbreak News
    • Mobile News
  • Software Development
  • Photography
  • Contact
    • Advertise With Us
    • About

Welcome Back!

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Fill the forms bellow to register

All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?