Building a low-latency cloud infrastructure for algorithmic trading

The project shown in the diagram is one I designed and built last Wednesday, and it hosts software for printing money.

-

Algorithmic trading is a well-established field that has existed for decades.
It focuses on executing buy and sell orders in the capital markets according to predefined parameters, rather than manual execution based on a human trader's judgment.

If you try to purchase access to one of the algorithmic trading platforms available on the market, the person most likely to profit from it is the one who sold it to you — through fees — not you.
The truly sophisticated trading systems that succeed in generating returns are closely guarded assets owned by deep-pocketed financial institutions, or by small groups of investors with deep pockets who pool their resources into the software and share the profits.

This time, my client was exactly that kind of person.
He runs an extensive network of active businesses around the world, and brings decades of experience as an active capital markets trader.
He decided to turn that experience into software. To that end, he brought in a number of heavyweight investors, and together they are developing the project.
A project like this cannot be outsourced to a standard development firm, because the source code is a valuable asset that must not be exposed. Only a handful of trusted developers work on it.

-

My part in the project was to design the storage server architecture and build it end-to-end.
The main requirements were near-zero latency, massive GPU-based processing capabilities, scalability, maximum security, and continuous backup — both across time and across a separate geographic location.

Requirements like these have a significant impact on infrastructure design.
For example, adding an external firewall or load balancer is not recommended, as they add meaningful latency to server response times.

--

And only if you're really into infrastructure and humorless terminal screens, here are a few interesting technical details:
I chose a Google server farm that is co-located with the target address, bringing average response time to 0.3 milliseconds — a remarkable achievement.
I separated the database onto a dedicated Cloud SQL server, with the connection between it and the application server running over an internal line with no public IP.
The servers are located in the United States, but backups are sent to a European server for extreme DR scenarios.

The database backup was a bit of a challenge, because the client required backups to be stored in Europe — but Cloud SQL's built-in backup replicates to the same server farm, providing point-in-time recovery but no geographic redundancy.
I asked Gemini; it suggested a cumbersome process of exporting a SQL file every hour and shipping it to Europe — an inelegant solution that also demands significant processing resources and 50% free disk space on the database server.

I had a better idea.
The primary server is in the United States, and I spun up an identical replica of it in Europe that synchronizes changes in near real time. This way, the local backup handles point-in-time recovery, while the European copy serves as a fallback in case of an alien attack on the U.S. server farm.

The project was particularly enjoyable, and to me it's part of a bigger story.
We live in an era where a single person can build, with their own two hands, a system capable of serving a million users or more.
Over the past year I've been driven by the dream of possessing exactly that capability, and mastering scalable cloud infrastructure is another very important step toward it.

And if you're wondering — no, I don't have permission to use the software (even though I technically have access to the source code, which I will of course never use). So for now I'm here with you, working hard, diligently, and without complaint ⛄.

--
👋 Hi, I'm Shlomo Strauss — follow me for more interesting content on science and technology.

Building a low-latency cloud infrastructure for algorithmic trading