Latency is becoming essential across various use cases in modern applications. For example, in e-commerce, going beyond single-digit millisecond latencies is believed to hurt the bottom line: a decade ago, Amazon reported that every 100 milliseconds in response time cost them 1% of sales. And it’s not just e-commerce; the need for low latency is becoming more important for OLAP and analytics workloads as well:
But how is this need for speed driving infrastructure change today? Let’s start from the constraints.
Geographical latency is a fundamental limit, which infrastructure needs to hide from the end-user. For example, according to Cloudflare, it takes up to 150 milliseconds for a packet sent from New York to arrive in London¹, but we don’t want users to experience that.
Content delivery platforms such as Vercel and Netlify allow the deployment of web application assets across multiple geographically distributed locations closer to the end-users, eliminating the geographical latency and making our websites snappy! More generally, edge computing platforms such as Cloudflare and Fastly have the infrastructure to deploy your code on the edge, eliminating geographical latency. For example, Cloudflare Workers provides an abstraction for deploying code that automatically runs everywhere. This deploy once, run everywhere abstraction of edge computing is pretty powerful, but it also creates an infrastructure problem: edge locations are closer to the end-user but have little capacity. On the other hand, data centers that power the cloud can be far away but have more storage and compute capacity. How do we reconcile that in our infrastructure?
“Fine-Grained Sandboxing with V8 Isolates” - Kenton Varda
The database access problem has two parts:
Serverless functions have no state, which complicating database access.
Storage capacity at large will likely remain in the data centers powering cloud computing.
The lack of state in serverless functions means that with traditional databases, every access has to establish a connection to the database, which is slow. Prisma, for example, attempts to address this problem with a proxy that handles the connection pooling and solutions such as pg_bouncer also solve the problem for Postgres:
However, although a proxy or pg_bouncer approach solves the connection pooling issue (and likely will be integrated into future serverless runtimes), the issue of some database access latency remains as databases are deployed into the cloud, but application logic is increasingly running on the edge.
We at ChiselStrike have been exploring an option of going even further to drive down latency: integrating the relational database engine (in our case SQLite) directly to the serverless runtime:
An embedded database management system solves the issue of latency, but it’s not a complete solution as you still need to replicate the data and storage capacity is limited on edge locations. ChiselStore is using Raft consensus protocol to replicate data, but it’s lacking optimizations needed to making it fast. And although we have ideas how to integrate an edge database approach with a centralized database that has higher storage capacity, we have not started the work.
Database vendors are also tacking this problem via an approach dubbed as serverless databases. For example, CockroachDB released their serverless offering end of last year, and YugaByte is working on their serverless offering:
In summary, while there’s lots of exciting work happening in serverless database space, my belief is that combining the low latency of the edge with the cloud’s storage capacity will be one of the big frontiers to explore in edge and serverless computing infrastructure in the years to come and will transform how we think about serverless and data!
¹ As various people on Twitter point out, the geographical latency between New York and London can be as low as 30–50 milliseconds if you are willing to pay for it.