SQLite-based databases on the postgres protocol? Yes we can!

Applications built on SQLite are very easy to get started with. SQLite requires no setup, no maintenance, and no scaling, and the result of that execution lies entirely in a single file that you…

Cover image for SQLite-based databases on the postgres protocol? Yes we can!

Notice:

After some months of development, we have decided to deprecate the Postgres protocol, and focus exclusively on HTTP-based transports.

Applications built on SQLite are very easy to get started with. SQLite requires no setup, no maintenance, and no scaling, and the result of that execution lies entirely in a single file that you could drop into your CI/CD for quick verification. What's not to like?

But it's not a common choice for production backends because of its lack of network accessibility for things like monitoring and backups. Most modern applications also need replicas for availability. And thanks to platforms such as Netlify, Vercel, and Cloudflare. applications are moving to the edge: they are now deployed everywhere, instead of somewhere.

Historically, these situations have made SQLite an infeasible option. But no more!

#SQLite gets network accessibility

Aside from any technical consideration, the SQLite project is Open Source, but not Open Contribution, which prevents the creation of a community of people and organizations pushing the project in new directions.

In October 2022, we announced a fork of SQLite that is also Open Contribution and distributed under an Open Source License (MIT), indicating our intent to evolve the project into some new and spicy directions. We call it libSQL.

The latest addition is called “server mode” (known as sqld) that enables network access to libSQL, as well as replication to multiple instances.

With this first iteration, we support:

  • The Postgres wire protocol
  • The Postgres wire protocol over websocket
  • HTTP

#sqld in action

To see how it works, we start the server and indicate we want to serve it over http and postgres. The -dswitch specifies the database file.

$ sqld -d foo.db -p 127.0.0.1:5432 --http-listen-addr=127.0.0.1:8000

Since sqld supports the Postgres wire protocol, standard Postgres tooling works, including the psqlcommand shell. We'll use it to create a table and insert a row. The commands use SQLite syntax and types:

$ psql -q postgres://127.0.0.1
glaubercosta=> create table databases (name text);
 -
(0 rows)
glaubercosta=> insert into databases (name) values ('libsql');
 -
(0 rows)
glaubercosta=> select * from databases;
name
 - - - -
libsql
(1 row)

SQLite standard tooling also works, so we can use the sqlite3command shell to inspect the resulting file:

$ sqlite3 foo.db
SQLite version 3.37.0 2021–12–09 01:34:53
Enter ".help" for usage hints.
sqlite> select * from databases;
libsql

#HTTP and a native client

Last, but not least, you can issue commands over HTTP, so you can just curlto it with a JSON payload, without managing connection pools or anything of the sort.

$ curl -s -d "{\"statements\": [\"SELECT * from databases;\"] }" \
  http://127.0.0.1:8000
[[{"name":"libsql"}]]

HTTP support was added to support restricted environments, such as edge functions, where very little besides HTTP is present. To make that even easier, we also provide a native TypeScript client that encapsulates the details of the protocol:

Transpiling and executing the following code:

import { connect } from '@libsql/client';
async function example() {
  const config = {
    url: process.env.DB_URI,
  };
  const db = connect(config);
  const rs = await db.execute('SELECT * FROM databases');
  console.log(rs);
}
example();

Will yield output similar to the prior examples:

$ DB_URI=http://127.0.0.1:8000 node index.js
{
  results: [ { name: 'libsql' } ],
  success: true,
  meta: { duration: 0 }
}

#Is a fork really needed?

A fair question to ask is this: “There are other projects attempting to merge SQLite and networking, and they don't fork SQLite. So is a fork really needed?”

A full explanation of how SQLite writes and reads data is out of the scope of this article. There are many other sources that go into great detail about that.

Suffice to say, there are two entities that are relevant when interacting with SQLite's storage: the VFS, and the WAL (Write-Ahead Log).

#WAL Virtualization

While SQLite does offer a virtualized interface for the VFS, it does not allow for the virtualization of the WAL methods. The main work in the core of libSQL was to allow for WAL virtualization.

Once the WAL is virtualized, we can capture any new updates to the database as they happen. This gives us the flexibility of VFS, but with a log-structured API.

Log-structured APIs are easier to replicate, since it is a natural point for streaming changes. In WAL mode, a writer can work in parallel with readers, which provides better concurrency guarantees for mixed and read-intensive workloads, compared to the default rollback journal mode.

Moreover, SQLite has a very interesting feature that isn't generally available yet — BEGIN CONCURRENT transactions, which use the power of optimistic concurrency control to allow multiple writers to work in parallel. This feature is built on top of WAL mode only.

#For us, it's more than sqld

This is as good a time as any to remember the primary reason for our fork: our goals with libSQL go beyond server mode. Many projects extend SQLite, making it hard to unify these efforts.

As an example, before server-mode, we had already integrated WASM user-defined functions, allowing users to to write close-to-the-data functions and triggers in WASM. libSQL also allows for randomized row ids, among other features, with even more planned in the future.

#What next?

We would love to hear how you might like to use libSQL's server mode in your next project. Give us a shout out on Twitter and join our Discord community.

More importantly, we aim to be a welcoming home for new contributions around the idea of what an embeddable database can be, as described in our motivations. Your support is welcome on our Github!

scarf