Something you probably want to know about if you're using SQLite in Golang

I recently published an article on how a code using database/sql can easily deadlock. That article started multiple interesting conversations and one of them led to discovery of a related issue when…

Cover image for Something you probably want to know about if you're using SQLite in Golang

I recently published an article on how a code using database/sql can easily deadlock. That article started multiple interesting conversations and one of them led to discovery of a related issue when using SQLite in Golang. The issue is especially nasty because it does not manifest itself in a crash. Instead it makes your database use orders of magnitude more disk than necessary.

The topic is close to my heart because I spend every day working on a database for the Edge called Turso that's based on LibSQL — a SQLite fork.

#How storage works in SQLite

To understand the problem we first need to look under the hood of SQLite and see how it manages the storage. There are actually multiple modes of SQLite operation but we're going to focus on Write-Ahead Logging(WAL). It's not the default mode so one needs to enable it by running PRAGMA journal_mode=WAL SQL statement.

In case you're wondering why anyone would use WAL, SQLite documentationhas some useful list:

1. WAL is significantly faster in most scenarios.

2. WAL provides more concurrency as readers do not block writers and a writer does not block readers. Reading and writing can proceed concurrently.

3. Disk I/O operations tends to be more sequential using WAL.

4. WAL uses many fewer fsync() operations and is thus less vulnerable to problems on systems where the fsync() system call is broken.

On a very high level, here's how WAL works:
1. There are two important files main db file and write-ahead log (WAL) file.
2. Writes only append to WAL file and don't change main db file at all.
3. Reads are performed from both main db file and WAL file.
4. The database periodically runs a checkpoint operation that transfers updates from WAL file to main db file and truncates WAL file. Checkpoint can also be triggered manually for example by running PRAGMA wal_checkpoint(TRUNCATE) SQL statement.

#The problem

Consider the following code written in Go that creates an SQLite database, populates it with data, and immediately queries that data back. At the end it prints the size of the main db file and its associated WAL file:

func insertUsers(db *sql.DB, count int) error {
 for i := 0; i < count; i++ {
  _, err := db.Exec("INSERT INTO users VALUES (?)", gofakeit.Email())
  if err != nil {
   return err
  }
 }
 return nil
}

func printFileSize(prefix, name string) error {
 fi, err := os.Stat(name)
 if err != nil {
  return err
 }
 fmt.Println(prefix, humanize.Bytes(uint64(fi.Size())))
 return nil
}

func main() {
 dbDir, err := os.MkdirTemp("", "test")
   if err != nil { panic(err) }
 defer os.RemoveAll(dbDir)

 db, err := sql.Open("sqlite", dbDir+"/test.db")
   if err != nil { panic(err) }
 defer db.Close()

 _, err = db.Exec("PRAGMA journal_mode=WAL")
   if err != nil { panic(err) }

 _, err = db.Exec("CREATE TABLE users (email TEXT)")
   if err != nil { panic(err) }

 err = insertUsers(db, 10)
   if err != nil { panic(err) }

 rows, err := db.Query("SELECT email FROM users")
   if err != nil { panic(err) }

 // Do stuff with rows here but
 //   don't call rows.Close()
 //   and don't read all the rows

 rows.Close() // <-- REMOVING THIS WILL HAVE AN INTERESTING EFFECT

 err = insertUsers(db, 10000)
   if err != nil { panic(err) }

 err = printFileSize("Main DB file size:", dbDir+"/test.db")
   if err != nil { panic(err) }
 err = printFileSize("WAL file size:", dbDir+"/test.db-wal")
   if err != nil { panic(err) }
}

When we run this code it prints:

Main DB file size: 406 kB
WAL file size: 4.1 MB

Which means that main db file has size of 406kBand WAL file has size of 4.1MB. We can see that at least some db updates generated by insertUserswere transferred from WAL file to main db file . To run checkpoint manually we could add the following code before printing the size of the files:

_, err := db.Exec("PRAGMA wal_checkpoint(TRUNCATE)")
if err != nil {
 panic(err)
}

When we run the code with such addition, it will print:

Main DB file size: 414 kB
WAL file size: 0 B

After such manual checkpoint the main db file is 414kBin size and WAL filewould be empty.

Now, let's pretend that we have forgotten to call rows.Close(). Without running manual checkpoint, the new code prints:

Main DB file size: 4.1 kB
WAL file size: 42 MB

Meaning that main db file has only 4.1kBand WAL file has 42MB. Running checkpoint manually does not change this result. It's worth noticing that in the second case when we forgot to close rows object, the storage taken by the same number of db rows is significantly (10–100x) higher.

#What happened here?

rows object represents a read operation to SQLite database. Unless rows is closed, explicitly by calling rows.Close() or implicitly by reading all the data from it, SQLite treats the operation as ongoing. It turns out that ongoing reads prevent checkpoint operation from advancing. As a result, one leaked rowsobject can prevent database from ever transferring changes from WAL file to main db file and truncating WAL file. This means WAL file grows indefinitely. At least until the process is restarted which can take a long time for a server application.

#Conclusions

It's very important to remember about closing each result of the calls to sql.DB.Query and sql.DB.QueryContext. When using SQLite, failing to do so could prevent checkpoint operation from happening and leads to the database using orders of magnitude more disk than necessary.

#Side note

SQLite has a special operation to reclaim disk space and compact the database. The operation is called VACUUM. It turns out that the problem described in this article affects VACUUM in the same way it affects checkpoint.

scarf