Fortran Relational Database

Fortran Relational Database

This is a self-contained relational database written entirely in Fortran, see the README below for details.

All the code has been written by Claude Opus 4.8 - about 13k lines in total including about 5.6k for the engine. My contribution has been the overall design, style, refactoring, testing requirements, clean builds, platform requirements and code reviews. It has had many reviews for which I used Codex, Antigravity, Fable 5 as well as Opus 4.8 in ultrathink mode. We even found a bug in gfortran (125866) which is being fixed.

All development was done on Linux (Mageia 9). Both ‘make’ and ‘fpm’ can be used to build it and FORD documentation is included.

It is known to build and run on Linux with ifx 2026.0.0, gfortran 16.1.0, flang 23.0.0git ; on Windows with mingw64 and tested with Wine in podman.

So where might this be useful? In keeping with the development, I asked Claude:

A Fortran program that needs indexed, transactional, queryable local storage — but doesn’t want to leave the language, link C/SQLite, or stand up a database server. Anywhere you’d currently reach for ad-hoc flat files, unformatted dumps, or a fragile NetCDF/CSV scheme, sqr offers indexed lookup + crash-safety instead.

                         Abridged README.md

sqr — a pure-Fortran relational store

sqr is a lightweight, embeddable relational storage engine written entirely
in modern Fortran. It stores tables as fixed-record binary files in a
directory, with on-disk B+tree secondary indices, a physical rollback
journal for crash-safe transactions, and two interactive front-ends — a
state-graph shell (sqrsh) and a small SQL-subset REPL (sqlsh).

It is deliberately scoped for the small-to-medium workloads a single program
needs (10⁴–10⁶ rows), not for postgres-scale concurrency. Access is
single-writer / multi-reader: an advisory lock admits one read-write
connection or any number of read-only ones at a time — there is no concurrent-writer (per-row / MVCC) isolation. The design goal is integrity first: every mutation is write-ahead journalled and survives a crash, even at the cost of an fsync per write.


Features

  • A database is a directory. No server, no daemon — open a path.
  • Typed columns: DT_INT (32-bit), DT_REAL (64-bit), DT_CHAR
    (fixed-width, NUL-padded), DT_TEXT (length-prefixed blob, binary-safe).
  • Per-row NULL bitmap — a NULL column reads back as absent and is omitted
    from any index it belongs to (partial-index semantics).
  • On-disk B±tree indices, including composite keys, uniqueness, range
    scans and cursors. The b_tree module is fully decoupled from sqr and is
    independently reusable.
  • Schema evolution: ADD COLUMN / DROP COLUMN by table rewrite, with
    row ids preserved (no index rebuild) and DROP cascading to dependent
    indices.
  • Crash-safe transactions: explicit db_begin/db_commit/db_rollback,
    plus auto-commit brackets around every single-row mutator. See
    ACID guarantees below.
  • Single-writer / multi-reader locking: an advisory lock taken on open
    admits one writer or many readers; contention is reported as SQR_LOCKED,
    and db_set_readonly demotes a writer to let readers in.
  • No error stop in library code — every entry point reports via optional
    stat / errmsg arguments.
  • Procedural and object-oriented APIs: call db_insert(db, ...) or
    db%insert(...).
  • Two front-ends: sqrsh, a cmdgraph state-graph shell over the engine,
    and sqlsh, a small SQL subset (a separate sql front-end layer that
    only calls the public API — no SQL in the store). See below.
  • Portable: clean on ifx and gfortran, builds under fpm, and
    cross-builds to Windows (mingw-w64, wine-validated).

Quick start (Fortran API)

use :: sqr
type(db_t), target :: db
type(column_t)     :: cols(2)
character(len=:), allocatable :: buf
integer        :: st, ti
integer(int32) :: rid

call db_open(db, 'mydb', stat=st)                 ! a database is a directory

cols(1)%name = 'id';   cols(1)%dtype = DT_INT;  cols(1)%csize = 4
cols(2)%name = 'name'; cols(2)%dtype = DT_CHAR; cols(2)%csize = 32
call db_create_table(db, 'people', cols, st)
ti = db_table_index(db, 'people')

call row_alloc(buf, db%tables(ti)%record_size)
call row_set_int (buf, db%tables(ti)%cols(1), 1_int32)
call row_set_char(buf, db%tables(ti)%cols(2), 'Ada')
call db_insert(db, 'people', buf, rid, st)        ! rid = new row id

call db_create_index(db, 'people', 'id', st)      ! on-disk B+-tree
call db_find_by_int(db, 'people', 'id', 1_int32, rid, st)

call db_close(db, st)

Wrap a group of changes in an explicit transaction when you need them to
commit (and fail) as a unit:

call db_begin(db, st)
! ... several inserts / updates / deletes ...
call db_commit(db, st)     ! durable here; or db_rollback(db, st)

The sqrsh shell

sqrsh is a small state-graph REPL over the engine. The command set:

root:    open <dir>   close   readonly   tables   desc <table>
         create <table>   use <table>   drop <table>   quit
creator: col <name> <type>   done   cancel   quit
table:   insert ...   select   get <id>   delete <id>   compact
         addcolumn <name> <type>   dropcolumn <name>
         index [unique] <col>...   dropindex <col>...   verify
         find <col> <value>   range <col> <lo> <hi>   match <col> <regex>
         getk ...   delk ...   back   quit

The sqlsh shell (SQL subset)

sqlsh is a second, independent front-end: a familiar SQL “shop window”
over the same engine. It is a front-end layer only — the sql module
(lexer, parser, executor) and the REPL call nothing but the public db_*
API, so the dependency runs one way (sql uses sqr, never the reverse)
and nothing about the on-disk format changes. The store itself has no
notion of SQL.

sqlsh mydb < script.sql        # run a script (results on stdout)
sqlsh mydb                     # interactive (prompts/errors on stderr)

Meta-commands: .open <dir>, .close, .tables, .schema [table],
.help, .quit. Everything else is SQL:

CREATE TABLE employee (id INTEGER, name CHAR(20), dept CHAR(12), salary REAL);
CREATE INDEX ON employee (dept);
INSERT INTO employee VALUES (1,'Alice','eng',55000.0), (2,'Bob','eng',48000.0);
SELECT name, salary FROM employee WHERE dept = 'eng' ORDER BY salary DESC LIMIT 5;
UPDATE employee SET salary = 50000.0 WHERE dept = 'sales' AND salary < 50000.0;
DELETE FROM employee WHERE salary < 40000.0;

On-disk layout

A database is a directory containing:

File Contents
_catalog.dat top-level catalog: the list of table names
<table>.schema per-table schema header + column definitions
<table>.dat fixed-size records (recl = record_size)
<table>.blob length-prefixed DT_TEXT values
<table>__i<slot>.idx one paged B±tree per secondary index
_journal.dat rollback (undo) journal; present only while a txn is open or pending recovery
_lock zero-byte sentinel carrying the advisory open lock

Each record is: 1 status byte (ROW_ALIVE / ROW_TOMBSTONE), then a
(ncols+7)/8-byte NULL bitmap, then column data at fixed offsets.


ACID guarantees

sqr is honest about what it provides. The store is single-writer /
multi-reader
: at most one read-write connection, or any number of
read-only connections, at a time — enforced by an advisory lock taken on
db_open. It does not provide fine-grained (per-row / MVCC) isolation
between concurrent writers.

Property Status How
Atomicity :white_check_mark: Yes Physical undo journal; a failed or rolled-back transaction restores every touched region.
Consistency :white_check_mark: Yes, including across a crash Recovery on db_open replays the journal, restoring data, blob and index files together.
Durability :white_check_mark: Yes Strict write-ahead: each undo image is fsync’d to the hot journal before the base write it guards. Commit fsyncs every modified file, then voids the journal header — the single durable commit point.
Isolation ◑ Coarse An advisory lock on _lock admits one writer xor many readers. A second writer (or a reader while a writer is active) is refused with SQR_LOCKED. db_set_readonly downgrades a writer so readers may attach. No concurrent-writer / row-level isolation.

Locking is whole-database advisory: flock(2) on POSIX, LockFileEx on
Windows. It is released on db_close and automatically by the OS if the
process dies, so a crashed writer never wedges the database.

The durability path is deliberately conservative — an fsync per write —
because the project prioritises integrity over throughput.


5 Likes