Columnar DB

A column-oriented analytical database built from scratch in Go, inspired by DuckDB & ClickHouse.

~19KLines of Go
474Tests
0External deps
5Phases

Quick Start

go run ./cmd/columnar-db

columnar-db> SELECT city, COUNT(*), AVG(age) FROM people GROUP BY city ORDER BY city
city     | COUNT(*) | AVG(age)
---------+----------+---------
Beijing  | 166      | 47.82
Kunming  | 184      | 45.51
Lyon     | 156      | 44.48
Paris    | 165      | 44.72
Shanghai | 174      | 44.94
Tokyo    | 155      | 45.02
(6 rows)

The REPL ships with a 1000-row demo dataset. Type .schema to see columns, .help for syntax, .quit to exit.

SQL Subset

SELECT { * | col [, ...] | agg(col) [, ...] }
FROM table
[WHERE col { = | != | < | <= | > | >= } literal]
[GROUP BY col [, col ...]]
[ORDER BY col [ASC | DESC]]
[LIMIT n]

Aggregates: COUNT(*), COUNT(col), SUM, MIN, MAX, AVG on Int64 and Float64.

Components

Key Concepts

Columnar storageContiguous typed arrays + null bitmaps, not row tuples
Vectorized executionProcess ~1024 values per batch in tight typed loops
Late materializationSelection vectors narrow live rows without copying data
Zero-alloc hot pathPer-batch Update/Finalize allocates nothing at steady state

Build Phases

PhaseTopicHighlight
1Column StorageTypes, null bitmap, row groups, file format
2EncodingRLE, dictionary, delta, LZ4
3Vectorized ScanBatch processing, filters, selection vectors
4AggregationHash GROUP BY, 1.67x speedup on high-cardinality
5SQL FrontendLexer, parser, planner — +0.3% overhead vs direct API