Benchmarks

Where Gribouille stays fast, and where a chart grows too heavy to compile.

Every Gribouille chart is compiled by Typst, and most layers draw one mark per data row. That makes compile cost rise with the number of elements on the page, so a chart that is instant at a few hundred points can become impractical at tens of thousands. This page measures that growth so you can judge, before you build a chart, whether your data volume suits the library.

How the numbers are produced

A small harness under tools/benchmark/ compiles a fixed set of charts across a range of element counts and the three output formats. For each cell it records the compile time, the output size, and whether the compile finished within a fixed time budget. A chart that exceeds the budget is recorded as a timeout rather than measured, because past that point the library is no longer a practical choice for that workload.

The figures below read the recorded data and are themselves drawn with Gribouille. The numbers are illustrative and depend on the machine that produced them, so read the shape of each curve rather than its absolute values.

Procedure

Each case is a standalone Typst file that reads its element count from sys.inputs, so one file covers every size, and builds deterministic synthetic data, so a given size renders the same chart on every run. The harness compiles the cases one at a time, never in parallel, so concurrent compiles cannot contend for the processor and distort the timings. It wraps each compile in /usr/bin/time, takes the median wall time over a few repetitions, records the output file size, and kills any compile that exceeds the time budget, marking it as a timeout.

The committed dataset behind the figures on this page was produced with:

lua tools/benchmark/run.lua \
  --variants base \
  --sizes 100,1000,10000,100000 \
  --formats png,svg,pdf --reps 1 --timeout 90 \
  --out docs/benchmarks/results.csv

Every case follows the same shape: read the count, build the data, draw one layer. The point case, for instance, is:

Benchmark case: point.typ

#import "../../../lib.typ": *

#set page(width: auto, height: auto, margin: 0cm)

#let n = int(sys.inputs.at("n", default: "100"))
#let variant = sys.inputs.at("variant", default: "base")

#let settings = (
  base: (size: 1.5pt, alpha: 0.6),
  large: (size: 4pt, alpha: 0.6),
  star: (size: 2.5pt, shape: "star", alpha: 0.8),
  alpha: (size: 1.5pt, alpha: 0.25),
)
#let opts = settings.at(variant, default: settings.base)

#let d = range(0, n).map(i => {
  let t = i / n
  let theta = t * 6 * calc.pi
  (x: t * 12, y: calc.sin(theta) + t * 3)
})

#plot(
  data: d,
  mapping: aes(x: "x", y: "y"),
  layers: (geom-point(..opts),),
  width: 12cm,
  height: 7cm,
)

The full harness, every other case, and the command-line options are in tools/benchmark/.

Compile time against element count

Typst source for this figure

// Compile time versus element count, read from the committed benchmark dataset
// and drawn with gribouille itself.
//
// Compile from the project root for debugging:
//
//   typst compile --root . docs/guides/_benchmarks-time.typ docs/guides/_benchmarks-time.pdf
//
// The .qmd page reuses this file via the `file: _benchmarks-time.typ` chunk
// option; do not move or rename it without updating that reference.

#import "/lib.typ": *

#set page(width: auto, height: auto, margin: 0.25cm)

#let budget = 90

#let rows = csv("/docs/benchmarks/results.csv", row-type: dictionary)
#let done = (
  rows
    .filter(r => r.status == "ok")
    .map(r => (
      case: r.case,
      n: int(r.n),
      format: r.format,
      time: float(r.time_s),
    ))
)
#let stalled = (
  rows
    .filter(r => r.status == "timeout")
    .map(r => (case: r.case, n: int(r.n), format: r.format, time: budget))
)

#plot(
  data: done,
  mapping: aes(x: "n", y: "time", colour: "format"),
  layers: (
    geom-hline(yintercept: budget, linetype: "dashed", colour: rgb("#999999")),
    geom-line(),
    geom-point(size: 2.5pt),
    geom-point(data: stalled, shape: "cross", size: 3.5pt),
  ),
  scales: (scale-x-log10(), scale-y-log10()),
  facet: facet-wrap("case", ncolumn: 4),
  labels: labels(
    title: "Compile time grows superlinearly with element count",
    subtitle: "Crosses mark sizes that exceeded the "
      + str(budget)
      + "s budget",
    x: "Elements (log scale)",
    y: "Compile time, seconds (log scale)",
    colour: "Format",
  ),
  theme: theme-minimal(),
  width: 24cm,
  height: 11cm,
)

A grid of small line charts, one per chart type, plotting compile time against element count on logarithmic axes for the PNG, SVG, and PDF formats. The point, column, tile, line, and faceted smoother panels rise sharply and end in crosses that mark sizes which exceeded the time budget. The two-dimensional bin and boxplot panels rise more gently and reach the largest size without a cross. — Figure 1: Compile time against element count for each chart type, on log scales. Per-row layers climb steeply and hit the time budget, while binning and aggregating layers stay within it.

Per-row layers (geom-point, geom-col, geom-tile) climb far faster than the data grows. On the test machine a scatter of one thousand points compiles in about two seconds, ten thousand takes close to a minute, and one hundred thousand never finishes inside the budget. Most of the cost is handling every row, so a single connected geom-line is markedly cheaper than the same number of separate markers. Aggregating layers move the ceiling outward rather than removing it. A two-dimensional bin or a boxplot collapses the rows to a small, fixed number of marks, so it still completes at one hundred thousand rows where every per-row layer times out, even though it too slows as the row count climbs.

Output size against element count

Typst source for this figure

// Output size versus element count, read from the committed benchmark dataset
// and drawn with gribouille itself.
//
// Compile from the project root for debugging:
//
//   typst compile --root . docs/guides/_benchmarks-size.typ docs/guides/_benchmarks-size.pdf
//
// The .qmd page reuses this file via the `file: _benchmarks-size.typ` chunk
// option; do not move or rename it without updating that reference.

#import "/lib.typ": *

#set page(width: auto, height: auto, margin: 0.25cm)

#let rows = csv("/docs/benchmarks/results.csv", row-type: dictionary)
#let done = (
  rows
    .filter(r => r.status == "ok" and r.bytes != "")
    .map(r => (
      case: r.case,
      n: int(r.n),
      format: r.format,
      kb: float(r.bytes) / 1024,
    ))
)

#plot(
  data: done,
  mapping: aes(x: "n", y: "kb", colour: "format"),
  layers: (
    geom-line(),
    geom-point(size: 2.5pt),
  ),
  scales: (scale-x-log10(), scale-y-log10()),
  facet: facet-wrap("case", ncolumn: 4),
  labels: labels(
    title: "Vector output balloons with element count, raster stays compact",
    subtitle: "SVG carries one node per mark; PNG and PDF grow far more slowly",
    x: "Elements (log scale)",
    y: "Output size, KB (log scale)",
    colour: "Format",
  ),
  theme: theme-minimal(),
  width: 24cm,
  height: 11cm,
)

A grid of small line charts, one per chart type, plotting output file size against element count on logarithmic axes for the PNG, SVG, and PDF formats. The SVG line rises steeply for the per-row layers because every mark becomes a node in the file, while the PNG and PDF lines stay much lower and flatter. The binning and boxplot panels stay flat for all three formats. — Figure 2: Output size against element count for each chart type, on log scales. Vector output grows with the number of marks, while raster output stays compact.

Format matters as much as count. SVG stores one node per mark, so a dense scatter produces a very large file, whereas PNG rasterises to a fixed grid and stays compact regardless of mark count. PDF sits between the two. For a chart with many thousands of marks, prefer a raster format unless you specifically need vector output.

What this means for your charts

A few hundred to a few thousand marks per chart compile comfortably in any format.
Tens of thousands of per-row marks are slow, taking tens of seconds, and a raster format is the only sensible choice.
One hundred thousand per-row marks exceed the budget, so reshape the work instead of drawing every row.
When the data is large, reach for a layer that aggregates first, such as geom-bin-2d, geom-hex, geom-histogram, or geom-boxplot, which collapse the rows to few marks and push the practical ceiling much further out, though very large row counts still cost time.
When even the aggregation is heavy, do the summarising in a dedicated computing language such as R or Python, then pass the small, pre-computed result to Gribouille to draw.