Executable Code and Diagrams in Technical Books: A Practical Setup Guide
How to handle snippet extraction pitfalls, integrate Mermaid and Draw.io diagrams, and build a CI pipeline for technical books with infrastructure-dependent code examples.
This is a follow-up to Writing Technical Books in 2026: Tools, Workflows, and the Case for Executable Code, where I compared Quarto, Jupyter Book, Pandoc, and other tools for technical book authoring. That post ended with a recommendation: use Quarto with a hybrid approach — inline executable code for simple examples, snippet extraction from a tested repo for infrastructure-heavy ones.
This post gets into the practical details. How do you actually set up snippet extraction without it becoming a maintenance nightmare? How do you handle diagrams across PDF and HTML output? And what does the CI pipeline look like for a book that needs Docker infrastructure to validate its code examples?
I’ve published a working template repository that implements everything discussed here — a Quarto book with Postgres via Docker Compose, executable code, snippet extraction, pre-commit hooks, supply chain protections, and a full CI pipeline. Fork it and start writing.
The Hybrid Architecture
Most technical books fall into a pattern: early chapters set up infrastructure (databases, message queues, cloud services), and later chapters build application logic on top. The infrastructure chapters need real, tested code but can’t execute during a book render. The application chapters can often run inline.
flowchart TB
subgraph "Chapters 1-2: Infrastructure"
A[Tested repo] -->|snippet extraction| B[eval: false<br/>code blocks]
end
subgraph "Chapters 3+: Application"
C[Inline code in .qmd] -->|Quarto executes| D[Output captured<br/>in document]
end
E[Docker Compose] -->|pre-render script| C
A -->|CI tests| F[Validated independently]
The project structure:
book/
_quarto.yml # Book config with pre/post render hooks
chapters/
01-infrastructure-setup.qmd # Snippets from tested repo
02-platform-basics.qmd # Snippets + some inline
03-data-pipeline.qmd # Inline executable code
04-monitoring.qmd # Inline + diagrams
code/ # Tested source code repo
ch01/
ch02/
tests/
diagrams/ # Draw.io/Excalidraw sources
scripts/
start-infra.sh # docker compose up -d
stop-infra.sh # docker compose down
lint-snippets.py # Validate snippet references
export-diagrams.sh # Draw.io → SVG
.github/workflows/
book.yml # Full CI pipeline
Snippet Extraction Done Right
The basic idea is simple: mark regions in your source code with named tags, reference them from the manuscript, and a build script resolves the includes. The pitfalls are also well-known — chapter reordering breaks narrative context, refactoring cascades into the manuscript, dead snippets accumulate. Here’s how to handle each one.
Snippet markers
Use a consistent format that works as valid comments in any language:
# code/ch01/kafka_setup.py
def create_producer():
# <<< snippet: kafka-producer-setup >>>
producer = KafkaProducer(
bootstrap_servers=['localhost:9092'],
value_serializer=lambda v: json.dumps(v).encode('utf-8')
)
return producer
# <<< /snippet: kafka-producer-setup >>>
def send_event(producer, topic, event):
# <<< snippet: kafka-send-message >>>
producer.send(topic, event)
producer.flush()
# <<< /snippet: kafka-send-message >>>
Reference them in the manuscript with eval: false so Quarto renders the code but doesn’t try to execute it:
## Setting Up the Producer
```{python}
#| eval: false
{{< include ../code/ch01/kafka_setup.py#kafka-producer-setup >}}
```
The producer serializes each event as JSON before sending.
Quarto’s include shortcode handles the extraction natively — no custom build script needed for basic cases.
Pitfall 1: Chapter reordering breaks narrative references
The rule: never reference chapters by number in prose. Use Quarto cross-references instead:
<!-- BAD — breaks when you reorder -->
As we saw in Chapter 3, the producer connects to the broker.
<!-- GOOD — Quarto resolves automatically -->
As we saw in @sec-kafka-setup, the producer connects to the broker.
Every section that might be referenced gets an explicit label:
## Setting Up Kafka {#sec-kafka-setup}
Quarto resolves @sec-kafka-setup to the correct chapter and section number regardless of ordering. If you move the section, every reference updates automatically. This applies to figures (@fig-architecture), tables (@tbl-metrics), and code listings (@lst-producer) too.
Pitfall 2: Refactoring code cascades into the manuscript
When you rename a function or restructure a module, three things can break: snippet markers, include references, and prose descriptions.
Snippet markers and includes — a CI linter catches these:
#!/usr/bin/env python3
"""Validate that all snippet references in chapters resolve to defined snippets."""
import re
import sys
from pathlib import Path
SNIPPET_DEF = re.compile(r"<<<\s*snippet:\s*([\w-]+)")
SNIPPET_REF = re.compile(r"include\s+\S+#([\w-]+)")
def main():
# Scan code/ for defined snippets
defined = set()
for path in Path("code").rglob("*"):
if path.is_file():
for match in SNIPPET_DEF.finditer(path.read_text()):
defined.add(match.group(1))
# Scan chapters/ for referenced snippets
used = set()
for path in Path("chapters").rglob("*.qmd"):
for match in SNIPPET_REF.finditer(path.read_text()):
used.add(match.group(1))
errors = False
# Broken references (used but not defined)
broken = used - defined
if broken:
for name in sorted(broken):
print(f"::error::Broken snippet reference: '{name}'")
errors = True
# Dead snippets (defined but not used)
dead = defined - used
if dead:
for name in sorted(dead):
print(f"::warning::Dead snippet (defined but never referenced): '{name}'")
sys.exit(1 if errors else 0)
if __name__ == "__main__":
main()
Run this in CI alongside your tests. Broken references fail the build. Dead snippets generate warnings.
Prose descriptions — no tool catches “we call send_event()” when you renamed it to publish_event(). The best defence is to reference behaviour, not implementation details:
<!-- BAD — breaks on rename -->
The `send_event()` function handles serialization and delivery.
<!-- BETTER — describes behaviour -->
The event publishing function handles serialization and delivery.
<!-- BEST — reference the snippet directly -->
The function shown in @lst-kafka-send handles serialization and delivery.
Pitfall 3: Cross-snippet dependencies are invisible
Chapter 5’s snippet uses a producer variable that was created in Chapter 1’s snippet. The extraction tool doesn’t know this — it just pulls text. If you reorder or remove the Chapter 1 snippet, Chapter 5’s code still extracts fine but won’t make sense to the reader.
Solution: make dependencies explicit in tests.
# tests/test_ch05.py
from code.ch01.kafka_setup import create_producer
from code.ch05.consumer import process_events
def test_end_to_end():
"""Validates that ch05 examples work with ch01's infrastructure setup."""
producer = create_producer()
# Test ch05 snippets with ch01's setup
result = process_events(producer, topic="test-events")
assert result.processed_count > 0
The import chain makes the dependency graph explicit. If Chapter 1 refactors and breaks the interface, Chapter 5’s test fails with a clear import error — not a mysterious runtime problem weeks later when a reader tries the code.
For additional safety, declare dependencies in the snippet markers themselves:
# <<< snippet: consume-events depends: kafka-producer-setup >>>
Your linter can parse these and validate that every dependency is included earlier in the chapter ordering.
Pitfall 4: Dead snippets accumulate
The linter script above catches this — defined but unreferenced snippets generate warnings. Run it in CI so they surface on every PR.
Go further: add a pre-commit hook that runs the linter on staged .qmd and source files. Catch dead snippets before they’re committed, not in CI.
Pitfall 5: Context loss in extracted snippets
The reader sees 8 lines pulled from a 200-line file. Where do the imports come from? What class is this method in?
Solution: snippet groups with optional context.
# <<< snippet-context: kafka-imports >>>
from kafka import KafkaProducer
import json
# <<< /snippet-context: kafka-imports >>>
# <<< snippet: kafka-producer-setup context: kafka-imports >>>
producer = KafkaProducer(
bootstrap_servers=['localhost:9092'],
value_serializer=lambda v: json.dumps(v).encode('utf-8')
)
# <<< /snippet: kafka-producer-setup >>>
Your extraction script can render the context as a collapsible block above the snippet:
::: {.callout-note collapse="true" title="Full imports for this example"}
```python
from kafka import KafkaProducer
import json
```
:::
```{python}
#| eval: false
{{< include ../code/ch01/kafka_setup.py#kafka-producer-setup >}}
```
Readers who need the full picture expand the callout. Readers who don’t skip past it. The context is always accurate because it’s extracted from the same source file.
Diagrams in Multi-Format Books
Technical books need diagrams. Books also need those diagrams to work across PDF (vector, static), HTML (potentially interactive), and EPUB (static, constrained). Here’s how to handle each diagram tool.
Mermaid — your default choice
Quarto renders Mermaid natively. Write it inline in the chapter:
```{mermaid}
%%| label: fig-data-flow
%%| fig-cap: "Event processing pipeline"
flowchart LR
A[Producer] --> B[Kafka]
B --> C[Stream Processor]
C --> D[(Database)]
C --> E[Monitoring]
```
This gives you:
- PDF: auto-rendered to SVG, embedded as a vector image
- HTML: rendered client-side, interactive (hover, zoom)
- EPUB: rendered to static SVG
- Version control: the diagram source is text in the
.qmdfile, clean diffs
Use Mermaid for flowcharts, sequence diagrams, ERDs, state diagrams, Gantt charts, and decision trees. It handles 80% of technical book diagrams.
Tip: keep diagrams under 12 nodes. If a diagram is getting complex, split it into two diagrams with a connecting narrative paragraph.
Draw.io — for complex architecture diagrams
When Mermaid can’t handle the layout — 15+ nodes, custom positioning, overlapping layers, network topology — use Draw.io. The .drawio XML format is version-controllable (diffable, mergeable).
Store sources and exports together:
diagrams/
ch03-platform-architecture.drawio # Source
ch03-platform-architecture.svg # Exported for book
ch07-data-flow.drawio
ch07-data-flow.svg
Reference in the chapter:
{#fig-platform}
Automate the export in CI so you never forget to re-export after editing:
#!/bin/bash
# scripts/export-diagrams.sh
for f in diagrams/*.drawio; do
svg="${f%.drawio}.svg"
drawio --export --format svg --border 10 --output "$svg" "$f"
done
The Draw.io CLI (drawio or draw.io) works headless on Linux CI runners.
Excalidraw — for hand-drawn style diagrams
Excalidraw produces a distinctive hand-drawn aesthetic that works well for conceptual diagrams, whiteboard-style explanations, and informal system overviews. The .excalidraw source is JSON — version-controllable.
The workflow is the same as Draw.io: store source + exported SVG, automate export in CI. Excalidraw’s CLI export is less mature than Draw.io’s, so you may need to export manually or use the excalidraw-export tool.
D2 — for layout-critical diagrams
D2 is a text-based diagramming language with the TALA layout engine, which produces better automatic layouts than Mermaid on complex graphs. The source is a .d2 text file.
Producer -> Kafka: events
Kafka -> "Stream\nProcessor": consume
"Stream\nProcessor" -> Database: write
"Stream\nProcessor" -> Monitoring: metrics
Export to SVG:
d2 --theme=0 diagram.d2 diagram.svg
Use D2 when you have complex 15+ node diagrams where automatic layout quality matters and Mermaid’s output looks messy.
PlantUML — for UML compliance
If your book needs formal UML diagrams (class diagrams with visibility modifiers, sequence diagrams with activation bars and alt/else fragments, component diagrams), PlantUML is the right tool. Quarto has a PlantUML filter available.
Choosing the right tool per diagram
flowchart TD
A[Need a diagram] --> B{How complex?}
B -->|Under 12 nodes| C{Need UML<br/>compliance?}
B -->|12+ nodes| D{Need custom<br/>positioning?}
C -->|No| E[Mermaid<br/>inline in .qmd]
C -->|Yes| F[PlantUML]
D -->|Yes| G[Draw.io or<br/>Excalidraw]
D -->|No| H[D2<br/>text-based]
For most technical books: Mermaid for 80% of diagrams, Draw.io for the complex architecture diagrams, and PlantUML only if you need formal UML.
Theme handling across output formats
PDF books are printed — light theme only. HTML versions may support dark mode. EPUB readers vary.
The pragmatic approach:
- Mermaid: Quarto handles theming per output format automatically
- Draw.io / Excalidraw / D2: export with light theme, set a white background on the SVG
Don’t generate two versions of every diagram unless your HTML version genuinely needs dark mode. For most technical books, light-theme diagrams on a white or transparent background work everywhere.
The CI Pipeline
Everything comes together in a GitHub Action:
name: Build Book
on:
push:
branches: [main]
pull_request:
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- name: Install dependencies
run: pip install -r requirements.txt
- name: Lint snippet references
run: python scripts/lint_snippets.py
- name: Export Draw.io diagrams
run: |
# Install Draw.io CLI
scripts/export-diagrams.sh
- name: Start infrastructure
run: docker compose up -d
working-directory: code
- name: Run code tests
run: pytest code/tests/ -v
- name: Install Quarto
uses: quarto-dev/quarto-actions/setup@v2
- name: Render book
run: quarto render
- name: Stop infrastructure
if: always()
run: docker compose down
working-directory: code
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: book
path: |
_book/*.pdf
_book/*.epub
The pipeline:
- Lint snippets — catch broken references and dead snippets before anything else
- Export diagrams — regenerate SVGs from Draw.io sources
- Start infrastructure — Docker Compose brings up Kafka, Postgres, etc.
- Run code tests — validate all snippets against real infrastructure
- Render book — Quarto executes inline code (connecting to the running infrastructure) and resolves snippet includes
- Upload artifacts — PDF and EPUB available as build artifacts
flowchart LR
A[Push / PR] --> B[Lint snippets]
B --> C[Export diagrams]
C --> D[Start Docker infra]
D --> E[Run code tests]
E --> F[quarto render]
F --> G[PDF + HTML + EPUB]
D --> H[Stop infra]
F --> H
I[Renovate PR] --> A
When Renovate or Dependabot bumps a dependency, this entire pipeline runs. If the upgrade breaks a code example — whether it’s an inline Quarto block or an extracted snippet — the PR fails before merge.
Putting It All Together
The complete workflow for a technical book with infrastructure-dependent code and rich diagrams:
- Write chapters in
.qmdfiles with Mermaid diagrams inline and snippet includes for infrastructure code - Maintain tested code in
code/with snippet markers and CI tests - Create complex diagrams in Draw.io, store
.drawiosources indiagrams/, auto-export SVGs in CI - Pre-render script starts Docker infrastructure so inline code can execute against real services
- CI validates everything: snippet linting, code tests, diagram export, full book render
- Dependency upgrades trigger full validation — broken examples block the PR
The result is a book where every code example is tested, every diagram is version-controlled, and a library upgrade that breaks something is caught automatically before it reaches readers.