Architecture¶

Sploink's central abstraction is the bipartite mapping between workflow steps and hardware types. This page explains the two-layer model and how to open the interactive viz.

The two layers, named¶

WORKFLOW IR (customer)                     HARDWARE TYPES (sploink targets)
                                             ┌───── CPU ───────┐
classify ────────────────────────────────►   │ • Ollama        │
                                             │ ○ Salad         │
rerank ──────────────────────────────────►   │ ○ RunPod        │
                                             └─────────────────┘

extract ─────────────────────────────────►   ┌───── LPU ───────┐
                                             │ • Groq          │
reason ──────────────────────────────────►   │ ○ Cerebras      │
                                             └─────────────────┘

verify ──────────────────────────────────►   ┌──── GPU ────────┐
                                             │ ○ Together      │
                                             │ ○ RunPod        │
                                             └─────────────────┘

                                             ┌─ Frontier API ──┐
                                             │ ○ Anthropic     │
                                             │ ○ OpenAI        │
                                             └─────────────────┘

Layer 1 — Hardware-type policy¶

The routing decision: which kind of hardware is best for this step type?

HW_ROUTED_POLICY: dict[str, str] = {
    "classify": "cpu",
    "rerank":   "cpu",
    "extract":  "cpu",
    "verify":   "cpu",
    "reason":   "lpu",
}

This policy is the strategic layer. Sploink's job is to learn good policies from observed telemetry — which (step, hardware-type) pairs preserve quality at lower cost.

Layer 2 — Substrate selection¶

The substrate resolution: given a hardware type, which provider instance serves it right now?

SUBSTRATE_INSTANCES: dict[str, list[dict]] = {
    "cpu": [
        {"provider": "ollama", "model": "llama3.1:8b"},
        # future: {"provider": "salad", "model": "llama3.1:8b"},
    ],
    "lpu": [
        {"provider": "groq", "model": "llama-3.1-8b-instant"},
        # future: {"provider": "cerebras", "model": "llama-3.1-8b"},
    ],
    ...
}

def select_substrate(hardware_type: str) -> dict:
    """First-available today. Tomorrow: filter by availability, pick by cost."""
    return SUBSTRATE_INSTANCES[hardware_type][0]

This layer is operational. Adding a new provider for an existing hardware type is one line of config. Switching providers based on availability, region, or rate-limit headroom is the selector's job, invisible to the policy.

Why decouple?¶

	Together (today)	Decoupled (now)
Adding a new provider	Edit every routing rule that uses that hardware type	One line in `SUBSTRATE_INSTANCES`
Adding a new hardware type	Edit every strategy that should be aware of it	One line + a dispatcher branch
Switching providers by region	Hardcoded per strategy	Selector logic
Learned routing trains on...	Provider-specific decisions	Hardware-type decisions (transferable across providers)

This mirrors how compilers work — instruction selection (which kind of op) is decoupled from instruction scheduling (which physical resource).

Interactive diagram¶

Open the bipartite architecture viz in your browser:

python -m sploink.architecture

This generates sploink_architecture.html — a single-file SVG diagram with:

Workflow IR (left) — read live from bench.graphs.GRAPHS
Hardware types (right) — each containing nested substrate instances
Bipartite edges — Layer 1 (workflow step → hardware type) for the selected strategy
Active instance dots inside each hardware box — Layer 2 (which provider serves this type)

Strategy switcher in the header shows how Layer 1 changes between cpu_only / lpu_only / hw_routed.

# Switch the default workflow or strategy shown:
python -m sploink.architecture --workflow parallel_dag --strategy hw_routed
python -m sploink.architecture --workflow linear     --strategy cpu_only

How it relates to the rest of sploink¶

Layer	Owned by	Code location
Workflow IR	Customer (their framework — LangGraph, DSPy, plain Python)	their code, observed via `sploink.wrap()`
Workflow Graph data structure	Sploink	`sploink/graph.py`
Hardware-type policy (Layer 1)	Sploink (and configurable by customer)	`bench/strategies.py` HW_POLICIES
Substrate instances (Layer 2 catalog)	Sploink + customer's available providers	`bench/strategies.py` SUBSTRATE_INSTANCES
Dispatch	Sploink (substrate-specific call adapters)	`bench/strategies.py` `_dispatch`
Trace	Sploink	`sploink/trace.py`

The customer brings the IR. Sploink brings everything from policy down.