Architecture¶
Sploink's central abstraction is the bipartite mapping between workflow steps and hardware types. This page explains the two-layer model and how to open the interactive viz.
The two layers, named¶
WORKFLOW IR (customer) HARDWARE TYPES (sploink targets)
┌───── CPU ───────┐
classify ────────────────────────────────► │ • Ollama │
│ ○ Salad │
rerank ──────────────────────────────────► │ ○ RunPod │
└─────────────────┘
extract ─────────────────────────────────► ┌───── LPU ───────┐
│ • Groq │
reason ──────────────────────────────────► │ ○ Cerebras │
└─────────────────┘
verify ──────────────────────────────────► ┌──── GPU ────────┐
│ ○ Together │
│ ○ RunPod │
└─────────────────┘
┌─ Frontier API ──┐
│ ○ Anthropic │
│ ○ OpenAI │
└─────────────────┘
Layer 1 — Hardware-type policy¶
The routing decision: which kind of hardware is best for this step type?
HW_ROUTED_POLICY: dict[str, str] = {
"classify": "cpu",
"rerank": "cpu",
"extract": "cpu",
"verify": "cpu",
"reason": "lpu",
}
This policy is the strategic layer. Sploink's job is to learn good policies from observed telemetry — which (step, hardware-type) pairs preserve quality at lower cost.
Layer 2 — Substrate selection¶
The substrate resolution: given a hardware type, which provider instance serves it right now?
SUBSTRATE_INSTANCES: dict[str, list[dict]] = {
"cpu": [
{"provider": "ollama", "model": "llama3.1:8b"},
# future: {"provider": "salad", "model": "llama3.1:8b"},
],
"lpu": [
{"provider": "groq", "model": "llama-3.1-8b-instant"},
# future: {"provider": "cerebras", "model": "llama-3.1-8b"},
],
...
}
def select_substrate(hardware_type: str) -> dict:
"""First-available today. Tomorrow: filter by availability, pick by cost."""
return SUBSTRATE_INSTANCES[hardware_type][0]
This layer is operational. Adding a new provider for an existing hardware type is one line of config. Switching providers based on availability, region, or rate-limit headroom is the selector's job, invisible to the policy.
Why decouple?¶
| Together (today) | Decoupled (now) | |
|---|---|---|
| Adding a new provider | Edit every routing rule that uses that hardware type | One line in SUBSTRATE_INSTANCES |
| Adding a new hardware type | Edit every strategy that should be aware of it | One line + a dispatcher branch |
| Switching providers by region | Hardcoded per strategy | Selector logic |
| Learned routing trains on... | Provider-specific decisions | Hardware-type decisions (transferable across providers) |
This mirrors how compilers work — instruction selection (which kind of op) is decoupled from instruction scheduling (which physical resource).
Interactive diagram¶
Open the bipartite architecture viz in your browser:
This generates sploink_architecture.html — a single-file SVG diagram with:
- Workflow IR (left) — read live from
bench.graphs.GRAPHS - Hardware types (right) — each containing nested substrate instances
- Bipartite edges — Layer 1 (workflow step → hardware type) for the selected strategy
- Active instance dots inside each hardware box — Layer 2 (which provider serves this type)
Strategy switcher in the header shows how Layer 1 changes between cpu_only / lpu_only / hw_routed.
# Switch the default workflow or strategy shown:
python -m sploink.architecture --workflow parallel_dag --strategy hw_routed
python -m sploink.architecture --workflow linear --strategy cpu_only
How it relates to the rest of sploink¶
| Layer | Owned by | Code location |
|---|---|---|
| Workflow IR | Customer (their framework — LangGraph, DSPy, plain Python) | their code, observed via sploink.wrap() |
| Workflow Graph data structure | Sploink | sploink/graph.py |
| Hardware-type policy (Layer 1) | Sploink (and configurable by customer) | bench/strategies.py HW_POLICIES |
| Substrate instances (Layer 2 catalog) | Sploink + customer's available providers | bench/strategies.py SUBSTRATE_INSTANCES |
| Dispatch | Sploink (substrate-specific call adapters) | bench/strategies.py _dispatch |
| Trace | Sploink | sploink/trace.py |
The customer brings the IR. Sploink brings everything from policy down.