Focus.AI Press
Cornwall, Connecticut
Cornwall, Connecticut
Issue No. 01
APRIL 2026
APRIL 2026
Local.
A Tract on Inference You Own · Est. 2026
Contents · APRIL 2026
In this issue.
No centralization. No data exfiltration. Cheap. The Linux moment for AI looks like this.
Three tiers of inference: hosted APIs, self-hosted clusters, and the GPU in your laptop. This issue argues that the third tier has been underrated — and that a consumer-grade MacMini plus an open-weights model is already enough to unmake a chunk of the SaaS stack.
Hosted. Self-hosted. On-device. Three tiers, and we have spent too much time pretending the third one doesn’t matter.
- The economics — when electricity is your only marginal cost
- The hardware — MacMini, Framework, Strix Halo, and the NVIDIA DGX Spark
- The models — gpt-oss, Qwen, Llama, and the open-weights bench
- The attack surface — local models as a payload, and why that’s coming
- The counterweight — what flips when inference is free
— Local Editorial
Cross-References
- State · № 01 The hallway story on local inference as an attack vector — and its positive inverse — is where this argument started.
- Harness · № 01 Gen-3 SDKs assume cheap inference. Local changes the economics.
- Wire · № 01 MCP / skills / code mode are transport-agnostic. Local-first changes which of them you pick.