Focus.AI Press
Cornwall, Connecticut
Issue No. 01
APRIL 2026
Local.
A Tract on Inference You Own · Est. 2026
Preview · Forthcoming

Inference or Die

A tract on running it yourself

Read the full issue →

LOC—01
WMB · The Local Counter
DRAFT

In this issue.

  1. Preface A Tract Why this issue is angrier than the others. The argument for running in… p. 01

No centralization. No data exfiltration. Cheap. The Linux moment for AI looks like this.

Three tiers of inference: hosted APIs, self-hosted clusters, and the GPU in your laptop. This issue argues that the third tier has been underrated — and that a consumer-grade MacMini plus an open-weights model is already enough to unmake a chunk of the SaaS stack.

Hosted. Self-hosted. On-device. Three tiers, and we have spent too much time pretending the third one doesn’t matter.

  • The economics — when electricity is your only marginal cost
  • The hardware — MacMini, Framework, Strix Halo, and the NVIDIA DGX Spark
  • The models — gpt-oss, Qwen, Llama, and the open-weights bench
  • The attack surface — local models as a payload, and why that’s coming
  • The counterweight — what flips when inference is free
— Local Editorial