Serious AI research is supposed to need a data center. I’m wagering it doesn’t — a quiet home Blackwell cluster, custom Mojo kernels nobody else is writing, and upstream contributions to MAX. Here’s the why behind the push.
We’re building activation-level memory for AI inference — a model that thinks differently because of what it has experienced before, not just one with more text in the prompt. Early results, a real selectivity number, and a provisional patent on the way.
If topological integration were universal, state space models should show it too. They don’t. Mamba-370m maintains fragmented representations end-to-end.
Dense transformers develop bimodal processing gates at layers 3-4 that nobody designed. Confirmed across three model families. Standard SAEs fail at -3,059% on deep layers; a SipIt + SAE + GLP pipeline recovers them.
What if you could read what a model is actually computing while it generates an answer? 182 runs across 29 models, 0.96 calibration accuracy. Activation-based detection runs about 10× more reliable than text-only analysis.
4 of 5 frontier models correctly explained a discount calculation but produced the wrong final answer. The gap shrinks with scale but persists. 61,678 math problems evaluated.