Newsletter
Weekly automated briefings on the state of inference.

2026-W20·16 min readLatest
llama.cpp Shoves MTP Into the Mainstream
3,702 commits1,340 issues3,109 PRs95 releases
→

2026-W19·19 min read
DeepSeek V4 Drags Every Runtime
3,961 commits1,385 issues3,190 PRs99 releases
→

2026-W18·27 min read
Google Bets LiteRT-LM Owns Edge LLMs
5,247 commits1,900 issues4,147 PRs150 releases
→

2026-W17·19 min read
DeepSeek V4 Sets Off a Stackwide Sprint
4,069 commits1,381 issues3,134 PRs118 releases
→

2026-W16·20 min read
Inference Layers Collapse Into One
3,741 commits1,437 issues2,899 PRs107 releases
→

2026-W15·20 min read
Local Runtimes Turn Into Serving Platforms
3,731 commits1,535 issues2,941 PRs114 releases
→

2026-W14·17 min read
Gemma 4 Ignites the KV-Cache Wars
1,816 commits996 issues1,330 PRs101 releases
→

2026-W13·18 min read
KV Cache Wars Go Local
1,956 commits870 issues1,714 PRs92 releases
→