post/
11 pages · Updated April 26, 2026
Pages
- post/steerling-8b-alignment-without-retraining/index.html
- post/interpretable-intelligence/index.html
- post/the-fineweb-concept-atlas/index.html
- post/scaling-interpretable-models-8b/index.html
- post/block-causal-diffusion-language-model/index.html
- Discovering human-understandable concepts in Steerling-8B
- Introducing Guide Labs: Engineering Interpretable and Auditable AI Systems
- Steering Interpretable Language Models
- Steerling-8B: The First Inherently Interpretable Language Model
- Atlas: Orienting the Pre-Training data of an LLM
- PRISM: Training Data Prototypes for Language Models