Research Journal
Building a language model from scratch is a milestone — but finishing the tutorial is not the same as understanding the model. This journal is about what comes after: actually running experiments, reading the training curves, and learning to speak the language of LLMs. The goal is to go as deep as possible — from pretraining to SFT to RLVR — and share every observation along the way.
Experiments
Running LiquidAI's LFM2.5-8B-A1B from a MacBook Air to an RTX 5060 Ti — throughput sweeps, quantization fidelity (perplexity vs KL-divergence), and a private eval against Claude. The reasoning I brought to each run, and where it was corrected.
Baseline, optimizer × warmup, init × depth, normalization × LR. What the training curves actually say.