NOTES Field notes from an AI engineer
Notes in the margins.
Shorter pieces. Three a week. Data, ML, tools, the industry. The labs are the marquee form on this site. These are everything that doesn’t fit there. Subscribe via RSS, or check back from the homepage.
-
LeWorldModel (LeWM) explained · JEPA + SIGReg · code
Seven loss terms become one Gaussian-matching regularizer. The JEPA world model in the simplification branch, walked through with code.
-
DeepSeek-V4 architecture, in plain language
Muon optimizer, Compressed Sparse Attention (CSA), FP4 KV cache, hyper-connections. Five compounding choices that yield the 50x cache reduction.
-
What is Subquadratic? The SubQ company, paper, and architecture
A $29M Miami startup with a closed-weights one-million-token LLM. SSA, the benchmarks, the GPU contract, and the honest reasons to discount the headline numbers.
-
33 OG images in 80 lines of Python
Per-page OpenGraph cards without Figma. A one-shot Python script over your existing og:title tags.
-
The k=2 problem in cluster selection
Silhouette at k=2 is almost always highest and almost always meaningless. Why my segmentation lab grays it out instead of pretending it’s a real choice.
-
Why my A/B simulator says false positives are 26%, not 5%
Peeking every 50 users turns a 5% nominal false-positive rate into 26% empirical. Monte Carlo from the A/B Test Simulator lab.