Jiahuang (Jacob) Lin
Jacob Lin

Machine Learning Engineer

Hi, I'm Jacob.

I'm a machine learning engineer working on machine-learning systems and RL β€” training and serving large models, GPU kernels, and the infrastructure around them. I like understanding systems from first principles and writing the explanation I wish I'd had.

Explore the library

A first-principles reference library β€” seven areas, ~497 lesson pages across 23 tracks. Pick an area, or browse the full catalog.

Foundations

SICP in JavaScript, functional programming, classical ML, the deep-learning core, and computer vision β€” programming abstraction and typed effects through bias–variance, backprop, attention, detection, segmentation, and VLMs.

Open area β†’

Generative models

Diffusion, flow matching, DiT, and tokenizers, plus a GPT built end-to-end from pretrain β†’ SFT β†’ CoT β†’ DPO β†’ RLVR.

Open area β†’

Reinforcement learning

One linear track: MDPs β†’ value & policy methods β†’ TRPO/PPO β†’ RLHF/GRPO β†’ post-training systems β†’ twenty applied domains.

Open area β†’

GPU, kernels & serving

CUDA and Triton from first principles β€” including kernel-interview coding β€” the vLLM and SGLang serving engines, distributed training, and GenAI operations on Kubernetes.

Open area β†’

Systems, data & design

Designing ML, Ray, distributed, agentic, and data-intensive systems end-to-end, and the data plane behind them.

Open area β†’

Search, ads & recsys

Production ranking from first principles in three linear tracks: search (query understanding, BM25, dense & hybrid retrieval, learning-to-rank, reranking, relevance eval), recommender systems, and ads & auctions.

Open area β†’

Model compression

Knowledge distillation from soft targets to on-policy and reasoning-model distillation, set against quantization and pruning.

Open area β†’

✍️ Or read the Writing archive β€” earlier posts on algorithms, distributed systems, compilers, and databases.