Institution

Stanford University

A leading research university with major contributions in AI, systems, language, and robotics.

DPO: The Alignment Trick That Removed the RL Loop

Direct Preference Optimization turns preference tuning into a simple classification-style objective, avoiding an explicit reward model and reinforcement learning loop.

Efficient AI · Stanford University

FlashAttention: The Attention Speedup That Came From Reading GPU Memory

FlashAttention keeps attention exact but makes it IO-aware, using tiling to reduce slow GPU memory traffic and make long-sequence Transformers faster and cheaper.