Topics
LLM Reasoning
Eliciting and improving step-by-step reasoning in large language models.
Theorem Proving · Google DeepMind
AlphaGeometry combines a neural language model with symbolic deduction, using synthetic theorems and proofs to reach near gold-medal performance on olympiad geometry.
Alignment · Stanford University
Direct Preference Optimization turns preference tuning into a simple classification-style objective, avoiding an explicit reward model and reinforcement learning loop.
Multimodal Models · OpenAI
GPT-4 was less a full recipe than a measurement document: a multimodal Transformer whose benchmark performance, scaling predictability, and post-training alignment reset expectations for frontier AI.
Open Models · Meta AI
Llama 3 is not just a bigger open-weight model; it is Meta's attempt to package multilingual, coding, reasoning, tool use, and safety into a coherent public model family.
LLM Reasoning · DeepSeek
Reinforcement learning alone, with no supervised reasoning traces, can make a base language model develop strong step-by-step reasoning, rivaling top closed models.