GPT-3: The Moment Few-Shot Prompting Became the Interface
GPT-3 showed that a 175B autoregressive language model could perform many tasks from examples in the prompt, without gradient updates or task-specific fine-tuning.
Institution
An AI research and deployment company behind GPT, CLIP, DALL·E, and other frontier systems.
GPT-3 showed that a 175B autoregressive language model could perform many tasks from examples in the prompt, without gradient updates or task-specific fine-tuning.
InstructGPT showed that human preference data and RLHF could make smaller models more helpful and aligned than much larger raw language models.
Whisper showed that large, diverse, weakly supervised audio data can produce robust multilingual speech recognition and translation models.
DALL·E 2 splits text-to-image generation into a prior that predicts a CLIP image embedding and a decoder that turns that embedding into an image.
CLIP trains image and text encoders on 400 million internet image-text pairs, making natural language a flexible interface for zero-shot visual recognition.
GPT-4 was less a full recipe than a measurement document: a multimodal Transformer whose benchmark performance, scaling predictability, and post-training alignment reset expectations for frontier AI.