2022 AI 编年史：基础模型 Foundation Model 范式

发表于 2022-02-15 分类于 mechine 阅读次数：

2022 年 Stanford CRFM 提出 Foundation Model 概念，详解预训练-微调范式、涌现能力与产业影响，中英文对照。

2022 AI 编年史：基础模型 Foundation Model 范式 | AI Timeline 2022: Foundation Models

一、背景与核心概念 | Background & Core Concepts

English

In August 2021, Stanford’s Center for Research on Foundation Models (CRFM) coined the term Foundation Model in the report “On the Opportunities and Risks of Foundation Models” (Bommasani et al.). By early 2022, the concept had become the dominant framing for modern AI — replacing the narrower “pre-trained model” vocabulary with a term that captures both scale and generality.

A Foundation Model is defined by two properties:

Pre-trained on broad data at scale (often self-supervised), using transfer learning so downstream tasks require minimal additional training.
Adaptable to a wide range of downstream tasks — not via task-specific architectures, but through prompting, fine-tuning, or in-context learning.

Representative 2022 foundation models spanned modalities:

Model	Organization	Modality	Scale
GPT-3	OpenAI	Text	175B params
PaLM	Google	Text	540B params
DALL·E 2	OpenAI	Text→Image	~3.5B
Stable Diffusion	Stability AI	Text→Image	~860M
Whisper	OpenAI	Speech→Text	~1.5B
CLIP	OpenAI	Image+Text	~400M
Codex	OpenAI	Code	GPT-3 derivative

The paradigm shift: instead of building N task-specific models, train one large model and adapt it. This “pre-train, prompt, and predict“ workflow (Liu et al., 2021) became the industry default.

Key concepts tied to foundation models in 2022:

Emergent abilities: Capabilities (e.g., chain-of-thought reasoning, arithmetic) that appear only above certain scale thresholds.
In-context learning (ICL): The model learns from examples embedded in the prompt without gradient updates.
Homogenization: One architecture (Transformer) dominates across modalities, enabling unified tooling.
Centralization risk: A few organizations control models that underpin vast downstream applications.

中文

2021 年 8 月，斯坦福大学 基础模型研究中心（CRFM） 在报告 《On the Opportunities and Risks of Foundation Models》（Bommasani 等）中正式提出 基础模型（Foundation Model） 概念。到 2022 年初，该术语已成为现代 AI 的主流话语框架 —— 以强调规模与 通用性 的词汇，取代了较窄的「预训练模型」表述。

基础模型 由两大属性定义：

在 大规模广泛数据 上预训练（常为自监督），借助 迁移学习（Transfer Learning） 使下游任务只需极少额外训练。
可 适配广泛下游任务 —— 不依赖任务专用架构，而通过 提示（Prompting）、微调（Fine-tuning） 或 上下文学习（In-Context Learning） 完成适配。

2022 年代表性基础模型横跨多种模态：

模型	机构	模态	规模
GPT-3	OpenAI	文本	1750 亿参数
PaLM	Google	文本	5400 亿参数
DALL·E 2	OpenAI	文生图	约 35 亿参数
Stable Diffusion	Stability AI	文生图	约 8.6 亿参数
Whisper	OpenAI	语音转文本	约 15 亿参数
CLIP	OpenAI	图文	约 4 亿参数
Codex	OpenAI	代码	GPT-3 衍生

范式转变：不再构建 N 个任务专用模型，而是训练 一个大型模型 再适配。这一 「预训练、提示、预测」 工作流（Liu 等，2021）成为产业默认路径。

2022 年与基础模型关联的核心概念：

涌现能力（Emergent Abilities）：仅在超过特定规模阈值后才出现的能力（如思维链推理、算术）。
上下文学习（ICL）：模型从提示词中的示例学习，无需梯度更新。
同质化（Homogenization）：Transformer 架构统治多模态，催生统一工具链。
中心化风险（Centralization Risk）：少数机构掌控支撑海量下游应用的基础模型。

二、架构与范式 | Architecture & Paradigm

2.1 基础模型技术栈 | Foundation Model Stack

English

┌─────────────────────────────────────────────────┐
│           Downstream Applications               │
│  (QA, summarization, code gen, image gen, ...)  │
└────────────────────┬────────────────────────────┘
                     │ Adaptation Layer
        ┌────────────┼────────────┐
        ▼            ▼            ▼
   Prompting    Fine-tuning   RLHF/RLAIF
   (zero/few-shot) (SFT/LoRA)  (alignment)
        └────────────┼────────────┘
                     ▼
┌─────────────────────────────────────────────────┐
│         Foundation Model (Pre-trained)          │
│  Transformer backbone + modality-specific head  │
│  Trained on internet-scale self-supervised data │
└─────────────────────────────────────────────────┘

Layer	Function	2022 Examples
Pre-training	Learn general representations	Next-token prediction, masked LM, contrastive learning
Adaptation	Specialize for tasks/users	GPT-3 prompts, Codex fine-tune, SD LoRA
Alignment	Match human preferences	Early RLHF experiments (InstructGPT, Jan 2022)
Deployment	Serve at scale	OpenAI API, Hugging Face Inference Endpoints

中文

基础模型技术栈自上而下分为：下游应用层 → 适配层（提示/微调/对齐）→ 预训练基础模型层。2022 年各层均已出现成熟实践：GPT-3 提示、Codex 代码微调、InstructGPT 的 RLHF 对齐实验，以及 OpenAI API 与 Hugging Face 推理端点的规模化部署。

2.2 与传统 ML 流水线对比 | vs Traditional ML Pipeline

维度	传统机器学习	基础模型范式
数据需求	每任务大量标注数据	预训练无标注/弱标注，下游少量样本
模型数量	每任务一个模型	一个模型服务多任务
开发周期	特征工程 + 训练数周	提示词调试数小时
算力集中	分散在各团队	集中在预训练阶段
能力边界	任务内稳定	跨任务涌现，行为难预测

三、2022 年趋势 | Trends in 2022

English

Vocabulary goes mainstream: “Foundation model” entered policy papers, earnings calls, and university curricula.
Multimodal unification: CLIP + diffusion + LLM suggested a path toward general-purpose AI systems.
API economy: Most users accessed foundation models via APIs (OpenAI, Cohere, AI21) rather than self-hosting.
Open vs closed tension: Meta’s OPT (May 2022) and BigScience BLOOM (July 2022) challenged closed-model dominance.
Risk awareness: CRFM report highlighted bias, misinformation, environmental cost, and labor exploitation in data labeling.
Regulatory foreshadowing: EU AI Act drafts began referencing “general-purpose AI” — a legal cousin of foundation models.

中文

术语主流化：「基础模型」进入政策文件、财报电话会与大学课程。
多模态统一：CLIP + 扩散 + LLM 昭示通用 AI 系统路径。
API 经济：多数用户通过 API（OpenAI、Cohere、AI21）而非自托管使用基础模型。
开放与封闭博弈：Meta OPT（2022 年 5 月）与 BigScience BLOOM（2022 年 7 月）挑战闭源垄断。
风险意识觉醒：CRFM 报告强调偏见、虚假信息、环境成本与标注劳工问题。
监管前奏：欧盟 AI 法案草案开始提及「通用目的 AI」—— 基础模型的法律对应概念。

四、优缺点分析 | Pros and Cons

优点 Advantages	缺点 Disadvantages
一次预训练、多任务复用，研发效率极高 / Train once, adapt everywhere	预训练成本动辄数百万美元 / Pre-training costs millions of USD
小样本/零样本即可启动新任务 / Zero/few-shot task bootstrapping	行为涌现难以预测与测试 / Emergent behavior hard to predict
统一 Transformer 架构降低工程碎片化 / Unified Transformer reduces fragmentation	少数科技巨头垄断基础能力 / Oligopoly of foundation model providers
加速科学发现（AlphaFold、材料设计）/ Accelerates scientific discovery	训练数据版权与隐私争议 / Data provenance and privacy concerns
降低中小企业 AI 门槛（通过 API）/ Lowers AI barrier via APIs	环境碳足迹巨大 / Massive carbon footprint
促进开源生态（Hugging Face Hub）/ Fuels open-source ecosystem	同质化导致系统性脆弱 / Homogenization creates systemic risk

五、典型应用场景 | Use Cases

场景 Scenario	中文说明	English Description
通用语言接口	一个 GPT-3 API 支撑客服、写作、翻译	Single GPT-3 API for support, writing, translation
代码智能	Codex 驱动 Copilot 等编程助手	Codex-powered coding assistants like Copilot
视觉内容生成	SD/CLIP 作为创意产业基础能力	SD/CLIP as creative industry infrastructure
语音无障碍	Whisper 实现多语言转写普惠	Whisper democratizes multilingual transcription
生物医药	蛋白质结构预测（AlphaFold）作为科学基础模型	AlphaFold as scientific foundation model
搜索引擎增强	微软将 GPT 集成 Bing 的前期探索	Early exploration of GPT-enhanced search
教育个性化	基于 LLM 的辅导与内容生成	LLM-based tutoring and content generation

六、GitHub 开源项目 | GitHub Projects

项目 Project	说明 Description	链接 Link
huggingface/transformers	加载与推理数百种基础模型的标准库	github.com/huggingface/transformers
facebookresearch/llama	Meta 开源 LLM 系列（2023 发布，2022 年酝酿）	github.com/facebookresearch/llama
bigscience-workshop/petals	分布式运行大模型的协作推理框架	github.com/bigscience-workshop/petals
stanford-crfm/helm	基础模型标准化评测框架	github.com/stanford-crfm/helm

七、总结 | Summary

中文：2022 年 基础模型 概念将分散的大模型实践统一为可讨论、可治理的技术范式。它解释了为何 AI 能力在 2022 年出现「阶跃式」跃迁 —— 不是单个算法突破，而是 规模 + 通用架构 + 适配方法论 的系统性胜利。这一框架直接铺垫了 2023 年 ChatGPT 爆发与「大模型时代」的产业共识。

English: The Foundation Model framing unified disparate large-model practices into a coherent paradigm in 2022. It explained the “step-change” in AI capabilities — not a single algorithmic breakthrough, but a systemic win of scale + general architecture + adaptation methodology. This framework directly set the stage for the ChatGPT explosion and industry consensus on the “LLM era” in 2023.

参考链接 | References

CRFM 报告：On the Opportunities and Risks of Foundation Models (Bommasani et al., 2021)
涌现能力：Emergent Abilities of Large Language Models (Wei et al., 2022)
预训练提示预测：Pre-train, Prompt, and Predict (Liu et al., 2021)
HELM 评测：github.com/stanford-crfm/helm
Hugging Face Transformers：github.com/huggingface/transformers