2023 AI 编年史：LangChain LLM 应用框架

发表于 2023-09-15 分类于 framework 阅读次数：

2023 年 AI 编年史：LangChain LLM 应用开发框架的架构设计、Chains/Agents/RAG 模块、生态影响与中英文对照分析。

2023 AI 编年史：LangChain LLM 应用框架 | AI Timeline 2023: LangChain Framework

一、背景 | Background

English

In September 2023, LangChain — created by Harrison Chase in October 2022 — reached 70,000+ GitHub stars and became the de facto standard for building LLM applications. LangChain provided the abstraction layer that connected LLMs to external data, tools, and memory — turning raw API calls into composable application pipelines.

Before LangChain, every developer reinvented the wheel: writing custom prompt templates, parsing LLM outputs, connecting vector databases, and orchestrating multi-step workflows. LangChain standardized these patterns into reusable primitives: Models, Prompts, Chains, Agents, Memory, and Retrievers.

Key terms:

LangChain: Open-source framework for developing applications powered by LLMs.
Chain: Sequence of calls to LLM, tools, or other chains — the basic composable unit.
Agent: LLM that decides which tools to call and in what order (ReAct pattern).
Retriever: Component that fetches relevant documents from a vector store for RAG.
Memory: Mechanism to persist conversation history across chain invocations.
LCEL (LangChain Expression Language): Declarative syntax for composing chains with | pipe operator.

中文

2023 年 9 月，Harrison Chase 于 2022 年 10 月创建的 LangChain 达到 70,000+ GitHub Stars，成为构建 LLM 应用的事实标准。LangChain 提供了连接 LLM 与外部数据、工具、记忆的 抽象层——将原始 API 调用转化为可组合的应用流水线。

LangChain 出现之前，每个开发者都在重复造轮子：自定义 Prompt 模板、解析 LLM 输出、连接向量数据库、编排多步工作流。LangChain 将这些模式标准化为可复用原语：Models、Prompts、Chains、Agents、Memory 与 Retrievers。

关键词解释：

LangChain：基于 LLM 开发应用的开源框架。
Chain（链）：LLM、工具或其他链的调用序列——基本可组合单元。
Agent（智能体）：决定调用哪些工具及顺序的 LLM（ReAct 模式）。
Retriever（检索器）：从向量库获取相关文档用于 RAG 的组件。
Memory（记忆）：在链调用间持久化对话历史的机制。
LCEL：用 | 管道运算符组合链的声明式语法。

二、架构 | Architecture

2.1 LangChain 核心模块 | Core Modules

English

LangChain’s architecture evolved into a modular monorepo:

langchain（主包）
├── langchain-core        # 基础抽象：Runnable, Prompt, OutputParser
├── langchain-community   # 第三方集成：1000+ 集成
├── langchain-openai      # OpenAI 专用集成
├── langchain-anthropic   # Anthropic 专用集成
├── langchain-text-splitters  # 文档分块
└── langchain-cli         # 项目脚手架

langgraph               # 有状态 Agent 编排（2024 成熟）
langserve               # REST API 部署
langsmith               # 调试、评估、监控平台

Core abstractions:

抽象 Abstraction	职责 Role	示例 Example
Runnable	统一接口：invoke/stream/batch	所有组件的基类
PromptTemplate	参数化 Prompt 模板	ChatPromptTemplate
LLM / ChatModel	模型调用封装	ChatOpenAI, ChatAnthropic
OutputParser	结构化解析 LLM 输出	JsonOutputParser, PydanticOutputParser
Retriever	文档检索	VectorStoreRetriever
Tool	Agent 可调用的外部函数	Search, Calculator, SQL
Memory	对话历史管理	ConversationBufferMemory

中文

LangChain 架构演进为模块化 Monorepo：核心包提供 Runnable/Prompt/OutputParser 抽象，community 包含 1000+ 第三方集成，langgraph 提供有状态 Agent 编排，langserve 用于 REST 部署，langsmith 提供调试与监控。

2.2 RAG 流水线 | RAG Pipeline

English

LangChain’s most popular pattern — Retrieval-Augmented Generation:

1
2
3

Document Loader → Text Splitter → Embedding Model → Vector Store
                                                          ↓
User Query → Retriever（similarity search）→ Context + Query → LLM → Answer

LCEL implementation:

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

retriever = Chroma.from_documents(docs, OpenAIEmbeddings()).as_retriever()
prompt = ChatPromptTemplate.from_template(
    "Answer based on context:\n{context}\n\nQuestion: {question}"
)
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | ChatOpenAI()
)
answer = chain.invoke("What is PagedAttention?")

中文

LangChain 最流行模式——RAG：Document Loader → Text Splitter → Embedding → Vector Store；用户查询 → Retriever 相似度搜索 → Context + Query → LLM → 回答。LCEL 用 | 管道运算符声明式组合各组件。

2.3 Agent 架构 | Agent Architecture

English

LangChain Agent executor implements the ReAct loop:

User Input
    ↓
Agent（LLM + Tools + Prompt）
    ↓
┌─ Thought: "I need to search for..." ─┐
│  Action: search("PagedAttention")     │
│  Observation: "PagedAttention is..."  │
│  Thought: "Now I can answer"           │
│  Action: Final Answer                   │
└───────────────────────────────────────┘
    ↓
Response to User

Agent types in 2023:

ZERO_SHOT_REACT_DESCRIPTION — tool descriptions in prompt
OPENAI_FUNCTIONS — native function calling
STRUCTURED_CHAT — multi-input tools
Custom agents via LangGraph (stateful, cyclical)

中文

LangChain Agent 执行器实现 ReAct 循环：LLM 生成 Thought → 选择 Action（工具调用）→ 接收 Observation → 循环直至 Final Answer。2023 年支持 Zero-Shot ReAct、OpenAI Functions、Structured Chat 等 Agent 类型。

三、趋势 | Trends

English

September–December 2023 LangChain trends:

LangSmith launch: Commercial observability platform for LLM app debugging.
LangGraph emergence: Stateful, cyclical agent workflows replacing simple chains.
“LangChain is too complex” backlash: Developers criticized over-abstraction; simpler alternatives (LlamaIndex, direct API calls) gained traction.
1000+ integrations: Every vector DB, LLM provider, and tool had a LangChain wrapper.
Enterprise adoption: Morgan Stanley, Klarna, and others built production apps on LangChain.
LangServe: One-command REST API deployment for any chain.

中文

2023 年 9–12 月 LangChain 趋势：

LangSmith 上线：LLM 应用调试的商业可观测性平台。
LangGraph 涌现：有状态循环 Agent 工作流替代简单 Chain。
「LangChain 过于复杂」反弹：开发者批评过度抽象；LlamaIndex 等更简单替代方案获关注。
1000+ 集成：每个向量库、LLM 提供商、工具都有 LangChain 封装。
企业采纳：Morgan Stanley、Klarna 等构建生产级 LangChain 应用。
LangServe：一条命令将 Chain 部署为 REST API。

四、优缺点 | Pros and Cons

4.1 优点 | Advantages

快速原型 — 数行代码搭建 RAG / Agent / Rapid prototyping
1000+ 集成 — 几乎任何 LLM/DB/工具 / 1000+ integrations
标准化模式 — Chain/Agent/RAG 最佳实践内置 / Standardized patterns
LCEL 可组合 — 管道式声明编程 / LCEL composability
LangSmith 调试 — 生产级 trace 与评估 / LangSmith debugging
活跃生态 — 最大 LLM 开发者社区 / Largest LLM dev community

4.2 缺点 | Disadvantages

过度抽象 — 简单任务也需要多层包装 / Over-abstraction
API 频繁变更 — 2023 年多次 breaking changes / Frequent breaking changes
性能开销 — 抽象层增加延迟 / Performance overhead
调试困难 — 错误栈深且晦涩 / Hard to debug deep stacks
依赖膨胀 — langchain-community 体积巨大 / Dependency bloat
生产可靠性 — 早期版本 edge case 处理不足 / Production reliability gaps

五、应用场景 | Use Cases

场景 Scenario	LangChain 组件 Components	中文说明
企业知识库问答	RAG Chain + Chroma	文档上传 → 向量检索 → 问答
代码助手 Agent	ReAct Agent + Code Tool	写代码 → 执行 → 修复循环
客服 ChatBot	ConversationChain + Memory	多轮对话 + 历史记忆
数据分析	SQL Agent + Database Tool	自然语言 → SQL → 结果解读
文档摘要	MapReduce Chain	长文档分块摘要再合并
API 编排	Sequential Chain	多步 API 调用流水线
评估基准	LangSmith Evaluators	自动化 Prompt/Chain 评估

六、GitHub 与开源生态 | GitHub and Open Source

项目 Project	说明 Description
langchain-ai/langchain	LangChain 主仓库
langchain-ai/langgraph	有状态 Agent 编排框架
langchain-ai/langsmith-sdk	调试与评估 SDK
run-llama/llama_index	竞争方案——专注 RAG
microsoft/semantic-kernel	微软 LLM 编排框架

七、总结 | Summary

中文：2023 年 9 月，LangChain 以 70k+ Stars 成为 LLM 应用开发的「标准库」。它将 Prompt、Chain、Agent、RAG 等模式标准化为可复用原语，极大降低了 LLM 应用开发门槛。尽管面临过度抽象批评，LangChain 定义的架构模式深刻影响了整个 AI 应用生态，LangGraph 与 LangSmith 则指向 Agent 工程化的未来。

English: In September 2023, LangChain became the “standard library” for LLM application development with 70k+ stars. It standardized Prompt, Chain, Agent, and RAG patterns into reusable primitives, dramatically lowering the barrier to LLM app development. Despite over-abstraction criticism, LangChain’s architectural patterns deeply influenced the entire AI application ecosystem, with LangGraph and LangSmith pointing toward the future of Agent engineering.

参考链接 | References

LangChain 文档: python.langchain.com
GitHub: github.com/langchain-ai/langchain
Harrison Chase 博客: blog.langchain.dev
论文: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks