模块六 · 工具调用

Function Calling

让 Agent 从「能说」进化到「能做」的核心能力。Function Calling 让 LLM 能够自主决定调用外部工具，Structured Output 让 AI 返回固定格式数据，两者共同构成 Agent 的「动手能力」。

Function Calling 是什么

Function Calling（函数调用）是 LLM 的一个扩展能力：开发者向 LLM 提供一组「函数定义」（名称、参数说明、返回值说明），LLM 根据用户问题自主决定：

调用哪个函数（可能多个）
传入什么参数
调用顺序（如果多个函数有依赖关系）

「帮我查一下北京今天的天气」

LLM 决定：调用 get_weather，参数 city=北京

「给张三发一封邮件，告诉他会议延期」

LLM 决定：先查张三邮箱，再调用 send_email

Tool Calling 词条 Function Schema 词条 Agent 词条

Function Calling vs. Structured Output

	Function Calling	Structured Output
用途	让 Agent 调用外部工具	让 LLM 返回固定格式数据
触发	LLM 自主决定何时调用	用户要求格式化输出
典型场景	查天气、发邮件、查数据库	返回 JSON / YAML / 表格 / 枚举
代表 API	OpenAI tools / Anthropic tool_use	OpenAI JSON mode / response_format

实际项目中，两者经常结合使用： Function Calling 让 Agent 做事，Structured Output 让 Agent 的返回结果能被程序直接解析（写入数据库、传给下游 API 等）。

开发实战

1. 工具定义（JSON Schema）

工具定义示例：查询天气

tools = [
  {
    "type": "function",
    "function": {
      "name": "get_weather",
      "description": "获取指定城市的当前天气",
      "parameters": {
        "type": "object",
        "properties": {
          "city": {
            "type": "string",
            "description": "城市名称，如「北京」「上海」",
            "enum": ["北京", "上海", "广州", "深圳"]
          },
          "unit": {
            "type": "string",
            "description": "温度单位",
            "enum": ["celsius", "fahrenheit"],
            "default": "celsius"
          }
        },
        "required": ["city"]
      }
    }
  }
]

2. Agent 执行循环

用户提问：「北京今天多少度？」

LLM 判断：需要调用 get_weather(city=北京)

执行工具：调用天气 API，返回 {"temp": 22, "weather": "晴"}

LLM 整合：根据工具返回结果，生成自然语言回答

返回用户：「北京今天气温 22°C，晴朗」

3. 多工具编排

多工具依赖处理

# 场景：用户问「张三负责的项目的最新进度？」
# 需要：先查张三 → 再查项目 → 再查进度（三个工具串行）

messages = [
    {"role": "user", "content": "张三负责的项目的最新进度？"}
]

while True:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=messages,
        tools=tools,  # 所有工具定义
        tool_choice="auto"
    )
    
    if response.choices[0].finish_reason == "tool_calls":
        # LLM 要求调用工具
        tool_calls = response.choices[0].message.tool_calls
        for call in tool_calls:
            result = execute_tool(call.function.name, call.function.arguments)
            messages.append({"role": "tool", "tool_call_id": call.id, "content": result})
    else:
        # LLM 已生成最终回答
        print(response.choices[0].message.content)
        break

错误处理与重试策略

工具调用失败

API 超时 / 参数错误 / 服务不可用

重试 1-2 次 → 失败后返回「服务暂时不可用，请稍后重试」

循环调用失控

LLM 反复调用同一工具

设置 max_turns=10 → 超过后强制停止，返回当前最佳结果

无效输出处理

LLM 返回的参数格式不对

JSON Schema strict=true → 强制 LLM 输出符合 Schema

Structured Output 实战

场景：从非结构化文本提取结构化数据

from openai import OpenAI
client = OpenAI()

response = client.responses.parse(
    model="gpt-4o-mini",
    input="张三，男，1985 年生，北京人，电话 13800138000，邮箱 zhangsan@example.com",
    text_format={
        "type": "object",
        "properties": {
            "name":  {"type": "string"},
            "gender": {"type": "string", "enum": ["男", "女"]},
            "birth_year": {"type": "integer"},
            "city": {"type": "string"},
            "phone": {"type": "string"},
            "email": {"type": "string"}
        },
        "required": ["name", "phone"]
    }
)

# response.output_parsed 已经是 Python dict:
# {"name": "张三", "gender": "男", "birth_year": 1985, ...}
print(response.output_parsed)

什么时候用 Structured Output？

需要程序化处理 LLM 输出时——写入数据库、传给下游 API、生成固定格式报告。结合 Function Calling 使用：Function Calling 定义「做什么」，Structured Output 定义「返回什么格式」。

查看 Structured Output 词条

MCP 协议中的 Function Calling

MCP 协议的 tools/ 端点，本质上就是 Function Calling 的标准化实现。MCP 定义了一套通用的工具描述规范，使得任何支持 MCP 的 AI 客户端都能调用任何支持 MCP 的工具服务。

MCP 协议词条 Tool Calling 词条

Memory 系统（上一个模块）工具层总览 Agent 实战（浏览器沙箱）