结构化输出实战——让大模型输出稳定的 JSON

老张2026/4/30大约 8 分钟

结构化输出实战——让大模型输出稳定的 JSON

适读人群：在生产环境使用 LLM 的后端工程师 | 阅读时长：约 13 分钟 | 核心价值：系统性解决 LLM JSON 输出不稳定问题，有完整可用的代码

上线第三天，报警响了。

那是一个自动生成周报摘要的功能，流程是：收集各系统的数据 -> 调 LLM 生成摘要 -> 解析 JSON -> 写入报告模板。前两天测试都没问题，第三天某个摘要里出现了一段特别长的项目名称，LLM 在 JSON 的字符串里加了换行，导致 JSON 解析失败，整个周报流程挂掉了。

那是我第一次认真思考：在生产环境里怎么让 LLM 输出可靠的 JSON。

不是不稳定，是你没做对。

LLM JSON 输出会出哪些问题

做了两年 AI 应用开发，我遇到过的 JSON 输出问题可以分成这几类：

1. 被 Markdown 代码块包裹

模型认为自己在回答一个需要解释的问题，返回的不是纯JSON，而是带了文字说明和 ```json 代码块标记的字符串，直接 json.loads() 会抛出解析异常。

2. JSON 里加注释

{
  "status": "ok",  // 处理成功
  "score": 85  // 得分
}

Python 标准库的 json.loads() 不支持注释，直接报错。这在模型觉得需要解释的时候特别容易出现。

3. 数字类型被转成字符串或反之

{"amount": "1500.00", "count": "32"}

你期望的是数字，模型给了字符串。后续做数值计算就出问题了。

4. 空字段被省略

你定义了 8 个字段，模型觉得某些字段「没必要」就不输出了，你的代码按固定结构解析就崩了。

5. 大文本触发截断问题

输入太长，或者 max_tokens 设置不够，JSON 被截断，结尾的 } 没了，解析直接失败。

6. 嵌套层数多的时候格式乱

复杂嵌套结构的 JSON，模型偶尔会搞乱括号层级，出现不合法的格式。

解决方案全图

我现在用的是「三道防线」：

第一道：Prompt 层——从源头减少格式错误
第二道：Provider 层——用 API 功能强制 JSON 输出
第三道：解析层——健壮的解析和修复逻辑

第一道防线：Prompt 层

明确告诉模型只输出 JSON，不要任何多余内容：

你是一个数据分析助手。分析以下数据，返回结果。

重要：只返回合法的JSON对象，不要包含任何解释文字，不要使用Markdown代码块，不要添加注释。

输出格式：
{
  "summary": "string",
  "score": number,
  "issues": ["string"],
  "recommendation": "string | null"
}

数据：{input_data}

几个关键词：「只返回」、「不要包含任何解释」、「不要使用Markdown代码块」、「不要添加注释」。

光说「返回JSON」是不够的，因为模型默认行为是「回答问题」，而回答问题往往包含解释。你要明确排除那些你不想要的东西。

提供一个输出示例：

输出示例：
{"summary": "本周销售额环比上升12%", "score": 85, "issues": ["库存预警"], "recommendation": null}

给一个单行的紧凑 JSON 示例，比给格式化缩进的 JSON 效果更好——模型更倾向于模仿示例的格式。

第二道防线：API 层强制 JSON 输出

现在主流的 LLM API 都支持 JSON 模式或结构化输出，这比 Prompt 约束可靠得多。

OpenAI 的 response_format：

from openai import OpenAI
import json

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "你是一个数据分析助手，只返回JSON格式的分析结果。"},
        {"role": "user", "content": f"分析以下数据：{input_data}"}
    ],
    response_format={"type": "json_object"},  # 强制JSON输出
    temperature=0.1  # 结构化任务降低随机性
)

result = json.loads(response.choices[0].message.content)

OpenAI 的结构化输出（更强的约束）：

from pydantic import BaseModel
from typing import Optional, List

class AnalysisResult(BaseModel):
    summary: str
    score: int
    issues: List[str]
    recommendation: Optional[str] = None

response = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[...],
    response_format=AnalysisResult,
)

# 直接得到 Pydantic 对象，不需要手动解析
result = response.choices[0].message.parsed
print(result.score)  # 直接访问字段

Anthropic Claude 的工具调用方式强制输出结构：

import anthropic
import json

client = anthropic.Anthropic()

tools = [
    {
        "name": "save_analysis_result",
        "description": "保存分析结果",
        "input_schema": {
            "type": "object",
            "properties": {
                "summary": {
                    "type": "string",
                    "description": "分析摘要"
                },
                "score": {
                    "type": "integer",
                    "minimum": 0,
                    "maximum": 100,
                    "description": "综合得分"
                },
                "issues": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "发现的问题列表"
                },
                "recommendation": {
                    "type": ["string", "null"],
                    "description": "建议，没有建议时为null"
                }
            },
            "required": ["summary", "score", "issues", "recommendation"]
        }
    }
]

response = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    tools=tools,
    tool_choice={"type": "tool", "name": "save_analysis_result"},  # 强制调用这个工具
    messages=[
        {"role": "user", "content": f"分析以下数据：{input_data}"}
    ]
)

# 从工具调用结果里拿数据
for block in response.content:
    if block.type == "tool_use" and block.name == "save_analysis_result":
        result = block.input
        print(result)
        break

用工具调用来强制输出结构是一个被低估的技巧。Claude 在工具调用模式下，会严格按照你定义的 JSON Schema 输出，字段类型、必填项都会被强制约束。

第三道防线：健壮的解析层

即使前两道防线都做了，生产环境里还是要做防御性解析。

import json
import re
from typing import Any, Optional

class RobustJSONParser:
    """健壮的LLM JSON输出解析器"""
    
    @classmethod
    def parse(cls, raw_output: str) -> Any:
        """
        多策略解析，按可靠性从高到低尝试
        """
        # 策略1：直接解析
        result = cls._try_direct_parse(raw_output)
        if result is not None:
            return result
        
        # 策略2：提取Markdown代码块
        result = cls._try_extract_from_markdown(raw_output)
        if result is not None:
            return result
        
        # 策略3：提取第一个完整的JSON对象或数组
        result = cls._try_extract_json_object(raw_output)
        if result is not None:
            return result
        
        # 策略4：清理注释后再解析
        result = cls._try_remove_comments_and_parse(raw_output)
        if result is not None:
            return result
        
        # 策略5：修复常见格式问题后解析
        result = cls._try_repair_and_parse(raw_output)
        if result is not None:
            return result
        
        raise ValueError(f"无法解析JSON输出: {raw_output[:200]}...")
    
    @classmethod
    def _try_direct_parse(cls, text: str) -> Optional[Any]:
        try:
            return json.loads(text.strip())
        except json.JSONDecodeError:
            return None
    
    @classmethod
    def _try_extract_from_markdown(cls, text: str) -> Optional[Any]:
        """提取```json ... ```代码块内容"""
        # 匹配带json标注的代码块
        pattern = r'```(?:json)?\s*\n?(.*?)\n?```'
        matches = re.findall(pattern, text, re.DOTALL)
        
        for match in matches:
            try:
                return json.loads(match.strip())
            except json.JSONDecodeError:
                continue
        
        return None
    
    @classmethod
    def _try_extract_json_object(cls, text: str) -> Optional[Any]:
        """找到第一个完整的JSON对象或数组"""
        # 找到第一个 { 或 [
        start_chars = ['{', '[']
        for char in start_chars:
            start_idx = text.find(char)
            if start_idx == -1:
                continue
            
            # 从起始位置向后找匹配的结束字符
            end_char = '}' if char == '{' else ']'
            depth = 0
            in_string = False
            escape_next = False
            
            for i, c in enumerate(text[start_idx:], start_idx):
                if escape_next:
                    escape_next = False
                    continue
                if c == '\\' and in_string:
                    escape_next = True
                    continue
                if c == '"':
                    in_string = not in_string
                if not in_string:
                    if c == char:
                        depth += 1
                    elif c == end_char:
                        depth -= 1
                        if depth == 0:
                            candidate = text[start_idx:i+1]
                            try:
                                return json.loads(candidate)
                            except json.JSONDecodeError:
                                break
        
        return None
    
    @classmethod
    def _try_remove_comments_and_parse(cls, text: str) -> Optional[Any]:
        """移除JSON注释（// 和 /* */）后解析"""
        # 移除单行注释（注意不要破坏字符串内的//）
        cleaned = re.sub(r'(?<!\\)".*?"(*SKIP)(*FAIL)|//[^\n]*', '', text)
        # 移除多行注释
        cleaned = re.sub(r'(?<!\\)".*?"(*SKIP)(*FAIL)|/\*.*?\*/', '', cleaned, flags=re.DOTALL)
        
        try:
            return json.loads(cleaned.strip())
        except (json.JSONDecodeError, re.error):
            # 如果正则太复杂失败了，用简单方式
            lines = []
            for line in text.split('\n'):
                # 粗暴地去掉 // 注释（可能误伤字符串里的//）
                if '//' in line:
                    # 找到第一个不在引号里的//
                    in_str = False
                    for idx, ch in enumerate(line):
                        if ch == '"':
                            in_str = not in_str
                        if ch == '/' and not in_str and idx + 1 < len(line) and line[idx+1] == '/':
                            line = line[:idx]
                            break
                lines.append(line)
            
            try:
                return json.loads('\n'.join(lines).strip())
            except json.JSONDecodeError:
                return None
    
    @classmethod
    def _try_repair_and_parse(cls, text: str) -> Optional[Any]:
        """修复常见格式问题"""
        # 尝试安装并使用json_repair库
        try:
            import json_repair
            return json_repair.repair_json(text, return_objects=True)
        except ImportError:
            pass
        
        # 手动修复：处理末尾缺少括号的情况
        text = text.strip()
        if text.startswith('{'):
            # 数一下缺少多少个 }
            open_count = text.count('{') - text.count('}')
            if open_count > 0:
                text += '}' * open_count
                try:
                    return json.loads(text)
                except json.JSONDecodeError:
                    pass
        
        return None


# 使用示例
def parse_llm_json_output(raw: str) -> dict:
    try:
        return RobustJSONParser.parse(raw)
    except ValueError as e:
        # 记录日志，触发告警
        import logging
        logging.error(f"JSON解析失败: {e}")
        raise

处理截断问题

如果输出被截断（JSON 不完整），除了上面的修复逻辑，更好的预防方式是：

def estimate_output_tokens(schema: dict, data_size: int) -> int:
    """粗略估算输出需要的token数"""
    # 字段数 * 平均每字段token数 + 数据大小的处理结果
    field_count = len(schema.get('properties', {}))
    base_tokens = field_count * 30  # 每个字段平均30个token
    data_tokens = data_size // 4  # 粗略估算
    return base_tokens + data_tokens + 200  # 200的余量

# 在调用时动态设置max_tokens
max_tokens = max(512, estimate_output_tokens(output_schema, len(input_text)))

另一个方法是让模型「续写」截断的 JSON：

def continue_truncated_json(client, partial_json: str, original_prompt: str) -> str:
    """让模型继续补全截断的JSON"""
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "user", "content": original_prompt},
            {"role": "assistant", "content": partial_json},
            {"role": "user", "content": "上面的JSON没有完成，请继续补全，只输出剩余部分"}
        ],
        max_tokens=500
    )
    continuation = response.choices[0].message.content
    return partial_json + continuation

监控和告警

生产环境里要对 JSON 解析失败做监控：

from dataclasses import dataclass
from datetime import datetime
import threading

@dataclass
class ParseMetrics:
    total_requests: int = 0
    parse_failures: int = 0
    repair_successes: int = 0

_metrics = ParseMetrics()
_lock = threading.Lock()

def tracked_parse(raw: str, prompt_key: str) -> dict:
    with _lock:
        _metrics.total_requests += 1
    
    try:
        result = RobustJSONParser.parse(raw)
        return result
    except ValueError:
        with _lock:
            _metrics.parse_failures += 1
        
        # 失败率超过5%触发告警
        failure_rate = _metrics.parse_failures / _metrics.total_requests
        if failure_rate > 0.05:
            send_alert(f"JSON解析失败率 {failure_rate:.1%}，prompt_key={prompt_key}")
        
        raise

实际效果

按这套方案做完之后，我们的 JSON 解析失败率从每天 3-5 次降到几乎为零。偶尔出现的失败也基本是真正的 API 异常，不是格式问题。

一句话总结：Prompt 约束 + API 强制模式 + 健壮解析，缺任何一个都不算完整。