模型故障转移

OpenClaw 分两个阶段处理故障：

在当前 provider 内轮换认证配置。
模型回退到 agents.defaults.model.fallbacks 中的下一个模型。

本文讲的是运行时规则和支撑这些规则的数据结构。

认证存储（key + OAuth）

OpenClaw 使用认证配置（auth profile） 管理 API key 和 OAuth token。

密钥存放在 ~/.openclaw/agents/<agentId>/agent/auth-profiles.json（旧版：~/.openclaw/agent/auth-profiles.json）。
配置中的 auth.profiles / auth.order 只是元数据和路由信息（不含密钥）。
旧版仅导入用的 OAuth 文件：~/.openclaw/credentials/oauth.json（首次使用时导入到 auth-profiles.json）。

更多细节：/concepts/oauth

凭证类型：

type: "api_key" → { provider, key }
type: "oauth" → { provider, access, refresh, expires, email? }（部分 provider 还有 projectId/enterpriseUrl）

配置 ID

OAuth 登录会创建不同的配置，让多个账户可以共存。

默认：没有邮箱时为 provider:default。
有邮箱的 OAuth：provider:<email>（比如 google-antigravity:[email protected]）。

配置存放在 ~/.openclaw/agents/<agentId>/agent/auth-profiles.json 的 profiles 下。

轮换顺序

当一个 provider 有多个配置时，OpenClaw 按以下优先级选择：

显式配置：auth.order[provider]（如果设置了）。
已配置的 profile：按 provider 过滤的 auth.profiles。
已存储的 profile：auth-profiles.json 中该 provider 的条目。

如果没有配置显式顺序，OpenClaw 使用轮询：

主排序键： profile 类型（OAuth 优先于 API key）。
次排序键： usageStats.lastUsed（同类型内，最久未用的优先）。
冷却/禁用的 profile 移到最后，按最早恢复时间排序。

会话粘性（缓存友好）

OpenClaw 会按会话锁定选择的认证配置，以保持 provider 缓存热度。不会每次请求都轮换。锁定的配置会一直使用，直到：

会话被重置（/new / /reset）
compaction 完成（compaction 计数递增）
该 profile 进入冷却/禁用状态

通过 /model …@<profileId> 手动选择会设置该会话的用户覆盖，在新会话开始前不会自动轮换。

自动锁定的 profile（由会话路由器选择）被视为偏好：优先尝试，但遇到速率限制/超时时 OpenClaw 可能轮换到其他 profile。用户手动锁定的 profile 则固定不变；如果失败且配置了模型回退，OpenClaw 会转向下一个模型而不是切换 profile。

为什么 OAuth 看起来”丢了”

如果同一个 provider 同时有 OAuth 和 API key 配置，轮询机制可能在不同消息之间切换它们——除非已锁定。要强制使用单个配置：

通过 auth.order[provider] = ["provider:profileId"] 锁定，或
在聊天中用 /model … 加上 profile 覆盖（如果你的 UI/聊天界面支持的话）。

冷却

当 profile 因认证/速率限制错误（或看起来像速率限制的超时）失败时，OpenClaw 标记它进入冷却状态，转而使用下一个 profile。格式/无效请求错误（比如 Cloud Code Assist 工具调用 ID 校验失败）也被视为可故障转移的，使用相同的冷却机制。OpenAI 兼容的 stop-reason 错误（如 Unhandled stop reason: error、stop reason: error 和 reason: error）被归类为超时/故障转移信号。

冷却使用指数退避：

1 分钟
5 分钟
25 分钟
1 小时（上限）

状态存储在 auth-profiles.json 的 usageStats 下：

{
  "usageStats": {
    "provider:profile": {
      "lastUsed": 1736160000000,
      "cooldownUntil": 1736160600000,
      "errorCount": 2
    }
  }
}

计费禁用

计费/余额不足的错误（如 “insufficient credits” / “credit balance too low”）被视为可故障转移的，但通常不是临时性的。OpenClaw 不用短冷却，而是把 profile 标记为禁用（使用更长的退避时间），然后轮换到下一个 profile/provider。

状态存储在 auth-profiles.json 中：

{
  "usageStats": {
    "provider:profile": {
      "disabledUntil": 1736178000000,
      "disabledReason": "billing"
    }
  }
}

默认值：

计费退避从 5 小时起步，每次计费失败翻倍，上限 24 小时。
如果 profile 已经 24 小时没有失败，退避计数器重置（可配置）。

模型回退

如果一个 provider 的所有 profile 都失败了，OpenClaw 转向 agents.defaults.model.fallbacks 中的下一个模型。认证失败、速率限制和耗尽 profile 轮换的超时会触发回退（其他错误不会推进回退）。

当运行使用了模型覆盖（hook 或 CLI）启动时，在尝试完配置的回退后，最终仍会回到 agents.defaults.model.primary。