早间复盘 - 𓀚 转了码的刘公子

# 2026-05-02 Morning Review 候选包扫描 session：44 个。候选：5 个。今天信号集中在三类：本地/云 API 诊断、订阅源管理工具化、Night Gym 自身输入质量。 ## 一、优先候选 | 序号 | id | 类型 | 风险 | 标题 | |---:|---|---|---|---| | 1 | `local-model-thinking-latency-diagnostic` | diagnostic-tool | low | 本地 OpenAI-compatible 模型的 thinking/端口/守护诊断工具 | | 2 | `x-subscribe-hub-api-readiness-and-resume` | tool-upgrade | medium | X/订阅源管理的 API readiness、费用、权限与限速恢复链路 | | 3 | `azure-speech-pronunciation-resource-diagnostic` | diagnostic-tool | medium | Azure Pronunciation Assessment 资源/key 就绪诊断与 smoke test | | 4 | `night-gym-reflection-ingest-quality-gate` | workflow-pattern | medium | Night Gym agent_reflections 提取与脱敏质量门禁 | | 5 | `credential-handoff-browser-workflow` | workflow-pattern | medium | 浏览器登录、OAuth 授权和密钥读取的交接工作流 | ## 二、候选详情 ### 2.1 本地 OpenAI-compatible 模型的 thinking/端口/守护诊断工具类型：`diagnostic-tool`；风险：`low`；审批：`pending`。这个主题有明确价值：一次真实排障同时暴露了端口误判、客户端 thinking 开关未映射、reasoning_content 导致巨大延迟、以及 LaunchAgent/TCC 托管失败四类高复发问题。适合沉淀成一个本地诊断工具，先用无侵入 curl 对照确认问题层级，再给出服务端默认参数和后台托管建议。证据： - `019de7ad-72e7-7c12-af0a-e973feeb0a18` / `~/Documents/learning-bu`：用户围绕 vmlx/Gemma4 本地服务追问部署方式、量化版本、端口、速度 2.0 tok/s、Cherry Studio 关思考仍思考，并要求服务端默认关掉 thinking。 - `019de7ad-72e7-7c12-af0a-e973feeb0a18` / `~/Documents/learning-bu`：agent_reflection: 我先做两个小测试，不改服务配置：一个限制输出长度，一个尝试关掉 thinking/reasoning，看速度瓶颈是不是主要来自 reasoning_content。 - `019de7ad-72e7-7c12-af0a-e973feeb0a18` / `~/Documents/learning-bu`：执行中发现 8080 不通、8000 可用；默认 thinking 104.89s/213 tokens，enable_thinking:false 3.21s/16 tokens；LaunchAgent 被 TCC 拦，最终用 screen 常驻并验证 reasoning_content:null。建议改动： - 新增工具入口：在 `/Users/bytedance/Documents/info-bu/info-workspace/projects/local-gemma4-api/tools/diagnose_openai_compat.py` 增加 CLI，命令形态为 `python tools/diagnose_openai_compat.py --base-url http://127.0.0.1:8000 --model gemma4 --ports 8000,8080 --log /tmp/vmlx_serve.log`。 - 输入来源：读取 CLI 参数里的 base_url/model/候选端口/日志路径，可选读取 Cherry Studio 或其它客户端导出的请求体 JSON；同时采集 `lsof -nP -iTCP:<port> -sTCP:LISTEN`、`/v1/models`、`/v1/chat/completions` 响应、`screen -ls`、`launchctl print` 和服务日志关键行。 - 核心检查步骤与成功/失败信号：先探测 8000/8080 等端口并确认 OpenAI-compatible endpoint；再跑两组最小请求：默认请求与 `enable_thinking:false` 请求，记录 HTTP 状态、total_time、completion_tokens、reasoning_content 是否为 null；成功信号是目标端口 HTTP 200、禁 thinking 请求显著变快且 reasoning_content:null、后台会话/进程存在；失败信号是端口不通、客户端请求未携带正确字段、默认请求仍返回 reasoning_content、LaunchAgent 出现 TCC/pyvenv.cfg PermissionError。 - 建议落地路径：把诊断工具包装成 `local-model-diagnostic` skill，写入 `/Users/bytedance/Documents/info-bu/.agents/skills/local-model-diagnostic/SKILL.md`；当用户说“本地模型慢/关思考没用/端口不通/帮我常驻”时触发，输出一页诊断报告，并建议服务端兜底参数如 `--default-enable-thinking false` 或使用 detached `screen` 托管。 - 示例文件：`out/examples/local-model-thinking-latency-diagnostic.md` ### 2.2 X/订阅源管理的 API readiness、费用、权限与限速恢复链路类型：`tool-upgrade`；风险：`medium`；审批：`pending`。这个主题有效，不是噪音：同一条 X 订阅源管理工作流里连续暴露了 API 凭证、付费 credits、OAuth scope、App 写权限和 429 限速恢复等多个前置条件缺口。最值得沉淀的是把 subscribe-hub 从一次性脚本升级为带 readiness 诊断、状态解释、审计日志和断点续跑的工具链，避免批量取关这类高风险操作在中途失败后丢失真实进度。证据： - `019de822-b157-7af3-a59c-5a244670d33e` / `~/Documents/info-bu`：用户从 YouTube/X 订阅盘点开始，要求新建 subscribe-hub skill，并逐步推进 X API：创建 App、OAuth、付费 credits、读 Lists/Following、写权限、批量取关。 - `019de822-b157-7af3-a59c-5a244670d33e` / `~/Documents/info-bu`：过程反复遇到 X_CLIENT_ID 缺失、client-not-enrolled、CreditsDepleted、OAuth scope/权限升级、需要多次 Authorize app、写操作 429 Too Many Requests。 - `019de822-b157-7af3-a59c-5a244670d33e` / `~/Documents/info-bu`：批量取关 306 个时第一版脚本遇到 429 未写最终审计；随后改成即时写审计、生成剩余名单、分批恢复，累计取关约 100 个，剩余 206。建议改动： - 在 `~/Documents/info-bu/.agents/skills/subscribe-hub/SKILL.md` 增加 `x-api-readiness` 固定步骤：任何 X Lists/Following/Unfollow 操作前必须先跑诊断，输出 credentials、plan/credits、OAuth scopes、app permission、rate-limit bucket 五类状态。 - 新增工具入口 `scripts/x_api_readiness.py`：输入来源为 `.env`/skill config 中的 `X_CLIENT_ID`、`X_CLIENT_SECRET`、access token/refresh token、目标操作类型 `read_lists|read_following|write_unfollow`；核心检查步骤包括环境变量存在性、OAuth token scope introspection 或轻量 API smoke、credits/plan 错误探测、写权限探测、rate-limit header 解析；成功信号是每类检查返回 `ok` 并给出可执行下一步，失败信号包括 `missing_client_id`、`client_not_enrolled`、`credits_depleted`、`insufficient_scope`、`write_permission_missing`、`rate_limited`；建议落地为 CLI + JSON 报告，供后续批处理脚本强制读取。 - 改造批量取关脚本为 resume-safe：每个账号操作前写入 `pending`，成功后立即写 `unfollowed`，失败后写 `failed` 和原始错误；每批结束生成 `audit.jsonl`、`remaining.csv`、`rate_limit_state.json`，下一次默认从 `remaining.csv` 续跑。 - 增加 `x_unfollow_resume.md` runbook：列出 429 后等待/分批策略、重新 Authorize app 的判断条件、CreditsDepleted 与 scope/permission 错误的区别、以及批量取关前的 dry-run 和回滚边界。 - 示例文件：`out/examples/x-subscribe-hub-api-readiness-and-resume.md` ### 2.3 Azure Pronunciation Assessment 资源/key 就绪诊断与 smoke test 类型：`diagnostic-tool`；风险：`medium`；审批：`pending`。这个主题有明确价值，不是噪音：已有 REST short-audio smoke 脚本和样例 WAV，但真正阻塞发生在 Speech resource、key/region、Portal/tenant 权限这条前置链路。建议沉淀一个本地诊断入口，把环境变量、Azure CLI 登录态、tenant/subscription 权限、Speech resource 可见性、key 获取和 Pronunciation Assessment smoke test 串成一次可复现检查。证据： - `019de91c-170c-7402-9254-50e2607239b3` / `~/Documents/product-bu`：用户要求把 Azure AI Speech Pronunciation Assessment 接口搞定并测通；agent 绕开 Portal，按 REST short-audio + Pronunciation-Assessment header 落了 smoke 脚本和样例 WAV。 - `019de939-7b1b-7143-929c-ec04ced72629` / `~/Documents/product-bu`：用户继续要求从已登录 Azure 账号创建或读取 Speech resource key；agent 发现本机缺 AZURE_SPEECH_KEY/REGION，Foundry 页面不直接暴露 key，Azure Portal 登录态/tenant 权限成为阻塞。 - `019de939-7b1b-7143-929c-ec04ced72629` / `~/Documents/product-bu`：具体报错：[email protected] 不在 Microsoft Services tenant 中，无法访问应用；只能换有 Azure 订阅权限账号、加 external user，或手动填 key/region 后重跑 smoke。建议改动： - 新增诊断工具入口：在 `/Users/bytedance/Documents/product-bu/tools/azure_speech_pronunciation_diag.py` 增加 CLI，支持 `--audio sample.wav --reference-text "..." --resource-name optional --resource-group optional`；输入来源包括环境变量 `AZURE_SPEECH_KEY`/`AZURE_SPEECH_REGION`、Azure CLI 当前账号与 subscription、可选的 sample WAV 和 reference text。 - 核心检查步骤：先检查 `az` 是否安装和 `az account show` 是否返回有效账号；再检查 tenant/subscription 是否可用；随后读取或定位 Speech resource，调用 `az cognitiveservices account keys list` 获取 key；最后用 REST short-audio + `Pronunciation-Assessment` header 发起 smoke test。成功信号是 HTTP 200 且返回 `NBest`/pronunciation score；失败信号按 `missing_env`、`az_not_logged_in`、`tenant_denied`、`no_subscription`、`resource_not_found`、`key_fetch_denied`、`smoke_http_error` 分类输出。 - 建议落地路径：把诊断脚本包装成 product-bu 工作区 skill `azure-speech-pronunciation-diagnostic`，`SKILL.md` 写清触发条件是 Azure Speech/Pronunciation/key/tenant/smoke test；默认先跑诊断，再决定是否打开 Portal 或要求用户切账号。 - 补一份 `docs/azure_speech_pronunciation_setup.md`：记录资源创建/读取路径、环境变量写法、Portal/Foundry 不暴露 key 时的 Azure CLI 替代路径、tenant 报错 `not in Microsoft Services tenant` 的处理分支，以及 smoke test 期望输出样例。 - 示例文件：`out/examples/azure-speech-pronunciation-resource-diagnostic.md` ### 2.4 Night Gym agent_reflections 提取与脱敏质量门禁类型：`workflow-pattern`；风险：`medium`；审批：`pending`。这个主题有明确价值：Night Gym 新增的 agent_reflections 是把 agent 自述瓶颈转成诊断工具候选的关键输入，但它同时暴露了提取缺失和脱敏误伤两个质量风险。建议把它沉淀成 ingest 质量门禁，确保每次生成 input.json 时既能保留瓶颈证据，又不会泄露密钥或截断中文正文。证据： - `019de944-5bba-7d83-8754-6eb66970e304` / `~/Documents/product-bu`：用户要求优化 agent gym：定期询问 agent 最近最困难、最耗时任务，并把具体瓶颈转成诊断工具候选。 - `019de944-5bba-7d83-8754-6eb66970e304` / `~/Documents/product-bu`：现有系统把候选发现交给 master prompt 和子分析 prompt，Python 层主要负责清洗/打包。我会把“定期询问 agent 最难最耗时任务”的机制做成一等输入。 - `019de944-5bba-7d83-8754-6eb66970e304` / `~/Documents/product-bu`：红灯已经确认：当前 `input.json` 没有 `agent_reflections` 字段。接下来我会在 ingest 层做最小实现，并同步更新 Night Gym 的主/子 prompt。建议改动： - 在 `~/Documents/product-bu/tools/night-gym/tests/test_night_gym.py` 增加 ingest 质量门禁用例：构造 assistant/event 消息，验证 `input.json.sessions[].agent_reflections` 存在、只收录明确瓶颈自述、不收录普通进度播报。 - 在 `~/Documents/product-bu/tools/night-gym/tests/test_night_gym.py` 增加脱敏回归样例：包含 `Bearer sk-xxx，后面是中文瓶颈描述`、`API_KEY=xxx，后面是证据句` 等文本，断言密钥被替换但中文逗号后的正文完整保留。 - 在 `~/Documents/product-bu/tools/night-gym/night_gym/prompts/master.md` 和 `night_gym/prompts/subagent.md` 固化候选契约：master 必须先审 `agent_reflections`，若转成 `diagnostic-tool`，subagent 必须输出工具入口、输入来源、核心检查步骤、成功/失败信号和落地路径。 - 在 `~/Documents/product-bu/tools/night-gym/README.md` 或运行手册新增一节 `Reflection ingest quality gate`：列出本地验证命令、最小合成 session、通过标准，以及出现 evidence 截断时优先检查 `privacy.py` 的边界规则。 - 示例文件：`out/examples/night-gym-reflection-ingest-quality-gate.md` ### 2.5 浏览器登录、OAuth 授权和密钥读取的交接工作流类型：`workflow-pattern`；风险：`medium`；审批：`pending`。这个主题有明确价值：多个云服务/API 配置任务都卡在浏览器登录、OAuth 授权、付费升级、tenant 权限和密钥读取这些人机边界上。它不适合只沉淀成单平台教程，而应该沉淀为一套凭证交接工作流：agent 先代做可验证准备，敏感动作交给用户，随后用脚本化 smoke test 验证权限、回调、密钥落盘和真实 API 可用性。证据： - `019de822-b157-7af3-a59c-5a244670d33e` / `~/Documents/info-bu`：X API 配置过程中多次需要用户点击 OAuth Authorize、确认保存一次性 credentials、购买 credits、升级权限；agent 需要区分可代点、需要确认、不能代填的边界。 - `019de939-7b1b-7143-929c-ec04ced72629` / `~/Documents/product-bu`：Azure Portal 中 agent 不能代填账号/MFA；用户登录后仍可能遇到 tenant 权限问题，导致无法读取 resource key。 - `019de91c-170c-7402-9254-50e2607239b3` / `~/Documents/product-bu`：Azure smoke 项目已经把缺 key 场景变成明确诊断；这类云接口任务需要统一的凭证/权限/回调/密钥落盘检查表。建议改动： - 新增或更新全局 workflow 文档：`/Users/bytedance/.codex/AGENTS.md` 或 Night Gym 的模式库中加入 `credential-handoff-browser-workflow` 小节，明确三类动作边界：agent 可代点的导航/复制公开配置、必须让用户确认的购买/授权/保存一次性凭证、绝不代填的账号密码/MFA/支付敏感信息。 - 新增一个 `credential-handoff` skill，入口描述为“浏览器登录、OAuth、云控制台 key 读取、API credits/权限升级时使用”。skill 内容包含：前置检查、交接话术模板、用户完成后的回收步骤、凭证落盘位置规范、以及失败时的诊断分支。 - 为常见云/API 项目补一个最小 smoke 脚本模板，例如 `scripts/smoke_credentials.py` 或项目内 `tools/check_credentials.*`：读取 `.env`/系统 keychain/配置文件，检查必需变量存在，调用最小权限 API，输出 `missing_key`、`permission_denied`、`tenant_mismatch`、`callback_not_configured`、`credits_required` 等稳定信号。 - 在浏览器/OAuth 任务的工作记录里强制留下“交接检查表”：当前页面 URL、用户需要完成的动作、完成后 agent 要验证的文件/环境变量/API、失败截图或错误码。这样后续恢复 session 时不会从头猜测登录状态和权限状态。 - 示例文件：`out/examples/credential-handoff-browser-workflow.md` ## 三、审批运行 `cd out && ./approve.sh <candidate-id>` 可把候选标记为 approved。