AugmentClaude

Video Script Writer

Generate timestamped Chinese narration scripts for videos with validation.

Installation

  1. Make sure Claude is on your device and in your terminal.

    Skills load from ~/.claude/skills/ when Claude Code starts up — so you need it on your machine first. If you don't have it yet, install it once with the command below, then run claude in any terminal to verify.

    One-time setup
    npm i -g @anthropic-ai/claude-code

    Already have it? Skip ahead.

  2. Paste into Claude Code or into your terminal.

    This copies the whole skill folder into ~/.claude/skills/video-script-worldwonderer/ — the SKILL.md plus any scripts, reference docs, or templates the skill ships with. Safe default: works for every skill.

    Faster alternative (instruction-only skills)

    Skips the clone and grabs only the SKILL.md file. Don't use this if the skill ships Python scripts, reference markdowns, or asset templates — they won't be downloaded and the skill will fail when it tries to load them.

    Quick install (SKILL.md only)
    Sign up to copy
  3. Restart Claude Code.

    Quit and reopen Claude Code (or any other agent that loads from ~/.claude/skills/). New skills are picked up on startup.

  4. Just ask Claude.

    Skills auto-activate when your request matches the skill's description — no slash command needed. Trigger phrases live in the skill's own frontmatter; you can read them in the “What this skill does” section above.

Prefer to read the source first? Open on GitHub.

When Claude uses it

Write a timestamped Chinese narration script (解说词 / 旁白) for an already-analyzed video, then lint/validate it. Use after video-understanding has produced agent_narration_brief.md + vlm_analysis.json, when you need to author the recap narration (style, anti-hallucination, 字数公式, density, hook/throughline). Input: the understanding index in work_dir. Output: narration.json (validated). 触发词: 解说词, 写解说, 视频旁白, narration script, 写稿, 解说文案.

What this skill does

What this does

Authoring + validation of the narration script. The agent writes work_dir/narration.json following the rules below; then validate.py lints it against the understanding index, and in full mode time-aligns it to quiet windows.

Step 1 — read the brief

Read work_dir/agent_narration_brief.md (scenes, durations, quiet windows, char budget) first. Digest long dialogue via asr_writing_chunks.json; judge "is there speech/a silent slot here?" via timeline_fusion.json. Check raw vlm_analysis.json / asr_result.json for details. All timestamps are original-video time.

Step 2 — write narration.json

[
  {"start": 5.0, "end": 12.0, "narration": "解说文本。", "pause_after_ms": 250, "overlaps_speech": true}
]
FieldMeaning
start / endnarration start/end seconds (original-video time)
narrationnarration text
pause_after_mspause after segment, default 250 (keeps a tight rhythm)
overlaps_speechoverlaps original dialogue; default true for continuous-bed style, false only in true silence

写作规则(连续原声铺底的高密度 recap)

  1. 连续解说:沿整条时间轴用短促 beat 连续解说,原声作为压低的背景一直存在。
  2. 达到密度目标:按 brief 头部给出的目标(约 9.6 段/分钟,最低 6.24),相邻 beat 间隔不要超过 11 秒。
  3. 默认重叠原声overlaps_speech 默认 true;只有刻意放进真正静音空档的 beat 才设 false。
  4. 每段短小:约一句短句(1-2 行字幕),宁短不长。字数 ≤ (end - start - 0.25) × 3
  5. 不要看图说话:观众看得见动作表情,解说讲动机、关系、潜台词和剧情意义(基于画面可见的证据,不要编造)。
  6. 用已知角色名--contextbackground_research.json 给出角色名时优先使用。
  7. 完整句子:以句号 / 问号 / 感叹号结束,不写半句。

解说手法(区分“解说”和“字幕”)

  • 钩子:开头 1-2 个 beat 制造悬念或利害,不是交代场景。
  • 主线:选一条主线(目标 / 关系 / 悬念),每个 beat 都推进它。
  • 递进:信息和张力逐步升级。悬念缺口:提前埋后果,后面回收。
  • 收尾:最后 1-2 个 beat 给出结果或反转,不要泛泛收场。
  • 给信息而非念画面去废词:用具体名词动词,删空泛形容。

Step 2.5 — review GATE (advisory, logged, overridable)

A separate quality pass (LLM-as-judge), distinct from the mechanical lint below. Needs the chat API key (same as VLM).

  1. Run: python3 scripts/review.py --work-dir <work_dir>
  2. Open narration_review.md. For every error finding (ESPECIALLY category=hallucination — a claim not grounded in the visual/ASR evidence), revise narration.json and re-run review until either:
    • (a) verdict == OK with zero error findings, OR
    • (b) you consciously OVERRIDE a remaining finding (next step).
  3. To OVERRIDE: append a block to work_dir/narration_review_override.md naming WHICH finding (segment + category), WHY it is acceptable, and who signed off. Unaddressed error findings with no override entry mean the draft is NOT ready.
  4. Only then proceed to Step 3 (validate.py — the hard gate).

GATE rule: review NEVER blocks the tooling (it leans on a flaky chat API and a re-render is cheap). validate.py is the deterministic hard gate. The override log makes "we saw the finding and chose to ship it" auditable — review.py / validate.py never read it; it is a record for the human in the loop.

Override block shape — work_dir/narration_review_override.md (append-only):

## Override — <date>
- Finding: segment 4 / category=hallucination
- Reviewer said: "‘他早已知情’无画面/对白依据"
- Decision: KEEP — grounded in the --context synopsis (s2 reveal); reviewer lacked that context.
- Signed: <agent/human>

Step 3 — validate

python3 scripts/validate.py --work-dir <work_dir> --mode full   # or --mode cut

Writes narration_lint.json; in full mode rewrites narration.json with quiet-window alignment. Fix any lint errors and re-run until clean.

Cut mode (long video → short recap)

Before narration, write work_dir/clip_plan.json (original-time source ranges to keep), optionally self-review it in clip_plan_review.md (agent-only; the tooling does not read it), then write narration.json with timestamps that fall inside the kept clips. The video-cut skill maps both to the shortened timeline. Validate with --mode cut.

{"target_duration": "10m", "clips": [{"start": 12.0, "end": 38.0, "reason": "冲突开端"}]}

剪辑模式写作要点(解说要对上剪后的画面,不是原片):

  • 按成片时长写:解说的 beat 数量对齐目标成片(brief 头部已按目标时长算好),不是原片时长。中点落在任何保留片段之外的 beat 会被丢弃——别在被剪掉的段落上写解说。
  • 每段落在单个片段内:一个 beat 的 [start, end] 不要跨片段边界,否则会被裁到片段长度,配音就会念到剪掉的画面(--mode cut 会以 crosses_clip_boundary 警告)。
  • 按成片顺序讲:beat 按它们所属片段在成片里出现的先后排序,让解说在剪后画面上连成一条线。
  • 重复源区间:clips 复用了重叠的源区间时,给 beat 标 source_clip_id,确保映射到正确片段。

片名/题材明确但缺乏剧情上下文时,先按 背景调研指南background_research.json 再写解说——否则解说只能"看图说话"。brief 在 substrate 偏薄时会把密度目标降为上限而非配额:宁可少写、写实,也不要为凑数堆画面描述。

What this skill does NOT do

  • Does NOT run ASR/VLM or analyze the video — it consumes the understanding index.
  • Does NOT synthesize audio or render video.
  • review.py does NOT edit narration.json and does NOT block the pipeline — it is advisory.
  • validate.py does NOT rewrite the meaning of the text — it only checks/aligns timing and quiet windows.

Related skills