Codex History Ingest
Import your past Codex CLI sessions into your Obsidian wiki for knowledge mining.
Installation
- Make sure Claude is on your device and in your terminal.
Skills load from
~/.claude/skills/when Claude Code starts up — so you need it on your machine first. If you don't have it yet, install it once with the command below, then runclaudein any terminal to verify.One-time setupnpm i -g @anthropic-ai/claude-codeAlready have it? Skip ahead.
- Paste into Claude Code or into your terminal.
This copies the whole skill folder into
~/.claude/skills/codex-history-ingest-ar9av/— the SKILL.md plus any scripts, reference docs, or templates the skill ships with. Safe default: works for every skill.Faster alternative (instruction-only skills)
Skips the clone and grabs only the SKILL.md file. Don't use this if the skill ships Python scripts, reference markdowns, or asset templates — they won't be downloaded and the skill will fail when it tries to load them.
Quick install (SKILL.md only)Sign up to copy - Restart Claude Code.
Quit and reopen Claude Code (or any other agent that loads from
~/.claude/skills/). New skills are picked up on startup. - Just ask Claude.
Skills auto-activate when your request matches the skill's description — no slash command needed. Trigger phrases live in the skill's own frontmatter; you can read them in the “What this skill does” section above.
Prefer to read the source first? Open on GitHub.
When Claude uses it
Ingest Codex CLI conversation history into the Obsidian wiki. Use this skill when the user wants to mine their past Codex sessions for knowledge, import their ~/.codex folder, extract insights from previous coding sessions, or says things like "process my Codex history", "add my Codex conversations to the wiki", or "what have I discussed in Codex before". Also triggers when the user mentions .codex sessions, rollout files, session_index.jsonl, or Codex transcript logs.
What this skill does
Codex History Ingest — Conversation Mining
You are extracting knowledge from the user's past Codex sessions and distilling it into the Obsidian wiki. Session logs are rich but noisy: focus on durable knowledge, not operational telemetry.
This skill can be invoked directly or via the wiki-history-ingest router (/wiki-history-ingest codex).
Before You Start
- Resolve config — follow the Config Resolution Protocol in
llm-wiki/SKILL.md(walk up CWD for.env→~/.obsidian-wiki/config→ prompt setup). This givesOBSIDIAN_VAULT_PATHandCODEX_HISTORY_PATH(defaults to~/.codex) - Read
.manifest.jsonat the vault root to check what has already been ingested - Read
index.mdat the vault root to understand what the wiki already contains
Ingest Modes
Append Mode (default)
Check .manifest.json for each source file. Only process:
- Files not in the manifest (new session rollouts, new index files)
- Files whose modification time is newer than
ingested_atin the manifest
Use this mode for regular syncs.
Full Mode
Process everything regardless of manifest. Use after wiki-rebuild or if the user explicitly asks for a full re-ingest.
Codex Data Layout
Codex stores local artifacts under ~/.codex/.
~/.codex/
├── sessions/ # Session rollout logs by date
│ └── YYYY/MM/DD/
│ └── rollout-<timestamp>-<id>.jsonl
├── archived_sessions/ # Archived rollout logs
├── session_index.jsonl # Lightweight index of thread id/name/updated_at
├── history.jsonl # Local transcript history (if persistence enabled)
├── config.toml # User config (contains history settings)
└── state_*.sqlite / logs_*.sqlite # Runtime DBs (usually skip)
Key data sources ranked by value
session_index.jsonl— best inventory source for IDs, titles, and freshnesssessions/**/rollout-*.jsonl— rich structured transcript eventshistory.jsonl— useful fallback/timeline aid if enabled
Avoid ingesting SQLite internals unless the user explicitly asks.
Step 1: Survey and Compute Delta
Scan CODEX_HISTORY_PATH and compare against .manifest.json:
~/.codex/session_index.jsonl~/.codex/sessions/**/rollout-*.jsonl~/.codex/archived_sessions/**(optional; only if user asks for archived history)~/.codex/history.jsonl(optional fallback)
Classify each file:
- New — not in manifest
- Modified — in manifest but file is newer than
ingested_at - Unchanged — already ingested and unchanged
Report a concise delta summary before deep parsing.
Step 2: Parse Session Index First
session_index.jsonl typically has entries like:
{"id":"...","thread_name":"...","updated_at":"..."}
Use it to:
- Build a canonical session inventory
- Prioritize recent/high-signal sessions
- Map rollout IDs to human-readable thread names
Step 3: Parse Rollout JSONL Safely
Each rollout-*.jsonl line is an event envelope with:
{
"timestamp": "...",
"type": "session_meta|turn_context|event_msg|response_item",
"payload": { ... }
}
Extraction rules
- Prioritize user intent and assistant-visible outputs
- Favor
response_itemrecords with user/assistant message content - Use
event_msgselectively for meaningful milestones; ignore pure telemetry - Treat
session_metaas metadata (cwd, model, ids), not user knowledge
Skip/noise filters
- Token accounting events
- Tool plumbing with no semantic content
- Raw command output unless it contains reusable decisions/patterns
- Repeated plan snapshots unless they add novel decisions
Critical privacy filter
Rollout logs can include injected instructions, tool payloads, and sensitive text. Do not ingest verbatim system/developer prompts or secrets.
- Remove API keys, tokens, passwords, credentials
- Redact private identifiers unless relevant and approved
- Summarize instead of quoting raw transcripts
Step 4: Cluster by Topic
Do not create one wiki page per session.
- Group by stable topics across many sessions
- Split mixed sessions into separate themes
- Merge recurring concepts across dates/projects
- Use
cwdfrom metadata to infer project scope
Step 5: Distill into Wiki Pages
Route extracted knowledge using existing wiki conventions:
- Project-specific architecture/process ->
projects/<name>/... - General concepts ->
concepts/ - Recurring techniques/debug playbooks ->
skills/ - Tools/services ->
entities/ - Cross-session patterns ->
synthesis/
For each impacted project, create/update projects/<name>/<name>.md (project name as filename, never _project.md).
Writing rules
- Distill knowledge, not chronology
- Avoid "on date X we discussed..." unless date context is essential
- Add
summary:frontmatter on each new/updated page (1-2 sentences, <= 200 chars) - Add confidence and lifecycle fields to every new page:
Leavebase_confidence: 0.42 lifecycle: draft lifecycle_changed: <ISO date today>lifecycleunchanged on update. - Add provenance markers:
^[extracted]when directly grounded in explicit session content^[inferred]when synthesizing patterns across events/sessions^[ambiguous]when sessions conflict
- Add/update
provenance:frontmatter mix for each changed page
Step 6: Update Manifest, Log, and Index
Update .manifest.json
For each processed source file:
ingested_at,size_bytes,modified_atsource_type:codex_rollout|codex_index|codex_historyproject: inferred project name (when applicable)pages_created,pages_updated
Add/update a top-level project/session summary block:
{
"project-name": {
"source_path": "~/.codex/sessions/...",
"last_ingested": "TIMESTAMP",
"sessions_ingested": 12,
"sessions_total": 40,
"index_updated_at": "TIMESTAMP"
}
}
Update special files
Update index.md and log.md:
- [TIMESTAMP] CODEX_HISTORY_INGEST sessions=N pages_updated=X pages_created=Y mode=append|full
hot.md — Read $OBSIDIAN_VAULT_PATH/hot.md (create from the template in wiki-ingest if missing). Update Recent Activity with a one-line summary — e.g. "Ingested 12 Codex sessions; surfaced recurring patterns in CLI tooling and shell scripting." Keep the last 3 operations. Update updated timestamp.
Privacy and Compliance
- Distill and synthesize; avoid raw transcript dumps
- Default to redaction for anything that looks sensitive
- Ask the user before storing personal/sensitive details
- Keep references to other people minimal and purpose-bound
Reference
See references/codex-data-format.md for field-level parsing notes and extraction guidance.
QMD Refresh After Vault Writes
QMD is a search index, not the source of truth. If $QMD_WIKI_COLLECTION is empty or unset, skip this step. Run it only after this skill has written or rewritten vault markdown. If QMD refresh fails, do not roll back the vault changes; report the QMD status separately.
Use $QMD_CLI if set; otherwise use qmd.
${QMD_CLI:-qmd} update
If the output says vectors are needed or embeddings may be stale, run:
${QMD_CLI:-qmd} embed
Verify the collection with either:
${QMD_CLI:-qmd} ls "$QMD_WIKI_COLLECTION"
or, when a specific page path is known:
${QMD_CLI:-qmd} get "qmd://$QMD_WIKI_COLLECTION/<page>.md" -l 5
Record one of:
QMD refreshed: update + embed + verifiedQMD refreshed: update only + verifiedQMD skipped: QMD_WIKI_COLLECTION unsetQMD skipped: qmd CLI unavailableQMD failed: <short error summary>
Related skills
Word Document Editor
anthropics
Create, edit, and format Word documents with tables, images, and tracked changes.
Skill Template
anthropics
A template for creating and configuring new Claude skills.
Kanban TUI
Zaloog
Manage tasks, boards, and workflows in your terminal with a CLI kanban tool.
Superpowers
obra
Jesse Vincent's full Claude Code methodology — TDD, brainstorming, git worktrees, all auto-triggering.