Author Strategy
Analyze a researcher's PubMed portfolio to understand their publication strategy and research focus.
Installation
- Make sure Claude is on your device and in your terminal.
Skills load from
~/.claude/skills/when Claude Code starts up — so you need it on your machine first. If you don't have it yet, install it once with the command below, then runclaudein any terminal to verify.One-time setupnpm i -g @anthropic-ai/claude-codeAlready have it? Skip ahead.
- Paste into Claude Code or into your terminal.
This copies the whole skill folder into
~/.claude/skills/author-strategy-aperivue/— the SKILL.md plus any scripts, reference docs, or templates the skill ships with. Safe default: works for every skill.Faster alternative (instruction-only skills)
Skips the clone and grabs only the SKILL.md file. Don't use this if the skill ships Python scripts, reference markdowns, or asset templates — they won't be downloaded and the skill will fail when it tries to load them.
Quick install (SKILL.md only)Sign up to copy - Restart Claude Code.
Quit and reopen Claude Code (or any other agent that loads from
~/.claude/skills/). New skills are picked up on startup. - Just ask Claude.
Skills auto-activate when your request matches the skill's description — no slash command needed. Trigger phrases live in the skill's own frontmatter; you can read them in the “What this skill does” section above.
Prefer to read the source first? Open on GitHub.
When Claude uses it
PubMed author profile analysis. Author name → PubMed fetch → study-type classification → visualization → strategy report → optional trajectory-archetype classification.
What this skill does
/author-strategy — PubMed Author Strategy Analysis
Purpose
Analyze a researcher's PubMed publication portfolio to reverse-engineer their research strategy. Produces a CSV dataset, 7 visualizations, and a strategy report.
Prerequisites
- Python 3.10+ with
biopython,pandas,matplotlib,seaborn, andpyyaml(PyYAML is required by the archetype classifier and the rubric renderer) - Scripts:
${CLAUDE_SKILL_DIR}/fetch_pubmed.py,${CLAUDE_SKILL_DIR}/analyze_patterns.py,${CLAUDE_SKILL_DIR}/pubmed_parse.py(stdlib parser),${CLAUDE_SKILL_DIR}/classify_archetypes.py,${CLAUDE_SKILL_DIR}/render_archetype_doc.py - Rubric:
${CLAUDE_SKILL_DIR}/references/trajectory_archetypes.yaml(canonical) and${CLAUDE_SKILL_DIR}/references/trajectory_archetypes.md(generated)
Workflow
Step 1: Gather Input
Ask the user for:
- Author name (PubMed format, e.g., "Kim DK" or "Lee KS")
- Last name for position classification (auto-detected if ambiguous)
- Output directory (default:
~/.local/cache/author-strategy/{AuthorName}/)
Step 2: Fetch PubMed Data
python "${CLAUDE_SKILL_DIR}/fetch_pubmed.py" "{Author Name}" \
--last-name "{LastName}" \
--output "{output_dir}/data/{name}_publications.csv" \
--email "{user_email}"
Review the console summary (total count, study type distribution, author position). If count is 0, suggest alternative name formats (e.g., "Yon DK" vs "Yon D" vs "Yon Dong Keon").
Step 3: Generate Visualizations and Report
python "${CLAUDE_SKILL_DIR}/analyze_patterns.py" "{output_dir}/data/{name}_publications.csv" \
--output-dir "{output_dir}/report/" \
--author-name "{Author Name}"
This produces:
- 7 PNG charts (01-07)
analysis_report.mdwith strategy breakdown
Step 4: Interpret and Present
Read analysis_report.md and present to the user:
- Executive summary: total publications, growth trajectory, high-tier rate
- Primary strategy: what study type dominates and why
- Author position analysis: first/last positional rate vs middle (positional heuristic only — not leadership or corresponding-author metadata, which are unavailable here)
- Topic clusters: research focus areas
- ROI quadrant: which strategies yield high-tier + leadership vs. volume only
- Replication opportunities: which patterns are replicable with Claude Code + public databases
Step 5: Optional — MA Gap Identification
If the user asks "what MA topics are feasible with this professor?":
- Cross-reference topic clusters with existing MA plans in memory
- Identify gaps where the professor has domain expertise but no MA published
- Output a prioritized list of MA proposals
Optional: Trajectory-Archetype Classification
A second, opt-in capability that classifies the author's trajectory into abstract
career archetypes (A1–A6 + a composite) as an explainable, multi-label,
confidence-scored heuristic — not an objective verdict. The rubric is the canonical
references/trajectory_archetypes.yaml. This path is gated: a surname alone does not
resolve an author, so the corpus must pass an explicit disambiguation review before it
can be classified.
Step 6: Disambiguation Gate (required before classification)
Pass disambiguators so the target author is uniquely attributed (a surname alone is never sufficient):
python "${CLAUDE_SKILL_DIR}/fetch_pubmed.py" "{Author Name}" \
--initials "{Initials}" --orcid "{ORCID}" \
--affiliation "{Institution}" --year-from "{YYYY}" --year-to "{YYYY}" \
--output "{output_dir}/data/{name}_publications.csv" --email "{user_email}"
This writes the CSV, a candidates.json of affiliation/year candidate clusters, and a
corpus_manifest.json with review_status: pending. Present the candidate clusters to
the user for review. The user decides include/exclude. Only after the user has reviewed
the clusters do you finalize and approve the corpus (the --approve flag is a human gate
— never set it without explicit user review/approval):
python "${CLAUDE_SKILL_DIR}/fetch_pubmed.py" "{Author Name}" \
--initials "{Initials}" --affiliation "{Institution}" \
--include-pmids "{included.txt}" --exclude-pmids "{excluded.txt}" --approve \
--output "{output_dir}/data/{name}_publications.csv" --email "{user_email}"
The manifest is cryptographically bound to the CSV (csv_sha256 + pmid_set_hash); the
classifier refuses to run on an unapproved or mismatched corpus.
Step 7: Run the Classifier and Present
python "${CLAUDE_SKILL_DIR}/classify_archetypes.py" \
"{output_dir}/data/{name}_publications.csv" \
--manifest "{output_dir}/data/corpus_manifest.json" \
--rubric "${CLAUDE_SKILL_DIR}/references/trajectory_archetypes.yaml" \
--output-dir "{output_dir}/report/"
Read archetype_report.md and present it to the user, stating up front that the labels
are explainable heuristics, not objective classifications. For each surfaced archetype,
show the score, confidence band, and the author's own evidence PMIDs. Honor the [VERIFY]
markers (h-index/citation/venue-tier are unavailable) and the A5 participation flag. List
the insufficient evidence archetypes too.
To retune the rubric, edit only the YAML and regenerate the narrative doc:
python "${CLAUDE_SKILL_DIR}/render_archetype_doc.py" # regenerate the .md
python "${CLAUDE_SKILL_DIR}/render_archetype_doc.py" --check # CI/test sync gate
Study Type Classifier
The classifier is tuned for Korean epidemiology and public health researchers. Categories:
| Type | Detection Pattern |
|---|---|
| GBD | "global burden" or "gbd" in title/abstract |
| SR/MA | "systematic review" or "meta-analysis" |
| NHIS/Claims | "national health insurance", "nhis", "claims database", "nationwide cohort" |
| Cross-national | Country pairs or "cross-national"/"binational" |
| National survey | "knhanes", "nhanes", "kchs", "national survey" |
| Biobank | "biobank" |
| AI/ML | "machine learning", "deep learning", "artificial intelligence" |
| Clinical trial | "randomized" or publication type |
| Case report | "case report" |
| Letter/Commentary | Publication type = letter/comment/editorial |
Known limitation: The classifier may undercount NHIS studies when they appear in Cross-national or Other categories. The report notes this.
Known Limitations
- The study type classifier is tuned for epidemiology and public health researchers. May undercount specialized study types for other fields.
- NHIS studies may be undercounted when they appear in cross-national or "other" categories.
- PubMed search requires an email for NCBI E-utilities (set via
--emailflag).
Anti-Hallucination
- Never fabricate publication counts, h-index, or journal metrics. All numbers must come from PubMed API output.
- Never invent study classifications. If a paper cannot be classified, label it as "Other" rather than guessing.
- If PubMed returns 0 results, suggest alternative name formats rather than generating fake data.
- Archetype labels are explainable heuristics, not objective classifications. Every label must carry a score, a confidence band, and evidence (the queried author's own PMIDs). Below the minimum sample or with conflicting signals, report
insufficient evidence— never force a label. - Metadata + stored abstract only. Signals are computed from PubMed metadata and the title/abstract text already fetched. Do not retrieve full text, follow external links, or resolve preprints. Signals that need citations, citation half-life, venue-impact tier, repository/preprint links, or corresponding-author role are
unavailableand surface as[VERIFY]— never inferred. - Author position is a positional heuristic (first/middle/last/unknown + real EqualContrib). Never present it as authoritative leadership or corresponding-author metadata.
- Never resolve an author by surname alone. Classification requires an approved, CSV-bound
corpus_manifest.json; present candidate clusters for the user to confirm.
Output Structure
{output_dir}/
data/
{name}_publications.csv
candidates.json # disambiguation candidate clusters (Step 6)
corpus_manifest.json # review_status + csv_sha256 + pmid_set_hash (Step 6)
report/
analysis_report.md
01_yearly_stacked.png
02_study_type_pie.png
03_author_position.png
04_journal_tier_heatmap.png
05_topic_distribution.png
06_growth_curve.png
07_strategy_roi.png
archetype_report.md # trajectory-archetype classification (Step 7)
archetype_results.json # machine-readable labels + scores + evidence
Related skills
Investor Board Deck Builder
alirezarezvani
Build compelling board decks that weave metrics and narratives into cohesive investor updates.
ckm:slides
nextlevelbuilder
Create strategic HTML presentations with charts, responsive layouts, and copywriting optimization.
Excel Data Dashboard
iOfficeAI
Build multi-chart Excel dashboards with KPI cards and conditional formatting from CSV data.
Financial Model Builder
iOfficeAI
Build Excel financial models including DCF, LBO, unit economics, and scenario analysis.