AugmentClaude

AAAI Artifact Evaluation

Package code, data, and appendices for AAAI conference artifact review.

Installation

  1. Make sure Claude is on your device and in your terminal.

    Skills load from ~/.claude/skills/ when Claude Code starts up — so you need it on your machine first. If you don't have it yet, install it once with the command below, then run claude in any terminal to verify.

    One-time setup
    npm i -g @anthropic-ai/claude-code

    Already have it? Skip ahead.

  2. Paste into Claude Code or into your terminal.

    This copies the whole skill folder into ~/.claude/skills/aaai-artifact-evaluation-brycewang-stanford/ — the SKILL.md plus any scripts, reference docs, or templates the skill ships with. Safe default: works for every skill.

    Faster alternative (instruction-only skills)

    Skips the clone and grabs only the SKILL.md file. Don't use this if the skill ships Python scripts, reference markdowns, or asset templates — they won't be downloaded and the skill will fail when it tries to load them.

    Quick install (SKILL.md only)
    Sign up to copy
  3. Restart Claude Code.

    Quit and reopen Claude Code (or any other agent that loads from ~/.claude/skills/). New skills are picked up on startup.

  4. Just ask Claude.

    Skills auto-activate when your request matches the skill's description — no slash command needed. Trigger phrases live in the skill's own frontmatter; you can read them in the “What this skill does” section above.

Prefer to read the source first? Open on GitHub.

When Claude uses it

Use when packaging AAAI code, data, multimedia appendices, technical appendices, reproducibility evidence, and post-acceptance artifact releases without violating double-blind or immutable-supplement rules.

What this skill does

AAAI Artifact Evaluation

Use this to prepare artifacts that reviewers can use to assess reproducibility. AAAI supplementary material is part of the submission record; after review starts, do not assume it can be updated.

Artifact package

  • Provide a technical appendix for proofs, algorithms, assumptions, hyperparameters, and extended experiments.
  • Provide code/data ZIPs that reproduce main tables or figures, with a short README, environment, commands, seeds, expected outputs, and runtime.
  • Provide multimedia appendices only when they support the technical claim.
  • Remove author names, usernames, paths, repository history, cloud buckets, API keys, and metadata.
  • Avoid web pointers in the reviewed submission unless current rules explicitly allow them.
  • Include licensing and access notes for datasets, models, and third-party code.

AAAI-specific discipline

  • Treat the supplementary deadline as final.
  • Verify ZIP integrity before submission; missing or corrupted files may not be fixable during rebuttal.
  • Make the reproducibility checklist consistent with the artifact package.
  • Prepare a post-acceptance public release path but keep review artifacts anonymous.

What an AAAI reviewer actually opens

AAAI does not run a separate badged artifact-evaluation committee the way some systems venues do; the same broad-AI reviewer who scores the paper also inspects whatever supplement you attach. That reviewer may be a planning, knowledge-representation, or constraint-satisfaction specialist rather than a deep-learning engineer, so the artifact has to be legible without insider tooling. Optimize for a reviewer who skims, not one who will spend an afternoon configuring a cluster.

Reviewer actionPassesFails
Opens the ZIPsane tree, top READMEnested archives, 0-byte files
Reads appendixmaps to numbered claimscontradicts the paper
Tries one commandreproduces one headline numberneeds private data or credentials
Scans for identitynothing reveals authorsGit logs or home paths leak

Phase-1 artifact red flags

Because clearly-below-bar papers can be cut before author feedback, a supplement that looks thin or unrunnable is a cheap reason to summary-reject. Avoid these:

  • Checklist promises released code, but the ZIP only holds figures and no scripts.
  • A "see our repository" pointer to a mutable, deanonymizing URL.
  • Multimedia attached for spectacle that carries no technical claim, inflating size with no rigor.
  • Datasets shipped with no license note, leaving reuse legality unverifiable.

Worked vignette

A constraint-solving paper claims a 30% node-expansion reduction. The team ships a large ZIP of raw solver logs but no driver script. The reproduction path is empty, so artifact status is "risky"; the fix is a small run_main.py that regenerates Table 2 from seeds, a trimmed log sample, and a license for the benchmark instances. The raw dump moves to the post-acceptance release.

Output format

[Artifact status] complete / partial / risky / unavailable
[Submitted files] technical appendix / multimedia appendix / code-data ZIP
[Reviewer reproduction path] <commands and expected output>
[Anonymity risks] <metadata, links, paths, logs>
[Missing items] <data, code, seeds, licenses, hardware>

Related skills