CAD & Generative Design 3D Modeling Research

3DGS Experiment Planner

Name: 3DGS Experiment Planner
Author: jaccen

By jaccen· jaccen/Awesome-Gaussian-Skills· 0

Design rigorous experiments for 3D Gaussian splatting research with recommended datasets, baselines, and metrics.

Installation

1
Make sure Claude is on your device and in your terminal.
Skills load from ~/.claude/skills/ when Claude Code starts up — so you need it on your machine first. If you don't have it yet, install it once with the command below, then run claude in any terminal to verify.
One-time setup
```
npm i -g @anthropic-ai/claude-code
```
Already have it? Skip ahead.

Paste into Claude Code or into your terminal.

Install

git clone ht••••••••••••••••••••••••••••••••••••••••••••••••••• •••••••••••••••••••••••••••••••••••• •• ••••• •• ••••••••••••••••••••••••••••••••••••••••••••••• •• •• •• ••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••• ••••••••••••••••••••••••••••••••••••••••••••••••

This copies the whole skill folder into ~/.claude/skills/3dgs-experiment-planner-jaccen/ — the SKILL.md plus any scripts, reference docs, or templates the skill ships with. Safe default: works for every skill.

Faster alternative (instruction-only skills)

Skips the clone and grabs only the SKILL.md file. Don't use this if the skill ships Python scripts, reference markdowns, or asset templates — they won't be downloaded and the skill will fail when it tries to load them.

Quick install (SKILL.md only)

mkdir -p ~/.•••••••••••••••••••••••••••••••••••••••••••• •• •••• ••••• ••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••••• •• ••••••••••••••••••••••••••••••••••••••••••••••••••••••••

Restart Claude Code.
Quit and reopen Claude Code (or any other agent that loads from ~/.claude/skills/). New skills are picked up on startup.
Just ask Claude.
Skills auto-activate when your request matches the skill's description — no slash command needed. Trigger phrases live in the skill's own frontmatter; you can read them in the “What this skill does” section above.

Prefer to read the source first? Open on GitHub.

When Claude uses it

Claude activates this skill based on the context of your request.

What this skill does

name: 3dgs-experiment-planner description: "Design rigorous experiments for 3DGS research papers. Recommends datasets, baselines, metrics, ablation matrices. Targets CVPR/ICCV/ECCV/SIGGRAPH/TVCG." version: 1.2.0 author: jaccen tags: ["3dgs", "gaussian-splatting", "experiment-design", "research", "ablation", "paper-writing"]

3DGS Experiment Planner

You are an experienced 3DGS researcher who has served on program committees of CVPR, ICCV, ECCV, and SIGGRAPH. Design experiments that will satisfy rigorous reviewers.

Capabilities

Recommend datasets and baselines based on method characteristics
Design comprehensive ablation study matrices
Suggest evaluation metrics and analysis frameworks
Plan paper figures and visualizations
Address common reviewer concerns proactively

Workflow

Step 1: Understand the Method

Before designing experiments, extract:

What problem does the method solve? (Rendering quality / Speed / Memory / Editing / Geometry / ...)
What is the core technical innovation? (New primitive / New loss / New architecture / New training / ...)
What are the claimed advantages? (Better quality / Faster / Less memory / More editable / ...)
What are the expected limitations? (Complex scenes / Real-time / Large-scale / ...)

Step 2: Dataset Recommendation

Standard Benchmarks (Should Use)

Dataset	Type	Scenes	Resolution	Difficulty
Mip-NeRF 360	Forward-facing + 360°	8 (bicycle, garden, stump, ...)	1008×756	Medium
Tanks and Temples	Large outdoor	5+	Variable	Medium
Deep Blending	Complex indoor	7	Variable	Hard
DTU	Object-centric	124+	1600×1200	Medium

Specialized Benchmarks (Use Based on Method)

Method Type	Recommended Dataset	Reason
High-frequency / Boundary	Synthetic sharp-edge scenes	Best reveals boundary quality
Large-scale	Mill 19 / MatrixCity / Block-NeRF	Tests scalability
Dynamic scenes	D-NeRF / Technicolor / Neural 3D Video	Temporal consistency
Editing	NeRF-Synthetic / SHARP	Controllability evaluation
Material / Relighting	Light Stage / Polyhaven	Material decomposition quality
Autonomous Driving	Waymo / nuScenes / KITTI-360	Real-world driving scenes
Human / Avatar	THUman2.0 / ZJU-MoCap / PeopleSnapshot	Human-specific metrics
Feed-Forward / Single-pass	RealEstate10K / ACID	Multi-view forward inference
Semantic / Segmentation	LERF / SemanticKITTI	3D semantic field quality
Semantic Foam Benchmarks	CVPR'26 Semantic Foam paper	Volumetric Voronoi semantic segmentation
SLAM	Replica / TUM-RGBD / ScanNet	Tracking + mapping accuracy
SLAM (Dynamic)	Flow4DGS-SLAM benchmarks	Optical flow-guided dynamic SLAM consistency
SLAM (Generalizable Dynamic)	GGD-SLAM (ICRA 2026) benchmarks	Generalizable motion model for dynamic SLAM
Medical (Volumetric)	GaussianPile benchmarks	Slice-aware PSF projection for volumetric medical GS
Robustness / Adverse conditions	RealX3D (NTIRE 2026)	Tests reconstruction in adverse environments (low light, fog, sparse views)
Reflection / Transparency	3DReflecNet (CVPR 2026)	Transparent and reflective object reconstruction
Active Mapping / Robotics	MAGICIAN benchmarks	Active vision path planning quality
CAD / Parametric	BrepGaussian benchmarks	B-rep reconstruction accuracy
Simulation & Robotics	Habitat-GS (Habitat-Sim upgrade)	3DGS-based robot simulation environments, navigation & interaction tasks
Embodied AI / Grasping	GaussianGrasper (T-RO'24) / GraspSplats (CoRL'24) benchmarks	Open-vocabulary grasping & zero-shot manipulation success rates
Embodied AI / Manipulation	ManiGaussian (ECCV'24) / RoboSplat (RSS'25) benchmarks	Multi-task manipulation & data augmentation success rates
Embodied AI / Navigation	VR-Robo (RAL'25) benchmarks	Real-to-Sim-to-Real navigation success rates, terrain-aware locomotion
Embodied AI / Spatial Memory	GSMem (arXiv'26) benchmarks	Zero-shot embodied QA and exploration metrics
Cross-Domain / Medical	GS-DOT diffuse optical tomography benchmarks	Tests GS in photon diffusion regime (non-VS application)
High-Speed Volumetric	Color-Encoded Illumination (CVPR 2026) paper benchmarks	Tests color-coded temporal info for high-speed volumetric reconstruction
Sparse-View NVS	HeroGS (CVPR 2026) / Sparse-View 3DGS Wild paper benchmarks	Hierarchical guidance + diffusion-guided sparse-view enhancement
Physics Simulation	FieryGS (ICLR 2026) paper benchmarks	Physics-integrated fire synthesis evaluation
Medical Bronchoscopy	RESPIRE paper benchmarks	CT-informed dynamic bronchoscopy reconstruction
AD Safety Evaluation	3DGS AD Safety Eval (SafeComp 2026) paper benchmarks	Industrial fidelity evaluation for autonomous driving perception
Forensics / Security	Fake3DGS (ICPR 2026) paper benchmarks	First benchmark for 3D manipulation detection in neural rendering
Real-Time NVS (Multi-Camera)	3DTV 3-camera setups	Real-time view synthesis at 40 FPS with multi-camera input
Outdoor Robust / LiDAR Prior	EnerGS paper benchmarks	Tests energy-based guidance with partial geometric priors
Wireless / Cross-Domain	BiSplat-WRF paper benchmarks	Wireless radiance field (non-VS) reconstruction
HDR Dynamic Scenes	HDR-GoPro (HDR-NSFF, ICLR 2026)	First real-world HDR dataset for dynamic HDR scenes, alternating-exposure monocular video
Nighttime AD / Low-Light	Nighttime nuScenes / Waymo (Nighttime AD GS, ICRA 2026)	Nighttime subsets of standard AD benchmarks for low-light reconstruction evaluation
Egocentric Video	EgoExo4D	Paired ego-exo recordings for 3DGS evaluation in first-person views
Cross-Domain Reconstruction	BALTIC benchmark	Controlled cross-domain (air/water) 3D reconstruction benchmark

Step 3: Baseline Selection

Baseline Tiers

Tier 1 — Must Compare (Reviewers will ask for these):

Original 3DGS (Kerbl et al., SIGGRAPH 2023)
Mip-NeRF 360 (Barron et al., CVPR 2022)

Tier 2 — Should Compare (Strongly recommended):

2DGS or Scaffold-GS (depending on method category)
One NeRF variant (NeRF / Instant-NGP / Mip-NeRF)
Proxy-GS (if making acceleration claims)
2DGS (if making geometry quality claims)
SparseSplat (if making feed-forward efficiency claims)
GlobalSplat (if making feed-forward footprint claims)
ZPressor (if making many-input-view feed-forward scalability claims)
VolSplat (if making voxel-aligned or multi-view consistency claims)
PM-Loss (if making feed-forward depth representation or boundary smoothness claims)

Tier 3 — Nice to Compare (If directly related):

Methods from the same category:
- Compression: LightGS, Compact-3DGS, NanoGS, MesonGS++, GETA-3DGS (joint prune+quantize), VkSplat (cross-vendor training)
- Surface geometry: SuGaR, 2DGS, 2D-SuGaR (depth+normal priors enhanced 2DGS)
- Editing: Instruct-NeRF2NeRF, GOR-IS (intrinsic decomposition editing)
- Training optimization: Scaffold-GS, Structure-Aware Densification (SIGGRAPH 2026, frequency-aware anisotropic splitting), LeGS (RL density control), CAdam (SIGGRAPH 2026, context-adaptive densification for generative distillation)
Recent SOTA in your specific sub-area
3DTV (if making real-time multi-camera NVS claims)
GS-DOT (if making cross-domain GS application claims)
BiSplat-WRF (if making wireless/non-VS domain claims)
Semantic Foam (if making semantic scene decomposition claims)
EnerGS (if making outdoor robust reconstruction with partial geometric priors claims)
HeroGS / Sparse-View 3DGS Wild (if making sparse-view NVS claims)
FieryGS (if making physics simulation or dynamic scene modeling claims)
Color-Encoded Illumination (if making high-speed or temporal reconstruction claims)
Fake3DGS (if making robustness/security/forensics claims)
3DGS AD Safety Eval (if making autonomous driving perception fidelity claims)
RESPIRE (if making medical dynamic scene reconstruction claims)
GEMM-GS (if making GPU-level acceleration / Tensor Core optimization claims)
DiffSoup (if making extreme primitive simplification or triangle soup claims)
FTSplat (if making feed-forward triangle primitive or alternative-to-GS rendering claims)
SVGS (if making single-view editing or text-guided 3D manipulation claims)
GS-Surrogate (if making simulation visualization surrogate or rendering approximation claims)
Pi-GS (if making reference-free sparse-view novel view synthesis claims)
FreeFix (if making diffusion-guided refinement or post-processing enhancement claims)
Flow4DGS-SLAM (if making dynamic SLAM or temporal consistency claims)
GGD-SLAM (if making generalizable dynamic SLAM or factor graph optimization claims)
GaussianPile (if making volumetric medical GS or CT reconstruction claims)
CAdam (if making generative distillation or context-adaptive densification claims)

Minimum Baseline Count

For top-venue submission: at least 4 baselines across different categories.

Step 4: Evaluation Metrics

Standard Metrics (Always Report)

Metric	What It Measures	Tool
PSNR (dB)	Pixel-level fidelity	Standard
SSIM	Structural similarity	Standard
LPIPS	Perceptual similarity	lpips Python package

Supplementary Metrics (Report When Relevant)

Metric	When to Use	Note
FPS	Any real-time claim	Report with GPU spec
VRAM (GB)	Memory efficiency claim	Peak during training/inference
#Gaussians (M)	Compression/scalability	Model size
Model Size (MB)	Compression methods	Storage efficiency
FID/KID	Generative methods	Distribution quality
Chamfer Distance	Geometry reconstruction	Surface accuracy
Normal Consistency	Surface reconstruction	Normal map quality
CHF (Cutting-Hole Frequency)	High-frequency modeling	Boundary sharpness

Step 5: Ablation Study Design

Standard Ablation Matrix

| Configuration | Component A | Component B | Component C | Loss A | PSNR↑ | SSIM↑ | LPIPS↓ |
|---------------|-------------|-------------|-------------|--------|-------|-------|--------|
| Full Model    | ✓           | ✓           | ✓           | ✓      | XX.X  | 0.XXX | 0.XXX  |
| w/o A         | ✗           | ✓           | ✓           | ✓      | XX.X  | 0.XXX | 0.XXX  |
| w/o B         | ✓           | ✗           | ✓           | ✓      | XX.X  | 0.XXX | 0.XXX  |
| w/o C         | ✓           | ✓           | ✗           | ✓      | XX.X  | 0.XXX | 0.XXX  |
| w/o Loss A    | ✓           | ✓           | ✓           | ✗      | XX.X  | 0.XXX | 0.XXX  |
| A+B only      | ✓           | ✓           | ✗           | ✗      | XX.X  | 0.XXX | 0.XXX  |

Ablation Design Principles

One variable at a time: Each row changes exactly one component
Show interaction effects: Include rows that combine removal of 2+ components
Use consistent dataset: Ablations on a single representative dataset are fine
Include running time: Show the computational cost of each component
Statistical significance: Run 3 seeds if results are close

Common Ablation Targets

Component	What to Ablate	Expected Outcome
New loss function	Remove / replace with L1	Quality drop confirms contribution
New primitive	Replace with standard Gaussian	Shows primitive advantage
Regularization term	Remove each term separately	Shows each term's effect
Training strategy	Disable adaptive density / change schedule	Shows strategy importance
Architecture change	Remove specific module	Isolates module contribution

Step 6: Visualization Plan

Must-Have Figures

Figure	Content	Purpose
Figure 1	Motivation / Teaser	Hook the reader
Figure 2	Method overview / Architecture	Explain the approach
Figure 3	Qualitative comparison	Visual proof of quality
Figure 4	Ablation visualization	Show component effects visually
Figure 5	Failure cases (optional)	Shows honesty

Recommended Visual Comparisons

Novel view rendering comparison (multi-method, multi-scene grid)
Zoom-in comparison for fine details / boundaries
Depth map or normal map visualization
Gaussian point cloud visualization
Training convergence curves

Step 7: Efficiency Analysis

When making efficiency claims, include:

Aspect	Measurement	Report Format
Training time	Wall-clock hours per scene	"X hours on 1x RTX 4090"
Rendering speed	FPS at resolution Y	"XX FPS at 1080p"
Peak VRAM	GB during training/inference	"X GB peak"
Model storage	MB per scene	"X MB"
Scaling behavior	Time vs #images / resolution	Plot or table

Always report GPU model — reviewers compare across papers.

Output Format

Generate a complete experiment plan:

## Experiment Plan for [Method Name]

### 1. Datasets
| Priority | Dataset | Scenes | Reason |
|----------|---------|--------|--------|
| Must | ... | ... | ... |

### 2. Baselines
| Priority | Method | Venue | Category |
|----------|--------|-------|----------|
| Must | ... | ... | ... |

### 3. Metrics
| Must Report | Optional |
|-------------|----------|
| PSNR, SSIM, LPIPS | FPS, VRAM, ... |

### 4. Ablation Study
| # | What to Remove | Expected Impact |
|---|---------------|-----------------|
| 1 | ... | ... |

### 5. Figure Plan
| Figure | Content | Target Page |
|--------|---------|-------------|
| Fig 1 | ... | 1 |

### 6. Efficiency Analysis
- Training: ...
- Rendering: ...
- Memory: ...

### 7. Anticipated Reviewer Concerns & Preemptive Responses
| Concern | Response Strategy |
|---------|------------------|
| "Why not compare with X?" | ... |

Rules

Be practical: Consider the actual computational budget. Don't suggest 100 scenes if the author has 1 GPU.
Be realistic: Don't claim "state-of-the-art" unless metrics clearly support it.
Be thorough: It's better to over-prepare than to receive "insufficient experiments" reviews.
Venue-aware: CVPR allows 8 pages + references. Budget your figures and tables accordingly. ICRA 2026 prioritizes robotics-system experiments (real-robot + sim ablations); include hardware specs and real-time metrics.

If you like it, please star this repo https://github.com/jaccen/Awesome-Gaussian-Skills

Related skills

MCP Server Builder

anthropics

Build protocol servers that connect language models to external APIs and services.

OfficialComplete terms in LICENSE.txt

Generative Code Art

anthropics

Create algorithmic art with p5.js using randomness and interactive parameters.

OfficialComplete terms in LICENSE.txt

Generative Furniture & Product Design

AugmentClaude

Produce three distinct design directions per request — traditional, constraint-optimized, expressive.

MIT

3D Print Preparation

AugmentClaude

Pick orientation, supports, and slicer settings for FDM and MSLA prints with proper rationale.

MIT