Project: EDA Analyser (Developer Track)
This project builds a Python EDA (exploratory data analysis) tool with the agent, but with structured guardrails so the work doesn’t drift. The goal is to demonstrate that structured agentic coding is not slower than vibe-coding — it produces better results with fewer corrections because the agent has context before the first prompt.
The product is a CLI that accepts a CSV file and produces a profile.md report: schema, missingness analysis, column distributions, outlier flags.
flowchart TD
A["data.csv\ninput"]
B[".github/copilot-instructions.md\nPython 3.11, uv, polars, pytest, type hints"]
C["tests/.instructions.md\napplyTo: tests/**"]
D[".github/prompts/new-analyzer.prompt.md\n/new-analyzer command"]
E[".github/skills/dataset-profile/SKILL.md\n+ profile.py + profile-template.md"]
F[".vscode/mcp.json\nDuckDB MCP server"]
G[".github/agents/CriticReviewer.agent.md\nuser-invocable: false"]
H["hook: PostToolUse\nruff format on .py edits"]
I["profiles/data-profile.md\noutput report"]
A --> B --> C --> D --> E
E <-->|"SQL queries\nover CSV"| F
E --> G
G -->|"critique loop"| E
E --> H --> I
The critique loop (Skill → CriticReviewer → back to Skill) is where structured agentic coding earns its keep. Without it, the agent optimizes for "looks right" rather than "is correct".
Create .github/copilot-instructions.md declaring the toolchain. This is the single most important step. Without it, the agent will guess: pandas instead of polars, pip instead of uv, notebooks instead of a CLI. Write the rules before writing any code.
# EDA Analyser — Project Context
## Toolchain
- Python 3.11
- Package manager: uv (not pip, not poetry)
- DataFrames: polars (never pandas — no `import pandas`, no `pd.`)
- Tests: pytest with pytest-cov
- Type hints: mandatory on all functions and return types
- Linting and formatting: ruff
## Structure
src/eda_analyser/ — source code
tests/ — pytest tests (mirror src/ structure)
profiles/ — output profile reports (markdown)
## Rules
- No Jupyter notebooks in production code
- No raw SQL strings — use parameterized queries or the DuckDB MCP
- CLI entry point: src/eda_analyser/cli.py using Click
- Tests must not mock file I/O — use pytest's tmp_path fixture
Create tests/.instructions.md with applyTo: "tests/**". These TDD norms only load when the agent is writing test files. Scoping them prevents the agent from applying test conventions to source files and vice versa.
---
applyTo: "tests/**"
---
## Test Structure
Use Arrange-Act-Assert. Label each phase with a comment: # Arrange, # Act, # Assert.
## Fixtures
Use tmp_path for any temporary files. Never mock file I/O.
Use @pytest.mark.parametrize for multiple input variants of the same test case.
## Naming Convention
test_{function_name}_{scenario} — e.g., test_profile_csv_empty_input
## Required Coverage
Every new function in src/ needs at minimum:
- One test covering the happy path with a representative input
- One test covering an edge case (empty DataFrame, single row, all-null column)
Create .github/prompts/new-analyzer.prompt.md. This becomes /new-analyzer. Given a name and a one-line description, it scaffolds a complete analyzer module — source file, test file, and CLI registration — following all project conventions from the instructions files.
---
name: new-analyzer
description: Scaffold a new EDA analyzer module — source file, tests, and CLI registration
tools: ['codebase', 'edit']
argument-hint: "[analyzer-name] [one-line description]"
---
Read .github/copilot-instructions.md and tests/.instructions.md first.
Create three files:
1. src/eda_analyser/{name}.py — analyzer with full type hints, accepts polars.DataFrame
2. tests/test_{name}.py — happy path + edge case tests
3. Register the CLI command in src/eda_analyser/cli.py
The analyzer function signature: def analyze(df: pl.DataFrame) -> dict[str, Any]
Return a flat dict of findings — the profile skill will format them.
Create .github/skills/dataset-profile/ with SKILL.md, scripts/profile.py, and templates/profile-template.md. The skill runs a full profile pass on a CSV using DuckDB for SQL queries and writes a structured report.
SKILL.md:
---
name: dataset-profile
description: Run a full EDA profile on a CSV file and write a profile.md report.
Use when given a CSV file path. Produces schema, missingness, distributions, outlier flags.
user-invocable: true
argument-hint: "[path/to/data.csv]"
---
## Steps
1. Use DuckDB MCP to inspect schema: SELECT * FROM read_csv_auto('{path}') LIMIT 5
2. Run missingness: SELECT COUNT(*) - COUNT(col) AS nulls FROM read_csv_auto('{path}')
3. Run distributions: min, max, mean, p25, p50, p75 for numerics; value_counts for categoricals
4. Flag outliers: values beyond 3 standard deviations from the column mean
5. Fill in templates/profile-template.md with all findings
6. Write output to profiles/{basename}-profile.md
7. Request CriticReviewer subagent review before finalizing the report
Create .vscode/mcp.json. The dataset-profile skill queries the CSV directly via DuckDB rather than loading it into Python first. The agent writes SQL queries and iterates on them — faster than writing polars aggregations for exploratory work.
{
"servers": {
"duckdb": {
"command": "uvx",
"args": ["mcp-server-duckdb", "--db-path", ":memory:"]
}
}
}
Install: uv tool install mcp-server-duckdb or pip install mcp-server-duckdb.
DuckDB reads CSV directly — no import step:
-- Schema inspection
DESCRIBE SELECT * FROM read_csv_auto('data/well_logs.csv');
-- Missingness per column
SELECT column_name, COUNT(*) - COUNT(column_name) AS null_count
FROM information_schema.columns
WHERE table_name = 'well_logs';
-- Distribution of a numeric column
SELECT
MIN(GR) AS min_val, MAX(GR) AS max_val,
AVG(GR) AS mean_val,
PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY GR) AS median_val
FROM read_csv_auto('data/well_logs.csv');
Create .github/agents/CriticReviewer.agent.md with user-invocable: false. The dataset-profile skill invokes this subagent to review the draft report before finalizing. The critic runs in isolated context — it reads the output cold, without the implementation conversation, and catches assumptions the implementer stopped noticing.
---
name: CriticReviewer
description: Reviews EDA profile reports for statistical correctness, missing edge cases, and analytical blind spots.
Use before finalizing any profile report to check for bias and incorrect assumptions.
tools: ['codebase']
model: claude-opus-4-7
user-invocable: false
---
You are a senior data scientist reviewing an EDA report. Read only — do not edit files.
Review for:
- Statistical correctness: does each metric measure what it claims?
- Missing edge cases: empty columns, all-null columns, mixed-type columns, single-row input
- Distribution assumptions: does the outlier detection handle skewed distributions? Log-normal? Categorical misclassified as numeric?
- Misleading aggregations: means on bimodal distributions, medians on sparse data
Return:
- PASS: findings are correct, report is ready to finalize
- WARN: minor issues with specific section references and suggested corrections
- FAIL: substantive issues that would produce misleading analysis — list each with its correction
Create .github/hooks/post-edit.json. Every time the agent edits a .py file, ruff format runs automatically. The agent cannot produce unformatted code — this is the evaluator layer, deterministic and unconditional.
{
"hooks": {
"PostToolUse": [{
"matcher": "Edit|Write",
"hooks": [{
"type": "command",
"command": "ruff format \"$CLAUDE_TOOL_INPUT_PATH\" 2>/dev/null || true"
}]
}]
}
}
Final project structure
eda-analyser/
├── .github/
│ ├── copilot-instructions.md
│ ├── agents/
│ │ └── CriticReviewer.agent.md
│ ├── skills/
│ │ └── dataset-profile/
│ │ ├── SKILL.md
│ │ ├── scripts/profile.py
│ │ └── templates/profile-template.md
│ ├── prompts/
│ │ └── new-analyzer.prompt.md
│ └── hooks/
│ └── post-edit.json
├── .vscode/
│ └── mcp.json
├── tests/
│ └── .instructions.md
└── src/
└── eda_analyser/
└── cli.py
Write the instructions before the first prompt. The copilot-instructions.md here takes five minutes to write and prevents 10+ corrections over the course of the project — toolchain drift, pandas imports, missing type hints. Front-load the context.
Let the subagent critique cold. The CriticReviewer sees the output without the implementation conversation's anchoring effects. Isolation is the feature — a fresh read catches assumptions the implementer stopped questioning.
Don't give the main agent unrestricted write access. An agent with access to the entire filesystem will drift into config files, CI scripts, and infrastructure when the problem feels related. Restrict tools to the directories the task actually requires.
Don't encode dataset-specific logic in instructions. copilot-instructions.md describes the project's toolchain and conventions — not the structure of a specific CSV. Dataset-specific context belongs in the prompt that invokes the skill, or in the skill itself.