Skip to content

nbctl clean

Remove outputs and metadata from notebooks for version control.

Description

The clean command strips unnecessary data from Jupyter notebooks to make them git-friendly. It removes cell outputs, execution counts, and metadata that create noise in version control systems.

This command is essential for reducing git diff size by 90-95%, making code reviews more meaningful, preventing merge conflicts, and keeping repository size small.

Usage

nbctl clean NOTEBOOK [OPTIONS]

Arguments

NOTEBOOK (required) Path to the Jupyter notebook file

Options

--output, -o PATH Save cleaned notebook to different file (default: modify in-place)

--keep-outputs Preserve cell outputs

--keep-execution-count Preserve execution counts

--keep-metadata Preserve metadata

--dry-run Preview changes without modifying file

What Gets Cleaned

By default, the command removes:

Cell outputs: All text, images, and plot outputs are removed Execution counts: All execution counts set to null
Metadata: Non-essential metadata cleaned

Output

Success message:

Notebook cleaned successfully: notebook.ipynb

Dry run output:

Dry run - no changes made
Would clean: notebook.ipynb
- Outputs removed: 15 cells
- Execution counts reset: 20 cells
- Metadata cleaned: 1 notebook

Exit Codes

0: Success 1: File not found or invalid notebook 2: Permission error

Examples

Basic usage:

nbctl clean notebook.ipynb

Preview changes first:

nbctl clean notebook.ipynb --dry-run

Save to new file:

nbctl clean original.ipynb -o cleaned.ipynb

Keep outputs but clean metadata:

nbctl clean notebook.ipynb --keep-outputs

Notes

In-place modification: By default, the original file is overwritten. Use --dry-run first or -o to test.

Git integration: Use with pre-commit hooks for automatic cleaning.

Reversible: Original outputs can be regenerated by re-running the notebook.

Common Workflow

# 1. Make changes to notebook in Jupyter

# 2. Preview what will be cleaned
nbctl clean notebook.ipynb --dry-run

# 3. Clean the notebook
nbctl clean notebook.ipynb

# 4. Check the git diff (much smaller now!)
git diff notebook.ipynb

# 5. Commit
git add notebook.ipynb
git commit -m "Update analysis"

run - Execute notebooks to regenerate outputs diff - Compare cleaned notebooks git-setup - Configure automatic cleaning in git

See Also

Examples - Practical usage examples Getting Started - Introduction to nbctl