Tutorial: Differential Privacy Training¶
This tutorial shows how to train a model with DP-SGD and verify the privacy guarantee by running an MIA before and after.
Prerequisites¶
Understanding the parameters¶
| Parameter | Effect |
|---|---|
epsilon |
Lower = stronger privacy. Start at 5, tighten to 1. |
delta |
Set to 1/N where N is training set size. For 5000 samples: 2e-4. |
max_grad_norm |
Clip gradients per sample. 0.5–1.0 is typical. |
noise_multiplier |
Amount of Gaussian noise. Auto-computed if null. |
Step 1: Baseline (no DP)¶
# configs/dp_tutorial_baseline.yaml
experiment_name: dp_tutorial_baseline
data:
dataset: mnist
train_size: 5000
batch_size: 256
training:
epochs: 20
learning_rate: 0.001
device: auto
auditml train --config configs/dp_tutorial_baseline.yaml
auditml audit --config configs/dp_tutorial_baseline.yaml --attack mia_threshold
Step 2: DP training (ε=5)¶
# configs/dp_tutorial_eps5.yaml
experiment_name: dp_tutorial_eps5
data:
dataset: mnist
train_size: 5000
batch_size: 256 # larger batch = better privacy per epoch
training:
epochs: 20
learning_rate: 0.001
device: auto
dp:
enabled: true
epsilon: 5.0
delta: 2e-4
max_grad_norm: 1.0
auditml train --config configs/dp_tutorial_eps5.yaml
auditml audit --config configs/dp_tutorial_eps5.yaml --attack mia_threshold
After training you'll see:
Step 3: DP training (ε=1)¶
Strong privacy at some accuracy cost.
Step 4: Compare¶
from auditml.reporting.dp_comparison import DPComparison
cmp = DPComparison(
baseline_dir="results/dp_tutorial_baseline",
dp_dir="results/dp_tutorial_eps5",
)
cmp.generate_report("results/dp_comparison")
Typical results on MNIST:
| Model | Val Acc | MIA AUC |
|---|---|---|
| No DP | 98.8% | 0.671 |
| ε=5 | 97.8% | 0.558 |
| ε=1 | 96.2% | 0.512 |
Tips¶
- Increase batch size when using DP — larger batches dilute the noise.
- Train longer — DP models converge more slowly; 2–3× more epochs may be needed.
- Tune
max_grad_norm— too small clips valid gradients, too large lets noise dominate. - Don't use Batch Normalisation — it's incompatible with per-sample gradient computation.