Saving and Resuming Simulations

This notebook demonstrates xftsim’s checkpoint system for saving and resuming simulations. Long-running simulations can be saved to disk and resumed later without losing state.

We will:

Run a simulation for 5 generations
Save a checkpoint to disk
Inspect the checkpoint directory contents
Load and resume the simulation from the checkpoint
Verify generation count and results continuity
Save and load individual components (haplotypes, phenotypes, effects)

1. Imports

[ ]:

import os
import json
import tempfile
import shutil
import numpy as np

from xftsim.effect import AdditiveEffects
from xftsim.arch import Architecture
from xftsim.mate import RandomMating
from xftsim.reproduce import RecombinationMap
from xftsim.sim import Simulation
from xftsim.stats import SampleStatistics
from xftsim.founders import founder_haplotypes_uniform_AFs
from xftsim.io import (
    save_simulation_checkpoint,
    load_simulation_checkpoint,
    save_haplotypes_npz,
    load_haplotypes_npz,
    save_phenotypes_npz,
    load_phenotypes_npz,
    save_effects_npz,
    load_effects_npz,
)

2. Run a Simulation for 5 Generations

We set up a simple univariate simulation and run it for 5 generations. We use retain_haplotypes=2 and retain_phenotypes=5 to keep enough history for the checkpoint to be meaningful.

[ ]:

np.random.seed(42)

n = 1000
m = 200

founder_hap = founder_haplotypes_uniform_AFs(n=n, m=m)
eff = AdditiveEffects.from_h2(h2=0.5, m=m, seed=123)

formula = """
Y.G ~ genetic(eff)
Y.E ~ noise(0.5)
Y ~ Y.G + Y.E
"""

arch = Architecture(formula=formula, effects={'eff': eff})
mating = RandomMating(offspring_per_pair=2)
recomb = RecombinationMap.constant_map(m=m, p=0.5)

sim = Simulation(
    founder_haplotypes=founder_hap,
    architecture=arch,
    mating_regime=mating,
    recombination_map=recomb,
    retain_haplotypes=2,
    retain_phenotypes=5,
    statistics=[SampleStatistics()],
    seed=42,
)

sim.run(5)
print(f"Simulation completed at generation {sim.generation}")
print(f"Haplotype history generations: {list(sim.haplotype_history.keys())}")
print(f"Phenotype history generations: {list(sim.phenotype_history.keys())}")
print(f"Number of result entries: {len(sim.results)}")

3. Save Checkpoint

save_simulation_checkpoint(sim, dir_path) saves the full simulation state to a directory. This includes:

Architecture (JSON + effect .npz files)
Metadata (generation count, retention settings, mating regime config)
Recombination map
RNG state
Haplotype history (all retained generations)
Phenotype history (all retained generations)
Pedigree history

[ ]:

# Create a temporary directory for the checkpoint
checkpoint_dir = tempfile.mkdtemp(prefix='xftsim_checkpoint_')

save_simulation_checkpoint(sim, checkpoint_dir)
print(f"Checkpoint saved to: {checkpoint_dir}")

4. Inspect the Checkpoint Directory

The checkpoint directory has a well-defined structure:

checkpoint_dir/
  meta.json                 # generation, retention settings, mating config
  recombination_map.npz     # recombination probabilities
  rng_state.npz             # numpy RNG state for reproducibility
  history_keys.npz          # which generations are stored
  architecture/             # architecture definition
    architecture.json       # node structure
    effect_0.npz            # effect size arrays
  haplotypes/               # one .npz per retained generation
    gen_3.npz
    gen_4.npz
  phenotypes/               # one .npz per retained generation
    gen_0.npz
    gen_1.npz
    ...
  pedigrees/                # one .npz per retained generation
    gen_1.npz
    gen_2.npz
    ...

[ ]:

# List all files in the checkpoint directory
print("Checkpoint directory contents:")
for root, dirs, files in os.walk(checkpoint_dir):
    level = root.replace(checkpoint_dir, '').count(os.sep)
    indent = '  ' * level
    dirname = os.path.basename(root) or os.path.basename(checkpoint_dir)
    print(f"{indent}{dirname}/")
    sub_indent = '  ' * (level + 1)
    for f in sorted(files):
        fpath = os.path.join(root, f)
        size_kb = os.path.getsize(fpath) / 1024
        print(f"{sub_indent}{f}  ({size_kb:.1f} KB)")

5. Read the Checkpoint Metadata

The meta.json file contains key simulation parameters. This is what you would inspect to understand a checkpoint without loading it fully.

[ ]:

with open(os.path.join(checkpoint_dir, 'meta.json'), 'r') as f:
    meta = json.load(f)

print("Checkpoint metadata (meta.json):")
print(json.dumps(meta, indent=2))

6. Load and Resume the Simulation

Simulation.from_checkpoint(dir_path) reconstructs the simulation from saved state. The architecture, mating regime, recombination map, history dicts, and RNG state are all restored. The simulation can then be continued with sim.continue_run(n) for n additional generations.

Note: from_checkpoint restores the raw state. Statistics and filters must be re-supplied if you want them for the continued run, since Python callables are not serialized.

[ ]:

# Load simulation from checkpoint
sim_loaded = Simulation.from_checkpoint(
    checkpoint_dir,
    statistics=[SampleStatistics()],  # re-supply for continued run
)

print(f"Loaded simulation at generation {sim_loaded.generation}")
print(f"Haplotype history generations: {list(sim_loaded.haplotype_history.keys())}")
print(f"Phenotype history generations: {list(sim_loaded.phenotype_history.keys())}")

[ ]:

# Verify that the loaded state matches the original
orig_y = sim.phenotypes['Y']
loaded_y = sim_loaded.phenotypes['Y']

print("Verification that loaded state matches original:")
print(f"  Phenotype Y mean (original):  {np.mean(orig_y):.6f}")
print(f"  Phenotype Y mean (loaded):    {np.mean(loaded_y):.6f}")
print(f"  Arrays match exactly: {np.allclose(orig_y, loaded_y)}")

[ ]:

# Continue the simulation for 5 more generations
print(f"Before continue_run: generation {sim_loaded.generation}")

sim_loaded.continue_run(5)

print(f"After continue_run:  generation {sim_loaded.generation}")
print(f"Total result entries: {len(sim_loaded.results)}")
print(f"Phenotype history generations: {list(sim_loaded.phenotype_history.keys())}")

[ ]:

# Show statistics from the continued run
print("Variance of Y across continued generations:")
for result in sim_loaded.results:
    stats = result.statistics['SampleStatistics']
    keys = stats['keys']
    y_idx = keys.index('Y')
    print(f"  Gen {result.generation}: Var(Y) = {stats['var'][y_idx]:.4f}")

7. Save and Load Individual Components

Besides full simulation checkpoints, xftsim provides functions to save and load individual data structures:

save_haplotypes_npz / load_haplotypes_npz
save_phenotypes_npz / load_phenotypes_npz
save_effects_npz / load_effects_npz

These are useful for exporting specific simulation outputs for analysis in other tools or for sharing data.

[ ]:

# Create a temporary directory for individual component files
component_dir = tempfile.mkdtemp(prefix='xftsim_components_')

# Save haplotypes
hap_path = os.path.join(component_dir, 'haplotypes.npz')
save_haplotypes_npz(sim_loaded.haplotypes, hap_path)
print(f"Saved haplotypes: {os.path.getsize(hap_path) / 1024:.1f} KB")

# Save phenotypes
pheno_path = os.path.join(component_dir, 'phenotypes.npz')
save_phenotypes_npz(sim_loaded.phenotypes, pheno_path)
print(f"Saved phenotypes: {os.path.getsize(pheno_path) / 1024:.1f} KB")

# Save effects
eff_path = os.path.join(component_dir, 'effects.npz')
save_effects_npz(eff, eff_path)
print(f"Saved effects:    {os.path.getsize(eff_path) / 1024:.1f} KB")

[ ]:

# Load them back and verify
hap_loaded = load_haplotypes_npz(hap_path)
pheno_loaded = load_phenotypes_npz(pheno_path)
eff_loaded = load_effects_npz(eff_path)

print("Loaded haplotypes:")
print(f"  n = {hap_loaded.n}, m = {hap_loaded.m}")
print(f"  Genotypes match: {np.array_equal(hap_loaded.genotypes, sim_loaded.haplotypes.genotypes)}")

print("\nLoaded phenotypes:")
print(f"  Keys: {list(pheno_loaded.keys)}")
print(f"  Y values match: {np.allclose(pheno_loaded['Y'], sim_loaded.phenotypes['Y'])}")

print("\nLoaded effects:")
print(f"  Shape: {eff_loaded.effects.shape}")
print(f"  Standardized: {eff_loaded.standardized}")
print(f"  Values match: {np.allclose(eff_loaded.effects, eff.effects)}")

8. Low-Level Checkpoint Inspection

The load_simulation_checkpoint function returns a raw dict with all saved state. This is useful for manual inspection or for building custom analysis pipelines without reconstructing a full simulation.

[ ]:

checkpoint_data = load_simulation_checkpoint(checkpoint_dir)

print("Checkpoint dict keys:")
for key in checkpoint_data:
    val = checkpoint_data[key]
    if isinstance(val, dict):
        print(f"  {key}: dict with keys {list(val.keys())}")
    elif isinstance(val, int):
        print(f"  {key}: {val}")
    else:
        print(f"  {key}: {type(val).__name__}")

print(f"\nCheckpoint generation: {checkpoint_data['generation']}")
print(f"Retain haplotypes: {checkpoint_data['retain_haplotypes']}")
print(f"Retain phenotypes: {checkpoint_data['retain_phenotypes']}")
print(f"Architecture: {checkpoint_data['architecture']}")

9. Cleanup

Remove the temporary directories we created for this demo.

[ ]:

shutil.rmtree(checkpoint_dir)
shutil.rmtree(component_dir)
print("Temporary directories cleaned up.")

Summary

This notebook demonstrated:

Full simulation checkpoints via save_simulation_checkpoint / Simulation.from_checkpoint
Checkpoint directory structure – meta.json, architecture/, haplotypes/, phenotypes/, pedigrees/
Metadata inspection – reading meta.json for generation count and mating config
Resuming simulations via sim.continue_run(n) from a loaded checkpoint
Individual component I/O – save/load haplotypes, phenotypes, and effects separately
Low-level checkpoint access via load_simulation_checkpoint returning a raw dict

Key points: - Checkpoints save all state needed to resume: haplotypes, phenotypes, pedigrees, RNG state - Statistics, filters, and callbacks must be re-supplied when loading (not serialized) - The checkpoint format uses compressed numpy (.npz) for arrays and JSON for metadata - Retention policy applies: only generations within the retention window are saved