Statistics

Per-generation statistics computed during simulation. Each Statistic subclass receives the phenotype history and any filtered views, and returns results stored in GenerationResult.

Statistics computed per-generation in a simulation.

Statistic ABC and concrete implementations. Each statistic receives the phenotype history and any filtered views, and returns a result stored in GenerationResult.

class xftsim.stats.GenerationResult(generation, statistics=<factory>)[source]

Bases: object

Results from a single generation of simulation.

Parameters:
  • generation (int) – Generation number.

  • statistics (dict) – Name → value mapping of computed statistics.

Parameters:
generation: int
statistics: Dict[str, Any]
class xftsim.stats.Statistic[source]

Bases: ABC

Abstract base class for per-generation statistics.

Subclasses implement estimate() to compute a summary statistic from phenotype history and filtered views each generation.

abstract estimate(phenotype_history, filtered_views, generation, **kwargs)[source]

Compute the statistic for a given generation.

Parameters:
  • phenotype_history (dict[int, PhenotypeArray]) – Generation → phenotypes mapping.

  • filtered_views (dict[str, FilteredView]) – Named filtered views (from filters).

  • generation (int) – Current generation number.

  • **kwargs – Additional context. May include: - haplotype_history: dict[int, HaplotypeOperator]

Return type:

Any

Returns:

Any – The computed statistic value.

Parameters:
class xftsim.stats.SampleStatistics[source]

Bases: Statistic

Compute the sample covariance matrix across all phenotype components.

Returns a dict with ‘cov’ (k x k matrix), ‘var’ (diagonal), and ‘keys’.

estimate(phenotype_history, filtered_views, generation, **kwargs)[source]

Compute the statistic for a given generation.

Parameters:
  • phenotype_history (dict[int, PhenotypeArray]) – Generation → phenotypes mapping.

  • filtered_views (dict[str, FilteredView]) – Named filtered views (from filters).

  • generation (int) – Current generation number.

  • **kwargs – Additional context. May include: - haplotype_history: dict[int, HaplotypeOperator]

Return type:

dict[str, Any] | None

Returns:

Any – The computed statistic value.

Parameters:
class xftsim.stats.HasemanElstonEstimator(phenotype_keys=None, n_probe=0)[source]

Bases: Statistic

GRM-based Haseman-Elston regression estimator.

Estimates genetic covariance (and heritability) using the GRM (genomic relationship matrix) computed from standardized genotypes. Works with any sample — does not require siblings, trios, or any specific family structure. Works at generation 0 (founders).

The estimator solves:

cov_g = Y’ (K Y - Y) / (tr(K^2) - n)

where K = G G’ / m is the GRM built from per-SNP standardized genotypes, and Y is the (n x k) phenotype matrix (standardized).

This matches the legacy haseman_elston() function.

Parameters:
  • phenotype_keys (list[str], optional) – Phenotype names to estimate heritability for. If None, uses all phenotype keys that do NOT contain a ‘.’ (i.e., top-level phenotypes like ‘height’, not sub-components like ‘height.G’).

  • n_probe (int) – Number of random probes for stochastic trace estimation. Set to 0 for deterministic (exact) trace. Default 0.

Parameters:
estimate(phenotype_history, filtered_views, generation, **kwargs)[source]

Compute the statistic for a given generation.

Parameters:
  • phenotype_history (dict[int, PhenotypeArray]) – Generation → phenotypes mapping.

  • filtered_views (dict[str, FilteredView]) – Named filtered views (from filters).

  • generation (int) – Current generation number.

  • **kwargs – Additional context. May include: - haplotype_history: dict[int, HaplotypeOperator]

Return type:

dict[str, dict[str, Any]] | None

Returns:

Any – The computed statistic value.

Parameters:
class xftsim.stats.ParentOffspringRegression(filter_name='trio')[source]

Bases: Statistic

Parent-offspring regression estimator of heritability.

Regresses offspring phenotype on mid-parent value. Under an additive model the slope equals h2.

Requires a TrioFilter (keyed by filter_name) to be active.

Parameters:

filter_name (str) – Key in filtered_views that contains a TrioView. Default is 'trio'.

Parameters:

filter_name (str)

estimate(phenotype_history, filtered_views, generation, **kwargs)[source]

Compute the statistic for a given generation.

Parameters:
  • phenotype_history (dict[int, PhenotypeArray]) – Generation → phenotypes mapping.

  • filtered_views (dict[str, FilteredView]) – Named filtered views (from filters).

  • generation (int) – Current generation number.

  • **kwargs – Additional context. May include: - haplotype_history: dict[int, HaplotypeOperator]

Return type:

dict[str, dict[str, Any]] | None

Returns:

Any – The computed statistic value.

Parameters:
class xftsim.stats.MatingStatistics(filter_name='trio')[source]

Bases: Statistic

Compute mating statistics from pedigree structure and parent phenotypes.

Returns per-generation dict with: - n_mating_pairs: number of unique parent pairs - mean_offspring_count: mean offspring per pair - spouse_correlations: dict of phenotype name -> spousal Pearson r

Requires a TrioFilter (keyed by filter_name) to be active so that parent phenotypes are available, or works directly from pedigree if phenotype_history has the parent generation.

Parameters:

filter_name (str) – Key in filtered_views for a TrioView (used for spouse correlations). Default is 'trio'.

Parameters:

filter_name (str)

estimate(phenotype_history, filtered_views, generation, **kwargs)[source]

Compute the statistic for a given generation.

Parameters:
  • phenotype_history (dict[int, PhenotypeArray]) – Generation → phenotypes mapping.

  • filtered_views (dict[str, FilteredView]) – Named filtered views (from filters).

  • generation (int) – Current generation number.

  • **kwargs – Additional context. May include: - haplotype_history: dict[int, HaplotypeOperator]

Return type:

dict[str, Any] | None

Returns:

Any – The computed statistic value.

Parameters: