Filters

Filters extract structured views (trios, sib-pairs, etc.) from phenotype and pedigree histories, used by the statistics system.

Filters for extracting structured views from simulation history.

Filters produce FilteredViews (trios, sib-pairs, etc.) from phenotype and pedigree histories, used by statistics modules.

class xftsim.filters.FilteredView[source]

Bases: object

Base class for filtered data views produced by Filter.apply().

class xftsim.filters.TrioView(offspring_phenotypes, mother_phenotypes, father_phenotypes, n_trios)[source]

Bases: FilteredView

Aligned trio data: offspring, mother, and father phenotypes.

All dicts map phenotype name -> (n_trios,) array.

Parameters:
offspring_phenotypes: Dict[str, ndarray]
mother_phenotypes: Dict[str, ndarray]
father_phenotypes: Dict[str, ndarray]
n_trios: int
class xftsim.filters.SibPairView(sib1_phenotypes, sib2_phenotypes, n_pairs, sib1_idx=None, sib2_idx=None)[source]

Bases: FilteredView

Sibling pair data: two aligned sets of sibling phenotypes.

All dicts map phenotype name -> (n_pairs,) array. sib1_idx and sib2_idx are the original sample indices.

Parameters:
sib1_phenotypes: Dict[str, ndarray]
sib2_phenotypes: Dict[str, ndarray]
n_pairs: int
sib1_idx: ndarray = None
sib2_idx: ndarray = None
class xftsim.filters.Filter[source]

Bases: ABC

Abstract base class for filters.

Filters extract structured views from simulation history.

abstract apply(generation, phenotype_history, pedigree_history)[source]

Apply the filter to extract a view.

Parameters:
  • generation (int) – Current generation number.

  • phenotype_history (dict[int, PhenotypeArray]) – Generation -> phenotypes mapping.

  • pedigree_history (dict[int, PedigreeArray]) – Generation -> pedigree mapping.

Return type:

Optional[FilteredView]

Returns:

FilteredView or None – The filtered view, or None if not applicable.

Parameters:
  • generation (int)

  • phenotype_history (dict)

  • pedigree_history (dict)

class xftsim.filters.TrioFilter[source]

Bases: Filter

Extract complete trios (offspring + both parents) from adjacent generations.

At generation 0, returns None (no parents). At generation > 0, indexes parent phenotypes from gen-1 by pedigree indices.

apply(generation, phenotype_history, pedigree_history)[source]

Apply the filter to extract a view.

Parameters:
  • generation (int) – Current generation number.

  • phenotype_history (dict[int, PhenotypeArray]) – Generation -> phenotypes mapping.

  • pedigree_history (dict[int, PedigreeArray]) – Generation -> pedigree mapping.

Return type:

TrioView | None

Returns:

FilteredView or None – The filtered view, or None if not applicable.

Parameters:
class xftsim.filters.SibPairFilter[source]

Bases: Filter

Extract sibling pairs (individuals sharing the same FID).

Groups offspring by FID and forms all unique within-family pairs.

apply(generation, phenotype_history, pedigree_history)[source]

Apply the filter to extract a view.

Parameters:
  • generation (int) – Current generation number.

  • phenotype_history (dict[int, PhenotypeArray]) – Generation -> phenotypes mapping.

  • pedigree_history (dict[int, PedigreeArray]) – Generation -> pedigree mapping.

Return type:

SibPairView | None

Returns:

FilteredView or None – The filtered view, or None if not applicable.

Parameters:
class xftsim.filters.UnrelatedView(indices=<factory>, phenotypes=None)[source]

Bases: FilteredView

View of one individual per family (unrelated subsample).

indices

Indices into the original sample array (one per family).

Type:

np.ndarray

phenotypes

Subset of phenotypes for the selected individuals.

Type:

PhenotypeArray

Parameters:
indices: ndarray
phenotypes: PhenotypeArray = None
class xftsim.filters.UnrelatedFilter[source]

Bases: Filter

Select one individual per family (first occurrence per FID).

Produces an UnrelatedView with the first individual encountered for each unique FID value.

apply(generation, phenotype_history, pedigree_history)[source]

Apply the filter to extract a view.

Parameters:
  • generation (int) – Current generation number.

  • phenotype_history (dict[int, PhenotypeArray]) – Generation -> phenotypes mapping.

  • pedigree_history (dict[int, PedigreeArray]) – Generation -> pedigree mapping.

Return type:

UnrelatedView | None

Returns:

FilteredView or None – The filtered view, or None if not applicable.

Parameters:
class xftsim.filters.AscertainedView(indices=<factory>, phenotypes=None, ascertainment_key='', threshold=0.0)[source]

Bases: FilteredView

View of individuals passing an ascertainment threshold.

indices

Indices into the original sample array.

Type:

np.ndarray

phenotypes

Subset of phenotypes for selected individuals.

Type:

PhenotypeArray

ascertainment_key

The phenotype key used for ascertainment.

Type:

str

threshold

The quantile threshold value(s) used.

Type:

float

Parameters:
indices: ndarray
phenotypes: PhenotypeArray = None
ascertainment_key: str = ''
threshold: float = 0.0
class xftsim.filters.AscertainmentFilter(phenotype_key, quantile, tail='both')[source]

Bases: Filter

Select individuals from the tails of a phenotype distribution.

Parameters:
  • phenotype_key (str) – Which phenotype to ascertain on (e.g. ‘Y’, ‘Y.G’).

  • quantile (float) – Proportion of the distribution to select. E.g. 0.1 selects the top 10%, bottom 10%, or both (depending on tail).

  • tail (str) – Which tail(s) to select: ‘upper’, ‘lower’, or ‘both’. - ‘upper’: individuals above the (1 - quantile) percentile - ‘lower’: individuals below the quantile percentile - ‘both’: individuals in either tail (union of upper and lower)

Parameters:
apply(generation, phenotype_history, pedigree_history)[source]

Apply the filter to extract a view.

Parameters:
  • generation (int) – Current generation number.

  • phenotype_history (dict[int, PhenotypeArray]) – Generation -> phenotypes mapping.

  • pedigree_history (dict[int, PedigreeArray]) – Generation -> pedigree mapping.

Return type:

AscertainedView | None

Returns:

FilteredView or None – The filtered view, or None if not applicable.

Parameters:
class xftsim.filters.SubsampleView(indices=<factory>, phenotypes=None, n_subsample=0)[source]

Bases: FilteredView

View of a random subsample of individuals.

indices

Indices into the original sample array.

Type:

np.ndarray

phenotypes

Subset of phenotypes for selected individuals.

Type:

PhenotypeArray

n_subsample

Number of individuals in the subsample.

Type:

int

Parameters:
indices: ndarray
phenotypes: PhenotypeArray = None
n_subsample: int = 0
class xftsim.filters.SubsampleFilter(n=None, fraction=None, seed=None)[source]

Bases: Filter

Randomly subsample individuals.

Exactly one of n or fraction must be provided.

Parameters:
  • n (int, optional) – Exact number of individuals to sample. If larger than the population, all individuals are returned.

  • fraction (float, optional) – Fraction of individuals to sample, in (0, 1].

  • seed (int, optional) – Random seed for reproducibility.

Parameters:
apply(generation, phenotype_history, pedigree_history)[source]

Apply the filter to extract a view.

Parameters:
  • generation (int) – Current generation number.

  • phenotype_history (dict[int, PhenotypeArray]) – Generation -> phenotypes mapping.

  • pedigree_history (dict[int, PedigreeArray]) – Generation -> pedigree mapping.

Return type:

SubsampleView | None

Returns:

FilteredView or None – The filtered view, or None if not applicable.

Parameters: