arch

Below is an auto-generated summary of the xftsim.arch submodule API.

Architecture system for the new xftsim design.

ArchComponent ABC, concrete components, ArchNode, and Architecture class. Supports both programmatic construction (arch.add()) and formula parsing.

class xftsim.arch.ArchComponent[source]

Bases: ABC

Abstract base class for architecture components (DSL built-in functions).

name

Component name (e.g. ‘genetic’, ‘noise’).

Type:

str

kind

One of ‘genetic’, ‘generative’, ‘aggregating’.

Type:

str

accepts_grouping

Whether this component can use the | operator.

Type:

bool

name: str = ''
kind: str = ''
accepts_grouping: bool = False
abstract compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:
class xftsim.arch.GeneticComponent(effects)[source]

Bases: ArchComponent

Univariate genetic component: computes diploid G @ effects.

Uses standardized_matvec when effects.standardized is True, otherwise plain matvec.

Parameters:

effects (EffectSpec) – Effect sizes (shape (m,) for univariate). Typically an AdditiveEffects or SparseEffects instance.

Parameters:

effects (EffectSpec)

name: str = 'genetic'
kind: str = 'genetic'
accepts_grouping: bool = False
compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:
class xftsim.arch.MVGeneticComponent(effects)[source]

Bases: GeneticComponent

Multivariate genetic component: computes G @ effects for k traits.

Inherits compute() from GeneticComponent since numpy’s matvec handles both 1D and 2D effect arrays.

Parameters:

effects (EffectSpec) – Effect sizes with shape (m, k). Typically a MultivariateEffects instance.

Parameters:

effects (EffectSpec)

name: str = 'mvGenetic'
class xftsim.arch.HaplotypeGeneticComponent(effects, haplotype='maternal')[source]

Bases: ArchComponent

Haplotype-specific genetic component.

Computes hap[:,:,0] @ effects (maternal) or hap[:,:,1] @ effects (paternal). Enables indirect genetic effects (IGE) formulas where maternal and paternal contributions are modeled separately.

Parameters:
  • effects (EffectSpec) – Effect sizes (shape (m,)).

  • haplotype (str) – Which haplotype copy to use: 'maternal' or 'paternal'.

Raises:

ValueError – If haplotype is not 'maternal' or 'paternal'.

Parameters:
name: str = 'haplotypeGenetic'
kind: str = 'genetic'
accepts_grouping: bool = False
compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:
class xftsim.arch.NoiseComponent(variance)[source]

Bases: ArchComponent

Univariate noise component.

Draws iid N(0, variance) per individual. When grouping is active (e.g., | FID), draws one shared value per group and broadcasts to all members.

Parameters:

variance (float) – Noise variance (used as the variance of the normal distribution).

Parameters:

variance (float)

name: str = 'noise'
kind: str = 'generative'
accepts_grouping: bool = True
compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:
class xftsim.arch.CNoiseComponent(cov)[source]

Bases: ArchComponent

Correlated multivariate noise component.

Draws N(0, cov) per individual. When grouping is active, draws one shared vector per group and broadcasts. Returns an (n, k) array.

Parameters:

cov (np.ndarray) – (k, k) covariance matrix. Must be square.

Raises:

ValueError – If cov is not a square matrix.

Parameters:

cov (ndarray)

name: str = 'cnoise'
kind: str = 'generative'
accepts_grouping: bool = True
property k: int

Number of correlated traits.

compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:
class xftsim.arch.ThresholdComponent(source, threshold)[source]

Bases: ArchComponent

Binarizing threshold component: returns 1 where input exceeds threshold.

Implements the liability threshold model: given a continuous phenotype (liability) and a threshold, produces a binary indicator (diagnosis).

Parameters:
  • source (str) – Name of the input phenotype to threshold.

  • threshold (float) – Threshold value. Output is 1.0 where input > threshold, else 0.0.

Parameters:
name: str = 'threshold'
kind: str = 'aggregating'
accepts_grouping: bool = False
compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:
class xftsim.arch.AggregationComponent(expression)[source]

Bases: ArchComponent

Aggregation component: evaluates arithmetic expressions over phenotype values.

Uses a custom tokenizer + shunting-yard evaluator (no eval()). Supports +, -, *, /, scalar multiplication, parentheses, and dotted names (e.g., 'Y.G + Y.E').

Parameters:

expression (str) – Arithmetic expression referencing phenotype names, e.g. 'height.G + height.E' or '0.5 * (Y.G + Y.E)'.

Parameters:

expression (str)

name: str = 'aggregation'
kind: str = 'aggregating'
accepts_grouping: bool = False
compute(node, haplotypes, phenotypes, **kwargs)[source]

Execute this component and return the result array.

Parameters:
  • node (ArchNode) – The node being executed (provides inputs, outputs, grouping).

  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray) – Current phenotype array (may already have upstream values).

  • **kwargs – Additional context: phenotype_history, pedigree_history, generation.

Return type:

ndarray

Returns:

np.ndarray – Result array of shape (n,) or (n, k) for multi-output.

Parameters:
class xftsim.arch.MotherComponent(phenotype_name, founder_component=None, normalize=False)[source]

Bases: _ParentalComponent

Vertical transmission: mother’s phenotype from previous generation.

Parameters:
  • phenotype_name (str) – Phenotype to look up in previous generation.

  • founder_component (ArchComponent, optional) – Fallback at generation 0.

  • normalize (bool) – Standardize parental values before lookup. Default False.

Parameters:
name: str = 'mother'
phenotype_name: str
founder_component: ArchComponent | None
normalize: bool
class xftsim.arch.FatherComponent(phenotype_name, founder_component=None, normalize=False)[source]

Bases: _ParentalComponent

Vertical transmission: father’s phenotype from previous generation.

Parameters:
  • phenotype_name (str) – Phenotype to look up in previous generation.

  • founder_component (ArchComponent, optional) – Fallback at generation 0.

  • normalize (bool) – Standardize parental values before lookup. Default False.

Parameters:
name: str = 'father'
phenotype_name: str
founder_component: ArchComponent | None
normalize: bool
class xftsim.arch.ParentComponent(phenotype_name, founder_component=None, normalize=False)[source]

Bases: _ParentalComponent

Vertical transmission: midparent (average of mother and father).

Parameters:
  • phenotype_name (str) – Phenotype to look up in previous generation.

  • founder_component (ArchComponent, optional) – Fallback at generation 0.

  • normalize (bool) – Standardize parental values before lookup. Default False.

Parameters:
name: str = 'parent'
phenotype_name: str
founder_component: ArchComponent | None
normalize: bool
class xftsim.arch.SiblingMeanComponent(source_name)[source]

Bases: _SiblingComponent

Sibling mean: average of source phenotype within group.

Parameters:

source_name (str)

name: str = 'sibling_mean'
class xftsim.arch.SiblingSumComponent(source_name)[source]

Bases: _SiblingComponent

Sibling sum: sum of source phenotype within group.

Parameters:

source_name (str)

name: str = 'sibling_sum'
class xftsim.arch.SiblingAnyComponent(source_name)[source]

Bases: _SiblingComponent

Sibling any: 1.0 if any member in group has value > 0, else 0.0.

Parameters:

source_name (str)

name: str = 'sibling_any'
class xftsim.arch.SiblingCountComponent(source_name)[source]

Bases: _SiblingComponent

Sibling count: number of individuals in each group.

Parameters:

source_name (str)

name: str = 'sibling_count'
class xftsim.arch.SiblingEldestComponent(source_name)[source]

Bases: _SiblingComponent

Sibling eldest: value of the first (lowest IID) member in each group.

Parameters:

source_name (str)

name: str = 'sibling_eldest'
class xftsim.arch.SiblingYoungestComponent(source_name)[source]

Bases: _SiblingComponent

Sibling youngest: value of the last (highest IID) member in each group.

Parameters:

source_name (str)

name: str = 'sibling_youngest'
class xftsim.arch.ArchNode(outputs, component, inputs=<factory>, grouping=None)[source]

Bases: object

A single node in the architecture DAG.

Parameters:
  • outputs (list[str]) – Names written to PhenotypeArray.

  • component (ArchComponent) – The computation to perform.

  • inputs (list[str]) – Names read from PhenotypeArray (for aggregation) or [] (for generative).

  • grouping (str or None) – Grouping variable for | operator, or None (implicit | IID).

Parameters:
outputs: list[str]
component: ArchComponent
inputs: list[str]
grouping: str | None = None
xftsim.arch.BUILTINS: dict[str, type[ArchComponent]] = {'cnoise': <class 'xftsim.arch.CNoiseComponent'>, 'father': <class 'xftsim.arch.FatherComponent'>, 'genetic': <class 'xftsim.arch.GeneticComponent'>, 'haplotypeGenetic': <class 'xftsim.arch.HaplotypeGeneticComponent'>, 'mother': <class 'xftsim.arch.MotherComponent'>, 'mvGenetic': <class 'xftsim.arch.MVGeneticComponent'>, 'noise': <class 'xftsim.arch.NoiseComponent'>, 'parent': <class 'xftsim.arch.ParentComponent'>, 'sibling_any': <class 'xftsim.arch.SiblingAnyComponent'>, 'sibling_count': <class 'xftsim.arch.SiblingCountComponent'>, 'sibling_eldest': <class 'xftsim.arch.SiblingEldestComponent'>, 'sibling_mean': <class 'xftsim.arch.SiblingMeanComponent'>, 'sibling_sum': <class 'xftsim.arch.SiblingSumComponent'>, 'sibling_youngest': <class 'xftsim.arch.SiblingYoungestComponent'>, 'threshold': <class 'xftsim.arch.ThresholdComponent'>}

Registry mapping DSL function names to ArchComponent subclasses.

Used by the formula parser to resolve function calls like genetic(eff) to the corresponding component class.

class xftsim.arch.Architecture(formula=None, effects=None)[source]

Bases: object

Phenogenetic architecture: a DAG of ArchNodes executed each generation.

Can be constructed programmatically via add() or from a formula string (parsed by the parser module). Nodes are topologically sorted so that dependencies are resolved before dependents.

Parameters:
  • formula (str, optional) – Multi-line formula string (parsed into ArchNodes). See parser.parse_formula for the grammar.

  • effects (dict, optional) – Name -> EffectSpec mapping for resolving effect references in the formula.

Parameters:

Examples

Programmatic construction:

>>> from xftsim.effect import AdditiveEffects
>>> eff = AdditiveEffects.from_h2(h2=0.5, m=100, seed=1)
>>> arch = Architecture()
>>> arch.add('Y.G', GeneticComponent(eff))
>>> arch.add('Y.E', NoiseComponent(0.5))
>>> arch.add('Y', AggregationComponent('Y.G + Y.E'))

Formula construction:

>>> arch = Architecture(
...     formula="""
...     Y.G ~ genetic(eff)
...     Y.E ~ noise(0.5)
...     Y ~ Y.G + Y.E
...     """,
...     effects={'eff': eff},
... )
Parameters:
classmethod from_formula(formula, effects=None)[source]

Construct an Architecture from a DSL formula string.

Parameters:
  • formula (str) – Multi-line formula string (see parser module for grammar).

  • effects (dict, optional) – Name → EffectSpec mapping for resolving effect references.

Return type:

Architecture

Returns:

Architecture

Parameters:
add(outputs, component, inputs=None, grouping=None)[source]

Programmatically add a node to the architecture.

Parameters:
  • outputs (str or list[str]) – Output name(s).

  • component (ArchComponent) – The component to execute.

  • inputs (list[str], optional) – Input names (for aggregation). Auto-detected for AggregationComponent.

  • grouping (str, optional) – Grouping variable.

Parameters:
Return type:

None

property nodes: list[ArchNode]

Return the topologically sorted node list.

compute(haplotypes, phenotypes=None, rng=None, **kwargs)[source]

Execute all nodes in topological order.

Parameters:
  • haplotypes (HaplotypeOperator) – Current generation’s haplotype data.

  • phenotypes (PhenotypeArray, optional) – Existing phenotype array to write into. Created if None.

  • rng (np.random.RandomState, optional) – Random state for noise components.

  • **kwargs – Additional context (phenotype_history, pedigree_history, generation).

Return type:

PhenotypeArray

Returns:

PhenotypeArray – The phenotype array with all computed values.

Parameters: